Do Hitter’s Get Their Expected RBIs?

 

by Cyril Morong

 

Email


Click here to see my sabermetric blog called Cybermetrics

 

 

For the most part, yes.  If we assume hitter’s hit the same way with runners on base as they do all the time, they generally get about the number of RBIs we would expect based on how many base runners they are on when they bat and what bases they are on.

 

To calculate a hitter’s number of expected RBIs, I first computed how many runners were on first base when he batted, how many were on second base and how many were on third base. Then I computed how what his single, double, triple and homerun frequency.  I then assumed he had the same frequency with runners on base as he did for all of his at bats.  For example, if there were 2000 runners on first during the period (the data is explained below), and his homerun frequency is 3%, then we would expect him to get 60 RBIs for that case.  Then we also assume that he hits homeruns 3% of the time when runners are on second base and third bases.  The same is done for triples.  Assume he hits triples with the same frequency as his overall triple rate and multiply that times how many base runners were on when he batted.

 

With singles and doubles, things are a little different.  Of course runners on third and second will score on a double and a runner on third will score on a single.  But I had to make assumptions about how often runners on first would score on a double and how often runners on second would score on a single.  I assumed that 42.6% of runners scored from first on doubles and 63.4% scored from second. These were the major league averages for the years 1987-2000 that I got from John Jarvis’s website (http://knology.net/~johnfjarvis/stats.html).

 

Let’s look at an example.  Jay Bell.  Here is how many runners were on base when he batted, his hit frequencies, his expected RBIs in each case, and how many total RBIs we would have expected him to get (I had to adjust the numbers for at bats with runners on base I found at the CNN/SI site-this is explained in the section on the data below).  This only includes RBIs from hits. RBIs from sacrifice flies and walks with the bases loaded are not included.

 

 

 

AB with

AB with

AB with

AB with

AB with

 

 

 

None On

Runners on

Runners on 1st

Runners on 2nd

Runners on 3rd

Situational

 

 

4298

2921

1814

1396

708

Total

1B%

0.1773

0.00

N/A

0.00

156.93

125.54

282.47

2B%

0.0540

0.00

N/A

41.75

75.42

38.25

155.41

3B%

0.0093

0.00

N/A

16.84

12.96

6.57

36.36

HR%

0.0266

114.31

77.69

48.25

37.13

18.83

296.21

 

 

Expected RBI

Expected RBI

Expected RBI

Expected RBI

Expected RBI

770.45

 

 

 

 

 

 

 

Grand Total

 

If Jay Bell had the same hit frequencies with runners on base as he did overall, we would expect him to get 770.45 RBIs.  The N/A in the fourth column for singles, doubles and triples is there because those expected RBIs are shown in the next three columns.  For homeruns, there is a total in both the “None on” and “Runners on” columns since any homeruns in those cases give one RBI for the hitter over and above any runners on base.

 

Bell gets 114.31 expected RBIs from his at bats with none on from homeruns since .0266*4298 = 114.31.  He gets 156.93 expected RBIs with runners on 2nd from singles since 1396*.1773*.634 = 156.93.  He gets 41.75 expected RBIs with runners on 1st from doubles since 1814*.054*.426 = 41.75. I assumed that all runners on 3rd score on singles and all runners on 2nd score on doubles. He gets 16.84 expected RBIs with runners on from triples since 1814*.0093 = 16.84.  I assumed that all runners scored on triples.

 

Adding up all of the individual cases gives 770.45 expected RBIs.  During the years 1987-2001, Jay Bell actually got 776 RBIs, with sacrifice flies and bases loaded walks excluded.  Below is a list of the 61 players who had 6000 or more plate appearances from the years 1987-2001 and who also had data listed at the CNN/SI site.  The totals are per 600 at bats, or about a full season.

 

Name

Actual

Predicted

Difference

Tino Martinez

103.93

93.75

10.18

Jeff Bagwell

113.06

103.81

9.25

Frank Thomas

117.90

109.89

8.01

Wally Joyner

84.48

77.04

7.44

Robin Ventura

90.57

83.67

6.90

Dante Bichette

99.95

93.10

6.85

Harold Baines

94.28

87.51

6.77

Mark Grace

76.18

69.97

6.21

David Justice

103.19

96.99

6.21

B.J. Surhoff

75.98

70.02

5.96

Juan Gonzalez

123.21

117.78

5.43

Luis Gonzalez

87.50

82.46

5.05

John Olerud

87.83

83.08

4.76

Greg Vaughn

100.19

95.57

4.62

Paul O'Neill

94.80

90.19

4.60

Tony Gwynn

71.58

67.14

4.44

Travis Fryman

86.69

82.43

4.26

Gary Sheffield

98.15

94.06

4.08

Eric Karros

89.70

85.73

3.98

Jay Buhner

106.52

102.96

3.56

Bernie Williams

93.49

89.97

3.52

Barry Bonds

111.72

108.35

3.36

Ken Griffey Jr.

112.30

108.94

3.36

Mark McLemore

51.36

48.06

3.30

Kenny Lofton

55.16

51.87

3.29

Larry Walker

107.72

104.48

3.24

Jose Canseco

111.68

108.56

3.12

Andres Galarraga

102.89

99.77

3.12

Sammy Sosa

108.32

105.25

3.07

Ray Lankford

86.15

83.20

2.95

Delino DeShields

52.73

50.03

2.69

Will Clark

93.75

91.32

2.42

Todd Zeile

80.47

78.20

2.27

Mark McGwire

127.26

125.02

2.24

Matt Williams

99.05

96.96

2.10

Cal Ripken

79.41

77.61

1.81

Fred McGriff

100.76

99.09

1.67

Tim Raines

65.42

63.76

1.67

Edgar Martinez

97.80

96.44

1.36

Ken Caminiti

85.88

85.16

0.72

Jay Bell

64.50

64.04

0.46

Gregg Jefferies

66.30

65.95

0.36

Brady Anderson

63.65

63.36

0.30

Rafael Palmeiro

96.95

96.67

0.29

Chuck Knoblauch

51.53

51.75

-0.22

Tony Fernandez

60.07

60.45

-0.39

Ruben Sierra

89.29

89.83

-0.55

Ron Gant

86.37

87.04

-0.66

Roberto Alomar

71.11

71.86

-0.75

Marquis Grissom

60.58

61.59

-1.02

Bobby Bonilla

90.35

91.37

-1.02

Barry Larkin

68.58

70.52

-1.94

Devon White

64.56

66.50

-1.94

Steve Finley

66.49

68.46

-1.97

Dave Martinez

55.81

57.81

-2.00

Craig Biggio

59.81

61.86

-2.05

Omar Vizquel

47.01

49.76

-2.75

Rickey Henderson

55.40

58.45

-3.05

Benito Santiago

71.44

75.83

-4.39

Ellis Burks

93.29

99.86

-6.57

Wade Boggs

56.42

63.25

-6.83

 

 

Some observations.  47 of the 61 hitters (77%) were predicted to within 5 RBIs per 600 at bats.  So this is reasonably accurate (the correlation between the actual RBIs and predicted RBIs is .987). But there are only 16 hitters who had fewer RBIs than expected. I expected about half the hitters to have more RBIs than expected and half to have less.  A few RBIs come from groundouts and I don’t have data on those.  But there are not many such cases.

 

Maybe my list of 61 players tends to be populated by very good hitters (who else would last so long?).  They tend to bat in the middle of the order and the runners who are on base might be faster than average since they would be the 1 and 2 hitters. Also, these hitters may have more than average power.  Maybe their singles and doubles go farther (and travel faster) than average, making it easier for runners to score.

 

Notice that the players who are above expectations also seem to be power hitters who hit in the middle of the order.  The players who are negative tend to be players who batted at the top or bottom of the order.  If they were at the top of the order, it might be partly because of their speed.  So they may get more singles and doubles as a result of speed and the runners who are on already may not score.  Also, if they have below average power, maybe their doubles and singles don’t go as far as they do for the power hitters, making it harder for runners to score.

 

But the bottom line is that the vast majority of hitters are predicted fairly well, and only one was off by more than 10 RBIs.

 

 

Alternate methods

 

The assumed runner advancement is for both leagues.  I also tried using different figures for hitters from each league.  For the AL, 64.5% of runners scored from 2nd on singles and 39.5% scored from 1st on doubles. For the NL, these were 62.3% and 45.7%, respectively. For any players who did not get at least 80% of their at bats in one league, I used a weighted average.  The results of that analysis is the left-hand column.  Then I not only used the league runner advance figures, but I also took into account the fact that, on average, batting average is higher with runners on base (and HR% is lower).  The differences are not great.  Email me for details. These results are in the right-hand column. In each case, I only give the difference between actual RBIs and expectations.

 

 

 

 

ADJ FOR LEAGUE

 

ADJ FOR LEAGUE

PER600

 

AND ROB

PER600

Name

AB

 

Name

AB

Tino Martinez

10.29

 

Tino Martinez

9.76

Jeff Bagwell

9.10

 

Jeff Bagwell

8.55

Frank Thomas

8.15

 

Frank Thomas

7.63

Wally Joyner

7.45

 

Wally Joyner

6.56

Robin Ventura

6.93

 

Robin Ventura

6.34

Dante Bichette

6.81

 

Dante Bichette

6.05

Harold Baines

6.74

 

Harold Baines

6.02

David Justice

6.21

 

David Justice

5.74

Mark Grace

6.16

 

Mark Grace

5.21

B.J. Surhoff

6.00

 

Juan Gonzalez

5.16

Juan Gonzalez

5.58

 

B.J. Surhoff

5.13

Luis Gonzalez

4.90

 

Greg Vaughn

4.35

John Olerud

4.89

 

Luis Gonzalez

4.21

Greg Vaughn

4.66

 

John Olerud

4.13

Paul O'Neill

4.63

 

Paul O'Neill

3.83

 Tony Gwynn

4.44

 

Travis Fryman

3.60

Travis Fryman

4.34

 

Gary Sheffield

3.57

Gary Sheffield

4.03

 

Eric Karros

3.41

Eric Karros

3.89

 

 Tony Gwynn

3.38

Jay Buhner

3.64

 

Jay Buhner

3.32

Bernie Williams

3.61

 

Ken Griffey Jr.

3.09

Ken Griffey Jr.

3.43

 

Barry Bonds

2.90

Kenny Lofton

3.27

 

Jose Canseco

2.82

Mark McLemore

3.25

 

Bernie Williams

2.82

Barry Bonds

3.22

 

Sammy Sosa

2.76

Jose Canseco

3.16

 

Kenny Lofton

2.56

Larry Walker

3.10

 

Larry Walker

2.47

Andres Galarraga

3.10

 

Andres Galarraga

2.43

Sammy Sosa

3.01

 

Mark McLemore

2.42

Ray Lankford

2.85

 

Mark McGwire

2.29

Delino DeShields

2.76

 

Ray Lankford

2.21

Will Clark

2.42

 

Delino DeShields

1.99

Todd Zeile

2.23

 

Will Clark

1.59

Mark McGwire

2.21

 

Matt Williams

1.56

Matt Williams

2.05

 

Todd Zeile

1.49

Cal Ripken

1.84

 

Fred McGriff

1.19

Fred McGriff

1.67

 

Cal Ripken

1.19

Tim Raines

1.64

 

Tim Raines

0.82

Edgar Martinez

1.49

 

Edgar Martinez

0.68

Ken Caminiti

0.67

 

Ken Caminiti

-0.06

 Jay Bell

0.43

 

Rafael Palmeiro

-0.19

Rafael Palmeiro

0.36

 

Brady Anderson

-0.19

Gregg Jefferies

0.35

 

 Jay Bell

-0.27

Brady Anderson

0.34

 

Gregg Jefferies

-0.47

Chuck Knoblauch

-0.23

 

Chuck Knoblauch

-0.97

Tony Fernandez

-0.41

 

Ron Gant

-1.17

Ruben Sierra

-0.49

 

Ruben Sierra

-1.26

Ron Gant

-0.67

 

Tony Fernandez

-1.42

Roberto Alomar

-0.76

 

Marquis Grissom

-1.67

Marquis Grissom

-0.97

 

Roberto Alomar

-1.68

Bobby Bonilla

-1.07

 

Bobby Bonilla

-1.84

Barry Larkin

-1.90

 

Devon White

-2.64

Devon White

-1.94

 

Steve Finley

-2.68

Dave Martinez

-1.99

 

Barry Larkin

-2.74

Steve Finley

-2.00

 

Craig Biggio

-2.79

Craig Biggio

-2.04

 

Dave Martinez

-2.85

Omar Vizquel

-2.84

 

Rickey Henderson

-3.67

Rickey Henderson

-3.07

 

Omar Vizquel

-3.74

Benito Santiago

-4.38

 

Benito Santiago

-5.13

Ellis Burks

-6.56

 

Ellis Burks

-7.30

Wade Boggs

-6.84

 

Wade Boggs

-7.85

 

 

The results are similar to the first case, although some players have noticeably bigger (or smaller) differences between actual and expected RBIs.

 

I also looked at this issue with walks included.  That is, for a player’s hit frequency, I used at bats + walks + HBP as the denominator.  Of course, it also meant that there were more runners to drive in.  The lists below are those three cases.  Again, the results are similar what we have already seen.  I just show the difference between actual RBIs and expectations per 660 plate appearances.  RBIs from bases loaded walks are included but not RBIs from sacrifice flies. The column on the left makes no adjustment for what league a hitter was in.  The one in the center does.  The one on the left adjusts for league and the normally higher hit frequencies with runners on base (and lower HR%s with runners on).

 

 

 

 

 

ADJUSTED

 

 

ADJUSTED FOR

ROB

NO ADJ

PER660

 

FOR LEAGUE

PER660

 

LEAGUE AND

PER660

Name

PA

 

Name

PA

 

Name

PA

Tino Martinez

9.11

 

Tino Martinez

9.24

 

Tino Martinez

8.80

Dante Bichette

7.11

 

Dante Bichette

7.06

 

Dante Bichette

6.84

Jeff Bagwell

6.78

 

Harold Baines

6.77

 

Harold Baines

6.28

Harold Baines

6.77

 

Jeff Bagwell

6.64

 

Jeff Bagwell

6.22

Mark Grace

6.16

 

Mark Grace

6.11

 

Mark Grace

5.59

B.J. Surhoff

5.87

 

B.J. Surhoff

5.91

 

B.J. Surhoff

5.41

Robin Ventura

5.62

 

Robin Ventura

5.67

 

Robin Ventura

5.19

Wally Joyner

5.61

 

Wally Joyner

5.62

 

Wally Joyner

5.08

Frank Thomas

5.41

 

Frank Thomas

5.55

 

Frank Thomas

4.72

Juan Gonzalez

4.20

 

Juan Gonzalez

4.37

 

Juan Gonzalez

4.06

David Justice

3.93

 

Paul O'Neill

3.95

 

David Justice

3.69

Paul O'Neill

3.92

 

David Justice

3.93

 

Jose Canseco

3.63

Luis Gonzalez

3.64

 

Travis Fryman

3.68

 

Greg Vaughn

3.55

Travis Fryman

3.58

 

Jose Canseco

3.62

 

Eric Karros

3.44

Jose Canseco

3.57

 

Greg Vaughn

3.54

 

Paul O'Neill

3.43

Greg Vaughn

3.50

 

Luis Gonzalez

3.51

 

Travis Fryman

3.36

Eric Karros

3.44

 

Eric Karros

3.37

 

Luis Gonzalez

3.16

Kenny Lofton

3.24

 

Kenny Lofton

3.24

 

Kenny Lofton

2.94

 Tony Gwynn

3.22

 

 Tony Gwynn

3.22

 

 Tony Gwynn

2.47

Ken Griffey Jr.

2.75

 

Ken Griffey Jr.

2.85

 

Mark McLemore

2.34

Mark McLemore

2.72

 

Bernie Williams

2.81

 

Bernie Williams

2.28

Bernie Williams

2.71

 

Mark McLemore

2.69

 

Jay Buhner

2.24

Delino DeShields

2.47

 

Delino DeShields

2.54

 

Delino DeShields

2.19

Andres Galarraga

2.17

 

John Olerud

2.32

 

Ken Griffey Jr.

2.03

John Olerud

2.17

 

Jay Buhner

2.22

 

Sammy Sosa

1.87

Gary Sheffield

2.14

 

Andres Galarraga

2.16

 

Andres Galarraga

1.87

Jay Buhner

2.13

 

Gary Sheffield

2.09

 

Gary Sheffield

1.85

Sammy Sosa

2.10

 

Sammy Sosa

2.05

 

John Olerud

1.47

Matt Williams

1.10

 

Matt Williams

1.05

 

Matt Williams

0.91

Will Clark

0.95

 

Will Clark

0.94

 

Mark McGwire

0.46

Larry Walker

0.64

 

Cal Ripken

0.68

 

Will Clark

0.35

Mark McGwire

0.63

 

Mark McGwire

0.61

 

Cal Ripken

0.29

Cal Ripken

0.63

 

Larry Walker

0.52

 

Ray Lankford

0.28

Ray Lankford

0.60

 

Ray Lankford

0.51

 

Todd Zeile

0.24

Todd Zeile

0.51

 

Todd Zeile

0.49

 

Larry Walker

0.18

Fred McGriff

0.44

 

Fred McGriff

0.44

 

Fred McGriff

0.06

Barry Bonds

0.08

 

Barry Bonds

-0.07

 

 Jay Bell

-0.45

 Jay Bell

-0.21

 

 Jay Bell

-0.22

 

Brady Anderson

-0.97

Edgar Martinez

-0.59

 

Edgar Martinez

-0.44

 

Chuck Knoblauch

-0.99

Chuck Knoblauch

-0.66

 

Chuck Knoblauch

-0.65

 

Edgar Martinez

-1.13

Brady Anderson

-0.73

 

Brady Anderson

-0.67

 

Tim Raines

-1.17

Tim Raines

-0.77

 

Tim Raines

-0.79

 

Barry Bonds

-1.32

Tony Fernandez

-0.96

 

Tony Fernandez

-0.98

 

Tony Fernandez

-1.44

Gregg Jefferies

-1.24

 

Gregg Jefferies

-1.24

 

Ron Gant

-1.45

Ruben Sierra

-1.33

 

Ruben Sierra

-1.25

 

Gregg Jefferies

-1.68

Ron Gant

-1.45

 

Ron Gant

-1.46

 

Ruben Sierra

-1.75

Ken Caminiti

-1.48

 

Ken Caminiti

-1.52

 

Ken Caminiti

-1.96

Roberto Alomar

-1.60

 

Roberto Alomar

-1.61

 

Roberto Alomar

-1.99

Rafael Palmeiro

-1.95

 

Rafael Palmeiro

-1.86

 

Rafael Palmeiro

-2.28

Marquis Grissom

-2.09

 

Marquis Grissom

-2.04

 

Marquis Grissom

-2.33

Steve Finley

-2.26

 

Steve Finley

-2.28

 

Steve Finley

-2.51

Bobby Bonilla

-2.69

 

Bobby Bonilla

-2.74

 

Devon White

-3.02

Devon White

-2.77

 

Craig Biggio

-2.76

 

Craig Biggio

-3.11

Craig Biggio

-2.77

 

Devon White

-2.76

 

Bobby Bonilla

-3.19

Omar Vizquel

-2.89

 

Omar Vizquel

-2.96

 

Omar Vizquel

-3.29

Barry Larkin

-3.62

 

Barry Larkin

-3.58

 

Barry Larkin

-3.95

Dave Martinez

-3.72

 

Dave Martinez

-3.71

 

Dave Martinez

-4.06

Rickey Henderson

-5.14

 

Rickey Henderson

-5.16

 

Rickey Henderson

-5.32

Benito Santiago

-5.34

 

Benito Santiago

-5.32

 

Benito Santiago

-5.60

Ellis Burks

-7.10

 

Ellis Burks

-7.09

 

Ellis Burks

-7.17

Wade Boggs

-8.64

 

Wade Boggs

-8.63

 

Wade Boggs

-9.30

 

 

Again, the results are similar to the analysis based on at bats only.

 

The Data

 

 

       I came across a discrepancy in the data I used.  I used data from the CNN/SI site for each player and I calculated each player's RBI opportunities.  For example, I added up the at-bats that CNN/SI reported for Rafael Palmeiro for each situation where runners might be on base

 

Runner on 1B-1262

Runner on 2B-714

Runner on 3B-236

Runners on 1B, 2B-614

Runners on 1B, 3B-249

Runners on 2B, 3B-142

Bases Loaded-163

 

This adds up to 3380.  But the discrepancy comes in where, in a separate line, they report that he had 3852 at-bats with runners on base, not 3380.  With no runners on, they give him 4521.  Adding this to 3852, you get 8373, the same total that he has in the Lee Sinins sabermetric encyclopedia (for 1987-2001).  So the 3852, not the 3380, must be right.

         In my study I looked at opportunities per at-bat.  So I had to calculate opportunities for each hitter.  For Palmeiro it was

 

1262*2=2524

714*2=1428

236*2=472

614*3=1842

249*3=747

142*3=426

163*4=652

 

If you add that up, you get 8091.  The reason you multiply 1262 by 2 is that an at-bat with a man on first is two opportunities, the man on first and the batter.  Adding the 8091 to the at-bats with none on (which is one opportunity each time), 4521, you get 12612.  That is, Palmeiro had 12612 opportunities.  To get this per at-bat, you divide 12612 by his total at-bats, 8373, and get 1.51.

       Now this is not right, because I did not have the right numbers for each of those base situations.  I don't know why CNN/SI had the discrepancy.  I  discovered it on August 1, 2003.

         Then I looked at how many runners on base at-bats were missing for Palmeiro. That would be 3852 - 3380 or 472.  The 3852 is 13.96% higher than the 3380.  I assumed that Palmeiro got those other 472 at bats and that the seven base situations came up with the same frequency as they did for the at-bats listed.  So I raised the at-bats with a runner on first 13.96%, the at-bats with a runner second 13.96%, and so on.  So Palmeiro gets more opportunities and then opportunities per at-bat goes up to 1.64.  I did this for all the hitters and re-ran the first regression from the paper.  The results were actually more accurate, with the coefficients on ISO and AVG changing a little but the coefficient on OPP (RBI opportunities) went up to .187 from .125.  That is quite a jump and it means that opportunities might be alot more important than I thought. In fact, I had done a study on the 1995 season a couple of years ago, and the coefficient on OPP was .174. The standard error per 600 at bats fell from 5.03 to 3.39.

            If you went to the CNN/SI site to look for these discrepancies, you would not be able to find them.  They no longer list all of the base out situations.  They leave one out, the one with runners on first.  So you can’t check the total at bats from the seven on-base situations with the total given in the separate “runners on base” line. Since they had a discrepancy before, it might still exist and you could not detect it.

            I think the corrections I have made are very reasonable.  They result in average number of opportunities per at bat of 1.62.  Before the corrections, it was just 1.52.  The 1.62 is more in line with what I got in a study of the 1995 season (I used data from the STATS, INC. Scoreboard book for that study) and also with the frequency of the different base-out situations.  There was quite a range of discrepancies.  Two hitters were missing more than 20% of their at bats with runners on base while two were missing less than 1%.  That is, the discrepancies varied quite a bit across players.

 

Back to Cyril Morong's Sabermetric Research