Do Batters Learn During Their Career?

 

Posted to the SABR-list in July 1999 by Tom Ruane

 

Part 1

 

Back in 1996, Dave Smith, the president and founder of Retrosheet, did

a study showing that batters tended to improve against a pitcher as

a game went on.  His paper was entitled "Do Batters Learn During a

Game?" and, rather than attempt to summarize his methods and findings,

I recommend that you read his work for yourself.  A copy of the paper

can be found at:

 

http://www.retrosheet.org/Research/SmithD/batlearn.pdf

 

This got me to wondering if a batter showed a similar increase in

performance as his career progressed.  Was a batter or a pitcher at

a disadvantage the first time they ever faced each other?  How did

their performance change as they got more familiar?  In order

to attempt to answer these questions, I looked at all batter-pitcher

matchups from 1980 to 1998.  I only considered matchups where the

batter and pitcher faced each other at least 20 times over the course

of their careers and examined how they did in each of their first 20

confrontations.  (Note: since my data started in mid-career for all

players active prior to 1980, I did not include any match-ups where

both pitcher and batter were active in the 1970s.)

 

So what did I find?  Here's the obligatory chart containing the totals

for the 23108 batter-pitcher matchups I found with at least 20 plate

appearances:

 

 PA     AB    H   2B   3B   HR   BB IBB    K HBP   AVG   OBP   SLG   OPS

  1  20648 5167  902  146  425 2058 105 4026 131  .250  .320  .369  .689

  2  20734 5479  903  181  514 1978 121 3497 119  .264  .329  .399  .728

  3  20816 5637 1049  143  527 1854 184 3352 121  .270  .331  .410  .741

  4  20772 5584  963  155  526 1943 124 3346 128  .268  .333  .406  .739

  5  20653 5499 1044  132  537 1997 125 3504 129  .266  .332  .407  .739

  6  20743 5537 1014  159  546 1925 134 3328 119  .266  .330  .410  .740

  7  20698 5560 1035  168  527 1983 150 3354 133  .268  .334  .411  .745

  8  20760 5507  998  129  527 1921 122 3344 130  .265  .328  .401  .729

  9  20850 5684 1060  148  591 1874 136 3332 110  .272  .333  .422  .755

 10  20634 5540  970  153  514 2011 157 3298 133  .268  .334  .405  .739

 11  20748 5494 1021  143  594 1959 147 3294 150  .264  .330  .413  .743

 12  20731 5619 1012  141  586 1956 118 3240 148  .271  .335  .418  .753

 13  20775 5535 1036  126  610 1948 150 3261 108  .266  .330  .416  .746

 14  20805 5559  978  131  583 1881 177 3322 134  .267  .329  .410  .739

 15  20702 5541  973  130  546 2000 155 3187 135  .267  .333  .406  .739

 16  20793 5690 1122  134  586 1946 161 3230 116  .273  .336  .425  .761

 17  20690 5615  993  134  646 1981 140 3172 117  .271  .335  .426  .761

 18  20796 5570 1037  147  605 1903 146 3299 129  .267  .330  .419  .749

 19  20682 5673 1013  134  603 2000 170 3259 120  .274  .338  .423  .761

 20  20675 5608 1018  153  595 2018 185 3240 134  .271  .337  .421  .758

 

So Batters seem to be at a noticeable disadvantage when facing an

unfamiliar pitcher.  The field seems to level off after the third time

they see each other, but the batter still seems to get slightly better

the more he sees a pitcher.  Here's the breakdown in groups of five

plate appearances:

 

   PAs    OPS

  1- 5   .727

  6-10   .742

 11-15   .744

 16-20   .758

 

Five of the top six slugging percentages and four of the top five

on-base percentages occurred in PAs 16 through 20.

 

 

Part 2

 

I got some very interesting off-list responses to my recent post on
the performance of batters the first twenty times they face a
particular pitcher.  Someone wondered if requiring twenty or
more plate appearances was biasing the sample by eliminating the batters
who fail to learn (and so leave the big leagues before facing any one
pitcher enough to meet my criteria).  And while it is true that we've
eliminated these marginal hitters from consideration, we've also done
the same for pitchers.  Still, this is something to keep in mind.
What's true when good batters face good pitchers may not necessarily
be true in general.  To get a feeling for the impact of this on my
study, I re-ran it dropping the requirement down to 4 at-bats.
Doing this increased the sample size from 23108 to 188878 and produced
the following results:
 
 PA     AB     H   2B   3B   HR    BB  IBB     K  HBP   AVG   OBP   SLG   OPS
  1 167474 41804 7247 1019 3659 17000 1157 32672 1158  .249  .320  .370  .690
  2 168401 43707 7731 1116 3975 16013 1393 29535 1210  .259  .325  .389  .714
  3 168307 43704 7839 1031 4155 16048 1588 28718 1136  .259  .325  .392  .717
  4 167992 43933 8033 1037 4234 16489 1518 29181 1184  .261  .329  .397  .726
 
Compare that to my earlier report:
 
 PA     AB     H   2B   3B   HR    BB  IBB     K  HBP   AVG   OBP   SLG   OPS
  1  20648  5167  902  146  425  2058  105  4026  131  .250  .320  .369  .689
  2  20734  5479  903  181  514  1978  121  3497  119  .264  .329  .399  .728
  3  20816  5637 1049  143  527  1854  184  3352  121  .270  .331  .410  .741
  4  20772  5584  963  155  526  1943  124  3346  128  .268  .333  .406  .739
 
And it does appear as if the "learning" is far less dramatic in the
larger group.  Lowering the bar to 4 plate appearance does allow a
lot of pitchers into the study as hitters.  Eliminating them yields
183496 samples and the following results
 
 PA     AB     H   2B   3B   HR    BB  IBB     K  HBP   AVG   OBP   SLG   OPS
  1 162824 41186 7176 1013 3645 16824 1157 30917 1147  .252  .324  .376  .700
  2 163769 43043 7629 1104 3957 15849 1393 27932 1200  .262  .329  .395  .724
  3 163678 43043 7748 1021 4137 15853 1588 27084 1124  .262  .329  .398  .727
  4 163311 43261 7926 1029 4215 16307 1518 27564 1170  .264  .333  .403  .736
 
While the averages all go up with pitchers removed, the relationship
between the plate appearances doesn't change all that much.
 
Someone else argued that what we're really seeing here are within-game
effects, that the first appearance is earlier in a game than the second,
which is probably earlier in a game than the third.  To see how
pronounced this is, I looked at how many batters the pitcher had faced
prior to the hitter coming to the plate.  Here are the averages for the
three groups ( 20 or more PAs, 4 or more PAs and 4 or more PAs with the
pitchers removed):
 
       >=20    >=4   xPIT
        BFP    BFP    BFP
  1     3.9    4.0    3.9
  2    10.7    9.4    9.2
  3    15.8   12.6   12.6
  4    11.9   10.1   10.0
  5    10.5
  6    13.6
  7    13.1
  8    12.0
  9    12.4
 10    12.9
 11    12.6
 12    12.5
 13    12.6
 14    12.7
 15    12.6
 16    12.5
 17    12.6
 18    12.3
 19    12.3
 20    12.8
 
So perhaps most of what I'm seeing is little more than what Dave Smith
reported in his original study.  The first PA is guaranteed to be the
first time the pitcher faced the batter in a given game, while the
second PA is likely to be from the same game as well.  Notice that the
pitcher in the first group has faced an average of 15.8 batters in the
game when he faces a batter for the third time.  This is significantly
higher than the other two groups (both at 12.6) because (or at least I
think this is the reason) the 20 plate appearance minimum skews the
sample in favor of starting pitchers.

 

 

Part 3

 

I had a few follow-up comments to make about the results I posted.

My initial findings showed that batters tended to improve the more

often they faced a pitcher.  The rates per appearance:

 

 PA   OBP   SLG   OPS       PA   OBP   SLG   OPS

  1  .320  .369  .689       11  .330  .413  .743

  2  .329  .399  .728       12  .335  .418  .753

  3  .331  .410  .741       13  .330  .416  .746

  4  .333  .406  .739       14  .329  .410  .739

  5  .332  .407  .739       15  .333  .406  .739

  6  .330  .410  .740       16  .336  .425  .761

  7  .334  .411  .745       17  .335  .426  .761

  8  .328  .401  .729       18  .330  .419  .749

  9  .333  .422  .755       19  .338  .423  .761

 10  .334  .405  .739       20  .337  .421  .758

 

I noted that OPS continued to go up even after 15 plate appearances:

 

   PAs    OPS

  1- 5   .727

  6-10   .742

 11-15   .744

 16-20   .758

 

Someone thought that the improvement during the 16th to 20th PAs

was counterintuitive and wrote:

 

> Casual empiricism suggests that this counterintuitive statistical

> result may be due to the influence of higher batting averages,

> particularly higher SLG, towards the end of his database period

> compared to the beginning; Ruane used 1980 to 1998 . The result may

> also be caused by relatively more observations from the American

> League at the end of the period.

 

In order to check this out, I kept track of the year and league

associated with every plate appearance (the NL league averages did

not include pitcher's hitting) that went into each PA average.

So here's the initial results along with the league averages for

each plate appearance:

 

     ---- Batter ----  ---- League ----

 PA   OBP   SLG   OPS   OBP   SLG   OPS

  1  .320  .369  .689  .330  .397  .727

  2  .329  .399  .728  .330  .397  .727

  3  .331  .410  .741  .330  .397  .727

  4  .333  .406  .739  .330  .398  .728

  5  .332  .407  .739  .330  .398  .728

  6  .330  .410  .740  .330  .399  .729

  7  .334  .411  .745  .330  .399  .729

  8  .328  .401  .729  .331  .400  .731

  9  .333  .422  .755  .331  .400  .731

 10  .334  .405  .739  .331  .400  .731

 11  .330  .413  .743  .331  .401  .732

 12  .335  .418  .753  .331  .401  .732

 13  .330  .416  .746  .331  .402  .733

 14  .329  .410  .739  .331  .402  .733

 15  .333  .406  .739  .332  .402  .734

 16  .336  .425  .761  .332  .403  .735

 17  .335  .426  .761  .332  .403  .735

 18  .330  .419  .749  .332  .404  .736

 19  .338  .423  .761  .332  .404  .736

 20  .337  .421  .758  .333  .405  .738

 

Which gives us the following summary:

 

       Batter  League

   PAs    OPS     OPS

  1- 5   .727    .727

  6-10   .742    .730

 11-15   .744    .733

 16-20   .758    .736

 

So while there is some of the effect that Ernie mentioned, it is not

sufficient by itself to explain the rise during the last group.

 

I think that it's also interesting to note that this group of hitters

(the ones facing a single pitcher at least 20 times) are not an

average group.  They consistently perform above the league average in

all but the initial group.  Although not shown, the same is also true

for their plate appearances beyond the 20th.

 

There was also some question at the time about how much of this

effect was really due to batter learning during a game.  In order to

look at this effect, I compared how each hitter did during the first

three times facing a pitcher in his career with how he did during the

first three times facing a pitcher in each game.  Here are the results:

 

First three times in a career:

 

 PA     AB    H   2B   3B   HR   BB IBB    K HBP   AVG   OBP   SLG   OPS

  1  20648 5167  902  146  425 2058 105 4026 131  .250  .320  .369  .689

  2  20734 5479  903  181  514 1978 121 3497 119  .264  .329  .399  .728

  3  20816 5637 1049  143  527 1854 184 3352 121  .270  .331  .410  .741

 

First three times in a game:

 

 PA     AB    H   2B   3B   HR   BB IBB    K HBP   AVG   OBP   SLG   OPS

  1  20639 5386  973  127  525 2055 112 3644 135  .260  .329  .396  .725

  2  20662 5624 1041  137  593 1897 129 3128 127  .272  .334  .421  .755

  3  20148 5569 1032  150  601 1817 171 2840 121  .276  .337  .432  .769

 

So while a batter does significantly worse facing a pitcher for the

first time in his career than when facing him for the first time in

a particular game, the rates of improvement are similar.

 

A technical note: since in the original sample each matchup contributed

a single plate appearance to a PA group, I used their averages (singles

per PA, doubles per PA, and so on) to compute their within-game

performance.  That way, each batter-pitcher pair contributed the same

amount to the final results.  Not all batters faced a pitcher 3 (or

even 2) times in a single game over the course of their careers.  The

number of samples:

 

  PA  Samples

   1    23108   total batter-pitcher matchups with 20 or more PAs

   2    22945   153 pitchers never faced a batter twice in one game

   3    22383   725 pitchers never faced a batter three times in one game

 

If you only include those 22383 matchups from the last group, the OPS

for the first three PAs in a game are .726, .755 and .769 (or just about

the same as the rates above).

 

Here are the percentage of times each career PA is the first, second,

third or fourth or more time they faced each other in a game:

 

 PA      1st   2nd   3rd  >3rd

  1    23108     0     0     0

  2     4572 18536     0     0

  3     6040  2278 14790     0

  4    12647  3853  1617  4991

  5     9356 10210  2878   664

  6     6662  7284  8318   844

  7     9512  4765  6013  2818

  8     9352  7502  3808  2446

  9     8265  7368  6137  1338

 10     8449  6364  6219  2076

 11     8944  6533  5317  2314

 12     8654  7019  5398  2037

 13     8561  6807  5911  1829

 14     8582  6675  5678  2173

 15     8710  6684  5519  2195

 16     8773  6762  5598  1975

 17     8673  6760  5673  2002

 18     9080  6532  5560  1936

 19     8912  6894  5399  1903

 20     8591  6673  5848  1996

 

I'm not exactly sure what to make of all of this, but thought it was

interesting anyway.

 

I would like to thank Larry McCray, Jeff Powers-Beck, Ernie Nadel and

Dave Smith for their comments on this study.