Do Batters Learn During Their Career?
Posted to the SABR-list in July 1999 by Tom Ruane
Part 1
Back in 1996, Dave Smith, the president and
founder of Retrosheet, did
a study showing that batters tended to
improve against a pitcher as
a game went on. His paper was entitled "Do Batters Learn
During a
Game?" and, rather than attempt to
summarize his methods and findings,
I recommend that you read his work for
yourself. A copy of the paper
can be found at:
http://www.retrosheet.org/Research/SmithD/batlearn.pdf
This got me to wondering if a batter showed
a similar increase in
performance as his career progressed. Was a batter or a pitcher at
a disadvantage the first time they ever faced
each other? How did
their performance change as they got more
familiar? In order
to attempt to answer these questions, I
looked at all batter-pitcher
matchups from 1980 to 1998. I only considered matchups where the
batter and pitcher faced each other at least
20 times over the course
of their careers and examined how they did
in each of their first 20
confrontations. (Note: since my data started in mid-career
for all
players active prior to 1980, I did not
include any match-ups where
both pitcher and batter were active in the
1970s.)
So what did I find? Here's the obligatory chart containing the
totals
for the 23108 batter-pitcher matchups I
found with at least 20 plate
appearances:
PA
AB H 2B
3B HR BB IBB
K HBP AVG OBP
SLG OPS
1 20648 5167 902
146 425 2058 105 4026 131 .250
.320 .369 .689
2 20734 5479 903
181 514 1978 121 3497 119 .264
.329 .399 .728
3 20816 5637 1049 143
527 1854 184 3352 121 .270 .331
.410 .741
4 20772 5584 963
155 526 1943 124 3346 128 .268
.333 .406 .739
5 20653 5499 1044 132
537 1997 125 3504 129 .266 .332
.407 .739
6 20743 5537 1014 159
546 1925 134 3328 119 .266 .330
.410 .740
7 20698 5560 1035 168
527 1983 150 3354 133 .268 .334
.411 .745
8 20760 5507 998
129 527 1921 122 3344 130 .265
.328 .401 .729
9 20850 5684 1060 148
591 1874 136 3332 110 .272 .333
.422 .755
10
20634 5540 970 153
514 2011 157 3298 133 .268 .334
.405 .739
11
20748 5494 1021 143 594 1959 147 3294 150 .264
.330 .413 .743
12
20731 5619 1012 141 586 1956 118 3240 148 .271
.335 .418 .753
13
20775 5535 1036 126 610 1948 150 3261 108 .266
.330 .416 .746
14
20805 5559 978 131
583 1881 177 3322 134 .267 .329
.410 .739
15
20702 5541 973 130
546 2000 155 3187 135 .267 .333
.406 .739
16
20793 5690 1122 134 586 1946 161 3230 116 .273
.336 .425 .761
17
20690 5615 993 134
646 1981 140 3172 117 .271 .335
.426 .761
18
20796 5570 1037 147 605 1903 146 3299 129 .267
.330 .419 .749
19
20682 5673 1013 134 603 2000 170 3259 120 .274
.338 .423 .761
20
20675 5608 1018 153 595 2018 185 3240 134 .271
.337 .421 .758
So Batters seem to be at a noticeable
disadvantage when facing an
unfamiliar pitcher. The field seems to level off after the third
time
they see each other, but the batter still
seems to get slightly better
the more he sees a pitcher. Here's the breakdown in groups of five
plate appearances:
PAs OPS
1-
5 .727
6-10 .742
11-15
.744
16-20
.758
Five of the top six slugging percentages and
four of the top five
on-base percentages occurred in PAs 16
through 20.
Part 2
I got some very interesting off-list responses to my recent post on
the performance of batters the first twenty times they face a
particular pitcher. Someone wondered if requiring twenty or
more plate appearances was biasing the sample by eliminating the batters
who fail to learn (and so leave the big leagues before facing any one
pitcher enough to meet my criteria). And while it is true that we've
eliminated these marginal hitters from consideration, we've also done
the same for pitchers. Still, this is something to keep in mind.
What's true when good batters face good pitchers may not necessarily
be true in general. To get a feeling for the impact of this on my
study, I re-ran it dropping the requirement down to 4 at-bats.
Doing this increased the sample size from 23108 to 188878 and produced
the following results:
PA AB H 2B 3B HR BB IBB K HBP AVG OBP SLG OPS
1 167474 41804 7247 1019 3659 17000 1157 32672 1158 .249 .320 .370 .690
2 168401 43707 7731 1116 3975 16013 1393 29535 1210 .259 .325 .389 .714
3 168307 43704 7839 1031 4155 16048 1588 28718 1136 .259 .325 .392 .717
4 167992 43933 8033 1037 4234 16489 1518 29181 1184 .261 .329 .397 .726
Compare that to my earlier report:
PA AB H 2B 3B HR BB IBB K HBP AVG OBP SLG OPS
1 20648 5167 902 146 425 2058 105 4026 131 .250 .320 .369 .689
2 20734 5479 903 181 514 1978 121 3497 119 .264 .329 .399 .728
3 20816 5637 1049 143 527 1854 184 3352 121 .270 .331 .410 .741
4 20772 5584 963 155 526 1943 124 3346 128 .268 .333 .406 .739
And it does appear as if the "learning" is far less dramatic in the
larger group. Lowering the bar to 4 plate appearance does allow a
lot of pitchers into the study as hitters. Eliminating them yields
183496 samples and the following results
PA AB H 2B 3B HR BB IBB K HBP AVG OBP SLG OPS
1 162824 41186 7176 1013 3645 16824 1157 30917 1147 .252 .324 .376 .700
2 163769 43043 7629 1104 3957 15849 1393 27932 1200 .262 .329 .395 .724
3 163678 43043 7748 1021 4137 15853 1588 27084 1124 .262 .329 .398 .727
4 163311 43261 7926 1029 4215 16307 1518 27564 1170 .264 .333 .403 .736
While the averages all go up with pitchers removed, the relationship
between the plate appearances doesn't change all that much.
Someone else argued that what we're really seeing here are within-game
effects, that the first appearance is earlier in a game than the second,
which is probably earlier in a game than the third. To see how
pronounced this is, I looked at how many batters the pitcher had faced
prior to the hitter coming to the plate. Here are the averages for the
three groups ( 20 or more PAs, 4 or more PAs and 4 or more PAs with the
pitchers removed):
>=20 >=4 xPIT
BFP BFP BFP
1 3.9 4.0 3.9
2 10.7 9.4 9.2
3 15.8 12.6 12.6
4 11.9 10.1 10.0
5 10.5
6 13.6
7 13.1
8 12.0
9 12.4
10 12.9
11 12.6
12 12.5
13 12.6
14 12.7
15 12.6
16 12.5
17 12.6
18 12.3
19 12.3
20 12.8
So perhaps most of what I'm seeing is little more than what Dave Smith
reported in his original study. The first PA is guaranteed to be the
first time the pitcher faced the batter in a given game, while the
second PA is likely to be from the same game as well. Notice that the
pitcher in the first group has faced an average of 15.8 batters in the
game when he faces a batter for the third time. This is significantly
higher than the other two groups (both at 12.6) because (or at least I
think this is the reason) the 20 plate appearance minimum skews the
sample in favor of starting pitchers.
Part 3
I had a few follow-up comments to make about
the results I posted.
My initial findings showed that batters
tended to improve the more
often they faced a pitcher. The rates per appearance:
PA
OBP SLG OPS
PA OBP SLG
OPS
1 .320 .369
.689 11 .330
.413 .743
2 .329 .399
.728 12
.335 .418 .753
3 .331 .410
.741 13 .330
.416 .746
4 .333 .406
.739 14 .329
.410 .739
5 .332 .407
.739 15 .333
.406 .739
6 .330 .410
.740 16 .336
.425 .761
7 .334 .411
.745 17 .335 .426 .761
8 .328 .401
.729 18 .330
.419 .749
9 .333 .422
.755 19 .338
.423 .761
10
.334 .405 .739
20 .337 .421
.758
I noted that OPS continued to go up even
after 15 plate appearances:
PAs OPS
1-
5 .727
6-10 .742
11-15
.744
16-20
.758
Someone thought that the improvement during
the 16th to 20th PAs
was counterintuitive and wrote:
> Casual empiricism suggests that this
counterintuitive statistical
> result may be due to the influence of
higher batting averages,
> particularly higher SLG, towards the
end of his database period
> compared to the beginning; Ruane used 1980 to 1998 . The result may
> also be caused by relatively more
observations from the American
> League at the end of the period.
In order to check this out, I kept track of
the year and league
associated with every plate appearance (the
NL league averages did
not include pitcher's hitting) that went
into each PA average.
So here's the initial results along with the
league averages for
each plate appearance:
---- Batter ---- ---- League ----
PA
OBP SLG OPS
OBP SLG OPS
1 .320 .369
.689 .330 .397
.727
2 .329 .399
.728 .330 .397
.727
3 .331 .410
.741 .330 .397
.727
4 .333 .406
.739 .330 .398
.728
5 .332 .407
.739 .330 .398
.728
6 .330 .410
.740 .330 .399
.729
7 .334 .411
.745 .330 .399
.729
8 .328 .401
.729 .331 .400
.731
9 .333 .422
.755 .331 .400
.731
10
.334 .405 .739
.331 .400 .731
11
.330 .413 .743
.331 .401 .732
12
.335 .418 .753
.331 .401 .732
13
.330 .416 .746
.331 .402 .733
14
.329 .410 .739
.331 .402 .733
15
.333 .406 .739
.332 .402 .734
16
.336 .425 .761
.332 .403 .735
17
.335 .426 .761
.332 .403 .735
18
.330 .419 .749
.332 .404 .736
19
.338 .423 .761
.332 .404 .736
20
.337 .421 .758
.333 .405 .738
Which gives us the following summary:
Batter League
PAs OPS OPS
1-
5 .727 .727
6-10 .742 .730
11-15
.744 .733
16-20
.758 .736
So while there is some of the effect that
Ernie mentioned, it is not
sufficient by itself to explain the rise
during the last group.
I think that it's also interesting to note
that this group of hitters
(the ones facing a single pitcher at least
20 times) are not an
average group. They consistently perform above the league
average in
all but the initial group. Although not shown, the same is also true
for their plate appearances beyond the 20th.
There was also some question at the time
about how much of this
effect was really due to batter learning
during a game. In order to
look at this effect, I compared how each
hitter did during the first
three times facing a pitcher in his career
with how he did during the
first three times facing a pitcher in each
game. Here are the results:
First three times in a career:
PA
AB H 2B
3B HR BB IBB
K HBP AVG OBP
SLG OPS
1 20648 5167 902
146 425 2058 105 4026 131 .250
.320 .369 .689
2 20734 5479 903
181 514 1978 121 3497 119 .264
.329 .399 .728
3 20816 5637 1049 143
527 1854 184 3352 121 .270 .331
.410 .741
First three times in a game:
PA
AB H 2B
3B HR BB IBB
K HBP AVG OBP
SLG OPS
1 20639 5386 973
127 525 2055 112 3644 135 .260
.329 .396 .725
2 20662 5624 1041 137
593 1897 129 3128 127 .272 .334
.421 .755
3 20148 5569 1032 150
601 1817 171 2840 121 .276 .337
.432 .769
So while a batter does significantly worse
facing a pitcher for the
first time in his career than when facing
him for the first time in
a particular game, the rates of improvement
are similar.
A technical note: since in the original
sample each matchup contributed
a single plate appearance to a PA group, I
used their averages (singles
per PA, doubles per PA, and so on) to
compute their within-game
performance.
That way, each batter-pitcher pair contributed the same
amount to the final results. Not all batters faced a pitcher 3 (or
even 2) times in a single game over the
course of their careers. The
number of samples:
PA Samples
1 23108 total batter-pitcher matchups with 20 or
more PAs
2 22945 153 pitchers never faced a batter twice in
one game
3 22383 725 pitchers never faced a batter three
times in one game
If you only include those 22383 matchups
from the last group, the OPS
for the first three PAs in a game are .726,
.755 and .769 (or just about
the same as the rates above).
Here are the percentage of times each career
PA is the first, second,
third or fourth or more time they faced each
other in a game:
PA
1st 2nd 3rd
>3rd
1 23108 0
0 0
2 4572 18536 0
0
3 6040 2278 14790
0
4 12647 3853
1617 4991
5 9356 10210 2878
664
6 6662 7284
8318 844
7 9512 4765
6013 2818
8 9352 7502
3808 2446
9 8265 7368
6137 1338
10
8449 6364 6219 2076
11
8944 6533 5317
2314
12
8654 7019 5398
2037
13
8561 6807 5911
1829
14
8582 6675 5678
2173
15
8710 6684 5519
2195
16
8773 6762 5598
1975
17
8673 6760 5673
2002
18
9080 6532 5560
1936
19
8912 6894 5399
1903
20
8591 6673 5848
1996
I'm not exactly sure what to make of all of
this, but thought it was
interesting anyway.
I would like to thank Larry McCray, Jeff
Powers-Beck, Ernie Nadel and
Dave Smith for their comments on this study.