How Many HRs Would Babe Ruth Have in Integrated Baseball?
by Cyril Morong
Suppose that baseball had been integrated before 1947. The quality of the pitching would have been better since the talent pool would have been larger. But how much better would the pitching have been? I assume that the racial mix of pitchers would have been about what it was in the post 1947 era.
I estimate that about 15% of the IP then were by non-whites, blacks, dark-skinned Hispanics and Asians. Using the Lee Sinins Complete Baseball Encyclopedia, I found all the pitchers with 1,000+ IP in this period and then calculated what percent of the IP by these guys was by non-whites. You can see the list here. I checked the race of any pitcher I did not already know by looking at when they played and finding pictures of them in books or online. Any pitchers with Hispanic names were considered non-white. There were pitchers like Lefty Gomez before 1947, whose skin was light enough to play. But I did not want to have to judge who would have been able to play and who would not.
In that list, I have relative ERA listed. That is simply ERA divided by the league ERA. The relative ERA of all the whites combined was 105.75, meaning their ERA’s were about 5.75% better than the league average. For the non-whites it was just a bit higher at 106.8. In the analysis below, I assume that the ERAs of whites and non-whites will be the same. The number of IP by the pitchers with 1,000+ IP since 1947 accounted for 58% of all the IP in this time period.
Now if you add in a bunch of new pitchers, who do you get rid of? The worst pitchers. Since the non-whites make up 15% of the IP, I assumed that the worst 15% of pitchers in Ruth’s era would be replaced by non-whites. I did that by ranking pitchers in various years by ERA and eliminated the pitchers that made up the bottom 15% of the list. How good would the new mix of pitchers be? Since the non-whites and whites have basically the same ERAs, the new ERA in baseball would be the collective ERA of the top 85% of pitchers (actually the pitchers who made up the top 85% of IP on the ERA list). Again, since the non-whites have the same ERA as their white counterparts, I use the ERA of the best 85% of the pitchers (this lower ERA is about 8.7% lower below the league average). Initially I randomly selected 12 seasons between 1901 and 1940 to look at (later I looked at all AL years from 1920-34, the years Ruth played with the Yankees).
With lower ERAs, pitchers will allow fewer hits, HRs and walks. But how many? One way I estimated them was to run a regression with ERA as the dependent variable and HRs, non-HR hits and walks (all per 9 IP) were the independent variables. I used pitchers who had 100+ IP in a given year. The values in that equation were used to see how much hits, HRs and walks would fall by improving pitching quality. To see how this works, let’s look at the regression equation for the 1924 AL
ERA = -3.46 + 1.76*HR + .60*NONHR + .36*BB
I plugged in the average values for these three from the 1924 AL of .33, 9.75 and 3.41. That gives an ERA of 4.21. The actual ERA was 4.23. If the values for HRs, non-HR hits and walks are each reduced 4.8% (so that they are now .31, 3.25, and 9.28), and the new values are plugged into the equation, the ERA becomes 3.85. This was the ERA in the 1924 AL for the best 85% of the pitchers. As mentioned earlier, I looked at 12 seasons and the average reduction for those 12 seasons was 5%. I also tried to estimate the drop in HRs, non-HR hits and walks using the linear weights system developed by Pete Palmer. The results were similar, with HRs, non-HR hits and walks falling about 5%.
So if Ruth’s HRs fall 5%, he would have hit 678 HRs. I also looked at the best 85% of the pitchers in the AL from each season from 1920-34 and looked to see how much lower than average their HRs, non-HR hits and walks were. The table below has these values followed by the average for the whole period. The average for HRs is about also about 5% less. That is, the number of HRs would fall 5% with integration.
Year |
HR |
BB |
NONHR |
1920 |
0.047 |
0.062 |
0.028 |
1921 |
0.006 |
0.091 |
0.025 |
1922 |
0.075 |
0.072 |
0.022 |
1923 |
0.052 |
0.055 |
0.029 |
1924 |
0.036 |
0.062 |
0.031 |
1925 |
0.068 |
0.035 |
0.033 |
1926 |
0.094 |
0.042 |
0.019 |
1927 |
0.030 |
0.033 |
0.027 |
1928 |
0.003 |
0.054 |
0.033 |
1929 |
0.066 |
0.072 |
0.017 |
1930 |
0.080 |
0.062 |
0.032 |
1931 |
0.022 |
0.044 |
0.027 |
1932 |
0.070 |
0.068 |
0.021 |
1933 |
0.037 |
0.037 |
0.034 |
1934 |
0.057 |
0.030 |
0.032 |
AVG |
0.050 |
0.055 |
0.027 |