Statistical & Scientific Thinking
This is a post about a relatively obscure physicist, Walter A. Shewhart, and how his thinking can be used to make
predictions in baseball.
Heraclitus (around 500BC) said that all is constantly in flux. The quote attributed to him (which is not exactly what he
said) is, “You cannot step into the same stream twice.”
The idea here, which goes back almost as far in Eastern Philosophy is that all phenomena are dynamic. That is, in a
constant state of change. I’m going to call that variation. Variation is everywhere, all the time.
In baseball players’ performance we see it all the time. The same is true in golf, etc., the same golfer playing the same
course repeatedly does not get the same score every time. There is a distribution of scores. This is, in part, a
reflection of that inherent variability.
In the early part of the 20th century mass production arrived. Interchangeable parts were developed (thanks to Eli
Whitney’s rifles) and the second wave of the industrial revolution began. In the 1920s, Shewhart was working for Bell
Labs and his job was to develop means to assure the quality of manufactured telephone so the advertising slogan “As
alike as two telephones” would have verifiable meaning. He was a physicist.
He began studying the manufacturing process and he naturally detected the above mentioned variation.
Manufacturers could not produce identical parts no matter how hard they tried. But he noticed something else as well.
He noted that from time to time something happened that was outside the natural operation of the process and gave
rise to statistical signals. Some people in statistics call them outliers, he called them “Assignable Causes” because he
found that, on investigation, an engineer could usually assign a cause to the fluctuation.
So, in fact the variation in a process can be thought to come from two sources. First there is the inherent variation that
exists in the process (and all things) and then there are ‘special’ events from time to time. These have become known
as Common Cause Variation (common to every occurrence of the process) and Special Cause Variation (a special
event, or change).
He developed a tool for economically distinguishing the two.
So that is who Shewhart is and the tool, the Statistical Control Chart, is one of his contributions to mankind.
What does this have to do with baseball?
The uses of statistics in baseball are many. Bets are settled. Who was the batter to hit .400 or more for a single
season? (Ted Williams) Who was the last batter to win the Triple Crown? (Carl Yastrzemski).
Baseball statistics are often used to compare players. Who was the best pitcher of all time? Who played more doubleheader
For teams using statistics though, the use is performance measurement and the primary purpose is prediction. One of
the first principles of Shewhart is that a process (or player skill – say batting) that has special causes is not
predictable. A process that has no (or very few) special causes is predictable, but only within limits. His statistical
control chart is the tool for deciding if there are special causes and, if not, what those limits are.
I won’t get into the mechanics of the chart. But, say a player, Joe Smith, has played for ten years and one wanted to
look at his batting average over those ten years, one could construct a model (statistical) to do so. Then, if there are
special causes (as signaled statistically) they can be investigated to see what happened that was different from the
usual variation. If not the limits, +/- some amount, of his performance can be predicted. Of course, the usual caveats
apply. If he changes teams, or is injured the predictive ability is lost because these can be considered special causes.
Another way this theory applies is with regard to training. At some point a player’s performance level at a given skill
comes into statistical control (that is, over time the performance is more or less randomly varying within those limits).
When that happens, further training, using the same method, will not improve it. And if some new method is employed
it is extremely difficult to change. Dwight Evans did become a better hitter when he adopted a Charlie Lau type batting
style (via Red Sox batting coach, Walt Hriniak) , but it took him two or three seasons to make the change successfully.
It is a non-traditional way to look at performance and has proven over time (the control chart was first developed in
1925) to be a very effective one.