Posts Tagged ‘stats’

Old Guys; New Stats

December 21, 2012

The proliferation of new statistics in the last few years has been a mixed blessing. Some of them are pretty good, others not so much. In studying 19th Century baseball I’ve used both the traditional stats (ERA, BA, hits, runs, etc) and the newer stats (OPS+, ERA+, WAR, etc) to look at the players. The newer stats present something of a conundrum.

Below I’ve listed the OPS+ of two players. Their stat is for a five consecutive year period at the peak of each man’s career (all stats below from Baseball Reference.com):

player 1: 186/175/184/147/142

player 2: 211/207/143/176/235

Now the WAR for a five consecutive year period during the career peak for two players:

player 1: 7.9/6.8/8.6/5.7/4.8

player 2: 4.7/5.2/2.5/5.8/6.1

Next the ERA+ of two pitchers, again for a five consecutive year period during their peak years:

pitcher 1: 155/149/185/217/160

pitcher 2: 167/143/135/115/129

Finally the WAR for two pitchers over a five consecutive year period at their peak:

pitcher 1: 7.9/6.8/8.6/5.7/4.8

pitcher 2: 12.3/10.2/11.3/13.4/14.0

First, the obvious question, “who are these guys?” The first player in both OPS+ and WAR is Joe DiMaggio in the years 1939-42 and 1946 (Joltin’ Joe lost two years to World War II). The second player in both stats is Ross Barnes in the years 1872-76. And here a caveat. I realize that Barnes is in the National Association in 1872-75 and the National League only in 1876, but as his stats are available I’m going to use them. The first pitcher in both ERA+ and WAR is Lefty Grove in 1928-1932 and the second pitcher for both stats is Tommy Bond in 1875-1879 (and, again, Bond is in the NA in 1875).

Notice a few things? First, the two hitters are pretty comparable, aren’t they? According to OPS+ Barnes is better than DiMaggio three times, and in WAR is better twice. In fact, other than Barnes’ third number in both lists, they are pretty much a wash. And somehow we all know that’s just wrong. Does anyone seriously consider Ross Barnes as good as Joe DiMaggio, even if for only a five-year period? I doubt it. 

Now take a look at the pitchers. The two men are roughly comparable for the first two years of ERA+, then Grove really takes off. In WAR, Bond is consistently better. Really? Would you truly want Tommy Bond over Lefty Grove? Again, I doubt it.

So what’s going on here? Surely a number of things. First, the 19th Century players are involved in a lot fewer games played and anybody can get hot for a few games. Look up Bob Hazle in 1957 if you don’t believe me. Secondly, the nature of the way pitchers are used in the 19th Century, especially early, is so utterly different that it blows statistics completely out of kilter. Sticking with Grove and Bond, if you look at one single stat, batters faced, you see the problem immediately. In his career, the most batters Grove faced in any season was 1191 in 1930. Bond, on the other hand, faced 1408 as his low in 1875 (his high was 2189 in 1879). Think that fact alone doesn’t skew the stats? In the immortal words of Sarah Palin, “you betcha.” (My, God, I’m quoting Sarah Palin. Yutz.)

And these two things alone make it imperative that the newer stats be used carefully when looking at 19th Century players. I’m not suggesting they be ignored. What I am suggesting is that a slavish devotion to any of the stats is a mistake, particularly in the world of 19th Century baseball, where even the word, base ball, is different.

Alas Poor Stats

January 8, 2010

In an earlier post I commented on the decade. One thing I left out on purpose was the proliferation of statistics. My God, have they exploded into the public view. Most of them just say the same thing in different ways and many lead to the same conclusions. The problem with all of them is that they are flawed.

Pick a stat, any stat, and it’s flawed. Sometimes the flaw is obvious. I love runs produced (R + RBI – HR = RP). It gives you an immediate look at just how many runs a particular player gives his team. The flaw? Well, there are a couple. First, there is no context for the run. A run is a run is a run is simply not true. A run in 1965 is different that a run in 1995. Second, it leaves out contributions to runs that don’t actually produce an RBI or the run itself. A player leads off an inning with a double, is bunted to third (one out). The next man hits a sacrifice fly (1 run to the man on third, and RBI to the batter, two outs), then the next batter fans (third out, end of inning). So you have a run produced by the first and third batters, but what about the guy with the bunt? He doesn’t get it down, the fly only puts the man on third. The last batter strikes out and no run scores. Again, what abut the guy with the bunt? He gets no run produced yet his action is critical.

Some are more subtle. Take a look at WHIP. Nice stat, but knowing the number of hits a pitcher gives up isn’t the same thing as knowing the number of runs he gives up. A single that doesn’t score is 1 on the WHIP, so is a home run. Different result, but same stat effect.

 Some are just silly. There’s an old book called Super Stats that ends up with Gene Tenace as the greatest catcher ever. Oh, really?

Some come up with odd choices. A guy came up with WAR and tells me Bret Saberhagen was better than Sandy Koufax (not on Saberhagen’s best day and on Koufax’s worst maybe). I saw them both pitch and I know which was better.

So enjoy the new stats. Some are fun, some are silly, most are redundant. Just do me a favor, don’t take them too seriously and bet the farm on one of them.