Saturday, January 10, 2009

Pitching, Luck, and the Nationals

You often hear pitchers referred to as "hard luck." Usually, that's in reference to the level of run support a pitcher gets and their pitchers' won-loss record. I'm not going to get into W-L as a pitching stat, but suffice it to say that I think it's an almost totally useless measure of pitcher value, since so much of what goes into it is about overall team performance.

But in fact luck is a huge factor in the results any pitcher produces. And I don't mean W-L--I mean ERA, WHIP, and other stats that fans often think of as reliable measures of pitcher skill.

If an unusually large number of bloopers fall in for hits, or if an unusually large number of flyballs drift out, or if the team defense is bad, or if the bullpen allows an unusually high number of inherited runners to score... these are all factors outside a pitcher's control that can affect a pitcher's ERA. And the effects aren't just incremental differences--relatively common variations from one season to a next can raise or drop a pitcher's ERA by a run or more.

The "luck stats" that I look at most closely are:

1. Home runs per flyballs (HR/FB%): Research shows that pitchers cannot consistently control the percentage of flyballs that become home runs. This one takes a little bit for people to accept, but it's true. Any pitcher with a rate much above or below 11% is due for a correction.

2. Strand rate (LOB%): The percentage of hitters who reach base that are stranded. A typical LOB% is around 70%, and although a better pitcher can sustain a better LOB rate than that (a great pitcher could consistently reach 74-75%), if you have a pitcher whose LOB rate is much out of step with the rest of his rates, look for regression.

BABIP: batting average on balls in play off the pitcher. Research shows that pitchers have very little control over the rate at which a ball put in play is converted into an out. Groundballs are more often hits than flyballs are, though groundballs never become homers, so groundball pitchers are better, all other things equal. BABIP is mainly a function of defense and raw luck. A typical BABIP is around .290, though again groundball pitchers may be a little higher and flyball pitchers a little lower.

There are a couple good stats that attempt to measure a pitcher's true skill-based performance, with the luck factors removed. Fielding independent pitching (FIP), which is listed on the player pages at, is the best known and is based on walks, strikeouts, and home runs, but it assumes that HR/FB is a skill and that all pitchers are equally good at stranding runners. The Hardball Times' xFIP is an improvement over FIP because it substitutes a league average HR/FB rate, eliminating that element of luck from the equation. The best (and most complicated) of all is, tRA*, which you can find at That stat (which I assume is pronounced "T-R-A-star," but I don't really know) uses rates for groundballs, flyballs, line drives, hit-by-pitches, walks, strikeouts, and homers (a pretty complete list of the events in a pitcher's control) to project runs allowed and regresses them to the league average, which helps erase sample size randomness. FIP and xFIP are scaled to look like ERA, while tRA* is scaled to RA (run average, including earned and unearned).

For a fuller explication of these points, check out this indispensable post from USS Mariner, which I still must admit taught me most of what I know about evaluating pitching.

So with all that context out of the way, let's look at the Washington Nationals pitchers and what these stats tell us. Who's performances in 2008 were most elevated (or depressed) by luck?

The chart below gives you a bunch of stats from 2008. It lists each pitcher's actual ERA, HR/FB rate, strand rate, BABIP, and tRA*. Then, I created a metric called "luck," which attempts to roughly quantify the degree to which luck influenced each pitcher's ERA. Luck factor is simply tRA* minus 0.40 (which is the typical difference between RA and ERA) minus actual ERA. (That 0.40 number is a pretty blunt instrument, so don't try to translate luck factor into anything too precise--this just gives you a relative sense of each pitcher's good or bad fortune.)

Here are the Nationals pitchers from most to least lucky in 2008:

Mike Hinckley 0.00 .213 0.0% 91.7% 4.44 4.04
Steven Shell 2.16 .225 8.3% 85.7% 4.43 1.87
Scott Olsen 4.20 .266 10.9% 71.6% 5.29 0.69
John Lannan 3.91 .273 15.2% 74.0% 4.74 0.43
Wil Ledezma 4.17 .297 5.8% 73.8% 4.82 0.25
Daniel Cabrera 5.25 .298 12.2% 72.4% 5.78 0.13
Saul Rivera 3.96 .336 4.5% 70.0% 4.09 -0.27
Joel Hanrahan 3.95 .306 11.3% 73.5% 4.06 -0.29
Garrett Mock 4.17 .322 10% 73.5% 4.12 -0.45
Collin Balester 5.51 .313 11.5% 66.9% 5.10 -0.81
Jason Bergmann 5.09 .301 11.8% 64.5% 4.61 -0.88
Shawn Hill 5.83 .373 7.7% 61.9% 4.98 -1.25
Shairon Martis 5.66 .269 19.2% 69.6% 4.45 -1.61
Marco Estrada 7.82 .336 26.7% 59.8% 4.57 -3.65

Some observations:
--We should expect Lannan and Olsen, who are tentatively sketched in as our #1 and #2 starters, to see a rise in ERA. In Lannan's case, that BABIP will certainly rise, probably by 30 or more points given his strong groundball rates (54.2% in 2008). Olsen's BABIP is even less sustainable (his career number is .300), and his strand rate will fall too. Of course, their good luck could continue, or they could counter the expected regression in good luck with actual better pitching, but we shouldn't count on it.
--We should write off Hinckley's and Estrada's ERAs on sample size. We just can't make much if any conclusions about them from those numbers. Martis too.
--Hill, Bergmann, and Balester were all significantly victimized by tough luck. They were all hurt pretty badly by low strand rates, and Hill's BABIP is just silly.
--Steven Shell was about as lucky as a pitcher can be.


JAB said...

C'mon, aren't they all at least a little unlucky? They had Guzman at SS and a rotation of scrubs at first.

Hendo said...

Great post, especially since I've been thinking a lot about BABIP lately. (The average numbers I've been seeing on BP, by the way, have hovered in the .300-305 range for the last several years.)

Considering the way BABIP bounces around, I have a hard time with projections that don't just assume a BABIP of, say, .303 and go from there. Even Bill James varies it considerably, and Marcels also oscillate. Curious.

Harry Pavlidis said...

I had to steal this from you. Thanks.