Sabermetrics: A Science
In this series I will describe the relatively new phenomena in baseball called sabermetrics. In the 2000's, sabermetrics have really started to gain popularity. Front office decision makers and hardcore, modern fans have really started to pay attention to these stats.
Why aren't sabermetrics well known by casual fans? Well, old habits die hard. Everyone loves batting average and home runs and ERA. However, these simple stats can be influenced by teammates, the ballpark, and worst of all, luck. I am taking on the responsibility of teaching you about the improved statistical analysis called sabermetrics.
Installment III: FIP and xFIP
My most recent "Sabermetrics: A Science" installment was on OPS and OPS+, which is a pretty popular metric already and is very good for comparing players and putting them on an even playing field (OPS+). Since the previous metrics I covered were batting statistics, I decided that this time I will go with a pitching one.
FIP and xFIP are somewhat similar to OPS and OPS+ in that they are easier than most metrics to calculate and they are always somewhat popular. OPS is a more accurate calculation of a hitter's skill than OBP and SLG. FIP is a more accurate calculation of a pitcher's skill than ERA.
Just over ten years ago, Voros McCracken helped revolutionized baseball statistics by declaring that, basically, once the ball is in play, pitchers no longer have control over what happens. This was quite revolutionary at the time, despite seeming so simple. McCracken explained that the amount of balls put in play by the hitter seem to be inconsistent over the course of the season, therefore affecting his ERA and his WHIP, making them less accurate.
Stats like FIP are called defense independent pitching statistics. There are other stats similar to this one. Clay Dreslough created a very simple formula to determine this kind of stat called DICE, or Defense-Independent Component ERA. However, the most widely used of these stats would be FIP, or Fielding-Independent Pitching, which was developed by Tom Tango. Ultimately, DICE and FIP are very similar.
FIP and xFIP are easy to calculate. They definitely didn't take a 728 book like win shares did. Tom Tango developed this stat on his own, although it take Voros McCracken a long time on his theory. Here is how you calculate FIP:
FIP = ( (13 * HR) + (3 * (BB + HBP - IBB) ) - (2 * K) ) / IP + Constant
Note: The constant is found by calculating the FIP (without the constant). You then take the league average FIP and subtracting that from the league average ERA. This number is typically around 3.2 and I will use that example for the remainder of the article.
Now, I admit that this appears complicated and it is somewhat difficult to read in this format. Here is it written out in words step-by-step:
- 1.) Add the number of walks given up and the times they hit a batter. (BB + HBP)
- 2.) Subtract the number of intentional walks from the sum in (1). (BB + HBP - IBB)
- 3.) Multiply the difference from (2) by 3. (3 * (BB + HBP - IBB) )
- 4.) Multiply the number of home runs by 13. (13 * HR)
- 5.) Add the product from (3) to the product from (4). ( (13 * HR) + (3 * (BB + HBP - IBB) ) )
- 6.) Multiply strikeouts by 2. (2 * K)
- 7.) Subtract the product of (6) from the sum of (5). ( (13 * HR) + (3 * (BB + HBP - IBB) ) - (2 * K) )
- 8.) Divide the difference from (7) by the numbers of innings pitched. ( (13 * HR) + (3 * (BB + HBP - IBB) ) - (2 * K) ) / IP
- 9.) Add the quotient from (8) to the constant. ( (13 * HR) + (3 * (BB + HBP - IBB) ) - (2 * K) ) / IP + Constant
Now, I admit that this seems pretty lengthy. So, to further simplify things, I will use an example. Say a pitcher had this stat line:
200 IP / 175 K / 20 HR / 60 BB / 5 HBP / 1 IBB
With that stat line, his FIP would be found by doing this:
1.) FIP = ( (13 * 20) + (3 * (60 + 5 - 1) ) - (2 * 175) ) / 200 + Constant
2.) FIP = ( (260) + (3 * 104 ) - (64) ) / 200 + Constant
3.) FIP = (260 + (312 - 64) ) / 200 + Constant
4.) FIP = (260 + 248) / 200 + Constant
5.) FIP = 508 / 200 + Constant
6.) FIP = 2.54 + Constant
7.) FIP = 5.74 (assuming the constant is 3.2)
Now for xFIP. I will not go into as much detail on this one as I did on FIP. xFIP is really just a small refinement of FIP. As you can see above, FIP includes home runs, which it says are directly the pitcher's fault. xFIP determines that even this assumption is incorrect and that pitchers do not necessarily have much of an impact on home runs. Therefore, xFIP replaces the home run stat with a common home run to fly-ball ratio. It is slightly more accurate than FIP in its raw form.
What it tells us:
McCracken's finding that pitchers have very little influence on balls in play influenced how many baseball executives evaluated pitchers. As we all know, ERA can fluctuate a lot. It is a terribly inconsistent indicator of how a player's future will turn out. Things can change, such as fielding, the ballpark and even just pure luck. For example, look at the inconsistency in Javier Vazquez' ERA throughout his career, starting in 1998:
'98 - 6.06
'99 - 5.00
'00 - 4.05
'01 - 3.42
'02 - 3.91
'03 - 3.24
'04 - 4.91
'05 - 4.42
'06 - 4.84
'07 - 3.74
'08 - 4.67
'09 - 2.87
There you have it, Javier's ERA year-by-year. As you can tell, it is very inconsistent. I purposely took these numbers out of context, leaving out his age, the team he played on and every other determining factor. How can you predict his ERA the next year? You can look at his '04 ERA and it would tell you to stay away from this guy, or you can look at his ERA from '09 and say that he is a Cy Young-type pitcher.
The point of FIP is to create a consistent number throughout the player's career. The main purpose of these two metrics is to make the prediction of the player's next season more accurate.
Let's take a look at the top 10 in ERA and their FIP and xFIP last season:
ERA (FIP) xFIP
1.) Zack Greinke - 2.16 (2.33) 3.15
2.) Chris Carpenter - 2.24 (2.78) 3.38
3.) Tim Lincecum - 2.48 (2.34) 2.87
4.) Felix Hernandez - 2.49 (3.09) 3.42
5.) Jair Jurrjens - 2.60 (3.68) 4.34
6.) Adam Wainwright - 2.63 (3.11) 3.36
T7.) Roy Halladay - 2.79 (3.06) 3.05
T7.) Clayton Kershaw - 2.79 (3.08) 3.90
9.) Javier Vazquez - 2.87 (2.77) 2.82
10.) Matt Cain - 2.89 (3.89) 4.22
- According to FIP, Zack Greinke was still the best of these pitchers in the league, but the difference between him and Lincecum is much, much smaller.
- According to xFIP, Javier Vazquez was the best of these ten and the best in all of baseball, just ahead of Lincecum. Apparently Vazquez was very unlucky with the long ball.
- When looking at FIP, Jair Jurrjens was the luckiest pitcher out of these ten, which makes sense when you compare his K/9, BB/9, etc. from the two previous seasons. The unluckiest was Lincecum.
- When looking at xFIP, the luckiest was still Jair Jurrjens. And no one was unlucky.
- Overall, Jurrjens was luckiest while Lincecum was the least lucky.
- Obviously, pitchers do have an influence on some of the balls in play. Some hits go right to the pitcher, who has to make a play and sometimes it is the pitcher's fault if a ball falls for a hit rather than being an out. This equation assumes that a pitcher has absolutely no control over a ball in play. Unfortunately, this assumption is wrong.
- This is very important. FIP and xFIP do a better job of predicting the future than telling how a pitcher did at that time. Therefore, Jair Jurrjens did not have a bad season last year. He did very well.
- FIP does not account for the actual runs scored. You can not look at an FIP and say that they gave up this many runs, that just isn't how it works.
FIP is a very useful metric. It is not popular just yet, but it should become somewhat popular in years to come. If you have a fantasy team, pay attention to this stat! It is VERY useful in predicting a pitcher's next season. It is not always right, of course, but it is a much better indicator than ERA is.
I am definitely a fan of this stat. However, do not take it out of context. Throwing this out there on its own and saying it is the sole reason for pitcher A to be better than pitcher B is not even close to right. It has some big flaws and it is not very telling of how a pitcher performed that particular season.
- Saberlibrary is a very good site for information on sabermetrics, you can find its article on FIP here.
- FanGraphs is the source for my statistics on the 2009 ERA leaders and is a very good site if you are looking for advanced stats.
- The Waiver Wire has a good blog on both FIP and xFIP. It does a good job of comparing the two.