The Rubber, Week 6: Why BABIP Doesn’t Really Matter Until June
It’s commonly accepted that BABIP tends to regress toward the mean. I cite the stat in virtually every article I write, and almost any fantasy writer who ascribes to sabermetrics will use it in a a lot of their pieces.
But I think there is a problem with using BABIP early in the season in the same way we might in the preseason when we’re either using a full season worth of BABIP from previous years or even a career BABIP. What I mean is that using BABIP as a reason to “sell-high” or “buy-low” on a pitcher isn’t as useful in May because it doesn’t take much for the regression to occur completely.
Let me give an example.
Anibal Sanchez had a .316 BABIP before his start last night. He faced 25 batters, allowed eight hits, gave up one homer and walked no one. Those eight hits pushed his BABIP up to .333. But if Anibal had given up three hits, his BABIP would be down to .289, which is basically league average as the league BABIP for pitchers was .290 prior to Wednesday. We’re talking about five hits being the difference between a league average BABIP and one that is 43 points above league average. But because the totals are still so small, his BABIP could easily be back around .290 before June.
I’m not saying Anibal is definitely going to positively regress. But I am saying that if you’re trading for a guy because of some potential positive BABIP regression, it’s only going to take a few hits over the next few starts to eat up that regression.
Glenn DuPaul is a college student who is currently interning for Baseball Info Solutions (aka the company that provides all the stats for Fangraphs and Baseball-Reference). I had the chance to have him as a guest on the podcast, and I could not have been more impressed of his knowledge of how the proverbial stats sausage is made. I’m willing to say that there is almost no one I respect more when it comes to baseball statistics.
I say all that to say this, Glenn mentioned on the podcast that almost anytime you see a guy outside the .270-.310 range in BABIP, you should expect regression. Obviously, there will always be a small number of exceptions, but regression outside that range should almost universally occur.
As we just discussed, we’re still at the point in the season where guys outside of that range can get back within it in just a start or two. How far does someone have to be out of that range at this point in the year for regression to take more than a few starts?
To answer that question I created a spreadsheet containing all starters with 20+ innings prior to Wednesday games along with the BABIP relevant stats. Then I created a function to serve as my own little BABIP calculator so I could enter new numbers and quickly see how much BABIP was affected.
In this sample, the average BABIP was a touch lower at .287. I then calculated the standard deviation to try and target some guys whose BABIP is currently well outside that accepted .270-.310 range. This yielded a sample of 18 starters whose BABIPs are between one and two standard deviations below the mean and five pitchers whose BABIP is two or more standard deviations below the mean. As luck would have it, seven of them pitched yesterday, which provided us with a nice little case study. Below is a table showing their stats prior to their Wednesday starts, their stats in their Wednesday starts, and their stats after their Wednesday starts.
The first thing that jumps out is that even though Holland had the second largest sample prior to Wednesday, one start was all it took for his BABIP to go from .240 right back into the accepted range. Holland’s .435 BABIP for the day was the highest of anyone in this sample.
Hefner has the smallest sample in the group, but seven non-homer hits on 20 balls in play caused his BABIP to spike by about 25 points. If he goes out next week and has the exact same line plus one more hit, his BABIP will be .274.
Minor’s BABIP luck continued Wednesday as his BABIP dropped to .221. Minor allowed four hits but because one was a home run only three counted ‘against’ his BABIP. Had he doubled his non-homer hit total to just six, his BABIP would have risen almost 20 points. And if Minor were to post Holland’s line from yesterday in his next start, his BABIP would jump 33 points up to .255. Even with a .222 BABIP after facing 174 batters, Minor is still just two, maybe three bad luck starts away from the accepted range.
Of the guys below Minor on the list, Villanueva and Zimmermann own BABIPs higher than Minor’s after Wednesday’s starts.
That leaves Moore.
Moore still owns a BABIP that is more than two standard deviations below the mean despite the fact that his BABIP rose by 18 points yesterday. Even if Moore were to carry a .400 BABIP over his next four games, his BABIP for the season would still be a little less than .270.
Anyone who knows anything about stats will know that two or more standard deviations away from the mean is when something becomes significant. Through this exercise I’ve basically figured out that the only people who are sure to still have a BABIP outside the normal range come June are those who currently have a BABIP that is two or more standard deviations from the mean. I certainly took the long way to get to a point that might have been fairly obvious to some from the get-go.
But I digress.
Below is a list of the starters who currently have a BABIP that is two or more standard deviations from the mean in either direction. If you’re thinking of buying low or selling high based on BABIP, these are the only guys you should do that with until June.
Allow me to offer a few thoughts on these names. First, Wood is an obvious sell-high candidate, but who is buying? As for Iwakuma, I’d probably only deal him if I could get a top 25 caliber starter in return. His strikeout and walk skills have been stellar, and I still think his rest-of-the-season numbers will be worth owning despite the regression.
I’m only trading Harvey for a top ten type guy. That list is as follows for me: Kershaw, Verlander, Wainwright, Felix, Strasburg, Yu, Lee, Hamels, Price, Sale and then maybe Bumgarner and Shields. That’s it.
Moore is the one I’d be the most aggressive about dealing. He’s walking 13.2% of batters faced so far, and he’s never going to be consistently good until the control is at least close to average. Gun to my head, I might take Iwakuma over Moore for this year.
On the buy low list, just ignore Worley, Davis, and Doubront. Yes, they’re going to be better than they have been, but that doesn’t mean they’re going to be good. Worley can’t strike enough batters out (12.7% K%), and Davis can’t keep the walks at a reasonable rate. Doubront can be tempting with the big strikeout rate, but walks are a bigger problem for him that they are for Davis.
The only guy you should truly consider buying low on is McCarthy. Of course, he’s only owned in 17.7% of ESPN leagues, so we’re really talking about adding him off the waiver wire. Either way, he does have a ton of rebound potential.
He’s similar to Worley in that he has great control but doesn’t strike out too many guys. However, his skills are better than Worley’s. His strikeout rate is closer to average (14.5%), and he has the 5th best walk rate in the league (3.2%). BABIP hasn’t been the only stat in which he has had bad luck. He also has a 60.4% strand rate in some part due to his HR/FB rate being 4.6% higher than his career rate. The skills appear to be largely the same, and he’s coming off 290 innings of 3.30 ERA work in the last two seasons.
The long and short of this piece is this: trade Matt Moore and pick up Brandon McCarthy. It took about 1,400 words just to get to that.
Big shout out to Bill Baer’s BABIP calculator. I couldn’t have done this article without it. Check it out here.