Hammering Away at the Derby Effect

The HR Derby is having trouble finding participants these days. Players and teams alike are removing themselves or their employees from consideration for fear of hurting their swings and/or their bodies. It's not too hard to cite a handful of players who did well in the Derby only to launch into a second-half slump (exhibit A: Josh Hamilton hits 56* first half home runs in 2008, only to drag across the finish line with a paltry 11 second-half dingers), so it's unsurprising that players and teams spare few precautions for something widely deemed meaningless.


Players who have been confirmed or rumoured to have turned down invitations to participate this year include:

Albert Pujols
Justin Morneau
Ryan Howard
Torii Hunter
Robinson Cano
Ichiro
Micah Owings
Mark McGwire
Barry Bonds
Bobby Thomson (initially agreed, but declined upon learning Ralph Branca would not be available to pitch his rounds)

Meanwhile, eventual participants Chris Young, Corey Hart, Nick Swisher, Vernon Wells, and Hanley Ramirez entered 2010 with collectively fewer home runs than Alex Rodriguez despite their 5500 PA head start.

This article isn't about whether or not the HR Derby sucks, however. After all, it could be worse. For example, it could be the Texas League HR Derby, which was thrown into controversy when participant Koby Clemens failed to homer once after reportedly being threatened with a beaning if he dared take his dad yard. Rather, this article is about whether there has really been any detectable Derby hangover effect holding hitters back.

Before I begin, I should mention that Derek Carty looked at this very issue last year at THT. He has also recently published a follow-up on the same site using a different method. The approach I take here is more similar to the second Carty approach (compare Derby participants to a control group) than to the first (compare Derby participants to their own pre-season projections), but it is worth noting that Carty found similar results using both methods.

Whereas Carty focused only on AB/HR in each half of the season, I have chosen to instead look at all around hitting using wOBA and total PAs. My reasons for doing so are twofold:

-Derek Carty has already demonstrated, with mostly the same data I am using, that the Derby hangover has not manifested itself in worse HR frequencies than expected
-It is possible that if players or teams are concerned about the Derby affecting a player's swing, that could result in the player hitting just as many HR but suffering in other areas of hitting, which would reflect in wOBA but not AB/HR

For my study, I looked at the 80 participants in the HR Derby from 2000-2009 (some of those 80 participants are really just different seasons for the same player). For each player, I split his season into pre- and post-All-Star-Break and recorded his wOBA (by the way, the wOBA I am using here does not include SB/CS, only batter events) for each half. Here are the results:

1st Half

2nd Half

Diff
wOBA PA | wOBA PA | wOBA PA
0.411 29621 | 0.401 23182 | 0.010 6439

Like Carty, I found a drop in performance from 1st half to second half for Derby participants, but not a very large one, and, as Carty points out in his work, we would expect to see a drop in performance from any group of players who performed that well in the first half. As for how much of a drop we should expect, or whether a drop of .010 points in wOBA is indicative of a hangover effect, well, that's what we need to look at our control group for.

Carty manually selected comps for each Derby participant for his control group in his second study. Rather than repeat his process, I used a simple rule to select my control group. I just ranked all non-Derby participants in each season by first-half wRAA and took the top 8 from each season. I sorted by wRAA rather than wOBA to make sure I was not taking players with a great wOBA in small number of PAs (since they would not make good comps and would be expected to have significantly more regression in the second half than the Derby-participant group). I could have also set a minimum PA threshold and sorted by wOBA; either way accomplishes more-or-less the same thing.

Ideally, we want our control group to be as close as possible to the Derby-participant group in the first half so that we can make a good comparison of their second half performances. Let's see how the two groups compare:


1st Half

2nd Half

Diff

wOBA PA | wOBA PA | wOBA PA
Derby
0.411 29621 | 0.401 23182 | 0.010 6439
Control
0.432 28774 | 0.400 21442 | 0.032 7332


Here, we see that our control group lost .032 points in wOBA, way more than the Derby participants lost. What's more, the control group lost more PAs in the second half, so not only are the Derby participants holding up their rate production better; they're also staying in the lineup more, which is important because of the commonly cited health concerns over Derby participation.

Before we get too excited over these results, we should consider that they could simply reflect an issue with the control group. After all, it doesn't really make sense that over 20-30 thousand PA samples, the control group should lose an extra .022 points in wOBA and about 12 extra PAs per hitter in the second half over the Derby participants. If our control group were properly selected and actually representative, this would suggest that the Derby could actually be helping hitters significantly in the second half, and there's no reason to believe that to be the case. So before we accept these results, let's consider what issues might exist with the control group.

The first thing to notice is that we wanted both groups to come out as close as possible in the first half. However, the control group had a significantly higher first-half wOBA, as well as fewer first half PAs. This is by itself problematic. Remember that we expect any group that performs exceptionally in the first half to regress in the second half. The more exceptional the performance of the group in the first half, the more we'd expect them to regress in the second half. Additionally, the fewer PAs each player in the group has taken, the more we'd expect them to regress. An extra .021 points in first half wOBA and fewer PAs per player in the control group mean we would expect more regression in the second half than for the Derby group (which, in short, means this is not a good control group).

One possible way to address this is to select more players for the control group. If the top 8 non-Derby participants each year are collectively much better in the first half than the 8 Derby participants, we can select more hitters until the control group hits at about the same level as the Derby group. For example, while the top 8 hitters in the non-Derby group each year have hit at a .432 level, the top 20 might hit at close to a .411 level, which would make for a better control group. Unfortunately, that would still leave a likely problem.

As noted, even though the control group hit significantly better in the first half than the Derby group, they had fewer PAs, which is a bit unusual since we selected the control group based on the top performing hitters (who are generally given a lot of PAs). A possible reason for this presents an even bigger problem for our control group. With our Derby hitters, we know that they performed well in the first half, and that they were healthy at the All Star Break (at least healthy enough to participate). With our control group, we know that they performed well in the first half, but not that they were healthy at the All Star Break. We also know that, despite out-hitting the Derby participants as a group in the first half, they didn't participate in the Derby. There are many reasons hitters sit out the Derby, including pulling themselves out or being passed over for more well-known if less stellar-performing hitters, but one potential reason for a top-performing hitter to not be in the Derby is that he can't because he is already hurt. This was likely the case for a small number of hitters in the control group. It explains why the control group had fewer PAs despite performing better, as well as why they lost more PAs in the second half and why they regressed so much more.

Since we know none of the hitters in the Derby group were hurt as of the All Star Break, having any injured players (as of the All Star Break) in the control group will screw up the control group. Players who got hurt in the first half and still showed up in the top 8 in wRAA would have to have had a really good wOBA in the first half. That means when we look at the second half for their group, the group not only loses PAs because a player is already hurt going into the second half, they also lose a high-wOBA player, so even if everyone else regresses normally, the group as a whole will regress more than expected from losing one of its better hitters.

What this means for our control group is that we need to ensure that we have the same restrictions on our control group as we have on our Derby group. Namely, we need to ensure they were healthy going into the All Star Break. This is simple enough to do; we can do it the exact same way we got that restriction on the Derby group in the first place. We'll simply select only from the pool of players who participated in the All Star Game but not in the Derby.

More specifically, I narrowed the pool of players for the control group to the ASG starters (just because that was the simplest way to ensure an All Star actually participated and was not just selected, and because, as we'll see, doing so gave me a pretty good match for the control group) who did not participate in the HR Derby. I took the top 8 such players from each group each year (again, sorted by first half wRAA). Now, here is what the new control group looks like compared to the Derby group:



1st Half

2nd Half

Diff

wOBA PA | wOBA PA | wOBA PA
Derby
0.411 29621 | 0.401 23182 | 0.010 6439
Control
0.411 28768 l 0.395 22332 | 0.016 6436


Still fewer PAs, but an exact match on the wOBA, and we eliminated the issue of having already-injured players in the control group that was throwing off the comparison before.

With this group, we see that the loss in both wOBA and in PAs is pretty close for both groups. It's possible there are still some selection issues with the control group; for example, Derby participants over the last 10 years might tend to be more highly regarded hitters, so, even though they performed at the same level, they were given more PAs and had a slightly higher true talent level, which would cause them to regress a bit less. Still, I think the restrictions placed should take care of the most important problems, and this control group should be good enough to pick up an effect if there were one. The extra .006 points in wOBA the Derby group held over the control group seems like a pretty reasonable cushion to absorb any further problems with the control group. That gap could be explained by remaining minor selection issues, but with the injury problem controlled for, I think the control group is good enough to demonstrate a lack of any easily detectable effect.

Based on this control group, we see that not only have Derby participants not lost any home run frequency in the second half over what was expected, as Carty has shown; they also have not lost any overall hitting performance or total PAs. What that tells us is that, compared to other All Star participants who had equally good first halves at the plate, hitters who participate in the Derby haven't lost any more production or playing time over the past ten years. If there is any meaningful hangover effect, it is certainly far more subtle and less nefarious than it is often made out to be, and it's not reflected in a simple glance at the second half numbers.


*56 first half HR includes 35 unofficial HR from Derby itself
Continue Reading...