The Z Files: Does Defense Matter?

The Z Files: Does Defense Matter?

This article is part of our The Z Files series.

While analyzing defense has come a long way, there's still work to be done. It's relevancy in the fantasy game pertains to evaluating pitchers, as it stands to reason their numbers should reflect superior defensive support.

Whether it's a team's penchant for making the routine play, the outstanding play or simply better positioning, the best measure for our purposes is BABIP (batting average on balls in play). This is especially apropos for those generating pitching projections using component metrics. Each hurler's BABIP is usually regressed towards league mean for that type of pitcher (groundball, fly ball). Tempering towards team defense seems like a worthwhile endeavor.

Unfortunately, there's one catch. For BABIP to be reliable, it needs to be predictable. Today, team BABIP from the past five seasons will be investigated to gauge just how predictable, hence useful, it is in projecting pitcher performance.

The method will look at the correlation between projected team BABIP and actual, both in total and broken into components. The statistics will be kept relatively simple, gauging the linear relationship between projected and actual using the Pearson Coefficient (r). By means of explanation, r=1 means there is a direct relationship, actual is perfectly predicted by the past data. If r=0, the variables are completely random. When r=-1, there's a perfect inverse relationship. In this case, it would entail the worst BABIP becoming the best BABIP and vice versa.

To get things started, here's a table with each team's overall BABIP for the last five seasons:

Team
While analyzing defense has come a long way, there's still work to be done. It's relevancy in the fantasy game pertains to evaluating pitchers, as it stands to reason their numbers should reflect superior defensive support.

Whether it's a team's penchant for making the routine play, the outstanding play or simply better positioning, the best measure for our purposes is BABIP (batting average on balls in play). This is especially apropos for those generating pitching projections using component metrics. Each hurler's BABIP is usually regressed towards league mean for that type of pitcher (groundball, fly ball). Tempering towards team defense seems like a worthwhile endeavor.

Unfortunately, there's one catch. For BABIP to be reliable, it needs to be predictable. Today, team BABIP from the past five seasons will be investigated to gauge just how predictable, hence useful, it is in projecting pitcher performance.

The method will look at the correlation between projected team BABIP and actual, both in total and broken into components. The statistics will be kept relatively simple, gauging the linear relationship between projected and actual using the Pearson Coefficient (r). By means of explanation, r=1 means there is a direct relationship, actual is perfectly predicted by the past data. If r=0, the variables are completely random. When r=-1, there's a perfect inverse relationship. In this case, it would entail the worst BABIP becoming the best BABIP and vice versa.

To get things started, here's a table with each team's overall BABIP for the last five seasons:

Team 2017 2016 2015 2014 2013
Arizona Diamondbacks0.2920.2970.3230.2970.316
Atlanta Braves0.2810.3060.2960.3090.306
Baltimore Orioles0.3120.3030.3010.2990.283
Boston Red Sox0.2950.3040.2950.3070.303
Chicago Cubs0.2870.2880.2570.2900.308
Chicago White Sox0.2930.2820.3000.3140.309
Cincinnati Reds0.3030.3000.2940.3010.281
Cleveland Indians0.2990.3050.2910.2900.312
Colorado Rockies0.3020.3090.3210.3230.311
Detroit Tigers0.2920.3210.3020.3010.314
Houston Astros0.2840.3020.3080.2870.302
Kansas City Royals0.3110.3050.3010.2880.294
Los Angeles Angels0.2950.2910.3030.2880.287
Los Angeles Dodgers0.2860.2840.2910.2990.296
Miami Marlins0.2930.3020.3070.2970.315
Milwaukee Brewers0.2820.3010.3030.3070.293
Minnesota Twins0.3040.2980.3200.3030.317
New York Mets0.2980.3220.3120.2920.299
New York Yankees0.3020.2820.2940.3030.301
Oakland Athletics0.2710.2960.3010.2910.274
Philadelphia Phillies0.3060.3070.3070.3190.300
Pittsburgh Pirates0.2990.3100.3110.3050.295
San Diego Padres0.3080.3010.2980.3070.294
San Francisco Giants0.2970.3120.2900.2850.286
Seattle Mariners0.2960.2850.2940.3000.277
St. Louis Cardinals0.2970.3010.3070.3020.289
Tampa Bay Rays0.2790.2840.3000.2860.288
Texas Rangers0.3010.2890.2930.2960.311
Toronto Blue Jays0.3080.3040.2830.2800.295
Washington Nationals0.2890.2900.2920.3040.298

Three different tests will be run, determining how well the past three years, two years and previous campaign predicted the ensuing BABIP. The correlation coefficient is for the season following the years listed.

Three year

     Year r
2015-20170.23
2014-20160.25

Two year

     Year r
2016-20170.21
2015-20160.24
2014-20150.33

One year

Year r
20160.26
20150.34
20140.36
20130.26

While none of the studies demonstrate significant correlation, it appears the previous year's defense best predicts what will occur the following season. That said, we've come a long way since Voros McCracken first introduced DIPS theory (Defense Independent Pitching Statistics). Most notably, the BABIP on component batted balls differ. The BABIP on line drives is highest, followed by grounders and then fly balls. Let's look at the predictability of each, using the same set of data as above.

GROUND BALLS

Team 2017 2016 2015 2014 2013
Arizona Diamondbacks0.2160.2360.2510.2360.257
Atlanta Braves0.2230.2490.2560.2510.270
Baltimore Orioles0.2520.2490.2300.2430.232
Boston Red Sox0.2630.2540.2420.2370.248
Chicago Cubs0.2280.2200.1960.2210.242
Chicago White Sox0.2430.2220.2580.2620.264
Cincinnati Reds0.2470.2440.2460.2390.235
Cleveland Indians0.2470.2330.2190.2250.251
Colorado Rockies0.2260.2490.2510.2470.251
Detroit Tigers0.2480.2860.2660.2570.272
Houston Astros0.2280.2360.2360.2190.241
Kansas City Royals0.2700.2650.2680.2320.255
Los Angeles Angels0.2520.2480.2570.2520.249
Los Angeles Dodgers0.2260.2270.2290.2380.226
Miami Marlins0.2350.2500.2450.2460.270
Milwaukee Brewers0.2320.2280.2380.2340.240
Minnesota Twins0.2480.2590.2590.2530.257
New York Mets0.2520.2740.2650.2530.254
New York Yankees0.2590.2370.2450.2430.251
Oakland Athletics0.2080.2380.2540.2420.209
Philadelphia Phillies0.2540.2630.2420.2780.242
Pittsburgh Pirates0.2570.2350.2350.2190.223
San Diego Padres0.2500.2650.2600.2480.236
San Francisco Giants0.2390.2450.2220.2170.216
Seattle Mariners0.2430.2500.2440.2410.232
St. Louis Cardinals0.2350.2350.2380.2550.245
Tampa Bay Rays0.2260.2490.2790.2600.270
Texas Rangers0.2460.2280.2360.2260.275
Toronto Blue Jays0.2710.2420.2280.2210.240
Washington Nationals0.2480.2530.2430.2570.246

Three year

    Year r
2015-20170.15
2014-20160.53

Two year

    Year r
2016-20170.22
2015-20160.59
2014-20150.63

One year

Year r
20160.38
20150.60
20140.61
20130.36

The ground ball data shows more correlation, to the point it's actionable. The correlation is still low, but in a couple of instances it exceeds 0.5. That said, part and parcel to this type of analysis is all other variables remain constant. With the number and extent of teams employing shifts, the BABIP on grounders may not be as stable as is typically necessary to use it in a predictive manner.

FLY BALLS

Team 2017 2016 2015 2014 2013
Arizona Diamondbacks0.1140.0870.1060.0950.101
Atlanta Braves0.1070.1000.0880.0980.076
Baltimore Orioles0.1320.0980.1090.0990.084
Boston Red Sox0.1170.0970.1100.1070.115
Chicago Cubs0.0990.0870.0670.1160.085
Chicago White Sox0.1310.0850.0930.0920.092
Cincinnati Reds0.1090.0910.0830.0870.063
Cleveland Indians0.1070.0900.1130.0920.113
Colorado Rockies0.1290.0980.1290.1030.081
Detroit Tigers0.1130.0860.0740.0890.095
Houston Astros0.1160.1060.1120.1120.104
Kansas City Royals0.1130.0790.0830.0870.080
Los Angeles Angels0.0900.0730.0890.0740.063
Los Angeles Dodgers0.1020.0730.0770.0910.101
Miami Marlins0.1090.0780.0890.0780.074
Milwaukee Brewers0.0910.0740.1000.0840.084
Minnesota Twins0.1140.0800.1060.1110.111
New York Mets0.1150.0840.0900.0830.085
New York Yankees0.1300.0650.0890.0930.077
Oakland Athletics0.1190.0850.0930.0760.075
Philadelphia Phillies0.1190.0830.0780.0880.077
Pittsburgh Pirates0.1030.1010.1070.1050.087
San Diego Padres0.1110.0890.0810.0840.082
San Francisco Giants0.1310.0970.1010.1030.094
Seattle Mariners0.1050.0560.0900.1080.072
St. Louis Cardinals0.1040.0760.1090.0820.077
Tampa Bay Rays0.0960.0670.0710.0630.082
Texas Rangers0.1180.0820.0900.0790.091
Toronto Blue Jays0.1220.0970.0720.0970.117
Washington Nationals0.1220.0740.0890.0970.097

Three year

Year r
2015-20170.39
2014-20160.51

Two year

Year r
2016-20170.37
2015-20160.48
2014-20150.32

One year

Year r
20170.35
20160.37
20150.34
20140.44

Again, the component BABIP is a bit more correlated than overall. Shifts also influence fly balls, though likely not to the extent of grounders. That said, better positioning independent of shifts could be a factor. It's only recently that a big deal has been made of players referring to cards or wristbands, reminding them where to play for specific hitters.

LINE DRIVES

Team 2017 2016 2015 2014 2013
Arizona Diamondbacks0.6340.6690.6790.6450.655
Atlanta Braves0.5790.6790.6210.6710.625
Baltimore Orioles0.6250.6780.6830.6600.626
Boston Red Sox0.5860.6560.6710.6850.660
Chicago Cubs0.5920.6380.6140.6150.669
Chicago White Sox0.6080.6660.6570.6910.665
Cincinnati Reds0.6110.6700.6450.6400.602
Cleveland Indians0.6340.6620.6550.6550.667
Colorado Rockies0.6300.6580.6660.6690.641
Detroit Tigers0.6090.6720.7060.6670.679
Houston Astros0.5950.6650.6860.6450.659
Kansas City Royals0.6010.7050.6770.6550.641
Los Angeles Angels0.5970.6520.6690.6550.657
Los Angeles Dodgers0.6220.6380.6640.6630.661
Miami Marlins0.6370.6820.6640.6430.648
Milwaukee Brewers0.5780.6900.6700.6760.635
Minnesota Twins0.6290.6470.6900.6210.673
New York Mets0.6380.6830.6620.6470.655
New York Yankees0.6120.6560.6620.6690.673
Oakland Athletics0.5810.6590.6470.6610.643
Philadelphia Phillies0.6300.6760.6900.6530.664
Pittsburgh Pirates0.5970.6740.6700.6750.659
San Diego Padres0.6260.6490.6590.6620.645
San Francisco Giants0.5930.6830.6790.6570.637
Seattle Mariners0.6160.6450.6550.6620.619
St. Louis Cardinals0.6180.6780.6670.6440.628
Tampa Bay Rays0.6170.6420.6860.6320.630
Texas Rangers0.6180.6640.6710.6760.642
Toronto Blue Jays0.6350.6920.6670.6400.652
Washington Nationals0.6100.6540.6550.6390.625

Three year

Year r
2015-2017-0.01
2014-20160.10

Two year

Year r
2016-20170.18
2015-20160.26
2014-20150.23

One year

Year r
2016-0.03
20150.21
20140.06
20130.05

Based on the groundball and fly ball data, it follows that line drive BABIP shows little, if any correlation. This makes intuitive sense, since they're the most difficult batted ball to defend.

APPLICATIONS

How one chooses to apply the above data revolves around the objective and the depth of statistical understanding. Here are some general considerations.

Even though it's not perfect, and is variable year-to-year, the best BABIP predictor is the previous season's data. Assuming the use of the shift stabilizes, the variance should reduce. That is, once all 30 clubs are comfortable with their deployment and use it consistently each season, the component BABIP should stabilize a bit.

A groundball pitcher also yields fly balls and vice versa, while both obviously surrender line drives. In order to incorporate this into a formulaic projection system, the GB/FB/LD distribution needs to be projected for each pitcher, with each component BABIP influencing overall BABIP in proportion to their individual hit distributions.

Keep in mind a team's BABIP is already baked into the player's BABIP, so depending on how a projection engine works, it could be double-dipping. The same holds true for park factors, which also affect BABIP. To properly regress towards team BABIP, this impact must first be neutralized, then accounted for in the final result.

CONCLUSIONS

Here's my gut feel. While I recognize the advantage of starting with the most accurate baseline, it comes down to balancing practicality with the time and effort required to code the regression of component player BABIP to team BABIP. Quants will argue every degree of decimal point accuracy is beneficial; I'm not so sure. There's so much inherent cloudiness in player projections already, that small degree of accuracy is totally consumed by the haze. The regression would be towards a BABIP with some additional precision, but still not an especially strong correlation. Because the correlation on line drives is random, luck in either direction throws off the overall player BABIP. As has been discussed previously, skills are only part of a projection; the playing time component is crucial. The effect of 10 percent more or fewer innings is a more significant factor than a slightly better ERA baseline.

From a projectionist perspective, a great deal of work needs to go into estimating GB/FB/LD distribution. Is it based on history? How much does the improved ability to track pitches, and hence changes in repertoire, take the process from objective to subjective? How should the addition or subtraction of an excellent or poor defender alter team BABIP?

Again, quants are thinking it's worth the effort. The more I play fantasy baseball, the more I realize it's what you do with the projection, not the projection itself. It's not a secret I do my own projections. Part of that is determining an expected pitcher BABIP, mostly based on their historical hit distribution. I do not directly incorporate team defense, at least not globally. As alluded to earlier, in part that's because it's already baked into the pitcher's BABIP, especially for those players toiling for the same team for multiple seasons. I will, however, make individual adjustments as needed, usually for individuals changing teams. I'll investigate the difference in the team defense between the clubs and massage subjectively as necessary.

Yes, defense matters, but so many things are more relevant.

Want to Read More?
Subscribe to RotoWire to see the full article.

We reserve some of our best content for our paid subscribers. Plus, if you choose to subscribe you can discuss this article with the author and the rest of the RotoWire community.

Get Instant Access To This Article Get Access To This Article
RotoWire Community
Join Our Subscriber-Only MLB Chat
Chat with our writers and other RotoWire MLB fans for all the pre-game info and in-game banter.
Join The Discussion
ABOUT THE AUTHOR
Todd Zola
Todd has been writing about fantasy baseball since 1997. He won NL Tout Wars and Mixed LABR in 2016 as well as a multi-time league winner in the National Fantasy Baseball Championship. Todd is now setting his sights even higher: The Rotowire Staff League. Lord Zola, as he's known in the industry, won the 2013 FSWA Fantasy Baseball Article of the Year award and was named the 2017 FSWA Fantasy Baseball Writer of the Year. Todd is a five-time FSWA awards finalist.
Week 4 FAAB Results - Some Hitters Emerge
Week 4 FAAB Results - Some Hitters Emerge
San Diego Padres at Colorado Rockies, MLB Expert Picks for Monday, April 21
San Diego Padres at Colorado Rockies, MLB Expert Picks for Monday, April 21
Fantasy Baseball Injury Report: Kelly's Recovery Window Uncertain
Fantasy Baseball Injury Report: Kelly's Recovery Window Uncertain
Mets-Giants, Marlins-Braves & Brewers-Pirates, MLB Expert Picks for Monday, April 22
Mets-Giants, Marlins-Braves & Brewers-Pirates, MLB Expert Picks for Monday, April 22