The Z Files: Showing Your Work

Written by

Updated on April 20, 2017 8:31AM EST

Back in the winter, when we were finalizing plans for the 2017 season, I lobbied to take over the site's Weekly Pitching Rankings column. The reason was, to be honest, I've yet to see a column of this nature handle rankings as they should be done. To be clear, I don't know the method utilized by others, but I've yet to see someone detail their process and have it satisfy my requirements. Perhaps this is unfair, but my sense is almost everyone starts with a general ranking, then adjusts by feel based on the number of starts, home versus away, and quality of opponent.

The correct approach is deriving a projection for each start, summing up the components for those with a pair of starts then determining how the projected performance helps or hurts our team. I already have a system to generate a daily projection for DFS, so it wasn't hard to extend that over the entire week using the best probable starting grid in the industry, conveniently found right here on RotoWire. The trick is figuring out how to rank the projected performance as it relates to our fantasy team's performance.

Unfortunately, I admittedly failed in my initial process. It wasn't apparent the first two weeks of the season, but was readily obvious for the current week's initial rankings. When called out for the odd-looking rankings, I did what all pundits do – I got defensive. Then I remembered what got me this job, and my stature in the industry: a willingness to accept criticism and learn from it. Even though it was unconventional, I re-ranked the pitchers and asked for a re-post, which included an apology and an explanation.

Before I go all nerdy on you, it's understandable if you're disappointed to be paying good money for someone essentially learning on the job. It's my hope this is balanced by an appreciation for someone striving to get it right. I'm confident the adjustments will get us a lot closer to where we need to be. The impetus for these changes came directly from the comments. It's my sincere desire you continue to question anything of concern and offer constructive criticism. Obviously, it's your right to voice whatever opinion you have, even those of the non-constructive variety. I'm more likely to engage with the former than the latter.

Let's start with how the per-game projection is generated. The starting point is the rest-of-season projection. This assumes half of the hurler's starts will be at home so the first step is neutralizing the projection. Adjustments are made based on opponent, home/away and venue. Note there's two distinct alterations for home/away and venue. Hitters and pitchers both enjoy organic skills boost at home, irrespective of the venue.

These adjustments are done individually on the component skills: strikeouts, walks, hits and homers. Historical data pertaining to both the venue and opposition is utilized. Clever, if I do say so myself, Excel programming does all the work. All I need to do is paste in the probable pitching grid followed by projecting the innings pitched for each start.

The strikeout projection is straightforward. Earned runs are built out from the skills, just like in conventional seasonal projections.

Wins are a bit more complicated. If I'm estimating the number of runs a pitcher surrenders, I'm also doing the same for his team's offense by way of the number of runs his mound foe is anticipated to give up. Using Bill James' Pythagorean Formula, the win probability based on the runs scored by each team can be computed, after accounting for the runs allowed by respective bullpens, tempered to the innings projection. The further the pitchers works into the game, the better his chance of a win.

Not only does the coding generate the projections, it sums them up for multiple start pitchers then does the ranking. I do need to write the comments; I'm not that good of a programmer.

The initial method was described in the inaugural installment of the series so I won't waste bandwidth here. The problem with the process was this: the math was fine, but it was devoid of a theoretical basis. That is, it fell short of addressing the question, "How does the expected performance help or hurt my team?"

The specific issue pertains to ratios. It's clear the higher the number of wins and strikeouts, the greater the effect on your team. Ratios are trickier, since we're dealing with a different number of innings for each hurler.

My solution is to determine how each pitcher's ratios affect a baseline ratio. When I did the adjustment for Week 3 rankings, I used a 3.80 ERA and 1.25 WHIP over 1400 innings. Each pitcher's summed innings, walks, hits and earned runs are added to the appropriate baseline, deriving a new ERA and WHIP based on their projected numbers.

We now have a number that can be ranked, rotisserie-style, in each of the four standard categories germane to a starting pitcher. If there are 150 pitchers scheduled in a given week, 150 points are assigned to the leader in wins, leader in strikeouts, lowest ERA and lowest WHIP. The points are totaled and ranked greatest to fewest.

While I like this better than the original incarnation, it still doesn't completely answer the all-important question of how each pitcher helps or hurts your team, at least not in an absolute sense. We're close, but not quite there. I have two concerns.

The first concern is the baseline. While it's not arbitrary, it may not be best for all leagues. It was chosen to approximate an average staff in a 15-team mixed league. The baselines are obviously different for American and National League only and mixed leagues of different sizes. Not to mention, this doesn't include all the various points formats.

It's easy enough to set a separate baseline for AL and NL only. But even then, while 12 may be the standard, not all leagues play with a dozen teams. This is a shortcoming we're going to have to live with. The rankings will use a separate baseline for AL, NL and Mixed, but they can't account for all the variations.

The second concern gets to the crux of the matter. Am I truly capturing how each pitcher helps your fantasy team, or is it still just elegant math? In rotisserie-style scoring, the team leading each category gets the most points, regardless of the number of stat units it's ahead of the next team. I'm doing the same here, which is a flaw. I need to make an adjustment so the points awarded are relative to the impact, not just scored top to bottom. I'm still bandying about ways to do this. I'll settle on a method by the weekend, when Week 4 rankings will be posted.

That leaves us with the proverbial elephant in the room. Even with the perfect ranking system, the output is only as good as the input. Everything revolves around my rest-of-season projections for pitchers, along with the prowess of the opposing team's hitters. This was as much of an issue in Week 3 as the ranking process.

The chief advantage of using an objective means of ranking is spreadsheets aren't influenced by recency bias. Some of the comments obviously were a result of a pitcher's first couple of outings, good or bad. This question transcends this discussion, and will be the topic discussed in next week's Z Files. How should we let early results alter our initial expectations, be it for the pitcher or team's hitting? Is Jason Vargas really this good? Is Kevin Gausman really this bad? How do we discern when we're flat out wrong in our initial expectations versus when the player or team is just off to a hot or cold start, which will get blended in with an opposite stretch later in the season?

The bottom line answer is, "I don't know" to all those queries. However, I have developed a means to adjust in-season projections based on when skills begin to exhibit a different baseline. The time isn't the same for all the skills, so the adjustment is more than a weighted average. There's a different coefficient for the weight of each skill. As mentioned, I'll review this method next week.

Week 4 will be when rest-of-season projections get incorporated into the rankings. This by no means suggests the inputs will now be correct; I could still be wrong on a player or team. But it does begin to regress the player's initial expectation to current skill level. So, while Vargas isn't this good and Gausman this bad, their rest-of-season skills will move ever-so-slightly based on their early performance.

The final iteration to the method borrows from some research conducted by my friends and colleagues at Baseball HQ. Hot and cold streaks are a hot button topic, especially in DFS. Something tangible is that pitchers throwing well will, more often than not, continue to throw well. Obviously, at some point they don't, but using probabilities, it's been shown that there's better than a 50 percent chance a pitcher with a string of strong games extends that string. As such, beginning this week, I'm adding a recent performance element to the rankings.

Even after all the adjustments, there will no doubt be disagreements. Some will emanate from differing baseline opinions on each player. While I'm concerned about Gausman's drop in strikeout rate, I'm not ready to write him off. The rest-of-season projection will lower his whiff baseline, which in turns raises his earned run baseline. Relative to where he landed this week, Gausman will drop next week, but probably not enough for everyone. There's nothing I can do about that. I trust my judgment, you trust yours. It's your team, hence your call.

The point of contention I hope this discussion helps assuage is getting used to how a lesser two-start pitcher compares to a better guy with just one game. When you think about it, that's the main use of the rankings. We all know the no-brainers to start and sit; it's the group in between that needs clarification. I'll be honest, I'm fighting you all being preconditioned to rankings done by feel, where the true influence of a two-start option isn't appreciated. I hope to change that.