Our desire to reach into the future will always exceed our grasp. But debunkers go too far when they dismiss all forecasting as a fool’s errand.Philip E. Tetlock, Superforecasting
Whether you are ranking players straight up or in tiers, projecting what their stats will look like after game 162 of the upcoming MLB season, or using someone else’s projections then tweaking based on your opinion, you are forecasting the future. A necessary skill in fantasy sports.
When we acquire a player in fantasy baseball drafts, we are essentially buying stock in that player hoping to make a profit. The player needs to produce the stats necessary to help defeat the other teams in the league.
I’ve been playing fantasy sports for a long time and I’ve always been a, “use somebody else’s projections to calculate player values but don’t follow them religiously,” kind of guy. I believe I am not alone in this regard.
Over the years I’ve always been fascinated, not that other people sit down creating baseball projections, but that they make them available for others to scrutinize.
Why Am I Creating Baseball Projections?
Creating baseball projections are my blue duck. I’ve always wanted to do my own projections for fantasy baseball but was too intimidated until I read Mike Podhorzer’s (@MikePodhorzer) Projecting X 2.0: How to Forecast Baseball Player Performance. He does a tremendous job of explaining and demonstrating how he does it. Here’s an example:
The most precise way to go about developing a player projection is to forecast the individual components that affect the statistic in question. For example, instead of manually projecting a hitter’s home run total, the more effective method is to project the variables that directly influence it – the hitter’s strikeout and fly ball percentages and home run per fly ball rate.
In the book, Mike guides you through setting up a spreadsheet for your projections. Tanner Bell (@smartfantasybb) took it a step further over on SmartFantasyBaseball with his own spreadsheet (same link as above). With Mike and Tanner providing the foundation, I decided I was going to make my own projections.
Note the date of that tweet. This took a loooong time.
What I Learned Creating Baseball Projections
When I started I had every intention of analyzing each player and… I don’t have the time for that. I have a real 40+ hours a week job. Plus, I’m not nearly knowledgeable enough to make hunches if a player will or will not improve this rate or that rate.
So I decided to build a formula that provided me with what I need for the spreadsheet. This formula takes into account the player’s past MLB and AAA history. It also takes MLB averages into account because, you know, regression and all that.
Now I’m not going to offer up the exact formula because I’m afraid I would react the same way Moneyball’s Billy Beane reacted when he gave his pie to Monica Gellar.
I learned to chase perfection. Early on in this endeavor, I started calculating the Pythagorean W-L records of each team based on runs scored by hitters and runs allowed by pitchers. My thinking was that if the W-L records passed the eye test then I was probably on the right path. Uh oh.
I purposely started with a, “worst to first” mentality. So, I quickly realized I had a problem when Baltimore’s Pythagorean W-L was 81-81, Detroit’s 79-83 and Miami’s 84-78. Something was not right. Digging deeper proved that the issue was with pitcher’s projected earned runs. My projected league-wide ERA was 4.01, something that has only happened in the “real” MLB five times since 1995.
So I went back and double-checked Mike’s formula for calculating earned runs. The book mentions that the formula he used is a modified version of the Base Runs formula. On that same link is another version of the formula. I plugged that one in and the results were much more palatable. League-wide projected 4.61 ERA which is a tad high but closer to last season’s 4.51 MLB ERA.
While I learned to chase perfection, I also learned that I will never catch perfection. There is no way I will ever be able to get league-wide projected runs scored to equal runs allowed. Therefore, the Pythagorean W-L records will never be perfect. I spent way too much time trying to accomplish this. Way too much.
Playing time is another aspect that I dwelled on too much. I started with Steamer’s projected playing time and then tweaked to get down to somewhere between 6,000-6,200 for each team. I repeated the process but factored in ACT projections. Ultimately I ended up comparing projected playing time from various sources to get to my final number.
Overall, I’m glad I finished my projections. Maybe I should have opened with that.
I thought about throwing in the towel a couple of times but stuck to it. I am already thinking of ways to improve the process for next season.