I played with some numbers this week, came up with some fairly fancy stuff, then decided to throw it all away and simplify. My conclusions weren’t unreasonable, I was just digging too far for what I was trying to prove. After all, once you’ve decided that 2 + 2 = 4, you don’t really need to break it down to (1 + 1) + (1 + 1) = 4, do you?
What I was working on was seeing where the Rangers need to improve to get into contention. A fairly easy answer, you’d think – pitching. And you’d be right. But there’s a little more to it than that, so here goes:
First of all, a little note about what I’m looking at. I’m analyzing the American League from 2000-2007. That means there are 112 teams, and 32 of them made the playoffs. Any numbers I reference will apply to that dataset, unless I say otherwise.
Wins: Overall, for the decade so far, the Rangers rank 10th in wins, with the obvious suspects (Detroit, Baltimore, Kansas and Tampa) behind them. At 610, they’re a long way behind the leading Yankees with 773. Sadly, 2nd, 4th and 5th spots are held by Oakland, Anaheim and Seattle, showing just how much they’ve left the Rangers behind. The Rangers are in a bit of a gap right now, 34 behind Toronto and 41 ahead of Detroit. It will take quite a bit for that to change in the next couple of years – could they finish 17 games ahead of TOR, twice in a row (doubtful), or 21 games behind DET twice in a row (actually, quite possible now, this year they were 13 back of DET, and you can only assume they’re going in opposite directions). But as it stands, you’d probably be pretty safe putting your money on the Rangers finishing 10th for the decade, and that’s just about right given their performance.
Runs: Given the win totals, you may be surprised to discover that the Rangers are third in runs scored (6783), behind the Yankees and Boston. Obviously scoring hasn’t been a problem, although probably a lot of this is due to the Ballpark producing runs that other places don’t. That, or that the Rangers have always been able to hit pretty well (more on that later). The rivals in the division all scored in the middle of the pack.
Runs conceded: Ahh, here we see why they’re 10th in wins. Having given up 7073, they also finish third in allowing runs, not too far behind KC and Tampa for the worst pitching. We also see why the division rivals did so well, OAK, LAA and SEA finishing first, second and fourth in runs allowed.
Run differential: They’ve allowed 290 runs more than they scored, which puts them in 10th place. Actually, the differential across the league almost exactly follows wins, validating Pythag (not that it needed it). In fact, OAK and BOS swapped places, and CLE dropped two spots in their wins compared to differential, but otherwise everyone was where they were supposed to be. The Yankees scored 1049 runs more than they allowed to lead. The D-Rays allowed 1349 more that they scored (presumably not all to the Yankees).
Okay, those are some pretty simple numbers to look at. In summary, the Rangers are 10th out of 14 teams, and they pretty much deserve to be there. It’s all about the pitching, too.
Baltimore, Kansas, Tampa, Texas and Toronto were the five teams who never made the playoffs in that timeframe. BAL, TOR and TB have an excuse, playing in the AL East where they have no chance of competing with the money men in Boston and NY. KC and TEX? Doormats for their divisions.
Looking at how teams made the playoffs, we can see some patterns. The obvious ones are “score more runs” and “allow fewer runs”.
Ranking all 112 teams by runs scored, 16 of the top 21 teams made the playoffs. The rest scattered throughout, but skewed towards more, with the 05 White Sox ranked 84th being the lowest scoring team to make it. If I was to put a dividing line though, it would be at 26 out of 54, which is close enough to 50% of teams making it. That run total is exactly 800 – meaning if you score 800 or more runs in a season, you’ve got about a 50% chance of making the playoffs.
The problem with that though is that the Rangers have scored over 800 every year, with a low of 816 this year. Obviously the pitching is weighing them down more than we thought. Either that or the Ballpark is really inflating those numbers. Unluckiest Rangers? 2001, scored 890 to rank 15th.
So let’s turn to pitching. Similar pattern: 19 out of the top 32 made it. Lowest was again the White Sox, 2000 this time, who allowed 839 runs. This time I’m going to put the break-point at 790, which is where 25 out of 52 teams got to the playoffs, or once again just about 50% of teams.
The Rangers had a year with 784 (2004) and a year with 794 (2006), and then you jump down to 844. As you keep going down and down, you discover the bottom of the list, where the Rangers fill three of the four worst pitching spots (interrupted by KC at #2). Worst was the 2000 team, with 974 runs allowed. Yep, it’s the pitching.
Okay, so the hitting for the Rangers has been good, every year they’ve been above that break-even point for making the playoffs. The pitching has been bad, just one year barely above the break-even, and most years way down in the dumps. Can’t argue with the numbers: it’s the pitching.
But it’s the differential that counts. When I calculated the R-squared values for runs, runs allowed, and differential against wins, the differential easily is the most reliable predictor of wins. Runs gives an R-squared of .487, and RA gives .513 – both reasonable values, showing they’re pretty useful predictors. But differential gives an R-squared of .900, a highly significant result. Look at differential and you can almost exactly predict the wins.
I then tried something a little interesting, and came up with a result that was a lot interesting. I took the formula for the linear trendline for differential (y = 0.100x + 80.81, which pretty much matches the theory of 10 runs equaling one win by giving an 0.1 on the x value, and if you’re wondering why the last number is not 81, it’s because interleague play means there are not the same number of wins as losses), and plugged it in against both the runs scored and the runs allowed. That should show where the wins are coming from.
Here’s an example, the 2004 Rangers. They scored 860 runs. Plug in the formula and that equals 88.48 wins as a result of batting. They allowed 794, which results in 81.32 pitching wins. The batting was good but the pitching was barely average. How did they then win 89 games? Well, I discovered that if I multiply the batting wins by the pitching wins, then divide by 81, I get 88.83, which rounds to 89. This makes sense, because what I think I am saying is that the hitting won 88.48 against average teams, and the pitching won 81.32 against average teams, and if you combine them they get slightly better. Anyway, by doing this math against all the teams, I found a highly accurate predictor of wins, in fact the two numbers (wins and this combined result) correlate at an R-squared of .894. I think I can use this to show just how much a team’s hitting or pitching was worth.
What’s interesting is how the multiplying of the two sides (batting and pitching), then dividing by 81, pushes the result towards the actual wins. You saw just above how the good hitting and average pitching combined for the Rangers in 2004. If you get one of the numbers very close to 81, the wins will be very close to the other number. If you push them both in either direction, they exponentially increase the wins. Take the team with the best record, the 2001 Mariners who won 116 games. By this method, their batting was worth 96 and their pitching 99, combining them we get 118 wins. In other words, the sum of whole is greater than the sum of the parts. Flip to the other end, the 2003 Tigers, hitting of 57 and pitching of 66 combined to 47 wins (compared to the actual 43). With each side being bad, they push each other down even further. If they’d only had average pitching, they would have won 57, but they didn’t, so they fell even further. If I was naming it, I would call this rubber-banding, where the two sides are bouncing against each other, and if one stretches in one direction the other gets pulled that way too.
So what is the use of this tool? Look at the Rangers from 2000-07. They won between 71 and 80 games every year except one, when they won 89. How did they do it. Good hitting and bad pitching, right? But how good was the hitting? How bad was the pitching? Would you say their 76 win average was due to 80 hitting wins and 72 pitching wins, or something similar? And that exceptional year, was it a huge change to the system, or just random variation pushing them up for a year? Let’s calculate:
The hitting side was remarkably consistent, running from a 91.9 in 2001 to 83.4 in 2007 (proof that the 07 team were the worst hitters?). Just minor variation in the big scheme of things. The pitching varied much more though. In 2000 they scored a 61.5, while in 2006 it was 82.4. Three years in the 60s, three in the 70s and two in the 80s. Wild variation, and not even a consistent flow, as they jumped up and down like water on a hot griddle. And what was the result? Their actual wins were dampened by the hitting, so they floated in the 70s most of the time. 2004, the 89 win year, was simply a case of the hitting staying on the high side of where it was (the 88.4 was the third highest hitting score) and the pitching coming up to average (81.3, one of only two times it went above average). There was no big breakthrough, there was simply pitching being barely adequate.
And where does this leave the Rangers? With the knowledge, if they didn’t know it already, that their pitching sucks. More precisely, a quantitative tool they can use to see just what they need to do to improve. If they hold the hitting steady, just how much do they have to improve the pitchingto get to be contenders? After all, by keeping their hitting at the same level in 2004, and just getting the pitching to average, they were in contention until the last week of the season. Take another step, push those pitchers just slightly above average, and they could contend for some time. Going back to the earlier stuff on run counts, they’re doing fine in the offensive numbers, but they need to get their pitching down – in 2007 they allowed 844 runs, and the break-even point is 790. Where can they gain 54 runs in pitching? Oh yeah, and even if they do that, it just puts them on the very edge of the competition, not deep into the playoff zone.