Yesterday I wrote about the random strike zone, and whether a pitcher’s location in the strike zone had any effect on the result of the pitch. The obvious ones, a ball and a called strike, are mostly due to location, but the pitches that a batter actually swings at seemed to have a random result regardless of location. Today I was thinking about that, and wondering if I was seeing something that wasn’t there, just imagining it, or if I was looking at too small a sample for just one pitcher (Kam Loe) and maybe that was biasing things. So I dug into my database, pulled out the five pitchers who have thrown the most pitches that Gameday has recorded full details for, and analyzed them in the same way.
Those five pitchers are Miguel Batista, Dan Haren, Jake Peavy, Kelvim Escobar and Jarrod Washburn. They’re not necessarily the best pitchers, they just happen to be pitchers who’ve thrown a lot in ballparks that have the Gameday system installed. As it so happens, Haren, Peavy and Escobar are having very good if not excellent seasons, while Washburn and Batista are right about average. If I had dug a little deeper into the numbers, I would have had Kevin Millwood and Loe in the top twenty, but I wanted to get the most pitches I could to see any trends.
As it turns out, all five pitchers ended up showing almost identical results, and so I’m not going to print a bunch of graphs for all of them. I’m just going to pick Dan Haren, since as an Oakland A he is in the Rangers division. The shapes on the graphs for the other pitchers are slightly different, but the conclusions will be the same for all of them.
To remind you of what I was looking at, I took all the plays that Gameday had full data on, and sorted them by result type. I then took the results that were affected by the batter (where he swung at the ball, and either fouled, put it in play, or missed), and began charting them. Above is Haren’s overall chart, showing where that pitch crossed the strike zone (from the catcher’s perspective). Click on it to enlarge, but you won’t see much more than I see, which is an entirely random group of dots, with seemingly no pattern to them. Overall, you can see a pattern, that of up and left to down and right, but there are no specific clusters of the different kinds of results.
Here are foul balls broken out by themselves. Again, up left to down right, but nothing else interesting.
The In Play, Out result shows the same thing as the foul result did. I can’t see a difference.
The In Play, No Out (basically hits) result is the same, although there might seem to be a slight trend to it being a little more horizontal than the others.
In Play, Runs result. Hardly even worth doing, because the dataset is so small, but I think if you imagine really hard you can see it is like the one above.
Last, and surprisingly, not least. After looking at the graphs, I then went back to the data. I divided the strike zone into one foot by one foot areas, and started doing some counting. I wanted to see how homogeneous the data really is. I took the half dozen zones that actually had a reasonable number of data points in them (the smallest had 42 and the largest 159), split them out by result type, and looked at the percentages. If my theory that the result is unrelated to the location of the pitch is true, then the one foot square zones should show similar counts to the overall total.
The result of that test surprised me a little. As it turned out, the zones were almost identical to the overall, except in two places. In one zone, inside to a right hand batter (to be precise, -2 feet to -1 feet horizontally, 2 feet to 3 feet vertically), the batter is more likely to put the ball in play and make out, in fact about a third more than the other places in the strike zone. I attribute that to a batter getting busted inside, probably hitting one off the handle to third or short. The other place, which had by far the biggest difference, was down and away to a right hand batter (0 to 1 feet horizontally, 1 feet to 2 feet vertically), basically the outside half of the plate and down. At this point a batter was about two and a half times more likely to swing and miss, which I think you see a lot, the batter reaching to hit a pitch outside on a two strike count and striking out.
Here’s where it gets interesting though. Of the five pitchers I listed, all were right handers except Jarrod Washburn. All showed the same pattern except Washburn. He showed the same on the pitch middle inside, but on the down and away pitch, his chart showed batters swinging and missing on what would be down and in to a right hand batter. I don’t know enough to explain this. The Rangers don’t have any left hand starters, so I don’t have much experience analyzing their pitches, but also, I don’t know if it’s due to Washburn facing a different percentage of left hand batters, or some other function of being left handed. If that were the case, wouldn’t he also show a similar pattern on the middle inside pitch? I need to look at more data to say anything definitive.
So, an interesting analysis in the end. It adds to my previous analysis of Loe, and confirms those results, that location within the strike zone does not affect the outcome of the pitch, from the pitcher’s point of view. One of the things I recalled during the day today was seeing the hot and cold zone charts of various batters. I don’t remember seeing them for pitchers, but that doesn’t mean they don’t exist. Or, maybe, they don’t exist because they don’t show anything. Since there are hot and cold zones for hitters, maybe what we are seeing is the pitcher pitching around them – the pitcher ends up being random because he faces a large variety of hitters, who have different hot zones. The aggregate of the hot and cold zones for batters could end up making an amorphous blob for pitchers.
Thus, my next task will be to look at some batters, and see if I can discern their hot and cold zones. If that’s the case, then I think I could conclude that my analysis here is right, and that the outcome of the pitch is decided more by the batter than the pitcher, at least when it comes to location. And then I could say that the pitcher is involved in other ways, either by speed, or how much the pitch drops, or the break on the pitch, and it is those things that matter from the pitcher’s perspective. But right now, I’m looking at the old adage that says that anyone will hit a pitch grooved down the middle of the plate, and the data shows that no, they won’t. But you try telling a pitcher that.