In my messing around with the Enhanced Gameday data, I’ve produced some weird and wonderful information about the Rangers rotation. Today is definitely going to be on the weird side. I’ve been working on an algorithm to automatically work out what type of pitch any particular pitch is, and I’ve slowly gathered some stuff on it. Right now I have it working fairly well manually, but I want to be able to have the computer tell me what kind of pitch it is, rather than me crunching numbers and getting a result. What I have is nowhere near presentable as a whole, but still, as I’ve gone through it I’ve found that some of the parts are interesting. For example, in the last couple of days I’ve been digging deeply into the break angle and break length fields in the data, comparing different pitchers and pitch types, and I think they are a very useful factor. I have been able to use the angle and length in my algorithm, as another input to split some of the clusters of pitches apart. You’ll recall from some of the previous articles I wrote that each pitcher has some big clumps, and they’re not very easy to separate. Break angle and length have helped add another dimension.Today is a very preliminary presentation of some of the data, just to throw it out there and show you what I’ve been doing. You’ll see charts of break angle vs break length. Check out comment #18 in this blog to see a description of each from a Gameday employee. Basically the angle is how far it breaks toward or away from the batter, and the length is how much it breaks. Not very intuitive I know. Read the comment for a better answer. Angle is measured in degrees, and the break length is in inches.
One caveat: In recent weeks MLB has been experimenting with some of the calculations and measurements. In particular, they have moved the release point measurement back and forth. In the early months, they were measuring the release point at 55 feet from the plate. During June, they started measuring it at 40 feet, then later moved it to 50 feet. This has caused some of the data to look significantly different to what has been presented before. For example, I had previously measured the different Ranger pitchers’ release point zones as areas measuring x by z, with x and z varying apparently based on how well the pitcher was pitching (smaller = better). When they moved the measurement to 40 feet, everyone’s zone suddenly grew significantly, as you would expect: show it just after leaving their hand and it’s going to be pretty close together, but closer to the plate the pitches will naturally drift apart. This is not a problem per se, it’s just something to be aware of as we look at the data. In today’s case, I can tell you that everybody expect Padilla are being shown with a release point at 50 feet. Padilla is at 55 because he hasn’t pitched in the Ballpark since they moved to 50, and although he had a 40 game, I didn’t want to use that. 50 and 55 are close together, 40 is quite different, and it tended to reduce the break length, which once you see Padilla’s chart would actually work at bringing him closer to everyone else.
Okay, let’s take a look at a picture. We’ll start with Kevin Millwood on July 8:
What the heck is that, I hear you ask. I told you it was going to look weird. Okay, with a lot of visualization going on, and realizing this is entirely not correct either mathematically or linguistically, here’s what you see: The (0,0) point at the top center is a point kind of like the release point, but not really. Imagine Millwood’s hand is at (0,0), releasing the ball. If it was dead straight, it would stay at (0,0) and you wouldn’t see anything (technically, you’d see a dot, or the line end on). The lines therefore show pitches at different angles, and with different amounts of break on them. Confused? You will be.
This chart is not strictly accurate in terms of where the ball is going (not at all, in fact). If I understand it right, the pitch will actually leave the pitcher’s hand, curve up and away, and come back down hitting the strike zone somewhere. Since I can’t really show that curve without some 3-D modelling, I chose to show it this way. You’re seeing the pitch drop a certain distance, and go at a certain angle. Theoretically you’re seeing where the pitch is going. In reality I fudged and flipped the chart (I think). For example, for a pitch to curve from the pitcher’s hand, it would really go up and to the right or left, and curve back to the (0,0) point. I wanted to show it down because that’s what it’s really doing, dropping down. And I wanted to show it this way because it kind of shows what a pitch is doing in relation to the batter. For example, the long lines going down and to the left are curves, and in reality when Millwood throws them, they do come in (left) on a right-hand batter. His fastballs move away, or to the right. To show it pointing in the correct direction would show it on the chart in the opposite direction of what you would expect.
[Caveat: I fully expect someone much smarter than me to point out the complete fallacy of these charts, and how I got everything upside down and backwards. Go right ahead, but please be gentle.]
Now today I’m not going to analyze what every pitch is. I’ve already told you the long ones are curves, and that’s true for most every pitcher you’ll see. You can see Millwood has a few long curves, and some clusters of shorter pitches, some to the left and others to the right. In a future article I’ll try and decipher what each pitch is.
Oh yeah, the colors. Like the rainbow effect? I’m charting these in Excel, and at first I had them all printing black, but a) it just looked like a big blob, and b) any time a couple of pitches had the same angle, the shorter one would disappear behind the longer one. Even pitches close to each other would tend to cancel out. By doing the lines in colors, you can actually begin to distinguish the shorter ones. Plus it looks pretty neat.
As a very general guideline, I found that about 10% of pitches will be duplicates. Meaning that for any 100 pitches thrown, there will be about 10 pairs with the same break angle, so one of them will hide the other.
The day before Millwood, Brandon McCarthy went against the Orioles. Pretty similar stuff. His curves aren’t breaking quite as long as Millwood’s did. He doesn’t have a short left cluster like Millwood, and his right side is much more spread out, and also a little more break.
I should point out that all of these charts have the same scale, except our usual suspect Mr Padilla, who seems to be off the scale in just about everything. Note that all the numbers are inches, meaning the scales go 6 inches each side and 18 inches down. The two axes are not to scale, the horizontal axis is close to 50% bigger than the vertical. I could change it so they match, but I like them the way they are and when you change them things tend to squish together too much (squish being the technical term).
Kameron Loe on July 6 had another good start. His pitch angles show curves, not as long as McCarthy or Millwood, but his other pitches to the right appear to be breaking more. As a sinkerballer, this might be a clue. He also has a much clearer delineation between the left and right sides, compare it to Millwood and McCarthy who filled in the space directly below (0,0) with a few pitches.
On July 5 it was Tejeda’s turn. No, there is nothing wrong with the data. As we’ve discussed before, Tejeda doesn’t have a curve, he only has two pitches, which are a fastball and a slider. What’s worse, they show absolutely no movement in toward a right-handed batter. Although he has pretty good coverage of the right side, most of the pitches are also breaking a similar distance. So batters are keying on a fast pitch with not much break, and always moving away from the right-hand batter. Yet more evidence as to why he has been so hittable.
Jamey Wright finally makes it into one of my charts, since he’s joined the rotation while Padilla is on the DL. Similar charts to the others, although his three areas are clearer: curves bottom left, another pitch going straight down and slightly left, and the rest going away to the right. He seems to be throwing a lot more pitches that break almost vertically than anyone else.
And finally Padilla:
Remember I said his scale was different? It’s the curves. The lowest anyone else went was Millwood, just touching the 16 inch down line. Padilla goes all the way to 25 inches. His problem though is a big delineation between them. The curve is down and left, everything else is to the right, except one aberrant pitch going straight down. There do appear to be three pitches on the right though: a short one, and two long groups with a separation in the middle. As I said at the beginning, this is helpful information in determining the pitch clusters.
So that’s it. Some pretty pictures to sum up what I’ve been doing lately. All this is useful in creating an algorithm to determine pitch types. Of course, I’m still really only looking at the Ranger pitchers, and as I’ve mentioned before they’re all fairly similar, which is a disadvantage to the team. I will branch out and find some lefties to look at, and maybe some pitchers who throw other types of pitches, and see what I come up with. In the meantime, there’s still two more days to kill until the second half begins, and we can see how the Rangers can continue the season.