Major thanks to @LinesInDirt (Andre) for their assistance in writing this article!
What is baseball? It’s a very silly sport – the clown of the major sports world- but what is baseball really made of? Sadly, it isn’t whoopie cushions, but it is (Savannah) bananas. Nerds like me have been a part of the sport for as long as it has been played, and people have been data-crunching the game – and we aren’t slowing down. We’ve voided our warranty and removed the brake pedal – forever accelerating.
Baseball analytics has accelerated to the point where it is as if we are ignoring the gravity that holds baseball together. Small details like seam shifting are now measurable. The tools developed to measure things like this were developed in some sort of rocket surgery lab where they grow pitchers now. Let’s zoom out and put ourselves in space and ponder what the Pakleds might ask: “What makes baseball “go”?
The things that make baseball “go” are the very things that make us ‘go’ to the game. The crack of the bat, a diving catch, or a close play at the plate: All the action that leads to ✨drama✨. Can we measure this ✨drama✨ in a historical context? Can we really measure action? I will certainly give it a ‘go’.
Data Lore
I started to try to explore this all the way back in May 2021. Frankly, the amount of trial and error required was bordering on insanity.
My goal is to compare as many seasons from various pro baseball leagues around the world. For my dataset to work, the earliest I can begin is MLB in 1920. Going back further isn’t possible, as the data I need isn’t readily available and what is out there isn’t as reliable.
I had to limit myself to the data that could compare, for example, the 1920 MLB season and the 2020 CPBL season. With these taken into account, the major leagues available to me are MLB, NPB, KBO, CPBL, and ABL.
If I included things such as whiffs, barrel rates, and the like… that would defeat the purpose of wanting a historical dataset. Perhaps someone can make a formula going forward that includes such data, but you won’t find any of that here.
What is and isn’t included
While we are here talking about stuff we can’t measure, this formula won’t account for ANY situational moments. Going back to the 1920s for that sort of thing is, excuse my language, a fuckload of data. Finding all of that data is just an unreasonable amount of work for a single person to do. Moreover, I want this formula to be situationally neutral.
For my formula, I need the following raw data: PA, AB, R, 1B, 2B, 3B, HR, SB, CS, GDP, SH, and SF. That is why I obtained data from Baseball-Reference.
Considering the required stats, this makes 1920 a good starting point. It is the year where the dead-ball era becomes the past and ushers in the live-ball era. This makes it much more recognizable and much easier to compare to the modern game (even though the game has still changed a lot from the 1920’s as well). An added bonus is that it is the first year CS is measured.
Incomplete Data
An important question, why does my data exclude the Negro Leagues? I would include them, but for the purposes of the formula, they have CS AND SF data missing – as well as up to 25% of data not accounted for. Also, the way my data is organized in my model doesn’t easily enable me to include the Negro Leagues and MLB together (as I first put my historical league data in 2021 – which is before MLB decided to include Negro Leagues with MLB statistics). The fact that data from the Negro Leagues is incomplete makes it more difficult to get reliable results from the model.
Considering the purpose of this formula and this article is meant to compare seasons across history from different leagues, it feels wrong to compare under those circumstances. When I do have a public site that hosts my data, the Negro Leagues will be included then.
Ready… Set…
ACTION needs to mean something, I guess? Let’s just say it stands for Analyzing Competitive Trends In On-field Numbers. Now let’s dig in to see what ACTION is all about!
ACTION = Potential Outs Percentage x ( (( Runs / Plate Appearances ) x 4) + ((Singles x 2) / Plate Appearances ) + (( Doubles x 4 ) / Plate Appearances ) + (( Triples x 5) / Plate Appearances ) + (( Home Runs x 10 ) / Plate Appearances ))) + ((( Stolen Base x 2 + Caught Stealing ) / Plate Appearances ) x Stolen Base Percentage ) + ( Potential Outs / Plate Appearances ) )
Just in case the frequent use of plate appearances as a divisor is a little confusing, it is relatively simple to explain: We simply need to see how often all those positive (and potentially positive) ACTION contributors happen per plate appearance. Now we can dig into what makes this formula work.
It All Depends…
Let’s address fieldouts, the first part of the formula. Well… we have a problem. Going back that far across that many leagues poses an issue with getting exact values. But don’t worry, I figured out a way to get around that. I’m a 100 IQ genius: I just POO’d. I apologize for the lax joke. POO stands for POtential Outs. So now if you ever wondered what a POO formula looks like, here it is:
POO = (Hits / BABIP) – Hits
This does a pretty good job of reverse-engineering how many fielding outs there were. Albeit with some noise as it includes errors – as those could have been potential outs. But exactly how much noise? Let’s compare it to PO while removing strikeouts and throwing in errors. That way we know we are attempting to measure the same thing.
MLB in 2024:
PO = 129,349
E = 2,594
K = 41,197
So if we take PO and subtract K’s – adding in the errors we get:
( 129,349 PO – 41,197 K ) + 2,594 E = 97,200 PO-K+E
How does that compare with POO?
( 39,823 H / *.291 BABIP ) – 39,823 H = 97,244 POO
*Exact BABIP value is .2905375
I think we can agree that this is more than close enough. It’s also useful to consider that given slightly different conditions, many of these ‘outs’ could have been outright hits. I am not explicitly rewarding a hypothetical scenario here, but it does tell us the value of a ball put into play.
But we can shine this POO even more. Let’s POOP: Potential Outs Percentage!
POOP = ((Hits / BABIP) – Hits) / At Bats
Why POOP?
What is even the point of POOPing? I have often wondered this myself, but let’s try to not let this article go down the toilet.
In the ACTION formula, we already derived aspects of offense and its contributions to the game. So all we are missing is the defense. After all, most of the game is defensive opportunities. That’s why we need POOP.
Let’s analyze examples of this… extraction. Here are some POOP scores:
MLB 2024: (( 39,823 H / .291 BABIP ) – 39,823 H ) / 163,687 AB = 59.4% POOP
MLB 1950: (( 22,559 H / .280 BABIP ) – 22,559 H ) / 84,823 AB = 68.4% POOP
NPB 1971: (( 12,436 H / .256 BABIP ) – 12,436 H ) / 51,425 AB = *70.1% POOP
*Highest POOP on record.
Why not use an already established stat like BIP (balls in play)? The reason is simply that BIP doesn’t tell us how many outs were converted at all. POO/POOP doesn’t tell us the precise number either but as I showed, it’s more than close enough. Plus, if you add all the hits we’ve already recorded and then added BIP, you’d essentially be double-dipping your balls… in play.
Defense As Action
You may also be scoffing at the idea that defense is action. Just because a defensive opportunity is most likely going to turn into an out, it’s far more dynamic of an out than a strikeout. Additionally, the mere act of a defensive play comes with the potential of the ball landing for a hit and causing more defensive opportunities. Add errors on top – and we have a whole new ballgame of action unfolding before our very eyes. By comparison, a strikeout is typically the end of the action in of itself.
I’ve already gone over POO before, but let’s roll around in it one more time and go over why it’s used as a positive contributor as shown above: Outs are bad for generating the ‘potential’ for more action, but we have to reward action generated from defense – plus the potential that they could have been hits – or caused a batter to reach base.
I recognize that higher POOP means generally weaker contact – and the data I have certainly shows that trend. As weak as the contact likely is, it still has to be taken into account. Plus, high POOP doesn’t always mean weak contact, more on that later.
Valuing Potential
That ‘potential’ for more action and the defensive plays they generate is why I take POO and divide by at-bats to get how many field outs were made. Then I divide it by plate appearances in the formula to reflect how often field outs occur as a percentage. They have no weight attached. This is done to avoid overvaluing those outs since they are absolutely not generating further action once complete.
That is the reason why all the offensive contributors in the formula are multiplied by POOP as well. It’s ALL the positive actions as a percentage weighed against all the field outs as a percentage. This is so that it rewards hits and all actions that generate more potential action – all while not discounting outs as potential action generators as well. Phew! I’m all poop’d out after all that!
Babying the Formula
I try to measure ACTION with countable events that contribute positively to the potential for more action. The following events meet the definition above: Hits, extra-base hits, total bases, runs, home runs, defensive opportunities/outs, and stolen base attempts with stolen base percentage factored in. These are all ACTION enhancers and increase the probability that more action will occur.
You may notice strikeouts and walks are missing. Those are countable! Strikeouts are fun! That is true, but they also tend to curtail the potential for more action. A ‘measly’ fieldout by comparison, at minimum, has 2 countable events:
- The ball put into play, and
- The fielding opportunity it provides.
A fieldout also carries with it the potential for more action, such as a throwing error which causes a runner to advance or score. It also carries the potential to have a different outcome as that fieldout could have potentially been a hit or another variation.
So the logic follows that it is even better if a batter reaches base regardless of how they get on base. They can steal, be doubled off, or score a run. That’s the delicious baseball stuff we want. So while strikeouts can be thrilling by themselves, in most situations they are neutral for action as they do not contribute positively to the potential for more action.
Strikeouts and Walks
I am aware strikeouts can still generate baserunners via a dropped 3rd strike, but they are already reflected indirectly in the formula as providing more plate appearances which potentially could be a good thing – but only if the batter behind them makes it count. As we will later see, in higher ACTION environments, this is certainly more likely.
As for walks, walks are a sad counterpoint to strikeouts. While they do provide offense, they provide that offense in the most boring manner. They too, on occasion, can be exciting. A walkoff walk sure is fun! But please remember that my formula doesn’t account for those situations.
Even if we took those situations into account, in an overwhelming majority of situations, walks are just ‘balls’. Walks only positively contribute to action by what happens after a runner reaches base in most circumstances – and I already account for that. I also think it’s safe to assume that if asked, most people would rather watch a hit happen compared to a walk.
Stolen Bases
Let’s dive into stolen bases now. A successful stolen base is rewarded with a weight of 2 – with caught stealing receiving no weight. SBs receive a weight of 2 because they initiate action where there otherwise would be none – plus the reward for advancing a base. In turn, caught stealing is only “rewarded” for the initiation of action. However, it’s important to note that CS is horrendous for the development of further action.
This is why the rate of stolen base attempts is then multiplied by SB%. This rewards seasons for stealing bases at a more successful rate, while not totally discounting seasons where seasons where they attempted steals a little more than they should.
Later on, when we get into analyzing seasons by ACTION with tables, you will see in my data that I have a stat called SBS (Stolen Base Score) which is a reference to this exact formula:
Stolen Base Score = ((( Stolen Base x 2 weight + Caught Stealing) / Plate Appearances ) x Stolen Base Percentage )
Here are some example SBS scores:
MLB 2024: ((( 3,617SB x 2 weight + 961 CS ) / 182,449 PA ) x 79% SB% ) = 3.55% SBS
MLB 1987: ((( 3,585 SB x 2 weight + 1,529 CS ) / 161,922 PA ) x 70.1% SB% ) = 3.77% SBS
KBO 1982: ((( 699 SB x 2 weight + 311 CS ) / 18,208 PA ) x 69.2% SB% ) = *6.5% SBS
*For KBO in 1982, that’s 1,010 steal attempts in only 480 games – an absurd amount – and it is the highest on record!
Runs
Before we get to hits, I need to give a quick ‘rundown’ of runs since they are very simple to explain. A run is when a baserunner touches all 4 bases in succession. So that is why I have given all runs a weight value of 4. I think we can agree that runs are good – until you split your pants. So let’s skip on over to hits now.
♫ Hit Me Baby One More Time ♫
We’re finally getting into the swing of things! Hits – do they even lift, bro? If they do lift, I have added weights to them. Hits are the thing that ‘slap’ in baseball, so to speak. So let’s slap this together. Here are the weights and the logic behind the weights:
1B = 2
Hit into play (+1) + Defensive opportunity (+1)
2B = 4
Extra-base hit (+2 for 2 bases) + Hit into play (+1) + Defensive opportunity (+1)
3B = 5
Extra-base hit (+3 for 3 bases) + Hit into play (+1) + Defensive opportunity (+1)
HR = 10
Extra-base hit (+4 for 4 bases) + Hit into play (+1) + Defensive opportunity (+1) + Guaranteed Run (+4))
But wait, home runs are hit into play – and are defensive opportunities – despite leaving the field of play (in most circumstances)? Yes! That is technically correct. And that’s the best kind of correct. So is BABIP wrong? No, we’re just measuring different things.
Think about the action of the ball in flight – the anticipation of whether or not it will clear the fence. Remember, it’s only a home run if it’s not caught before it leaves the field of play. That is why I am deciding to reward home runs for that anticipation. Seeing the outfielder running after the ball and watching it go over the wall adds to the excitement of a home run. It’s both hit into play AND a defensive opportunity.
I’m also rewarding them for being a guaranteed run (with extremely rare exceptions). This means that, in a way, I am recounting some runs. Just consider it the excitement bonus for scoring.
If you have any issues with my reasoning, please file your formal complaint by filling out the 56 required forms through the Central Bureaucracy.
ACTION Hero
While this formula can be used for analyzing games (which would be interesting to see!), I have only collected data for seasons as a whole as of now. Before getting into specific seasons, let’s go over each league’s averages and related data.
Average:
Median:
Average Deviation:
With all leagues combined, we see ACTION is at 1.201, a median of 1.203, and an average deviation of 0.057.
MLB’s average ACTION score is the highest, standing at 1.233, NPB is 2nd at 1.201, KBO in 3rd with 1.188, ABL is 4th at 1.162, and CPBL is last at 1.154. The average deviation is useful to observe here. It means that a score .057 below or above the average is notable, but high scores are NOT a stamp of ‘better’ baseball – but low scores absolutely DO tell us that something needs to change. More on this point later.
Top 10
I’ve beaten around the wrong side of the bush for long enough. Based on the formula, which seasons had the most ACTION? Let’s look at the top 10!
ACTION+ is normalized where 100 is average – meaning that 116.1 = 16.1% more ACTION compared to the average.
There are NINE from NPB alone – with a large smattering from 1976 – 1985. This is despite CPBL’s 2016 season (6.73 R/G) and ABL’s 2017 (6.51 R/G) season blowing any season out of the water with runs per game. They still have very much above-average ACTION scores. It’s just that runs are not the only metric that matters.
Understanding Run Environments
Leagues that have seasons of 5+ runs per game may appear chaotic, but they might only be chaotic/action-packed on the scoreboard. And as strange as it sounds to say it – games aren’t played on the scoreboard – only the result of the game is.
Those CPBL and ABL seasons’ runs per game came mostly from the power of home runs. Home runs are awesome, but even when they account for 4% of plate appearances, you still have 96% left of the game to cover. Thus, both of those seasons actually have low POOP. In fact, they are in the Bottom 10 for POOP.
Not surprising when you remember how POOP works. Yet those seasons were so ridiculous with scoring, that they are still very much above average for their ACTION scores – so they weren’t exactly punished for their crimes against baseballs (or rather the crime of manufacturing super-juiced baseballs). Hitting home runs at insane rates certainly can still mean a lot of action, but not when it’s the only thing you churn out. There are certainly healthier ways to POOP.
In an attempt to make this easier to wrap one’s head around it: Those CPBL and ABL seasons provided action, just in an imbalanced way. To achieve a truly high ACTION score, almost everything has to be far above average across the league.
Haha 1980 Ball go Brrr
This brings us to NPB in 1980, the highest ACTION score of them all. Look at all that high-octane balance!
Remember that the comparative scores are against NPB’s own history. In this case, 4.56 R/G is way above average for NPB, OPS too – with plenty of POOP and the best HR% in its history. In almost all the metrics that are good for ACTION, NPB did (at minimum) significantly better than average while being near the top in others.
Inaugural Chaos
Let’s compare this to 2nd place, NPB’s 1950 season.
It is only 0.001 off the 1980 season but with a different profile. It gets most of its score through some of the most POOP you’ve ever seen. Even though 1950 only had a slightly above-average OPS, they scored more R/G too. One thing in the 1950’s season working in its favor that closes the gap between the SLG/OPS difference is SBS, a record high for NPB. So NPB’s 1980 and 1950 seasons were clearly both exciting with a ton happening, but how they get there is different.
What a high score tells us
Which would you rather watch? 1980 looks more like my flavor of ACTION. It is more balanced in the sense that the 1980 NPB season achieved its score with more total hit variety – especially home runs. NPB in 1950 does score better than NPB in 1980 for baserunning, and that may matter more to you. However, the difference is relatively minor when taken into totality against all other ACTION contributors. 1980 still scores noticeably above average in baserunning too. But compared to most seasons across history? NPB in 1950 would be a hoot to watch as well – as compared to the average – it too had a more diverse game to offer.
It (generally) holds true that any high ACTION score will likely have more variety – or at the very least – have events happen in sequence more frequently. By the way, how fitting is it that the season with the most ACTION was also ‘The King’ Sadaharu Oh‘s final season?
And if you want the top 5 seasons for each league neatly shown, voila!
ACTION Zero
We’re getting into the abyss now. No Hazmat suit can save you now. Trudging through this pit is actually what kick-started this journey for me. ACTION might not be able to tell us what our ideal version of baseball is, but ACTION can certainly tell us when something is clearly wrong. A low ACTION score is good for nobody. Even if you like small-ball – this is where comparatively nothing happens. It’s not small-ball, it’s no-ball.
What I mean by this is that ACTION can still be around average or even above average with balls being put into play enough. You, the fan, still have something to anticipate and expect to see! But with low ACTION, practically nothing is happening in every aspect of the game.
Bottom 10
We start to see the impact of external factors, such as dejuiced baseballs, expanded strike zones, and the general famine of offense that plagues Asian baseball in the 2020s. Of the bottom 10, 7 were in the last four years – including this year’s CPBL and NPB seasons.
NPB’s Modern Deadball Era
Brotha eww…
Zooming into the 2024 NPB season, I can’t go on without mentioning that it was originally on pace to be even worse for ACTION before the NPB All-Star break. The original pace was UNDER 1.000. That is FAR below any other season on record. But after the All-Star break, the league’s HR% exploded. Before the break, it was hovering around 1.30%. By season’s end, it ended up at 1.53%. That boost alone pulled it up… only to still be in last place.
That HR surge is alarming since, much like MLB, the NPB All-Star break is past the halfway point of the season. Coupled with the fact that most teams play in domes, the summer humidity, and heat alone cannot explain the sudden increase of home runs.
NPB’s Communication Issues
It’s a bad look for a league that’s already faced the backlash and controversy of not telling anyone about changes to their baseball only 11 years prior in 2013. For the 2024 season, my evidence isn’t a smoking gun that they juiced the balls in response to them being too dead to start the year. The league’s silence to admit the original ball at the start the year wasn’t up to standards – combined with their past controversy – doesn’t help make those concerns go away either.
It’s all the more befuddling because if the change was intentional, I think people would have welcomed it with open arms. The change helped bring the 2024 season up in ACTION to the point where it nearly wasn’t in last place anymore. It’s still condemnable that they didn’t admit that anything was wrong with the baseballs in the first place. This is despite the data clearly showing that something was indeed terribly wrong. Even more condemnable if they intentionally changed the ball again without saying a word to anyone.
Downward Trends
Speaking of trends, here are the last 7 seasons’ ACTION scores for each league – and then followed by the average ACTION by decade:
I wouldn’t call these trends exactly ‘concerning’ but more inevitable to some degree. Pitching has absolutely outpaced batting progress. Especially in terms of average velocity which has ballooned K%. As of yet, there are not many ideal solutions to slow down pitchers. I’m not making new observations by telling you baseball is changing or has changed. I only hope going down this rabbit hole has brought new insights in how we conceptualize this reality of baseball.
Before we wrap things up – if you want the bottom 5 seasons for each league neatly shown… why? Well, I’m sick in the head too, so I’ll indulge your morbid curiosity.
If you want to dive into this data more for yourself, I am still working on making the data publicly available. As of writing this, I am mostly doing the formula and data work all by myself. Please have patience. 🙏
Closing Thoughts
So what did we learn? I guess we learned a paradox where if you POOP less, you have more of a mess to clean up. In all seriousness, I hope this at least gets people to recognize what they enjoy about the game more – and when leagues should be proactive. Not just in giving us entertainment, but communication. That includes things such as the ball itself. This is because everything in baseball starts and ends with the ball – as it should. But discussions among fans and players about the quality of the ball itself shouldn’t ever start from amongst themselves.
Additionally, by looking at the worst and best seasons for ACTION, hopefully, it has allowed you to think more about what you wish to see and how we can achieve that. I didn’t list any solutions here, but maybe someone else will crack the code. I also can’t tell you what is best for baseball or what to like about baseball. One thing for certain though is that action benefits when it comes from a variety of places in the game.
It ties in with a philosophy of mine. I’m all about maximizing potential – opening as many doors as possible rather than confining oneself to binary choices – especially for the things we love. If you love baseball, extend this philosophy to baseball. And if you ever have your bad baseball days, just say…
“Even at its worst, baseball is still great. I love this silly game.“
This feature was written by guest author Joseph Aylward.
Leave a Reply