Batting average. ERA. WHIP. OPS.
Every one of those numbers was decided years before you ever logged in. Someone picked the numerator. Someone picked the denominator. You got the leftover.
You can stack them. Weight them. Regress them. You can't change what they're asking.
So we built something different underneath.
Every pitch in an MLB game is a row in our database. Every ball put in play is a row. Each row carries twenty-plus tags — pitch type, velocity bucket, zone region, count, runners, leverage, platoon — and you write the question when you build the formula.
Not when somebody named the stat.
(That's the whole shift. The rest of this post is just examples.)
Why this matters
Most platforms hand you season-level aggregates and call it a day.
Want "swinging-strike rate on 2-strike four-seamers above 95 in the heart of the zone vs lefties"? You're writing SQL against a Statcast pull and praying your join keys line up.
We did that part for you.
Six raw event streams. Each row carries the full context of the moment — pitch type, velocity, location, count, matchup, outcome. Sum and divide your way to any rate you want.
It's the same vocabulary FanGraphs, Pitcher List, and Driveline have been using for a decade — sitting in your formula editor, waiting.
The six raw events
Every MLB game we ingest produces six streams of event rows — every pitch, every plate appearance, every ball in play, every baserunning move, every scorable fielding chance, every half-inning. One row per event.
The two that do almost all the work are pitch_event (one row per pitch) and batted_ball (one row per ball in play). What makes them work is the tags.
A pitch_event row knows the pitch — type, velocity bucket, spin, horizontal and vertical movement, where it crossed the plate (heart / edge / chase / waste), what happened (called strike, swinging strike, foul, in play). And it knows the moment — count, outs, runners, inning group, who was on the mound, who was at the plate, what hand they each throw and bat with.
If a writer at FanGraphs can describe it, you can filter for it.
A batted_ball row knows trajectory (ground ball, line drive, fly ball, popup), how hard the contact was (the exit-velo buckets that matter for barrel reasoning: 95–104, 105+), and — crucially — which way the ball went relative to the batter's handedness. A lefty pulling to right field and a righty pulling to left field both get tagged pull. The handedness math is already done.
A note on the buckets: velocity, spin, and similar continuous values get bucketed before they hit the database — not because we couldn't store the float, but because "97.3 vs 97.6" is a distinction nobody actually models on. Bucketing turns sample noise into stable signal.
That's the whole vocabulary. Sum the rows that match your filter, you get a count. Divide two filtered sums, you get a rate. Build seven of those and you've described a player.
What this looks like in a formula
Seven components.
Every one maps to a question public analysts have already written about.
Every one is a one-line filter against raw events.

A. Can he put hitters away with the heater?
Two strikes, top of the seventh, your closer warming up — the question is whether the guy on the mound can finish at-bats with the fastball or whether he has to nibble. League average whiff rate on 2-strike heaters above 95 is one number. Your guy's number is another. The gap between them is the at-bat.

B. Throws hard, or misses bats hard?
The truest pitching question isn't "how hard does he throw?" It's whether bats miss when he's in the zone with two strikes. League average whiffs on about 12% of heart-zone pitches in that count. The elite arms double that. The gap between throwing hard and missing bats lives there.

FanGraphs has called this one of the most predictive "stuff" markers in public data.
C. Who actually hurts lefties with runners on?
Runners in scoring position, a lefty on the mound — what you want to know is who can make him pay. Not who'll work a walk or slap a single. Who can put one in the seats. Hard contact in the air off lefties is the signal.

RotoGraphs found that exit velocity on fly balls and line drives, not overall, is the stickiest year-to-year power signal.
D. Can he lay off the breaking ball?
Bases empty, no pressure to expand, no situational hitting in play. The pitcher drops a slider just below the zone. Whether the bat goes is the cleanest read on plate discipline you can get. League average is about a third. The hitters who can lay off — that's the edge.

Sports Info Solutions found hitters chase out-of-zone breaking balls 33% of the time vs 27% on fastballs — the gap is where discipline edges live.
E. Where do home runs actually come from?
A fly ball isn't a fly ball. To the opposite field, it leaves the yard 4% of the time. Pull side, that number jumps to 23%. Add a same-handed matchup, and you've stacked the platoon disadvantage on top. That's where home runs come from.

Lookout Landing found pull-side flies leave the yard at nearly 5x the rate of oppo flies.
F. Does the starter own strike one?
The 0-0 pitch sets the tone. Strike one and the hitter is suddenly in a pitcher's count for the rest of the at-bat. Ball one and the next pitch has to come over the plate, with the hitter sitting on it. The starters who consistently get strike one are the ones who get into the seventh.

After a first-pitch strike, 92.7% of at-bats end in an out or another strike (theScore).
G. Whose sweeper is real?
Some breaking balls move like a slider. Others move like they're being remote-controlled. The difference is mostly spin — a 2,800 RPM breaker doesn't behave anything like a 2,200 RPM one, and the bat can't track it the same way. The pitchers with the high-spin version are the ones with strikeout numbers nobody else has.

Driveline's research on spin rate showed why high-spin breakers play like a different pitch entirely.
Seven components. Seven questions backed by years of public research. All built from sums of rows.
Each one runs as a single query against the raw event tables — fast enough to recompute live while you tune the filters. Raw events were the foundation, not an afterthought.
Why this beats aggregated stats
Our other tiers — game-computed stats like team_strikeouts, derived stats like batting_average — are great when the question is fixed. They're fast, they aggregate cleanly, they're what most modeling work runs on.
But every aggregated stat has its question baked in at creation time. "Batting average" decided years ago which events count as the numerator (hits) and which count as the denominator (at-bats — not plate appearances, no walks, no sac flies). Want it phrased differently? "OBP excluding intentional walks against same-handed pitching with two outs"?
You're stuck.
Raw events flip the model. You write the question. We just hand you the rows.
That's the difference between consuming baseball stats and inventing them.
Your turn
The creator platform is where you write the formula, stack the filters, and let 26 algorithms learn from whatever you wire up. No SQL, no spreadsheet wrangling — just the question you've been wanting to ask.
Further reading
Key references
- In-Zone Whiff Rate Leaderboards and League Averages — FanGraphs, on heart-zone whiff as a stuff marker
- Breaking Down Plate Discipline by Pitch Type — Sports Info Solutions, on the fastball-vs-breaker chase gap
- The Near-Immediate Usefulness of Max EV — RotoGraphs, on why fly-ball EV is the stickiest contact-quality signal
- The Direction of Home Runs — Lookout Landing, on the 5x gap between pull and oppo flies
- Quantifying the power of the 1st-pitch strike — theScore, on early-count command compounding
Additional reading
- A Deeper Dive into Fastball Spin Rate — Driveline Baseball
- The Doomed Search for a Perfect Way To Interpret Exit Velocity Data — FanGraphs
- Tanner Scott and the Ideal Zone Rate — FanGraphs
- A Meandering Examination of Fly Ball Pull Rate — FanGraphs, on Isaac Paredes and the pull-air archetype
- Statcast Leaderboards — Baseball Savant, useful for sanity-checking your formulas
.png)







