Padres bloggin' since 2007

Sample size, randomness, baseball, and you

June 22nd, 2010 by Melvin

Special note: this post doesn’t have a lot of jokes. In exchange for your forgiveness, please accept this photo of Padres prospect Blake Tekotte. Thank you.

Blake Tekotte

When looking at statistics, there are two major pieces of information to learn.

  • How much has a player contributed to his team in the past?
  • How much will a player contribute to his team in the future?

Often times, the ability of a player to contribute to his team in the future is called “true talent level”. This is a player’s raw ability, with other factors such as luck and the ballpark environment in which he plays stripped from the conversation. This is where the concept of sample size is most important. Without using an adequate sample size in measurement, all the stuff that doesn’t affect a player’s future performance might mess up our opinions. Sample size, among other things, is what gets us there.

While fun and interesting, when talking about things a particular baseball squadron should or should not do, a player’s contributions in the past generally aren’t relevant. Sure there are exceptions–when a lifelong Padre player is negotiating his final contract–for instance. But those are rare.

My stupid example: flipping a coin

Suppose you ask me to call heads or tails as you flip a quarter in the air. I choose heads, and wouldn’t you know it, the quarter lands heads up! Does this mean I will know the result of all future coin flips when asked? In other words, do I have a perfect “true talent level” of calling coin flips? Of course not, and we all understand why. Because of luck.

Along these lines, each measurement (or statistic) has its own requirements for sample size. If you flipped a second coin, and I guess correctly a second time, that still doesn’t prove my coin guessing. We simply haven’t reached the number of coin flips necessary to filter out the luck. As you approach 50 coin flips and calls, my successful calling rate will likely be pretty close to my true talent level of 50%.

Back to baseball: wOBA and UZR

The same applies for baseball measurements. Different stats require different amounts of trial before they eliminate noise. I’m not a stat expert, so I can’t expressly say exactly how many tries one should use for each stat. For me, 3-5 years of wOBA (my favorite hitting stat) is what I want to see when looking at a player. 500 plate appearances at minimum.

When measuring defense with UZR, however, things are different. 3 years of UZR data is worth about 1 year of hitting data. That means when determining a defender’s true talent level, as I understand it, you really ought to look at 9 years of data. I’m completely serial. 3 years of UZR at minimum.

So please, everyone from message board posters to SDUT staff writers, be careful when making judgement about a player’s future potential using statistics. Especially UZR.

Your pal,


Posted in statistics | 7 Comments »

7 Responses to “Sample size, randomness, baseball, and you”

  1. Myron Logan says:

    Melvin, great advice. Like you say, I think sometimes we tend to look at someone’s UZR through half a year, see that it says +5 runs, and call him a +5 defender. As you say, that just isn’t true, especially with a stat like UZR.

    I do think nine years is too much, however, because what a player did eight or nine years ago doesn’t have much relevance today. You could weight each year, I suppose, but I still don’t think you have to go nine years back. My guess is 3-5 years is pretty good. Of course, utilizing as much information as possible is always your best bet.

    • Melvin says:

      9 years does seem like an extraordinarily long time, I agree. Yet, does 1 and a half year of wOBA show true talent level? Maybe i’m mis-remembering MGL’s line about 3 years of UZR equaling 1 year of hitting stats.

  2. Myron Logan says:

    hmm …. I know what you’re referring to, but I don’t remember the exact numbers. I think the key thing to remember is that true talent level is always changing, and we’re always trying to estimate it.

    I think with fielding you definitely want to use a lot of data (weighted somehow, with recent years counting more). Also, I’m a big fan of incorporating something like the Fans Scouting Report into fielding evaluations, especially when not much data is available on a certain player.

    • Myron Logan says:

      Doh. I think I messed up the concept of threaded comments right there.

    • Melvin says:

      Yeah, scouting is so important when there isn’t enough fielding data to be reliable. I wish Fangraphs’ WAR would incorporate some kind of weighted 3-5 year UZR number rather than just that current year’s.

  3. Ray says:

    I’d just like to say that AJ’s the man and he will continue being the man.

  4. […] Comments Ray on Sample size, randomness, baseball, and youMelvin on Sample size, randomness, baseball, and youMyron Logan on Sample size, randomness, […]

Search Posts

The Sacrifice Bunt on Facebook The Sacrifice Bunt on Twitter


Sacrifice Bunt Shop

Sacrifice Bunt Shop