Friday, December 31, 2004

How can Gonzaga be only #21?

[If you're not into sports stats, skip this post]

Roland Patrick tips off whom he roots for by demanding to know how the Gonzaga men's basketball team, after compiling a 9-1 record while defeating the #16, #9 and #3 teams in the nation, and losing only to the #1 team on that team's own home court, can be ranked a lowly 21st in the ESPN/USA Today coaches poll.

What is it with those coaches? Favoritism, nepotism, not being a member of the right club, outright bribery?

Can we instead take an empirical, objective look at where Gonzaga should be ranked, free of the subjective human element?

Sports-stats geeks will say "sure!" Those who are interested (to obsessed) in measuring such things have developed two pretty effective methods of rating the relative strength of competitors (individuals or teams) as measured by ability to accurately predict outcomes of future contests.

The first of these is the Elo system, developed in the late 1950s by Dr. Arpad Elo, then at Marquette University, to rate chess players. The Elo system is based on the observation that a given chess player's performance over a number of games is normally distributed, so that when two players of different strengths meet the probable outcome of their games will be determined by the overlapping distributions. (More details)

The simple result is that on the 4-digt scale actually used in chess, a 0-point difference in rating predicts each player will score 50%, a difference of 100 points that the stronger will score 60%; 200 points, 75%; 300 points, 85%; and so on. Of course while ratings are used predict contest results, contest results are used to compute ratings.

The Elo system has proved surprisingly robust and has been adopted in a wide range of other sports and areas where competitive ratings are desired -- even to ranking universities and colleges on the basis of who chooses to attend them.

The second strength rating method is the Pythagorean system, first developed for baseball by Bill James. This resulted from James's observation that teams' winning percentages correlate very highly with their runs scored for/against ratios as run through a simple formula.

The Pythagorean system has proved highly adaptable too, working fine with football, basketball, and other sports where game outcomes are determined by scoring. In fact, scoring for/against ratios have shown to be a somewhat better predictor of future won-lost records than current won-lost records. Which causes fans of the Pythagorean system to say it is actually a better measure of how good a team is than its won-lost record. (Here's a calculator to show how strong your favorite team really is by scoring ratio.)

Elo system proponents scoff at that, saying the purpose of a contest is to win -- the World Series is won by the team that wins the most games, not the team that scores the most runs. So the Elo system, which considers only wins and losses, is a superior measure of actual performance ... and we head off into the sort of endless argument among sports-stats geeks that is surpassed for vitriol and inanity only by those among political partisans in election season.

Not that there's really a lot to argue about, because the two systems usually give very much the same result -- and when they don't, the difference derives usually not from the systems themselves but from sample size. If only a small number of games are being measured there's a big error factor in each computation, which can produce a big difference between them, which diminishes as more games are played. (E.g, the Elo system requires 20+ games for a reliable rating, which makes it useless for football although many insist on applying it there.)

FWIW, to me the Elo system is a better measure of a team's actual past achievement at what it is supposed to do, win -- though if I was wagering the mortgage money with my bookie on tomorrow's game I'd check the Pythagorean number against the betting line.

But back to Gonzaga. The obvious thing to do is to compare the coaches' poll giving Gonzaga its #21 ranking with objective, empirical Elo and Pythagorean rankings of its performance. And in this Internet-driven world that's easy to do as every sports-stat geek with an online connection seems to have his own personal variation of a rating system posted on his web site.

However, as the coaches' poll is published by USA Today we can go to computer ratings published by that very same newspaper: Jeff Sagarin's. Conveniently, Sagarin publishes both Elo and "Predictor" (his version of points-only Pythagorean) rankings for every team. Then rather than delve into some scholastic argument about which of the two is "best" he puts the two of them together to get his ranking.

His computer's impartial, impersonal, objective and totally empirical rankings for Gonzaga as of this writing: By Elo #2, by Predictor #44, combined for a final ranking of #22.

Should have stayed with the coaches' poll!

What does this mean? Maybe Gonzaga has overachieved or been lucky in winning a couple of those games, relative to its real strength as shown by its points differential -- some of its wins against lesser opponents apparently weren't very impressive. Or maybe it hasn't played enough games to reveal its real strength via points differential.

Most likely, with only 10 games played, it's just small sample size. Who knows where between #2 and #44 they should be ranked? Take a poll and find out.

Maybe there's a point here too that it's often not so easy in life to escape making a subjective judgment by resorting to quantified measures. Not much is simpler than quantifying a basketball team's performance, but when the best you can do is get an answer somewhere from #2 to # 44...

Update: Promptly after writing the above Gonzaga lost to Missouri, ranked #110 by Sagarin's computer. Maybe the coaches and the Pythagoreans had a point.