Saturday, October 23, 2004

Probability that the Red Sox will finally escape the Curse of Babe Ruth this year: 43%. Or 61%.

Baseball is the most probabilistic of sports. A game centered on smacking together two round fast-moving objects (ball and bat) can hardly be anything else. That's why it is so loved by statisticians and the statistically minded.

One of the consequences of this is that any team really will beat any other team a certain percentage of the time, even if the difference in quality between them is great (as long as it is not too absurdly great). It's not like football or basketball, where physically superior players can overwhelm weaker opposition by force.

In baseball even a short, slight platoon infielder with a .219 lifetime batting average and all of seven career home runs may be remembered decades later as a "Series Hero" for taking the opposition's All-Star pitcher downtown at a critical moment. And a good thirty-five years later he'll be enshrined on the list of those who hit the 100 Greatest Home Runs of All Time.

Things like that happen in baseball, a certain percentage of the time. They're part of what makes the game so rewarding to watch. The fact that it's the only sport where the team with the ball plays defense isn't the only thing that's unique about it.

Another consequence is that despite what all the fans and sportswriters say -- especially during second-guessing time, which Joe Torre is enjoying in New York right now -- the outcome of every individual game is mostly the result of luck, chance, contingency. The difference in quality between major league teams is not so great as to be visible in results on a game-by-game basis, no matter what fans and sportswriters think they see. It takes a 162-game season for the luck to even out and the difference in quality to assert itself in the won-lost record. In shorter stretches bad teams will frequently outperform good ones.

How likely are they to do so? The statisticians have worked it out, of course. In a given game the probability of the stronger team beating the other is pretty close to the difference in their winning percentage plus .500. So a team with a .550 record will beat a team with a .470 record 58% of the time. To figure out the odds of each team winning a series of given length, just apply the standard rules of probablility.

What does this mean to post-season playoff series, and to the World Series, where both teams are good and are pretty evenly matched? In the memorable words of Oakland general manager Billy Bean -- quoted in Moneyball, the book about him bringing statistical analysis to team building -- they're "a crapshoot".

Take the Red Sox and Yankee teams that just completed their historic series. During the regular season the Yankees won three more games than the Sox out of 162. So the difference between those two teams was just one win every 54 games. But the contest between them was only seven games, and over the season any seven-game stretch accounted on average for only about 1/8th of a win difference between them -- with the other 7/8ths (or more) of a win resulting from chance.

Or, using the formula mentioned above, on the basis of full regular season records that gave the Yankees a winning percentage 1.8% higher than that of the Sox, the Yankees had a 51.8% chance of winning each game on average, and (if my binomial calculator is working correctly) a 54% chance of winning that series. Even if you adjust for things like home-field advantage on the odd game, those odds are pretty even. (Although after the Sox lost the first three games, their coming back by winning four in a row was about a 17.5-to-1 shot.)

This "playoff crapshoot" notion isn't appreciated by many baseball purists who think games should be won by the players' fighting character and their manager's cunning and force of will in directing them to victory, but it's hard to argue against it when you look at the numbers.

One of the common pre-Series media stories has been about how the Cardinals' star manager Tony LaRussa has taken teams to the post-season playoffs 10 times yet won the Series only once. But with a team having to win three playoff series for its manager to win a ring, if the odds in each are near 50/50 he'd be expected to win through in just one of eight chances -- so Tony's about par for the course.

The Times today reports some other post-season numbers...

Before 1969, when there was only one annual best-of-seven-game postseason series, the team with the better regular-season record won 34 of 65 series, just a tick more than 50 percent.

From 1969 through 1993, when baseball played one additional preliminary series, a league championship series (first best of five games, later best of seven), the team with the best regular-season record ended up wearing rings 7 of 25 times, or 28 percent. That is very close to the 25 percent of the time a flipped coin will come up heads twice in a row.

Since 1995, when the postseason expanded to eight teams and three rounds, the best team in the regular-season has won one of nine World Series, just what a coin's theoretical probability (1 in 8, or 12.5 percent) would prescribe.

Not that any of this should in any way diminish a fan's admiration for the fighting character shown by his baseball heroes. Curt Schilling having his tendons sewn to his leg -- in a procedure that had never been done before and was only tested on a cadaver -- so he could pitch the do-or-die sixth game against the Yankees with blood seeping through his pants, was a performance that deserves to put him in any sports heroes' hall of fame for courage alone. But both teams' players are playing as hard as they can, and their managers are leading as well as they can, and that's what makes them evenly matched and the outcome a near toss-up.

Everything is older than we think, and the Times notes that the modern statistical analysis of baseball fathered by Bill James has a grandfather in Frederick Mosteller -- a Red Sox fan born in Boston four years before Babe Ruth left for New York, who's still rooting for them -- who published along these lines in the Journal of the American Statistical Association a generation earlier, starting in 1952.

What are the odds that Professor Mosteller and his intellectual heirs put on his seeing his Red Sox escape the Curse of the Babe to finally bring the championship home this year?
Boston carried in a winning percentage of .610 (including postseason), St. Louis .647. This suggests that the Cardinals have a 53.7 percent chance of winning the average game between them. And after applying Mosteller's binomial theory, the Cardinals have a 58 percent chance of beating the Red Sox to four victories.

But ... the Sox could play four games at home, and home games typically add 42 points to a team's winning percentage. Making that adjustment, the Cardinals decline to a 57 percent favorite.

Even more significantly, the two teams' rosters have changed, specifically Boston's addition of shortstop Orlando Cabrera. Using only the teams' winning percentages since the July 31 trade deadline, Boston, which has played .700 ball since then, becomes the favorite, at 61 percent.
So there you have it, in real numbers.

After more than 50 years of the development of the statistical analysis of baseball into its modern state of the art, we can say definitively: The Red Sox are either the underdog or the favorite.

My advice: Enjoy the play of the game for its own sake. Don't get too drawn into the issue of whether or not the best team is winning.