MicrosoftВ® Office ExcelВ® 2007: Data Analysis and Business Modeling (Bpg -- Other)

Overview

Many of us follow basketball, football, hockey, and baseball. Bookmakers set point spreads on games in all these sports and others. For example, the bookmakers’ best guess was that the Pittsburgh Steelers would win the 2006 Super Bowl by three points. (Boy, did the Colts choke in the divisional round! Of course, the Colts redeemed themselves by winning the 2007 Super Bowl.) How can you use Microsoft Office Excel 2007 to come up with team “ratings” that generate reasonable point spreads?

Using a simple Solver model, you can generate reasonable point spreads for teams based on the scores of the 2005 season. (Playoff games for this season occurred in 2006). Our work is in file Nfl2005.xlsx (see Figure 32-1). We simply use the score of each game of the NFL 2005 season as input data. The changing cells for the Solver model will be a rating for each team and the size of the home field advantage. For example, if the Indianapolis Colts have a rating of +5 and the New York Jets have a rating of +7, the Jets are considered two points better than the Colts.

Figure 32-1: Data rating NFL teams that we’ll use with Solver

With regard to the home-field edge, in most years, college and professional football teams, as well as professional basketball teams, tend to win by an average of three points (whereas home college basketball teams tend to win by an average of five points). In our model, however, we will define the home edge as a changing cell and have our Solver model estimate the home edge. We can define the outcome of an NFL game to be the number of points by which the home team outscores the visitors. We can predict the outcome of each game by using the following equation (which I’ll refer to as Equation 1):

(1) 

For example, if the home field edge equals three points, when the Colts host the Jets, the Colts will be a one-point favorite (3+5–7). If the Jets host the Colts, the Jets will be a five-point favorite (3+7-5). A complication that arose in the 2005 season is that Hurricane Katrina caused the New Orleans Saints (Team 20) to play games at a nearby site (San Antonio, Texas or Baton Rouge, Louisiana). (This is indicated by a C in column A.) We assigned 0.5 home-field advantage for the Saints in these games. Also Game 59 (San Francisco 49ers against Arizona Cardinals) was played in Mexico City (a neutral site) so no home-field advantage should be assigned for this game.

What target cell will yield “good” ratings? Our goal is to find the set of values for team ratings and home-field advantage that best predict the outcome of all games. In short, we want the prediction for each game to be as close as possible to the outcome of each game. This suggests that we want to minimize the sum over all games of (Actual outcome)–(Predicted outcome). The problem with using this target is that positive and negative prediction errors cancel each other out. For example, if we overpredict the home-team margin by 50 points in one game and underpredict the home-team margin by 50 points in another game, our target cell would yield a value of 0, indicating perfect accuracy, when in fact we were off by 50 points a game. We can remedy this problem by minimizing the sum over all games by using [(Actual Outcome)– (Predicted Outcome)]2. Now positive and negative errors will not cancel each other out.

Категории