This day in age, we have access to a wealth of data that can be analyzed to help us to predict outcomes. This has been given the name predictive analytics. This can be applied to almost anything; including football. It may be that the search engine, Bing, has the ability to predict if the Broncos will relive their last Super Bowl experience, or if they will be victorious.
You may be thinking, “how does a search engine have the ability to predict how well a team will play?” This, too, was my first thought. But when you think about all the data they have access to, it makes a little more sense. I like to think of the data as being separated into two parts: stats and social.
Stats are mostly composed of historical data: team records, margins of victory, and individual player data. They can compare this data to environmental data: what type of surface games are played on, where they were played, weather condition, and if the stadium was covered or not. They use this data to help them determine what conditions are favorable to what teams, or what players. Then they can therefore predict the best lineup for a specific condition.
Stats alone create pretty good predictions about who might win a game. But pretty good is not good enough.
This is the second stage of prediction, in which Bing taps into the social web to solidify its predictions. The Bing team literally analyzes what people are saying through public networks about games, teams and players. They can find real time information on injuries, controversies, changes to the line-up and more.
If the game is to be played on the East Coast, and they notice that people are talking online about a major snowstorm, Bing will take this into account and predict how well, or poor a player may play in the cold weather.
Unstructured and social data can also catch some hiccups in the traditional model. For example, a high number of passing yards would typically suggest a win, because it signifies a high number of completed passes. But this may not always be the case. It may actually indicate that a team will lose, because if they are down by the fourth quarter, they will surely be throwing that ball around as much as possible, in hopes of getting some points on the board. Because someone talked about this situation online, the algorithm learned that it needs to analyze this data per quarter, instead of per game.
You may be surprise, but analyzing the social data actually increases the accuracy by 5%.
Thus far, Bing has yet to announce who they think will win. Others are predicting that the Panthers will prove victorious, but remember, even if they do, it can always change with the first drop of rain in San Francisco. I guess we will find out next weekend!