Tuesday, May 16, 2017

Nate Silver on Big Data in the 2016 Presidential Election

It seemed that the world was stunned waking up the morning after the 2016 presidential election. In the wake of a drama-filled election season, statisticians all prematurely announced the race to be over in favor of Hillary Clinton. Just four years prior, Nate Silver was able to predict 49 of 50 states outcomes, allowing him to quite easily announce the winner, but four years later, his story was very different. Many tried to blame Silver, saying that his data, methods or beliefs were flawed, but among the least surprised was Nate Silver himself.

As the world of data grows so quickly, entering virtually every enterprise and business realm, it's so easy to get caught up in all of the stories that data can tell and all of the things it can predict, but that doesn't mean they're immune to flaws. Silver has established an empire with fivethirtyeight, using data to tell stories about areas such as politics, sports, current events and many more with his team. Again, Silver is the first to address the limitations of such reporting.

I want to focus on particular on a recent interview with Nate Silver regarding the polls leading up to the election. He has been asked many times about how he, and so many others, were wrong about the outcome. He showed insights to his deep understanding of not only what data tells us, but what we need to keep in mind when using it. For example, he states that his model, gave Donald Trump a 30% chance to win, which is not an unlikely outcome. The favorite does not always win, rather his prediction should have been used to predict the fact that the race would be competitive, which it was.

In later questions, he also talked about stability (or lack of) in data. Throughout election season, projected percentages of votes swayed extremely often, particularly as certain news was released by the media. This variance shows how quickly these predictions can shift. A second thing that needs to be kept in mind about predictions elections is that not only are you predicting the way that people will vote but also the number of people that will vote. Obviously, as more variables are added (here is two but there are many, many more to be taken into account), the more challenging it becomes to feel confident that any projection can err. Lastly, this projection was one that attempted to predict human behavior which holds an inherent problem; humans don’t always do the rational thing, they don’t always do what they say they will and they don’t always do the thing that people think they will. These ideas and so, so many more are ones that need to keep in mind beyond just what the numbers say at their conclusion (their projected election winner here).


Silver also talks about the problem that arises when data turns into reporting, as different ways of analyzing and interpreting data can provide contradictory conclusions. It’s no secret that people take reports seriously, and when there is data backing them, their public credibility only becomes stronger. In logic, there people are careful not to find premises to support their conclusion, rather have a conclusion that is supported by their premises and this is no different; the story told from your data should be the story that stands out, not the one that you want to tell and can find evidence within your data to support. Silver also states that in these situations is often better to not tell a story at all, rather than giving a misrepresentation. Though he is more than qualified to provide his educated opinion, it isn’t always an opinion that is shared. In the age of fake news, integrity is not always at the forefront of people’s decision-making. Regardless, big data will be a huge part of politics moving forward and it will be incredibly interesting to see the differences come the 2020 election.

http://data-informed.com/nate-silver-big-data-has-peaked-and-thats-a-good-thing/

No comments:

Post a Comment