It seemed that the world was stunned waking up
the morning after the 2016 presidential election. In the wake of a drama-filled
election season, statisticians all prematurely announced the race to be over in
favor of Hillary Clinton. Just four years prior, Nate Silver was able to
predict 49 of 50 states outcomes, allowing him to quite easily announce the
winner, but four years later, his story was very different. Many tried to blame
Silver, saying that his data, methods or beliefs were flawed, but among the least
surprised was Nate Silver himself.
As the world of data grows so quickly, entering
virtually every enterprise and business realm, it's so easy to get caught up in
all of the stories that data can tell and all of the things it can predict, but
that doesn't mean they're immune to flaws. Silver has established an empire
with fivethirtyeight, using data to tell stories about areas such as politics,
sports, current events and many more with his team. Again, Silver is the first
to address the limitations of such reporting.
I want to focus on particular on a recent
interview with Nate Silver regarding the polls leading up to the election. He
has been asked many times about how he, and so many others, were wrong about
the outcome. He showed insights to his deep understanding of not only what data
tells us, but what we need to keep in mind when using it. For example, he
states that his model, gave Donald Trump a 30% chance to win, which is not an
unlikely outcome. The favorite does not always win, rather his prediction
should have been used to predict the fact that the race would be competitive,
which it was.
In later questions, he also talked about
stability (or lack of) in data. Throughout election season, projected
percentages of votes swayed extremely often, particularly as certain news was
released by the media. This variance shows how quickly these predictions can
shift. A second thing that needs to be kept in mind about predictions elections
is that not only are you predicting the way that people will vote but also the
number of people that will vote. Obviously, as more variables are added (here
is two but there are many, many more to be taken into account), the more
challenging it becomes to feel confident that any projection can err. Lastly,
this projection was one that attempted to predict human behavior which holds an
inherent problem; humans don’t always do the rational thing, they don’t always
do what they say they will and they don’t always do the thing that people think
they will. These ideas and so, so many more are ones that need to keep in mind
beyond just what the numbers say at their conclusion (their projected election
winner here).
Silver also talks about the problem that arises
when data turns into reporting, as different ways of analyzing and interpreting
data can provide contradictory conclusions. It’s no secret that people take
reports seriously, and when there is data backing them, their public
credibility only becomes stronger. In logic, there people are careful not to
find premises to support their conclusion, rather have a conclusion that is
supported by their premises and this is no different; the story told from your
data should be the story that stands out, not the one that you want to tell and
can find evidence within your data to support. Silver also states that in these
situations is often better to not tell a story at all, rather than giving a
misrepresentation. Though he is more than qualified to provide his educated
opinion, it isn’t always an opinion that is shared. In the age of fake news,
integrity is not always at the forefront of people’s decision-making.
Regardless, big data will be a huge part of politics moving forward and it will
be incredibly interesting to see the differences come the 2020 election.
http://data-informed.com/nate-silver-big-data-has-peaked-and-thats-a-good-thing/
No comments:
Post a Comment