Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Most Shocking Result in World Cup History (fivethirtyeight.com)
133 points by houseofshards on July 9, 2014 | hide | past | favorite | 79 comments


This isn't just about having trouble with "edge cases", it's a fallacy to even think statistics are useful for predicting outcomes from such a small group where individuals have so much freedom to determine the outcome.

Statistics may generate interesting facts after an event but for populations of people as small as two soccer teams there's no way they can predict individuals' actions. Maybe it works better for baseball because each individual has a sufficiently narrow range of potential impacts, but in soccer each person can do relatively whatever they want at any time.


If you were correct then soccer oddsmakers would have gone bankrupt a long time ago. The fact is that the average outcome occurs the majority of the time. High profile surprises like this cause people to question statistics, but this line of thinking is based on a fundamental misunderstanding. The purpose of statistical inference isn't to predict the outcome of every event, but to be right slightly more often than not.

This world cup has been remarkably predictable on the whole. The favored team won the overwhelming majority of matches in group stages and in the rounds of 16 and 8. And most of favored teams qualified for the Cup in the first place. Soccer isn't unpredictable because of the impact of individuals or size of the teams or the nature of the game. It's just as predictable as any other sport, which is why bookmakers are able to set neutral odds that attract an equal amount of betting on both sides.

Of course there will be highly unpredictable results, and they will cause people to question the common sense idea that the most likely thing usually happens. Five Thirty Eight is a bit on their heels here because they have chosen to give themselves the job of explaining a predictable world to human minds that aren't built for understanding statistics. It's a difficult job and I don't envy them. This won't be the last time we see them talking about edge cases and outliers instead of just saying "The odds are right most of the time, which implies that sometimes they will be very wrong."


538 predicted the winner in the group stage correctly in 28 of 48 matches. For teams advancing out of the group stage, they got 7 out of 16 right. Maybe the results were "predictable", but only if you assume that people actually making predictions for some reason took no advantage of that predictability.

(Source: http://www.sportingintelligence.com/2014/06/24/bankers-and-b...)


So they are just as good as flipping a coin? That's brilliant!


My understanding is that bookies work on the risk neutral probability, not on any predictive power.


^^ this. Statistics help them average out their wins/losses across an entire season. But if the bookie's business model was to predict only the final game of each season and make odds for it they would _not_ be in business for long


It's hard to predict a small group. Reliably. The masses are easy. Wall Street and Madison Ave do it all the time. But in football (soccer); which is fast paced, unpredictable and where an individual can have major influence you have a cluster fuck. Anyone's prediction is just a guess.

Now baseball is another matter. A much more rigid game. Easier to predict.

You know, we see this all the time in politics. How many upsets over the years? The models need to be thrown out every year. But it's expensive. So the US media clings to their same old models and is "shocked" when an upset occurs. And then they usually search so fast for a reason, they miss the real reason in the process. Hollywood has been missing the mark too. They also need to throw out their models and start over. BADLY.

Anyway I sometimes wonder just how far away we really are from Asimov's "Psychohistory ".


> So the US media clings to their same old models and is "shocked" when an upset occurs. And then they usually search so fast for a reason, they miss the real reason in the process. Hollywood has been missing the mark too. They also need to throw out their models and start over. BADLY.

This seems to assume that the US media is interested in having an accurate model of reality. I think it's far more reasonable to assume that the US media is driven by self interest, and 'shocks' and 'upsets' are part of what consumers are interested in. Speedy answers, not accurate answers. The model works for them.

To quote Steve Jobs from a Wired interview in 1996:

"When you're young, you look at television and think, There's a conspiracy. The networks have conspired to dumb us down. But when you get a little older, you realize that's not true. The networks are in business to give people exactly what they want. That's a far more depressing thought. Conspiracy is optimistic! You can shoot the bastards! We can have a revolution! But the networks are really in business to give people what they want. It's the truth."


People want to know who won the election. Getting that wrong does not help news networks in any way.

And, it's not like news media are the only people who try to predict elections. Eric Cantor and Mitt Romney had vested interests in winning their elections, but they still got it surprisingly wrong.


Hollywood has been missing the mark too. They also need to throw out their models and start over. BADLY.

Based on what? I work in the film industry, and studio level (ie larger budget) projects are heavily data driven. Considering the slipperiness of the input factors, I think Hollywood does a pretty good job actually. Do your remarks have a quantitative basis, or are you expressing your dislike of Hollywood's output?


Dislike of Hollywood's output, for sure. But "slipperiness" is a great term. As I said, it's hard to predict small groups. Are you saying Hollywood can't do a better job of figuring out what we want to see in big budget films? Cause all we are getting is the same crap over and over and over.


The input factors are slippery because many of them are not easily quantifiable. Quantifiable factors include past financial performance of films employing particular actors, directors, and other creative personnel; ratios of marketing to production budget, run-time, correlation of scene length distribution with power-law spectra, and various others.

Story factors are almost impossible to quantify reliably. There are some popular story schema like the Hero's Journey, and some pacing guides, like Black Snyder's 'Beat Sheets' that look at the structure of many successful movies to make guesses about the optimum time for story structure - eg if it's an action movie you need a fight scene by minute 15, or the Villain has to be appear within a certain time frame and get x minutes of solo screen time - but those tools are fashion-driven and if anything they contribute to the cookie-cutter approach. Whenever someone produces a better-than-average guide to what makes popular movies tick, it is quickly imitated as widely as possible.

I've written several screenplays and dislike templates, but I don't know in detail what people want to see. When I write or am working on film production I just have to keep playing the story like a movie in my head and and attempt to gauge whether it's sufficiently involving and consistent over time. The bigger the budget, the more conservative production decisions are likely to be, since executives and biased in favor of repeatability.


I doubt that any particular baseball game is easier to predict than any particular soccer game. But there are so many baseball games that aggregate statistics can become useful tools of analysis. Even the knockout tournaments in baseball are series of 5 or 7 games between the same teams.

World Cup knockout games are rare and are single-elimination. Same with U.S. elections. That makes it easier for the final outcome to be determined by outliers or lower probabilities.

Broader trends can still be predicted, though. It's not like North Korea just defeated Brazil 7-1...Germany is a well-known powerhouse. And in U.S. politics, quite a few people predicted that a protracted conflict in Iraq would hurt Republicans, and it did in 2006 and 2008.


Perhaps, or maybe the right data just isn't tracked / accessible yet.

Basketball is an instructive example --- as recently as 10-15 years ago, it was thought that this game couldn't be quantified/predicted nearly as well as baseball, that it had a lot of the same fluid properties of soccer and (US) football. Fast forward and a lot of work has been done to push basketball much closer to the baseball-side of the spectrum. Whose to say whether or not taking detailed data of every movement of every player in a soccer match might yield similar breakthroughs.


From what I have read (previous 538 blog post on Messi), they are already tracking a good deal of data about the games. I think one issue with soccer is that there is a lack of discrete, measurable outcomes in the game. I read a while back that one of the breakthroughs in basketball analysis came when they started tracking the total point differential during each player's time on court. Because so many points are scored in a game, and because so many games are played in a season, this stat was a fairly reliable and accurate picture of how a player would impact the team's performance (which allowed teams to measure the impact of players who may not rank high in the more traditional stats).

In soccer, you don't have a lot of data points to model against. The number of goals scored is typically low. Because of this, there is probably a higher level of uncertainty and variance in the outcomes of soccer games (and the prediction models as well).


You do have lots of data points. Each pass is a data point, each shot is a data point. Opta logs mores than 2000 events per game, each with an outcome and pitch coordinates. Yes, soccer is more complex than even basketball, but there's a lot more money involved and people watching. This stuff is being worked out right now, and it's an exciting field.


Often shots on goal or number of corners are used as a proxy variable because those events occur much more often than scored goals. But you're right that football is incredibly hard to model. For example, what would happen to the Argentinian team if Messi gets injured? Any pundit can tell you that it would probably be "really bad", but quantifying exactly how bad is currently impossible.


It seems that it wasn't so much about the rigidness of the game, as much as it was Neymar missing from the field. Brazil is very superstar-oriented, with excellent offense followed by pretty good midfielders, followed by okay defense, followed by bad goalkeeper.

Predictions on baseball teams probably fall through when a star player is injured.


No, this match had nothing to do with Neymar missing, and everything to do with Thiago Silva missing.

Neymar wouldn't have marked Muller for the first goal. Nor would he have been at the back to clean up the second fuck up, and the many fuck ups thereafter.

Thiago Silva is an excellent defender who organizes his defense to ensure proper defensive shape. In his absence, David Luiz was in charge of the same, and he isn't a particularly great defender to begin with (decent attacker, sure, but not a good defender).

Further, football resists statistical analysis, unlike most American sports, because momentum plays a HUGE part. Upsets are a common phenomenon in football. Small teams beat the big fish ALL the time.


" Brazil is very superstar-oriented"

Which is exactly why their current models failed. The need to be flexible.

Lots of people lost money. Maybe it was done on purpose? The models I mean. Cause I bet someone made a fortune. Someone short-selling the game?

PS: Or maybe Neymar's injury was "managed". Humm.. a nice juicy conspiracy theory. Stranger things have happened in the game before.


> Lots of people lost money. Maybe it was done on purpose? The models I mean. Cause I bet someone made a fortune. Someone short-selling the game?

The considerable amount of collusion required across fifa, the teams and players, multiple publishing industries, organizations, countries and cultures to have sports betting and publications like fivethirtyeight make shit up (to the point that they created a very detailed scoring system for their predictions, all for the purpose of lying about a lop sided 7-1 game) would have to be considerably large.

I think the the better, and frankly incredibly blindingly obvious, way to look at this is that this was just an unusual game that nobody expected and you can maybe take a break today from being suspicious about the whole world being manipulated by a nefarious few who conspire to create every fucking notable turn in history to screw you or someone else over.


"being suspicious about the whole world being manipulated by a nefarious few who conspire to create every fucking notable turn in history to screw you or someone else over."

But it's fun to think of "Illuminati".


Does evidence from betting markets back up this hypothesis?


It's still challenging to predict the outcome of a baseball game. Sabermetrics can project the outcome of a 162-game season, but a playoff series is too small of a sample size. Lewis stresses this in Moneyball.


Has anyone read "The Foundation" by Asimov?


I'm willing to wager more than half the readers here have. For those over 35, make that more than 75%.


Are you kidding? II'm guessing most of us have. And like me, many have probably read the whole series. Oh and if you read the I, Robot series they sort of connect - the whole human future/history spanning 100,000 years.


You mean "Foundation"?


The fact that variation is high does not mean statistics doesn't work. Nate Silver's model never claimed to predict individual behavior.


No, it didn't predict individual behaviour, but it tried to predict a global outcome, which is based on the summation of 22 (plus subs) individual behaviours/actions. And that what made it fail in this specific case.


Can you rephrase that in a way that makes sense? You appear to be saying that because sports teams are small that no probabilities can be assigned to the results of their games. There's literally a whole gambling industry that exists precisely because such a thing is possible. Likewise you appear to be arguing against insurance as a business model.

My guess is that you're misinterpreting Silver's work and assuming he was "predicting" that Brazil "would" win. He wasn't, and didn't claim to. He said Brazil was more likely to win, and gave numbers to that effect (that in this case happened to be wildly wrong).


I think he's saying it's a souped-up poisson model, not an agent based model. It's not trying to predict individual behavior.

A poisson model is probably valid, but it should perhaps be refined a bit. Maybe it should account for a systematic drift if the whole team plays badly (or a vital player, like the goal keeper, has a bad day). But it's probably really hard to calibrate that, and it's probably not too relevant unless you want to model major upsets.


Wouldn't that refinement basically be identifying the tails as quite a bit fatter than you'd thought before? But in that case, the tails are probably substantially fatter for everyone, and so the numbers 538 is quoting may not be far from the "real" ones.

My reading of this 538 World Cup analysis is that the error bars are taken as being very large, and the relative numbers are mostly interesting as a means of comparing the impacts of various players and other factors in the model. Obviously there can still be major flaws in this method, chief among them being that the method tends to atomize the team's contributions. But that's not a horrible first pass, and still can yield some interesting insights.


Yes and yes. It's still probably the biggest upset ever, but not as unlikely as their model suggests.


Statistics is used when a more accurate model cannot be found. This essentially assumes that there are a lot of unmeasured variables. So yes, statistically this was accounted for.


Statistics could be, and I'd say will be, a great tool for soccer. We're only just starting in this massivly complex sport, and unfortunately no comprehensive data is available to the public. I'm pretty sure in four years we will be discussing metrics than don't even exist today.


I think you're conflating statistics and probability.


Right, but I think that's what he's saying. Statistics can only describe the average outcome, which is only interesting when evaluating how extraordinary the actual outcome was.


Statistics can only describe the average outcome...

Um, no. Please go read any statistics tutorial containing the word "standard deviation" to see one example of why you are wrong.


I think he meant average in the sense of generality, not in the sense of the average function.


For those like me who don't know what the saying means. > Eating crow is an American colloquial idiom, meaning humiliation by admitting wrongness or having been proved wrong after taking a strong position.<

http://en.wikipedia.org/wiki/Eating_crow


This the same person who suggested Dani Alves would be an ideal replacement for the suspended Thiago Silva based on their 'defensive' statistics. While he might be a great statistician, unfortunately it's obvious he doesn't have enough knowledge of the sport to contribute any meaningful analysis.


To be fair, I dont think he could have gone much worse.


Agreed, that was silly.


Neymar is Brazil's talisman. A majority of their goals in this world cup, and quite a large percentage of their goals since he made his debut have come through his direct influence. Ignoring this was the biggest mistake in 538's prediction. Maybe football predictions needs a mix of stats and social psychology!

Thiago Silva is thought to be the best centre back in the world by most people. And by quite some distance. He provided leadership and organised that back four. Luiz, Marcelo, Maicon and Dante have never exhibited leadership qualities for their clubs.

I think that by suggesting Willian and Dante and Alves (even though Alves is ranked highly in the Guardian's top 100 list, he plays in the wrong position) would be adequate replacements was the biggest mistake.


What would be the justification for removing the discounting term for subsequent goals in the Elo model he mentions?

Silver mentions that it was the lopsidedness of the score that made the game so surprising, and that the Elo model discounts increasingly lopsided games. As a consequence, this game would not be the "most surprising" game in the history of the world cup. Thus he removes the discounting term, re-runs the model, and poof this is now the most surprising game in World Cup history.

That just smacks of attempting to fit a model to one's intuitions. Was it any more "shocking" that Germany won 7-1 than that they were up 5-0 at halftime? I would presume that the "lopsidedness" discount is intended precisely to address the idea that once a team is winning by an overwhelming score the subsequent goals aren't really that surprising.

Really, though, a measure of "shockingness", at least as described in this piece, suggests more about what the Elo model cannot capture than it does about the subjective way that any one game was perceived.


I imagine the discount on a lopsided game is because the teams stop really playing. After 3 goals or so it's generally a done deal. So you start to see players pulled, playing very differently (conservatively).


That might apply in a league where you can amortise that loss over a whole bunch of games but in a single-shot knockout game, I'd expect a professional team to keep fighting even at 5-0 down - much as the Brazilians did in the second half. If it weren't for Neuer's outstanding saves, they could have had 3+ goals in the first 15 minutes.

(Although the league sometimes doesn't work like that either - I remember listening to Spurs vs Man Utd when Man Utd went in at half-time 3-0 down and ended up winning 5-3 in what is possibly the most epic half of football I've heard in 30+ years.)


Arsenal coming back from 4-1 down against Reading to win 7-5 was pretty epic as well.

Also, goal difference is important in leagues, so teams tend to play to the end even if the gap is large. Though I think this could be improved by changing to 2 points for a win, with a bonus point for scoring 3+ goals.


Also, it's in the losing team's best interest to play for high variance, low expected value.


Is this data accurate? As a Turkish fan, I don't recall nor can find any records of Turkey losing to Switzerland by a 7-0 margin in 1998.

Edit: Also sadly, Turkey didn't make it to the World Cup in '98.


The only thing I could find was potentially a 7-0 match for Turkey vs. Korea in the 1954 World Cup, which was played in Switzerland? Hard to say, since the data seems to come from:

http://www.world-results.net/

and is a paywalled API.



Perhaps this includes qualifying matches.


The only World Cup qualifier that I could find (going back to 1990) are the play-off games between Turkey and Switzerland in '06. Switzerland wins the first game 2-0, and Turkey wins the second one 4-2.

Nothing as dramatic as a 7-0 margin as suggested by the article's data though.


Given the following:

1) the psychological pressure put on Brazil's players to win the cup in their home turf by their local fans, worldwide fans, media and history of the team.

2) the injury of their most valuable offensive player.

3) the non-participation of their most valuable defensive player.

4) the failure of their coach with other international teams.

5) the general attitude of Brazilian football towards "Zogo Bonito", giving importance to offense and neglecting defense, against the very well organized team like the Germans.

it was pretty obvious that the Germans would dominate the game.

The shocking part for me was not the score. It was the psychology of the Brazilian defenders. They quickly lost their nerve. These players are supposed to be of world class, having played in top clubs and knowing how to handle pressure. I was wrong, apparently.


I think you meant "Jogo Bonito". Yeah, our J sounds like Zh / ʒ.


From the handicap article:

[Among semifinalist teams, only Argentina has maintained ball possession more often than Germany, and nobody makes more short passes per game than the Germans. Germany’s approach is to patiently work the ball into the opponent’s territory, passing it around until its players can create a high-percentage scoring opportunity. Brazil, on the other hand, loves to dribble the ball and create chances by taking on defenders in one-on-one situations.]

http://fivethirtyeight.com/datalab/world-cup-semifinal-crib-...

So here's what happens in nearly every sport. A team focused on individual brilliance and passionate "opportunities" gets clobbered by a disciplined opponent who remains patient. Situations like this often turn into a rout.


Clearly the model does not claim to predict the future or else we'd all be putting money on the games. I don't think a justification or "eating crow" is necessary as the match was obviously an anomaly.


Reading this article (and lots more around world cup time) written about football by North Americans, I've noticed that there's a different way of thing about teams and countries than in Europe (or the English speaking part at least).

The biggest one that jumps out at me is using single person to describe the teams. We generally describe a team as 'they' rather than 'it'. So we are more likely to see 'Germany’s win will also affect their odds in the World Cup final' than the way it's written in the article.

Has anyone else found any other Americanisms?


> Has anyone else found any other Americanisms?

Offense (US) is Attack (elsewhere).

home-field advantage (US) is home advantage (elsewhere)

"Some of the goals that Brazil keeper Julio Cesar allowed were unavoidable, but he was not exactly Tim Howard in net. " (US)

would be [approx]

"Some of the goals that Brazil keeper Julio Cesar conceded were unavoidable, but he was not exactly Tim Howard in goal." (elsewhere)

Away from the article, the most jarring thing for a football (US: soccer) fan exposed to the US media is the use of "tie" instead of "draw". Of course there is also the US concept of an "assist" which has in fact now been embraced by the wider football community.

As well as differences in specific terminology, there is a broader and difficult to pin down difference in the "feel" of US coverage of football. To someone used to traditional football coverage it feels alien, maybe basketball or baseball translated into football rather than native football.

Don't get me wrong, I enjoy American sport and love the rapid fire humour and passion of American sport journalism in all media.


> "Some of the goals that Brazil keeper Julio Cesar conceded were unavoidable, but he was not exactly Tim Howard in goal." (elsewhere)

There's another plural/singular one. It would more likely be 'in goals', or more colloquially, 'in sticks'.

> the use of "tie" instead of "draw"

That reminds me of another one, how the score is communicated. Americans say something like 'Germany won seven to one', I'd say 'Germany won seven one'. This then gets quite confusing for Gaelic football where we say things like 'Mayo won three twelve to one seven' (3 goals and 12 points to 1 goal and seven points)


In goals ? I don't think so. Perhaps you are confusing local colloquialisms with standard British English ?


That seems to be the more used phrase. Maybe it's an Irish thing, we like to be a bit different too :)


I'm guessing you're from the UK? This isn't specific to teams. Any group is referred to in the plural in the UK and the singular in the US. For example, "the police is" vs "the police are".


Right, and as an American when I read British English it's not the use of "their" vs "its" that strikes me as 'foreign', it's the fact that as you point out, the names of groups take plural verbs in British and singular verbs in American usage.


I'm from the midwest (USA) and I'm not sure what the rule is but here is how I would say it.

The City Council is … The class is … The police are … The firefighters are …

The phrase "the police is" sounds so foreign to me, that I am almost certain I've never heard nor said it.


Almost, I'm from Ireland.

I hadn't noticed the wider use of singular groups previously, but that does make sense when you point it out. I suppose it was just more obvious because we rarely see american analysis of football outside of the world cup.


If you don't understand what you're modelling, you won't create an accurate model.

For programmers, we create an approximation to the real world and then refine the model as we (or the users) come across the imperfections.

Unfortunately, Nate probably still doesn't know how much he doesn't know.

> But there was almost certainly some bad luck for Brazil. It had more shots than Germany in the match

Comparing number of shots (or possession) is a bad metric if the playing styles are different.


Is statistical analysis useless for soccer? No - Australia was unlikely to win the world cup.

Is the 538 model a good enough one? No - when we have two events it thinks are extremely unlikely (this and Netherlands-Spain 5-1), we should suspect that there's something going on it's not capturing.


I mean at the broad level, sure stats applies. But even so, not many would have predicted Spain going out at the group stages, or Holland making it to the semis.

Football is a highly unpredictable sport because individual contributions can completely change the course of a match. All it takes for a minnow to beat a big team is some organized defending and a lucky goal, like Switzerland vs Spain past WC.

In a sport where the margins of victory are small, it is difficult to apply statistical analysis correctly.


The second assertion is a dangerous leap - an outlier event when the sample is small, may in fact, just be an outlier - ie, there isn't a prognosticator on the planet who called those two results. If anything, trying to "re fit" the model t o account for these outcomes as part of a Taleb-esque fat tails approach seems more dangerous to me.


Nate Silver stops making adjustments to the model as soon the results fit the headline.


Actually the prediction is kind a awkward.

Germany was the clear favorite, better team, better individual players on most positions (especially in the crucial midfield) and better strategic education.


It seems the actual article points to the contrary: that the prediction model is generally good, but can get unlucky, and has trouble with edge cases.


If i'm not mistaking, Germany used SAP (the big German ERP company) and bigdata for help in decisions :)


Looked like a thrown match to me tbh

Dont think models need fitting to this kind of result


Gotta adore the whitewashing priorities.


Brazil looked like a pack of nincompoops scurrying around without sense. Well deserved win for Germany!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: