Since Friday, besides being angry at not forecasting the likely results, I ask myself how come. I thought that this required further analysis. What about modes?
Well, the difference between modes is one clear reason why I -- and others, I guess -- were misled. Contrary what we were expecting based on previous referendums and elections, the Web polls' estimates of the Leave side -- that we could call the "status quo ante" side -- were higher than telephone polls' estimates. As we can see in the following graph, during the campaign, web opt-in polls tended to put "Leave" at 50%. Telephone polls' estimates were way behind at the beginning of the campaign-- five points lower -- and around two points lower at the end. On average, taking into account change over the campaign, web polls tended to put the Leave side 3.3 points higher than telephone polls.
However, there is substantial variation between estimates at the end of the campaign. Some web polls estimated support for Leave as high as 55% before Jo Cox's death and as low as 45% after. A similar situation exists for telephone polls: after estimating an advantage to Leave before Jo Cox's death, estimates went under the 50% line following her death. On average, in the polls, Jo Cox's death is followed by a drop of 2.9 points in the support for Leave.
Three conclusions arise from this analysis. First, the Web polls generally gave a better estimatation of the situation than telephone polls. We tended to believe that Web polls generally give higher estimates of the more liberal side (like in the Scottish referendum of 2014, in the UK 2015 general election and in many Canadian elections). However, in the US presidential campaign of 2012, web polls tended to produce lower estimates for Obama than telephone and IVR polls (see here). We may now conclude that Web polls generally give different estimates during the campaign but that they are not necessarily always biased in the same political direction.
Second, it is possible that people who were for Leave were even less likely to say so after Jo Cox's death. In short, maybe Jo Cox's death did not influence the vote but only the tendency to reveal a vote for Leave. This would give weight to the idea that social desirability -- or the Spiral of silence -- played a role in this campaign and that it was the Leave side who tended to hide its preferences.
Third, and very importantly, in the UK, it is the third time in a row -- Scotland 2014, UK 2015, Brexit 2016 -- that the estimates from the different modes differ substantially during the campaign but converge to similar average estimates at the end. This is mind boggling. I know that some researchers and pollsters explain it by "herding" but there is no proof of that, quite the reverse in fact. Although averages tend to be similar, there is much variation in estimates at the end of the campaign. It would be unlikely that pollsters using the same mode agree on an average! In France in 2002 (see Durand et al., 2004), I was informed by some pollsters on how they "chose" the published estimates. It would be "interesting" to get some insider information in the British case also.
With a non-proportional attribution of non-disclosers
Now, when I attribute 67% of non-disclosers to the Leave side, like I did in my last post and like I should have done all along, the web polls estimates show the Leave side ahead during all the campaign. On the opposite, the telephone polls did so only at the end. With this non-proportional attribution however, both modes give a very good estimate of the final vote, although some pollsters fared better than others.
Pollsters conduct polls and their estimates are published in the media. The role of researchers is to analyze the polls and inform on their likely bias. However, when bias is not systematic, it is very difficult to do so. We need to understand when and why polls go wrong, and when and why there are differences in estimates according to mode of administration. However, in order to do so, we need to know where the numbers we are working with come from and how they are compiled, weighted and adjusted. Since polls may influence the vote in some situations, maybe poll data of published polls should be made available to researchers during electoral campaigns. Sturgis et al.'s analysis of the UK 2015 polls is a good example of what can be done when researchers have access to the data