jeudi 5 novembre 2020

US 2020 - no 8. Two days after: did all the polls miss the target?

 HI,

 

My last message, titled "I think Biden will win", was posted on Monday, at midday. It included the polls published before 11AM. From 11AM Monday to Tuesday morning, 16 new polls were published on different sites. These polls confimed that support for Trump was increasing. Therefore, I tweeted on Tuesday that I still thought that Biden would win but that the results would likely be closer than expected.

Now that most of the votes are counted, we can compare the polls and the vote. The counted vote gives 50.4% to Biden and 47.9% to Trump, as I am writing this post. Please note that the proportion of votes for Biden is likely to increase with the counting of the votes and therefore, the performance of the pollsters will eventually appear better than now. In all the graphs, I used the proportional support for each candidate (support for one candidate over the sum of the two) in order to compare likes with likes: Some pollsters do not publish estimates of support for small candidates and even the proportion of undecided. Therefore, the published figures are not comparable. Therefore, I use the following figures for the reported vote: 51.3% for Biden and 48.7% for Trump. This is the proportion of the vote for each over the sum of the vote for the two of them; it allows for comparing likes with likes.

The first graph shows the trends in support for the two candidates when I integrate the 223 polls published from September 1st. As we can see, collectively, the polls missed the target, even if we take into account the late increase in support for Trump. However, as we can also see by the dots what are close to the election results, some polls seem to have performed better than others.



What about modes?

The following graph shows the estimates according to the main mode of administration used by each  pollster. As we can see, not all the polls were wrong. The two upper lines portray the trends traced by IVR polls and the dotted middle line, the trends traced by live interviewer polls. The trends tend to show that these polls were generally more accurate, and within the margin of error. These polls account for only 20% of the polls overall (14% live interviewer and 6% IVR) but they account for almost all the polls that performed better. Not all the telephone polls performed well and not all the web polls performed badly. However, there is a rather strong relationship between mode of administration and estimates of support. In addition, the telephone polls portrayed changes in support that were missed by the web polls. Therefore, a global average of the polls that do not take mode into account is misleading.


Since IVR pollsters cannot call cell phones, they complete their samples using another mode, usually web among cell only users. Other pollsters also combine modes, usually live telephone and web. In the preceding graph, we attributed the mode according to the main mode used. In the following graph, we show the trends according to the use of mixed mode. As the graph shows, there is indeed a tendency for polls using mixed modes -- the upper dotted line -- to be more accurate.



Conclusion

The proportion of web polls went from 50% in 2016 to 80% now. The use of averages that do not take modes into account -- by aggregators for example -- contributed to the impression that all the polls had been wrong. I do not know how and if it can be fixed in the next election but I think it ought to.

The fact that more pollsters resorted to mix modes is good news in my view. It seems difficult to represent the whole population of the United States using only one mode. Web modes cannot reach close to 15% of the population, which does not have internet access. IVR polls cannot reach cell only phone users withour resorting to another mode. The live interviewer polls can reach almost everybody but they are expensive.

What are the factors that explain differences between modes? It is difficult to partial out the impact of the cluster of methods that come with each mode and, sometimes, with each pollster. Modes vary in their sampling source, in their coverage and in their weighting. They also differ in terms of the length of the questionnaires they use, the way they ask their questions, the order of the vote intention question, etc. It remains that the modes that resort to probibilistic or quasi-probabilistic recruitment tended to perform better. The reports show that some web pollsters are trying to improve their methods to integrate some randomness but the estimates for this election show that there is still work to do. The AAPOR committee that will examine the poll performance will no doubt look at all these features.

Most of the methodology reports provided by pollsters were good enough to get sufficient information on what they do, which is good news. There are seven pollsters for which it has been impossible to find any methodological report. Some of them are academic pollsters or work for established media. This is difficult to understand.

Anybody outside the U.S. would be very surprised, I believe, by the proliferation of pollsters. There are 54 pollsters who published polls since September 1st. Among them, 21 conducted only one poll and eight conducted two polls. Five of them appeared in the last week of the campaign. I checked and did not find any difference between their estimates and the estimates of more established pollsters. However, it is more difficult to find information about some of these pollsters and to check on their work. It may become a concern when electoral campaigns are heated. Some may be tempted to publish biased or fake polls because they think it may help their preferred candidate.



We thank Luis Patricio Pena Ibarra for his meticulous work in entering the data and gathering the related information to help me conduct these analyses.

Methodology for the analyses presented in this post.

1) The trends produced are produced by a local regression procedure (Loess). This procedure gives more weigth to estimates that are closer to each other and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trends. If I start on September 1st, the trends are somewhat different because of the absence of the influence of the polls conducted in August. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trends too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data. This is appropriate in statistical terms.

4) The data used comes from the answers to the question about voting intention.

5) For non-Americans, note that, in the USA, IVR ("robopolls") cannot be used to call cell phones.U.S.A. seems to be the only country that has this restriction.

lundi 2 novembre 2020

I think Biden will win

 Hi,

I rarely commit myself to predict election results. This time, I am quite confident. And I know I will feel very bad on the 4th -- or when the final results are known -- if I am wrong... Of course, I examined only the national polls but if Biden ends with less electoral votes and a larger proportion of the popular vote than Trump, it would mean that the system really needs to be changed.

I first present the graphs and then, I explain why I think Biden will win. I included all the polls published before 11 this morning. Here is the first graph with all the polls. It shows that there might be a sligth decline in support for Biden recently. If this decline is confirmed however, he would end with the same support as in the beginning of September. As we will see in the next graph, not all the modes estimate that there is such a decline. The difference between the support for Biden over Trump is still at about eight points. Just before the election in 2016, the difference between support for Clinton and for Trump was closer to five points, using the same methodology.


Let us examine now the graph by mode of administration. We still observe a difference in trends by mode. However, with the integration of the recent polls, the difference is not as large. Live telephone  and IVR polls see a recent increase in support for Trump. There are not that many IVR polls so that any new poll may modify the trends. 

In summary, according to the web polls (close to 80% of the polls), proportional support for Trump has stabilized at around 45% since mid-October. Some of these polls have higher estimates, at around 48%, others, lower estimates, at 42%. According to live interviewer polls  (15% of the polls), support for Trump has increased recently, at around 47%. The only polls that estimate the difference in support within the margin of error are the IVR polls (6% of the polls), and they do not estimate that this support is still increasing.


 

Why I think Biden is likely to win

First, the AAPOR report on the 2016 polls has shown that most of the 2016 national polls conducted during the last week before the election in 2016 were within the margin of error. It identified the lack of weighting by education in the state polls as a possible explanation for their not so good performance. The pollsters seem to have adjusted for that. In addition, some pollsters weight for addtional variables like household size, the presence of children, the vote cast at the last elction. The committe concluded that a lower turnout among Clinton supporters was a likely explanation for the difference between the polls and the vote in some states. This does not seem to be happening in this election. The turnout will  likely be much higher than in 2016.

Second, a polling miss can come from two main factors: the poll methodology and people changing their minds (or not revealing their preference). In the 1980's and 1990's, the pollsters used very similar methodologies. Therefore, methodological factors could be a likely explanation for polling misses. Right now, the methods used are so diverse that it is unlikely that methods would lead to a polling miss. We can also exclude a last minute shift -- unless something incredible happens after I finish writing this post. Even then, so many people have already voted and cannot change their vote that a major shift is almost impossible. A last minute shift in voting intention has very rarely been proven. Sturgis -- Report of the British Polling Council -- checked for a late campaign shift to explain the poll failure in the UK in 2015 and concluded that no such movement occurred. The only case I know of is Quebec 2018 polling miss where there was a substantial move from one of the two major parties to the other during the last days of the campaign (Durand and Blais, 2020).

Now, there is one question left. How come IVR polls show such a difference with other polls. Could they be right and all the others wrong? IVR polls have often performed very well. However, in all these cases, their estimates were not far from the estimates produced by the other modes. In this election, after closing the gap with other polls at mid-campaign, they end with estimates that are four points higher than web polls and more than two points higher than live interviewer polls. There are unfortunately not enough polls using mainly this mode to see if it is due to the pollster or to the mode itself.

In summary, since September 1st, we identified 207 national polls conducted by 54 different pollsters with various methodologies. Only 12 of these polls are mainly IVR polls, 30, mainly Live interviewer polls and 16% of the polls combine different modes. It appears to me very unlikely that the large majority of these polls and pollsters are wrong. My only caveat is that we need to keep in mind that web polls seem to have difficulty detecting change.

Conclusion

For Donald Trump to win, it would be necessary that more than 50 different pollsters with different methodologies be wrong. This is highly unlikely. Electoral polls are often the best polls that you can get because pollsters know that their estimates will be compared with the final results (and with their competitors). Their performance testify of their competence. They are therefore very careful and they try to improve their methods all the time. However, there are many "one-shot" pollsters in this election (four new ones last week). I checked on their estimates. They are similar to those of established pollsters. Finally, my precedent blog showed that, even when we "load" the estimates in favor of Trump to simulate a "shy Trump" effect, we are still led to conclude that Biden is sufficiently ahead to win.


We thank Luis Patricio Pena Ibarra for his meticulous work in entering the data and gathering the related information to help me conduct these analyses.

Methodology for the analyses presented in this post.

1) The trends produced are produced by a local regression procedure (Loess). This procedure gives more weigth to estimates that are closer to each other and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trends. If I start on September 1st, the trends are somewhat different because of the absence of the influence of the polls conducted in August. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trends too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data. This is appropriate in statistical terms.

4) The data used comes from the answers to the question about voting intention.

5) For non-Americans, note that, in the USA, IVR ("robopolls") cannot be used to call cell phones.U.S.A. seems to be the only country that has this restriction.