Ah! les sondages: Can the polls be wrong?

Hi,

For this blog post one week before the election, I will first update the same analyses as in the last post with the most recent data. However, since lots of people ask if the polls can be wrong, like in 2022 for example, I revised all the methodological information provided by the pollsters. I will try to go through all the possible reasons that would lead polls to miss the target and conclude on this question.

Voting intentions from September 1st to October 27

Let us first have a look at the change over time in voting intentions that took place recently. As other analysts have shown, there has been continuous movement in favor of Donald Trump. Both candidates are now in a real dead heat. We cannot even speak of a margin of error: all the polls conducted last week vary between 49% and 51% for one or the other candidate, with four of the 11 polls at 50% on the two-party share.

If we look at voting intentions according to the combination of mode of administration and sampling source used, there is not much difference left according to methodology recently. Except for one outlier that put Harris at 53% (on the two-party share), all the other polls estimate the two candidates' share to be approximately equal.

The following table presents, for each week since September 1st the number of polls, the distribution of the methodology used, and the number of polls favoring each candidate. During the last week, there were 7 out of 11 polls either favorable to Donald Trump (3) or equal for both candidates (4), 7 of them using web opt-in, a mode that has been more favorable to the Democrats on average since the beginning of September and in the 2020 US election.

Therefore, we conclude that the two candidates are "nose-to-nose", which means that, if support for Donald Trump still tends to be underestimated, he is in fact ahead in voting intentions at the national level.

Can the polls go wrong?

Now, let's play devil's advocate and see if there are reasons to believe that polls will be wrong AND overestimate the Republican Party or its candidates as in 2022. How could this be possible?

I need to clarify a few things. First, we tend to conclude that polls went wrong when they did not forecast the right winner. This is what happened in the 2016 US election. However, in the 2020 election, polls were generally further from the target than in 2016 (they are considered to have been the worst since 2008) but since the winner was well predicted, the general public tended to conclude that the polls did not get it wrong. In that election, there was a substantial overestimation of the vote for Joe Biden. The polls that used web opt-in methodology were more likely to be far from the target, on average, with a final forecast of 55% for Joe Biden on the two-party share. He got 52.2%.

Second, in general, the polls are more likely to overestimate the more progressive side for a number of reasons. Empirically, it seems that more progressive people tend to be easier to contact and tend to collaborate more with pollsters.

Third, polls may miss the target but it can be due to people changing their mind at the last minute. This happened in the 2018 Quebec election (Durand and Blais, 2020). In order to conclude to a failure of the polls, we would need to meet three conditions. a) Almost all the polls err in the same direction b) the difference between the polls and the vote is significant c) We can discard the possibility that there was no substantial movement in the electorate during the last days of the campaign. In short, the polls may miss the target but it is a failure if and only if it is not due to the voters changing their minds.

Now, let us look at why the polls could be wrong

a) All the pollsters have been told after the 2016 election that they should adjust their results according to the education level (See Kennedy & al., 2017, 2018). However, self-declared education level is a tricky variable. First, people tend to boost their level of education. It means that when pollsters weight/adjust their estimates, not all the respondents who should be weighted up are.

If people with a lower education level are more likely to vote for Donald Trump -- this is what we see in the polls -- weighting by education level would tend to lead to an overestimation of the support for Donald Trump. However, most pollsters weighted for education in 2020 and the overestimation of the Democrats was even worse than in 2016.

b) Many pollsters use registered voters files as their sampling source or filter out respondents who are not registered. What if there is a movement to register people that is stronger on one side than the other. Then this side may be underestimated. In short, if the movement to register is stronger on the Democrats' side, they may be slightly underestimated.

c) The likely voter models at the individual level: Though I have shown in the US situation (2016 and 2020) that using a likely voter model does not seem to have a significant impact on the estimates, it does not mean that it cannot happen. The way these models are computed is not always transparent. They are based on one or more questions regarding the respondent's intention to go vote, to which some add also information about the prior participation in elections. I remember the first Obama election in 2008 where some pollsters revised their likely voter models in view of the specificity of that election and of the possible change in the composition of the electorate. They thought that the pool of voters would be different, given that Barack Obama was the first Black candidate in history and there was high enthousiasm for his candidacy. If the same happens this time, it could impact the estimates. However, I do not see in the methodological notices indications that there are revisions of likely voter models.

d) The likely voters at the collective level: Some pollsters make assumptions regarding the proportion of Whites, Blacks and Latinos who will vote. These estimates seem to vary somewhat between pollsters. If the estimate of the proportion of black voters is low, it can underestimate Harris's support. However, given the proportion of Blacks in the voting population, it woud likely not change much in the national estimates. Some pollsters also make assumptions regarding the proportion of Democrats, Republicans and Independents in the voting population. These assumptions seem to vary between pollsters.

e) Estimates of the Hispanic and Black vote is generally based on very small samples, too small for the stated purpose. In short, if one adjust for the proportion of Latinos and Blacks who will vote and use the estimates of their vote in the final estimates, it may lead to estimates that are less accurate, probably an underestimation of the vote for Harris.

f) Weighting by voter recall. Some pollsters use voter recall to adjust their estimates, a practice that is common in Europe. The pollsters who used voter recall in 2020 were among those whose estimates were outside the margin of error in the last 10 days of the campaign. They systematically underestimated the vote for Donald Trump. However, research shows that, if the support for a candidate is going up compared to the previous election, his support will be underestimated by this practice; if it is decreasing, it will be overestimated. Since the support for Donald Trump seems to have increased compared to 2020, his support could be underestimated by the pollsters who use this practice.

In conclusion

I may be wrong but I do not see any practice that applies to all pollsters and would lead undoubtedly to a bad estimation of the vote. The only possibility left would be a late campaign swing, a very rare occurrence in elections, or undecided respondents being much more likely to vote for one candidate. However, the proportion of undecideds is very low in all the polls so that it is unlikely that it could lead to a biased estimate.

Pollsters should be prepared to reinterview the respondents of their last polls in order to make sure that they can understand what happened if there is a polling miss. People will not accept an explanation based on a change in people's voting intentions if the pollsters cannot prove that such a change indeed occurred. This means that pollsters may think about asking the respondents of their last poll before the election if they may contact them again after the election for a very short survey and they should be prepared to proceed to reinterview in the following days after the election. With such a close call, whatever happens, pollsters may be considered to have failed.

Ah! les sondages

Translate

lundi 28 octobre 2024

Can the polls be wrong?

Aucun commentaire:

Publier un commentaire