Ah! les sondages: US2020 no 4. Who can we trust?

Hi,

In my fourth post on the polls of this election, 13 days before the election and two days before the second debate, I tried to go deeper into differences between modes, but also between pollsters.

First, let us have a look at the graphs

The first graph shows the trends in proportional support for Biden and Trump. We need to use this "proportional support", that is the proportion of each on the sum of the two, because there is substantial variation in the proportion of undecideds by pollster and by mode. It ranges from 0% to 12% since September 1st. It is on average at 2.9% for IVR-misec mode polls, 3.2% for Live interviewer polls and 4.6% for Web polls.

So here are the trends since September 1st.The difference between the support for the two candidates has increased somewhat, from eigth points at the beginning of September to a bit more than 10 points, on average. Biden is clearly ahead in all the polls, except for a few outliers that estimate that the two candidates are very close, this even in October. There are also outliers who estimate the difference between the two candidates at more than 20 points. We will come back to this later on.

Let us now look at the trends by mode.

As we can see, there remains a difference in trends by mode of administration but the discrepancy between modes is becoming smaller. We are heading towards a situation that is very similar as 2016, that is, no difference between web and live interviewer polls (estimate at 45%) and estimates of IVR polls that are higher for Trump (around 46%). However, we see some "outliers", particularly among the web polls. We also still see that the web polls trace a much more stable trend than the telephone polls. In addtion, the decline in support for Trump after the first debate now appears very clearly, whatever the mode use.

Of polls and pollsters

Some people -- all people? -- tend to do some "cherry-picking" of polls. They look only at the polls that are good news for their preferred candidate. However, some pollsters' estimates are quite different from the majority of the other estimates.

There were 42 pollsters conducting polls since the beginning of September. Of those, 19 (42%) conducted only one poll. these "one shot" pollsters are much less likely to comply with AAPOR's transparency guidelines. For six of them, we do not even know the mode of administration used. One pollster states "The survey included mobile phones, landlines and online" as the only information on the mode of administration. For most of them, we do not have the question wording. One pollster asks: "Would you vote for President Trump or for Joe Biden?", a clearly biased question. So we need to be careful about these pollsters' estimates.

The following Box and Whiskers plot shows which polls are confirmed statistical outliers and how far these outliers are from the other polls. Survey Monkey's poll conducted from September 30 to October 1st estimates the difference between the proportional support for the two main candidates at 21.5 points. This is an extreme outlier. One AP-NORC's poll and one Opinium's poll, both conducted from October 8 to October 12, are also outliers with an estimated difference of more than 17 points. Survey Monkey's poll estimated Trump's support at 39% and AP-NORC and Opinium at 41% while the median of the other web polls is 45%. We cannot really rely on these polls because they are too far from the other polls.

On the other end of the spectrum, one Harris poll, conducted between September 22 and 24 and one IBD/TIPP poll conducted from September 30 to October 1st are also outliers in estimating proportional support for Trump at close to 49%.

The Box and Whiskers plot (information regarding Box and Whiskers plots can be found here) also confirms that the median -- the horizontal line in the middle -- of IVR-mixed mode polls is higher than the median of the polls using other modes.

Why is that? A number of reasons may explain the differences between modes and between pollsters beside the sources of samples. Among them, variation in question wording. In the vote intention question, some pollsters mention the party for all the candidates, others do not. Some mention the vice-presidents, others do not. Some mention the small candidates, others do not. In addition, telephone polls tend to use a "leaning question" asked from those who say they are undecided. This tends to reduce the proportion of undecideds. In some very good reports, pollsters reveal the vote of the leaners, a very relevant information. Leaners tend to support Trump more than the respondents who state their preference rigth away. This tends to confirm a possible "shy Trump" tendency. Some pollsters rotate the names of the two main candidates - either Trump or Biden is presented first in the question --, others do not. Unfortunately, information about the use of a leaning question and of rotation is not often mentioned in methodological reports.

The number and type of questions asked before the vote intention question may also have a substantial impact since these questions may influence the answers to the subsequent vote intention question by drawing attention on some issues. The 2020 election may be a good occasion to look at all these methodological features to assess whether they have an impact on estimation. However, for this, we will need the cooperation of the pollsters.

In conclusion

In 2016, the difference between Clinton and Trump was smaller (7 points) at about the same time, that is, before the third debate (see my blog post of October 28 2016 here). The main difference is that in 2016, some pollsters had estimates of Trump and Clinton at par or very close, particularly close to the end of the campaign. In addition, support for Clinton started to decline just before the third debate. In this campaign, since September 1st, one IVR poll has put Trump very slightly ahead of Biden in mid-September and two polls have put the two candidates close to each other at the beginning of October. Since then, polls have been quite unanimous in putting Biden systematically and clearly ahead of Trump. Of course, we know that it is not necessarily enough for Biden to win because of the electoral system and ... because there are still two weeks to go. Meanwhile, we should not "jump to conclusions" when we see polls that fit our view when they differ from other polls conducted at the same time, and even more, when they are conducted using the same mode.

We thank Luis Patricio Pena Ibarra for his meticulous work in entering the data and the related information to help me conduct these analyses.

Methodology for the analyses presented in this post.

1) The trends produced are produced by a local regression procedure (Loess). This procedure gives more weigth to estimates that are closer to each other and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trends. If I start on September 1st, the trends are somewhat different because of the absence of the influence of the polls conducted in August. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trends too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data. This is appropriate in statistical terms.

4) The data used comes from the answers to the question about voting intention. Undecideds (non-disclosers) are attributed proportionally -- for now.

5) For non-Americans, note that, in the USA, IVR cannot be used to call cell phones.U.S.A. seems to be the only country that has this restriction.

Ah! les sondages

Translate

mercredi 21 octobre 2020

US2020 no 4. Who can we trust?

Aucun commentaire:

Publier un commentaire