mercredi 14 octobre 2020

US 2020 - no 3: Let's talk about methods

 Hi,

In this message, I update the graphs presented in my preceding posts that show the different trends according to mode of administration. I perform this analysis for the polls conducted since August 1st, in order to compare with my previous post, and since September 1st -- we now have enough polls to limit the analysis to more recent polls. I also add a graph showing the trend in the difference between the two main candidates. The analyses are based on the proportion of Trump voters on the total of Trump and Biden voters. This is the only way to have comparable estimates.

I will also have a section about the methods used by the pollsters. This will allow for better understanding what modes of administration may mean.

Recent trends by mode

The next two figures show the trends in support for Donald Trump by mode of administration. The first graph uses all the polls published since August 1st (n=192), the second one, since September 1st (n=120). We could not find the mode of administration for two polls.

The first graph confirms the tendency observed in my preceding posts. The telephone polls using live interviewers (the dotted  line) and the Interactive Voice Response (IVR) polls (the upper line) both show that support for Trump was increasing until it reached a plateau at the beginning of September and is decreasing since. The web polls (the straight pink line in the middle) do not detect such a movement. They account for 77% of the polls and therefore, they have a substantial weight in the graphs shown by most aggregators.

The graph also shows that, after the debate -- the vertical dotted line -- there is much variation in the estimates of the different web polls. They estimate Trump support between 39% -- the lowest estimate -- and 48%. Most of these polls estimate proportional support for Trump at around 45%.


Before you conclude too early, I refer you to a similar graph in my post at the end of October 2016 here. In 2016, the same downward trend for Trump occurred after the first debate but he could come back afterwards.

If we now examine only the polls conducted after September 1st, here is what we get. It is rather clear that the IVR polls detected a downward trend at least a week before the debate. It is also clear that the web polls do not detect any change. In addition, the IVR polls' estimates are now very close to the web polls estimates while telephone polls -- the dotted line -- estimate Trump support lower than the other polls. However, we need to be careful because there are not that many Live interviewer and IVR polls. Since September 1st, they account for respectively 12 percent and 9 percent of the published polls.


Another way of showing the same information is to examine the difference between support for Trump and Biden. The following graph shows that the difference between the two candidates has been growing steadily since the beginning of September. It went from 2% to 10% according to IVR polls and from 7% to 13% according to Live Interviewer polls. For web polls however, it stayed almost stable around 10%.


Let's talk about methods
How can we explain these differences? We need to examine how the different polls are conducted in order to at least propose hypotheses of explanation. We need to realize first that our classification in three modes hides substantial differences between pollsters using the same mode of administration.

Pollsters who use Live interviewers may have different sampling frame. In the Appendix of the AAPOR Report on the polls of the 2016 election, we noticed that the proportion of cell phones in the samples varied from 25% to 75%. This information is rarely presented by the pollsters. However, one pollster informs that its polls include 80% of cell phones. We may think that people who cannot be joined by cell phone have specific characteristics that are related to vote intention. However, the 2016 analyses did not show any impact of the proportion of cell phones in the samples on estimates.

IVR polls in the U.S. should probably be renamed Mixed mode polls. I have grouped in this category all the pollster who use Interactive Voice Response -- also called robopolls -- to recruit at least a part of their respondents. However, because of the interdiction to use automated calls to cell phones in the U.S., the pollsters using IVR complete their samples with other sources of respondents. In 2016, their samples usually combined 80% of automated calls to landline phones and 20% of cell-only respondents recuited via web opt-in panels. In 2020, there is much more variation and  respondents joined via automated calls to landline phones are often a minority. 

Web polls also vary incredibly in the methods used to recruit respondents. It is not always easy to find the information on the samples. Opt-in panels may not be the norm anymore. Some pollsters complete their opt-in panels with river sampling. Some use platforms like Lucid where respondents are customers of different major commerces. Some use ads on cell phone apps. Some combine different sources of respondents in order to balance the biases that may be associated with each source. 

Why don't web polls detect movement where other modes do? It may be due to more homogenous pools of respondents. However, we will need to get more information from the web pollsters to assess whether different modes of recruitment explain the observed difficulty in detecting movements. In 2016, we did not find any significant difference between the polls that use river sampling and the others. The difficulty of web polls to detect movement has been observed in other elections in other countries. Web pollsters will have to work on -- and are probably already working on -- finding the source of the problem and eventually correct it. Of course, they could also be right but it nonetheless needs to be analysed. It is of utmost importance when web polls account for two thirds of the published polls.

In conclusion

Modes of administration are only one of the differences between pollsters. In a way, we assume that the more variation we have, the more likely the biases will compensate themselves and the average estimates will be the most reliable. However, if one mode is very dominant, and there are differences between modes, it becomes more difficult to conclude on what it happening. Transparency becomes even more essential.

I am also working on question wording. I will come back on this question in a future post. There is much variation in question wordings. Unfortunately, it is difficult to find the information on the  wording used for about a third of the pollsters who published estimates since the beginning of September. It is therefore difficult to assess whether wordings lead to different estimates.


We thank Luis Patricio Pena Ibarra for his meticulous work in entering the data and the related information to help me conduct these analyses.

 

Methodology for the analyses presented in this post.

1) The trends produced are produced by a local regression procedure (Loess). This procedure gives more weigth to estimates that are closer to each other and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trends. If I start on September 1st, the trends are somewhat different because of the absence of the influence of the polls conducted in August. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trends too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data. This is appropriate in statistical terms.

4) The data used comes from the answers to the question about voting intention. Undecideds (non-disclosers) are attributed proportionally -- for now.

5) For non-Americans, note that, in the USA, IVR cannot be used to call cell phones.U.S.A. seems to be the only country that has this restriction.

8 commentaires:

  1. Many thanks Dear Claire. One small comment. It could be very good to have data for development of preferences for some special segments of voting population. The reactions of individual subsamples in the last weeks for new issues could be really different. Do we have such segmentation of Trump strong supporters (partisans), Trump week supporters and Biden strong supporters (partisans) and Biden week supporters? Or other segmentation? All the best and THANKS Hynek

    RépondreEffacer
    Réponses
    1. There are some pollsters who publish these. It would be too much work for me to look at it, unfortunately. I think that if you go look at the Yougov poll reports, you will have interesting data.

      Best,

      Effacer
  2. Bonjour, c'est très intéressant. Est-ce que vous avez des données sur l'évolution de l'utilisation des lignes physiques depuis 2016 ? Je crois avoir lu dans un de vos blogues que les utilisateurs qui ont deux types de lignes pourraient être surreprésentés, mais j'imagine que ce biais pourrait ne pas être le même en 2020 si les utilisateurs urbains sont plus nombreux à avoir abandonné la téléphonie traditionnelle.

    RépondreEffacer
    Réponses
    1. Merci pour le commentaire. Effectivement, il y a de moins en moins de personnes ayant les deux types de téléphones aux Etats-Unis mais le problème du chevauchement des échantillons de téléphone fixe et cellulaire demeure quant à moi pour les sondages téléphoniques. Difficile de comprendre comment ils pondèrent tout ça.

      Effacer
  3. 1. Is it possible to have confidence intervalls for the estimated lines?
    2. A substantial question: Are there good data on swing voters? How stable is Trumps camp?

    RépondreEffacer
    Réponses
    1. There is no real confidence interval with local regression. It is possible to include kind of a "fake" a confidence interval but I can do it only in R and with tricubic instead of Epanechnikov. Much work, so I decided that it was not worth the job. I may try to do it before the end of the campaign just to see what it leads to.
      Data on swing voters? Perhaps you would get that from the USC-Dornsife poll (the data is available for researchers). They have a panel. If you look at it, please tell me...

      Effacer
  4. Merci pour la réponse. J'ai une autre question: Allez-vous faire une analyse avec répartition non proportionnelle des indécis comme vous le faîtes généralement ? On dit qu'il y a beaucoup moins d'indécis pour cette élection. Est-ce vrai ?

    RépondreEffacer
  5. Je pense faire cette analyse mais c'est très difficile parce que les firmes de sondage ne donnent pas toujours l'information sur la proportion d'indécis et que les proportions varient parfois beaucoup selon la manière dont la question est posée et le mode d'administration.

    RépondreEffacer