Ah! les sondages: Tomorrow, the French presidential election

First a word to say that there is a new feature in this blog, for this election. This blog is a collaboration with P.J. Fournier (Qc125.com), which allows for a combination of competence.

And so, today’s topic, what should we expect for the French presidential election tomorrow?

Let us examine first the polls published during the last weeks. This first graph show change in support for the five main candidates since the beginning of March. It shows what other researchers, pollsters and media alike have shown. Support for Macron and Le Pen has varied a little since the beginning of the campaign. Support for Macron is quite stable. Meanwhile, support for Le Pen decreases somewhat. Support for Fillon has been stable lately while Hamon and Mélenchon exchange support. At the end, Mélenchon is now tied with Fillon at 20%. These estimates are dependent upon all the polls published since the beginning of March.

If we use only the polls conducted since the beginning of April, the analysis is less dependent upon older polls. Is the portrait different? Not really. The conclusions are the same. All the polls show Macron slightly ahead of Le Pen, with Fillon and Mélenchon tied in 3^rd and 4^th place.

And what about the margin of error?

If we take the margin of error of each poll, we could conclude, as some have done, that « everything can happen”, this because support for the four main candidates are usually within the margin of error for individual polls. This interpretation is not adequate. The margin of error for the all the polls combined is much smaller. In a way, it is as if we could combine all the samples or if we took into account the fact that the current estimations are dependent upon the previous ones. It is not as if « anything » can happen. Since the method used in the preceding graphs does not allow to show the margin of error of the estimation of the regression lines, the following graph will allow to show this in a more evident way.

Since there has been lots of movement in the two last weeks, it is possible to estimate the margin of error for all the polls published during that period, in a way as if they had all been conducted at the same time. However it gives more weight to the last polls (using a squared weighting) in order to compensate. It is a conservative estimation of the margin of error. The following graph allows for a visual portrait of the confidence intervals for each candidate’s support. It takes into account all the polls conducted from April 8 to 21. The intervals are not large since there is not much variation between the estimates of the various pollsters because of the methodology they use (see later on in this post).

What can we conclude? If the polls are reliable, it is impossible to be sure who, between Macron and Le Pen, will finish first because statistically they get equal support (the confidence intervals overlap). The same thing happens with Fillon and Mélenchon. However, the graph also shows that the confidence intervals for Fillon and Le Pen slightly overlap. Le Pen could get a score as low as 20.7% while Fillon could go as high as 21%, i.e. he could finish second. On the contrary, Mélenchon is significantly lower than all the other main candidates. In short, if we rely on polls, the two who are most likely to finish first are Macron and Le Pen but the possibility of a Macron-Fillon 2^nd round also exists. We have tried other hypotheses using different periods and weights and we get similar results.

Are polls reliable?

This is « the question ». In order to answer this question, one has two rely on two types of information, i.e. the methodological ones and history.

First, methodological information. What is interesting in the French situation is that the pollsters have to file methodological information and their data with the Commission des sondages, a government body. The Commission’s experts can check the data and decide whether the estimates match the data for each poll. The methodological “Notices” are available for everybody to consult on the Commission’s web site here: http://www.commission-des-sondages.fr/notices/.

One author of this blog, Claire Durand, had examined these files for the 2002 presidential election (see: https://academic.oup.com/poq/article/68/4/602/1884181/The-Polls-in-the-2002-French-Presidential-Election#29033826) and the 2007 election https://doi.org/10.1093/ijpor/edn029. In 2002, when respondents were asked whom they had voted for at the preceding election, only around 5% reported having voted for Jean-Marie Le Pen, the extreme-right candidate who had received 15% at the 1997 election. In 2007, JM Le Pen had been over-estimated at 14% while he received 10% of the vote. Just after the first round, only 3% to 7% reported having voted for him. In short, support for JM Le Pen was largely under reported and this caused problems in the estimation of the vote. It was necessary to multiply those who declared having voted for Le Pen by 2-3 times and even more. When we examine the files at the Commission des sondages this year, the situation is very different. Support for Marine Le Pen in the 2012 election is only slightly under reported. Support for the two main candidates, Hollande and Sarkozy, tends to be over-reported, a situation that is usual for that type of question.

This year, pollsters like Opinion Way and IFOP present the results of its polls, before any weighting or adjustment, and after weighting and adjustment, this for the whole sample and for those who declare being sure to vote. These estimations show that weighting and adjustments have a very small impact on estimations. The information from the other pollsters are less detailed but the information we could consult allow to be confident that the samples are quite representative socio-demographically and socio-politically. It is thus possible to conclude that a catastrophe like in 2002 is not likely to occur this year. Reports of past vote are very accurate. This could be due to the fact that most polls are now self-administered (web polls) and also to the fact that support for Marine Le Pen is much less “shameful” than support for her father.

However, adjusting using report of past vote that is used in France (and also by many pollsters in the UK) will normally tend to produce an underestimation of the candidates whose share is increasing compared to the previous election and overestimate those whose share is decreasing, this because report of past vote is not very reliable and is even less reliable when time has passed. In short, people tend to adjust their memory on their current voting intention (see: http://surveyinsights.org/?p=3543)

Given this information and the fact that the data is checked by the Commission’s experts, herding – like what had occurred in the French election of 2002 where five out of six estimates of support for Jospin were similar – would be almost impossible and could be detected. The low variance in estimates is likely due to adjustments using recall of previous votes, a procedure that mathematically reduces variance.

Now a word about history. Historically, in the 1^st round of the French presidential elections, right wing candidates tend to be under-estimated and left-wing candidates over-estimated. In addition, in elections in general, support for small candidates tend to be over-estimated, either because supporters tend to vote less or because they finally decide to cast a “strategic” vote for one of the candidates in the lead. What happened in 2012? The vote for Hollande and Sarkozy has been estimated almost perfectly (https://fr.wikipedia.org/wiki/Liste_de_sondages_sur_l%27%C3%A9lection_pr%C3%A9sidentielle_fran%C3%A7aise_de_2012#Avril_2012). However, the vote for Marine Le Pen had been lightly underestimated (one-two points lower than her final score of 17.9%) and support for Mélenchon had been quite overestimated, at 14-15 %, while he received 11.1% of the vote.

What are the consequences for tomorrow’s election? Le Pen, who has slightly increase her support compared to 2012 and who has been historically underestimated, could be underestimated on Sunday. As for Fillon, it is more difficult to reach a conclusion. He does not have a positive image in the media and the right-wing candidates tend to be underestimated. However, since he has less support than Sarkozy at the preceding election, adjustment using vote recall could lead to an overestimation. If these two possible effects cancel out, his support is very well estimated. For Mélenchon and Hamon, likely overestimation, Mélenchon because he is on the “far-left” and he has been substantially overestimated in 2012, Hamon, because small candidates tend to be overestimated. And what about Macron? Well, for Macron, we have no reliable information that would allow to devise whether his support is adequately estimated.

In conclusion, unless there is a catastrophe like in 2002, which is very unlikely given the methodological information provided, the highest probability is Macron – Le Pen for the 2^nd round. All the analyses lead to this conclusion. A poll conducted in the middle of last week has shown Le Pen, Fillon and Mélenchon tied but the next polls did not show similar estimates. In fact, the most recent polls all showed increase support for Macron. The second possibility would be Macron-Fillon for the 2^nd round. It is not likely but it cannot be totally excluded. Why not a Macron-Mélenchon? In addition to what has already been mentioned, we should add that the Brexit reminded us that “old” people win consultations, and “old” people support Fillon more than Mélenchon. Is it possible that the polls go wrong? In such an election, it is possible, but it is unlikely.

Ah! les sondages

samedi 22 avril 2017

Tomorrow, the French presidential election

3 commentaires: