The CCMG has subsequently released a statement on the accuracy of the count from last week's presidential elections and concluded thus:
Now that the Electoral Commission of Zambia (ECZ) has declared results for the 2016 presidential elections, CCMG affirms that its PVT estimates for the presidential election are consistent with the ECZ’s official results. All stakeholders, particularly political parties, that participated in the election should have confidence in the ECZ’s presidential results.
What is a PVT and how is it carried out in practice? And what does it mean for PVT estimates to be consistent with ECZ's official results?
In theory, a PVT works like this: a trusted entity (like CCMG) places agents at polling stations across the country. These agents observe all facets of the voting process from the voting itself to the counting of ballots afterwards. And the main and important idea is this: as the Electoral Commission collects and aggregates counted votes at polling stations across the country, the CCMG also does the same (hence the phrase "parallel tabulation"). At the end of the process, the Electoral Commission announces their final tally and so does the CCMG. The two totals should ideally be the same.
In practise, however, the CCMG cannot manage to send agents to everyone of the country's polling stations (which numbered 7,700 last week). They have to pick a "representative sample" of the 7,700 polling stations. Alternatively, a PVT can be conducted across all polling stations (called a Comprehensive PVT). But quite apart from it being prohibitively costly, a Comprehensive PVT, surprisingly, is more prone to systematic errors than simply sampling polling stations (read more here).
In picking its sample of polling stations last week, the CCMG used simple random sampling (SRS) techniques. But to guard against the possibility that the SRS only picked polling stations in one part of the country, the CCMG employed a version of Clustered SRS so that eventually polling stations were sampled from every province, district and constituency. The final sample contained 1,001 polling stations whose distribution largely mirrored the actual distribution of polling stations across the country. For example, since the Copperbelt holds 14% of all polling stations in the country, the CCMG's sample also had 14% of the polling stations coming from the Copperbelt. Using random sampling ensures that the characteristics of the polling stations (skill set of polling agents, availability of electricity etc...) in the sample are representative of characteristics across all the 7,700 polling stations. We don't want to only pick polling stations where there is electricity because this might not be representative of all polling stations in the country.
Since the CCMG worked with a sample of 1,001 stations and not the entire population of 7,700 stations, the CCMG's pronouncement on the election (for example what percentage of the vote went to candidate X) is called an estimate. And this estimate will come with a margin of error precisely because we are working with a sample and not the actual thing. (Technical note: The margin of error comes from the idea that if CCMG repeated this process of randomly picking 1,001 polling stations many times, they'd obtain a different estimate each time. The collection of these different estimates would form a distribution from which we could work out the margin of error). Ideally you want the margin of error to be as small as possible.
Table 1 below provides estimates of vote shares for each candidate from CCMG's PVT in addition to ECZ's official vote shares from last week's elections. The table also provides margins of error.
Table 1
An ECZ vote share is consistent with CCMG's vote share if the ECZ share is contained within CCMG's share plus the margin of error. For instance, Edgar Lungu of the Patriotic Front was declared by the ECZ to have obtained 50.4% of valid votes cast (see the second column). The CCMG's PVT, based on 1,001 polling stations, estimated Mr. Lungu's share at 50.2% with a margin of error of +/-2.5%. This implies a lower bound estimate of 47.7% and an upper bound of 52.7% of the vote share. ECZ's announced share for Mr. Lungu falls right within this range and is therefore consistent with CCMG's estimate.
Now, there is an important proviso to this process that I have not seen anyone else raise in the discussion of the CCMG's PVT this past week. And this involves the issue of measurement error which is different from the margin of error. The margin of error is a given in a process where you use a sample to make statements about a population. In making such a statement, there is, however, an assumption that what you are measuring (eg the vote count) is done properly. In practice, you are always going to commit some type of measurement error but the idea is that such errors should be small and not systematic. An illustration: suppose there are two groups of people A and B that are engaged in trying to figure out the average weight of Lusaka residents. Group A is going to use a random sample and Group B will visit each one of Lusaka's residents and take their weights. Both groups are each given a scale to measure the weight. But it turns out that both scales were manufactured with the same type of defect - they add 2kilos to the actual weight. If the sampling is done right (i.e. margin of error is minimized), the estimate of the average from Group A and the one from Group B will be consistent with one another, but might be 2kilos more because of a systematic pattern in measurement error. So if there is measurement error and if it's systematic, the PVT process will not be informative of the voting process. Just as the average weights from our defective scales will not be informative of the actual average weight in Lusaka.
Lastly, according to CCMG's PVT, Mr. Lungu's vote share could range anywhere from 47.7% to 52.7% of valid votes cast. This PVT tells us that there is likely a range of vote shares where Mr. Lungu did not get the majority needed to prevent a run-off (from 47.7% to 50%) and there is likely a range where he did (50% + 1 to 52.7%). Note that this conclusion holds even when there is zero measurement error, which the CCMG assumes here. I don't know what to make of this last point or if it's even important at all.