Malaria in high-transmission endemic areas of sub-Saharan Africa (SSA) is characterised by vast diversity of the Plasmodium falciparum parasites from the perspective of antigenic variation (Chen et al., 2011; Day et al., 2017; Otto et al., 2019; Ruybal-Pesántez et al., 2022; Ruybal-Pesántez et al., 2017). As with other hosts of hyper-variable pathogens (Futse et al., 2008), children experiencing clinical episodes of malaria eventually become immune to disease but not to infection. This results in a large reservoir of chronic asymptomatic infections, in hosts of all ages, sustaining transmission to mosquitos. Given the goal of malaria eradication by 2050, it is therefore of interest to examine how the parasite population changes following perturbation by major intervention efforts, both in terms of its size and underlying population genetics.
So, what do we mean by the parasite population size in the case of P. falciparum and how do we measure it? Parasite prevalence, detected by microscopy or more sensitive molecular diagnostics (e.g. PCR), describes the proportion of infected human hosts. Studies of P. falciparum genetic diversity have shown that the majority of people in high-transmission endemic areas harbour diverse multiclonal infections measured as the complexity or multiplicity of infection (MOI) (e.g. Anderson et al., 2000; Paul et al., 1995; Smith et al., 1999; Sumner et al., 2021) with complex population dynamics (Bruce et al., 2000; Farnert et al., 1997). These genetic data indicate much larger parasite population sizes than observed by prevalence of infection alone. Thus, from an ecological perspective, we can consider a human host as a patch carrying a number of ‘antigenically distinct infections’ of P. falciparum. The sum of these antigenically distinct infections over all sampled hosts provides us with a census of the parasite count of relevance to monitoring and evaluating malaria interventions. We refer to this census population size hereafter simply as population size but make clear that this measure is distinct from effective population size (Ne) as measured by neutral variation. This count can be scaled from the host sample to the larger denominator of a host population in the area of interest.
Diversity of P. falciparum single copy surface antigen genes such as circumsporozoite protein (csp), merozoite surface protein 1 (msp1) or 2 (msp2), and apical membrane antigen 1 (ama1) have each been widely used to measure MOI (e.g. Falk et al., 2006; Lerch et al., 2017; Nelson et al., 2019). They have become part of newer genetic panels (e.g. Paragon v1 [Tessema et al., 2022] and AMPLseq v1 [LaVerriere et al., 2022]) specifically for MOI determination. Typically, MOI is reported as the maximum number of alleles or single locus haplotypes present at the most diverse of these antigen-encoding loci rather than the number of unique multilocus haplotypes of these genes combined, as it is challenging to accurately reconstruct or phase these haplotypes in hosts with an MOI>3 (Lerch et al., 2019). Each of these genes is under balancing selection with a few geographically common haplotypes and many very rare haplotypes in moderate- to high-transmission settings (Markwalter et al., 2022; Sumner et al., 2021). Where there is a high probability of co-occurrence of two or more common single locus haplotypes in a host, genotyping each of these single copy antigen genes alone will underestimate MOI. Single nucleotide polymorphism (SNP) panels have been used to define the presence of multiclonal infections with limited reliability to estimate MOI for highly complex infections, typical in high transmission, even with the use of computational methods (Labbé et al., 2023).
As an alternative to genotyping single copy antigen genes and biallelic SNP panels to estimate MOI, we have proposed the use of a fingerprinting methodology known as varcoding to genotype the hyper-diverse var multigene family (~50-60 var genes per haploid genome) (Day et al., 2025). This method employs an ~450 bp region of a var gene, known as a DBLα tag encoding the immunogenic Duffy-binding-like alpha (DBLα) domain of P. falciparum erythrocyte membrane protein 1 (PfEMP1), the major surface antigen of the blood stages (Zhang and Deitsch, 2022). Bioinformatic analyses of a large database of exon 1 sequences of var genes showed a predominantly 1-to-1 DBLα-var relationship, such that each DBLα tag typically represents a unique var gene, especially in high transmission (Tan et al., 2023). The extensive diversity of DBLα tags, together with the very low percentage of var genes shared between parasites (Chen et al., 2011; Day et al., 2017; Ruybal-Pesántez et al., 2022; Ruybal-Pesántez et al., 2017), facilitates measuring MOI by amplifying, pooling, sequencing, and counting the number of unique DBLα tags (or DBLα types) in a host (Ruybal-Pesántez et al., 2022; Tiedje et al., 2022). From a single PCR with degenerate primers and amplicon sequencing, the method specifically counts the most diverse DBLα types, designated non-upsA, per infection to arrive at a metric we call MOIvar. It is not based on assigning haplotypes but exploits the fact that var repertoires are non-overlapping, especially in high transmission. Instead, it assumes a set number of non-upsA types per genome based on repeated sampling of 3D7 control isolates accounting for PCR sampling errors to calculate MOIvar (Ghansah et al., 2023; Ruybal-Pesántez et al., 2022; Tiedje et al., 2022). Consequently, rather than looking at the diversity of a single copy antigen-encoding gene like csp, msp2, or ama1 to calculate MOI, by varcoding we are looking at sets of up to 45 non-upsA DBLα types per genome. Prior work has shown that varcoding is more sensitive to measure MOI in high transmission where there is an extremely high prevalence of multiclonal infections that cannot be accurately phased with either biallelic SNP panels (Ghansah et al., 2023; Labbé et al., 2023; Tessema et al., 2022; Watson et al., 2021) or combinations of single copy antigen genes (Sumner et al., 2021).
Here, we report an investigation of changes in parasite census population size and structure through two sequential malaria control interventions between 2012 and 2017 in Bongo District located in the Upper East Region of northern Ghana, one of the 12 highest burden countries in Africa (World Health Organization, 2022). We present a novel Bayesian modification to the published varcoding approach (Ghansah et al., 2023; Ruybal-Pesántez et al., 2022; Tiedje et al., 2022) that takes into account under-sampling of non-upsA DBLα types in an isolate to estimate MOIvar (Ruybal-Pesántez et al., 2022; Tiedje et al., 2022) and therefore population size. We document P. falciparum prevalence, as well as var diversity and population structure from baseline in 2012 through a major perturbation by a short-term indoor residual spraying (IRS) campaign managed under operational conditions, which reduced transmission intensity by >90% as measured by the entomological inoculation rate (EIR) and decreased parasite prevalence by ~40-50% (Tiedje et al., 2022). Next, we followed what happened to parasite population size more than two years after the IRS intervention was discontinued and seasonal malaria chemoprevention (SMC) was introduced for children between the ages of 3 and 59 months (i.e. <5 years) (Wagman et al., 2018). Detectable changes in parasite population size were seen as a consequence of the IRS intervention, but this quantity rapidly rebounded 32 months after the intervention ceased. Overall, throughout the IRS, SMC, and subsequent rebound, the parasite population in humans remained large in size and retained the var population genetic characteristics of high transmission (i.e. high var diversity, low var repertoire overlap), demonstrating the overall resilience of the species to survive significant short-term perturbations.