Phylodynamic approaches to studying avian influenza virus

ABSTRACT Avian influenza viruses can cause severe disease in domestic and wild birds and are a pandemic threat. Phylodynamics is the study of how epidemiological, evolutionary, and immunological processes can interact to shape viral phylogenies. This review summarizes how phylodynamic methods have and could contribute to the study of avian influenza viruses. Specifically, we assess how phylodynamics can be used to examine viral spread within and between wild or domestic bird populations at various geographical scales, identify factors associated with virus dispersal, and determine the order and timing of virus lineage movement between geographic regions or poultry production systems. We discuss factors that can complicate the interpretation of phylodynamic results and identify how future methodological developments could contribute to improved control of the virus.

Phylodynamics studies how epidemiological, immunological, and evolutionary processes shape viral genetic diversity. Approaches developed within this framework can help recover viral dispersal patterns and evolutionary processes, even when virus genomic data is sampled relatively sparsely from an infected population (Grenfell et al., 2004;Volz et al., 2013;Rife et al., 2017). Time-scaled phylogenies can be inferred using molecular clock models, which quantify the rate of genetic change over time and therefore enable phylogenetic branch lengths to be expressed as units of time rather than as nucleotide substitutions per site (Drummond et al., 2006;Pybus & Rambaut, 2009) (Box 1). RNA viruses, including AIVs, typically have exceptionally short generation times, high evolutionary rates, and large population sizes (Duffy, 2018;Wille & Holmes, 2020). Consequently, genetic substitutions in viral genomes often occur on similar time scales as transmission events between hosts. Hence, it is possible to reconstruct outbreak dynamics from time-scaled phylogenies, as they contain a "molecular footprint" of viral spread (Grenfell et al., 2004;Lemey et al., 2009;Pybus & Rambaut, 2009). When genome sampling location is available, we can use phylodynamic techniques to reconstruct the geographical distribution of viral lineages ("phylogeography"), thereby revealing valuable information about viral spread and factors associated with faster or more frequent viral lineage movement events (Lemey et al., 2010;Faria et al., 2011;Gill et al., 2016).
The most common tools for phylodynamic analyses employ a Bayesian Markov Chain Monte Carlo (MCMC) framework to efficiently explore highly complex models involving many different parameters (Drummond & Rambaut, 2007;Bouckaert et al., 2014). A Bayesian framework has several advantages compared to the maximum likelihood or parsimonybased approaches. Firstly, Bayesian approaches allow for the incorporation of multiple sources of data or prior knowledge (e.g. divergence times, substitution rates) (Alfaro & Holder, 2006;Baele et al., 2017;Chakraborty et al., 2021). Perhaps more importantly, such approaches generate posterior distributions of phylogenetic trees, thus allowing uncertainty in parameter estimates to be captured (this has been reviewed extensively elsewhere, (e.g. Faria et al., 2011;Volz et al., 2013;Gill et al., 2016;Rasmussen & Grünwald, 2020;Dellicour et al., 2021). Maximum likelihood (ML)-based methods are more limited in scope of possible analyses, but are often less computationally intensive than the more popular Bayesian approaches (Baele et al., 2018;Sagulenko et al., 2018;Ishikawa et al., 2019). This can be beneficial when dealing with large datasets, limited computed resources, or when faster but lower complexity models are appropriate to inform emergency responses. Such ML phylodynamic methods typically use a single ML tree, enabling faster time-to-answer compared to Bayesian phylodynamic inference.
Recent trending decreases in the cost and time required to generate and analyse virus genetic data have led to rapid innovation within the field of phylodynamics and a subsequent increase in popularity (Gill et al., 2016;Rife et al., 2017;Grubaugh et al., 2019;Cardona-Ospina et al., 2021). This review highlights how phylodynamic methods have and may continue to aid the study of AIV spatiotemporal dispersal. We first explore recent studies that use phylodynamics to infer AIV dispersal dynamics within or between wild birds and domestic poultry populations at various geographical scales and discuss factors that can complicate the generation of reliable conclusions. Specifically, we discuss the inference of factors associated with AIV transmission, the order and timing of transmission and lineage dispersal events during an outbreak, and how viral lineages can move between different regions and sectors of poultry production systems. We then consider future challenges and opportunities for using phylodynamic approaches within AIV research.

Tracing viral incursions
Phylodynamics has helped identify global movements of AIV lineages in wild bird populations (Baele et al., 2018;Zhang et al., 2023). Many studies have used time-calibrated phylogenies (Box 1) to estimate the sampling time and location of unobserved viral ancestors, and hence reconstruct the timings and origins of viral (particularly HPAIV) incursions into different wild bird populations (e.g. Lee et al., 2018;Hill et al., 2019;Zhang, Fan et al., 2020;Beerens et al., 2021;Xie et al., 2022). For instance, one study showed that HPAIV H5N8 infections in wild birds present in the Netherlands in late 2020 were likely introductions from wild birds in Egypt and not elsewhere in Europe, as was anticipated based on proximity . This finding demonstrates how phylogenetic approaches can provide information on AIV spread that might not be detectable using standard epidemiological analyses based on reported cases.
Complications in interpreting where virus lineages originate can arise in instances where gene segments from multiple genetically diverse AIVs reassort during co-infection (Araujo et al., 2018;Wille & Holmes, 2020;Verhagen, Fouchier et al., 2021). Gene segments acquired from each different parent virus typically require analysis using separate phylogenies to capture the different evolutionary histories of each parent lineage (Lu et al., 2014). Accordingly, reassortant sequences are sometimes represented using a phylogenetic network instead of a single tree (Frost et al., 2015;Stolz et al., 2022) or, more commonly, removed, with only one gene segment (usually haemagglutinin (HA) (Heaton et al., 2013)) selected for analyses. The latter approach limits our understanding of AIV diffusion (Lu et al., 2014;Parvin et al., 2014;Frost et al., 2015;Venkatesh et al., 2018). Although reassortant sequences could perhaps be better accommodated in AIV phylodynamic analyses using a similar approach to that detailed in Müller et al., (2020), which was used to explicitly infer reassortment rates in different human influenza virus lineages, the complexity of this method likely prevents it being used widely (Müller et al., 2020).

Geographic structuring of genetic diversity
Phylodynamic methods have been used to investigate whether AIV lineages found in wild birds are structured according to avian flyways and host ecology (Hurt et al., 2014;Araujo et al., 2018;Mine et al., 2019;Sharshov et al., 2019;Verhagen et al., 2020;Zhang et al., 2023). Several studies indicate that avian hosts within geographically isolated regions sometimes harbour AIV segment lineages that are seemingly relatively distinct from other sampled lineages circulating globally (Hansbro et al., 2010;Hurt et al., 2014;Araujo et al., 2018;Wille et al., 2022). For instance, one study detected an H11N2 lineage in Adélie penguins (Pygoscelis adeliae) in Antarctica in 2013 that likely diverged approximately between the 1960s-1980s from the most closely related AIV sequences worldwide (Hurt et al., 2014). However, it is difficult to determine whether these findings represent true lineage geographic isolation or if lineages only appear geographically structured due to a scarcity of samples from nearby locations that may nevertheless be epidemiologically linked (e.g. southern Chile, Argentina). Other analyses reveal that AIV lineages can be shared between distant regions (Bahl et al., 2009;zu Dohna et al., 2009;Mine et al., 2019;Sharshov et al., 2019;Verhagen et al., 2020;Caliendo et al., 2022). For instance, recent discrete phylogeographic analyses (Box 1) based on the neuraminidase (NA) indicated likely intercontinental AIV dispersal between North America and Eurasia .
Viral phylogeography is the reconstruction of viral movement between geographic locations using virus genetic sequences. Phylogeographic approaches can be categorized based on whether the modelled locations are discrete (e.g. country, administrative district) or continuous (geographic coordinates) (Lemey et al., 2009;Faria et al., 2011).
Discrete trait analyses are sometimes used out of necessity, for instance, when precise geographic sampling coordinates are unavailable (De Maio et al., 2015;Hill et al., 2015;Lycett et al., 2019). However, these approaches are usefully applied when sequences cluster naturally by geographic location because viral movement is affected by geographical (e.g. oceans, mountains) and/or political (e.g. borders) barriers (Alkhamis et al., 2015;De Maio et al., 2015;. Several software packages (e.g. TreeTime , PastML (Ishikawa et al., 2019)) have been developed that enable reconstruction of ancestral characters within a maximum-likelihood framework. The most commonly used software packages for phylodynamic inference, BEAST (Drummond & Rambaut, 2007) and BEAST2 (Bouckaert et al., 2014), rely on a Bayesian inference framework and offer several different approaches for phylogeography using discrete traits. We summarize different maximum likelihood and Bayesian approaches below.
Maximum likelihood ancestral character reconstruction: Maximum likelihood approaches can also be used to estimate the most likely states of discrete traits (e.g. which country or host species) at internal nodes of phylogenies (Cunningham et al., 1998;Schmidt & von Haeseler, 2009;Hadfield et al., 2018;Sagulenko et al., 2018;Ishikawa et al., 2019). These approaches focus on first estimating the most likely phylogeny given the data (i.e. the maximum likelihood tree), and then estimating the most likely history of discrete states at each node (Cunningham et al., 1998;Schmidt & von Haeseler, 2009;Sagulenko et al., 2018;Ishikawa et al., 2019). Branch length or evolutionary time is accounted for, such that geographic movements are more likely to occur on longer branches. This is in contrast to Bayesian approaches implemented in BEAST and BEAST2, which estimate the history of discrete traits for each tree in the posterior tree distribution and hence can more fully account for phylogenetic uncertainty but as a result can be significantly slower (Faria et al., 2011;Volz et al., 2013;Gill et al., 2016;Dellicour et al., 2021).
Discrete trait analysis: In one common approach (often known as "discrete trait analysis", whilst being only one of several approaches for phylogeographic inference using discrete traits), phylogenetic branch locations are estimated using continuous-time Markov chains, i.e. modelled as moving instantaneously at specific rates between a fixed number of discrete locations (Lemey et al., 2009;Faria et al., 2011). ( Figure 1A). A frequently applied extension of this model known as Bayesian stochastic search variable selection (BSSVS) attempts to limit the number of possible recovered transitions between pairs of locations to those that adequately explain the phylogenetic diffusion process (Lemey et al., 2009). Biased sampling can affect the statistical inference of discrete trait analyses as the relative sampling intensities of different discrete locations or traits affect the estimates of viral movements (De Maio et al., 2015;Layan et al., 2023).
Structured coalescent approaches: Structured coalescent approaches, including those implemented in MASCOT (Müller et al., 2018) and BASTA (De Maio et al., 2015) within BEAST 2, explicitly model ancestry within and movement between discrete subpopulations, known as "demes" ( Figure 1A) De Maio et al., 2015;Müller et al., 2018). These methods can be less susceptible to sampling bias than classical coalescent models in certain instances (De Maio et al., 2015;Müller et al., 2018;Layan et al., 2023). However, structured coalescent phylogeographic models are typically more computationally demanding than discrete trait phylogeographic models, and thus are mostly applied to small numbers of demes (De Maio et al., 2015;Müller et al., 2018;Layan et al., 2023). Additionally, although population size can vary between demes, structured coalescent models assume that virus population size in each deme remains constant, which may be less appropriate for investigating AIV lineage expansions in naïve host populations or AIVs with strong seasonality in transmission (De Maio et al., 2015;Layan et al., 2023). Both discrete trait analyses and structured coalescent approaches can be used to model virus movement between other discrete traits (Faria et al., 2011;De Maio et al., 2015), such as host species ( Figure  1B) (Ren et al., 2016;Hicks et al., 2020). Furthermore, generalized linear model (GLM) extensions of these methods (Lemey et al., 2012Faria et al., 2013;Müller et al., 2019) can be used to evaluate covariates of virus movement, such as avian host density at each location or geographic distance between pairs of locations.
Multitype birth-death models: Like structured coalescent approaches, multitype birth-death (MTBD) models (e.g. as implemented in the BEAST 2 package bdmm (Kühnert et al., 2016)) are able to explicitly model transmission within structured populations, and represent an extension of the birth-death models described in Box 2. MTBD models involve the estimation of time-varying "birth" rates (transmission rate), "death" rates (rate of becoming non-infectious or death) for each of several discrete subpopulations, and "per-lineage migration rate" (changes in the subpopulation of an individual due to migration) (FitzJohn, 2012;Kühnert et al., 2016;Barido-Sottani et al., 2020). Unlike the structured coalescent approaches, which assume a constant population size within demes, the MTBD allows for population size to change through time (Stadler et al., 2015;Seidel et al., 2020).
Continuous phylogeography: In the continuous phylogeographic model, virus lineage movement is modelled between geographical coordinates (e.g. latitude and longitude) ( Figure 1C & D) (Lemey et al., 2009(Lemey et al., , 2010. Virus lineage movement is most commonly modelled using a relaxed random walk model (Lemey et al., 2010), which allows the rate of viral dispersal to vary across the phylogeny. To evaluate spatiotemporal covariates associated with virus dispersal route or velocity, we can conduct post-hoc analysis using the R package "seraphim" (Dellicour et al., 2016).
Sampling biases (i.e. disproportionate sampling of virus genomes compared to true infection prevalence) can strongly impact phylogeographic analyses, with under-sampled locations more likely to be inferred as sinks when using discrete trait approaches (De Maio et al., 2015;Kalkauskas et al., 2021;Layan et al., 2023). The international spread of AIV has been predominately inferred from virus genetic data Figure 1. Conceptual representation of common phylogeographic methods. A: Time-calibrated phylogeny estimated with a phylogeographic model with discrete traits. The inferred locations of the ancestral internal nodes (squares) are estimated from the set of discrete locations predefined at the tips (i.e. locations of sampled sequences; circles). B: Discrete trait (e.g. location or host species) analysis. Here, arrows indicate statistically significant virus lineage transitions between bird types. Arrow thickness corresponds to the inferred viral flow rate. C: Time-calibrated phylogeny estimated with continuous phylogeographic inference. The inferred geographic coordinates of the internal nodes (i.e. unsampled virus ancestor) can differ from the geographical sampling coordinates of the sequences at the tips. Both internal nodes and tips are coloured by location. D: Continuous phylogeographic reconstruction using information contained in estimated phylogenies such as C, in which the dots represent the internal and external nodes of the time-scaled phylogeny, coloured according to time. The curvature direction of the lines between dots indicates the inferred direction of viral movement. Coloured polygons represent the statistical uncertainty of the inferred internal node locations, which is derived from a posterior tree distribution. collected in North America and northern Europe . Wild bird sampling remains relatively limited in several areas of Central and South America, Africa, the Middle East, and polar regions, which may result in these regions being overestimated as sink locations for virus lineages (Hurt et al., 2014;Araujo et al., 2018;Fusaro et al., 2019;Naguib et al., 2019;Kalonda et al., 2020;Hill et al., 2021;Lo et al., 2022). Accordingly, phylodynamic investigations of the global spread of AIV would benefit from increased AIV genomic surveillance in undersampled locations and host species.

Migratory flyways and mixing zones
Over 50 billion birds are estimated to migrate annually, with most flyways linking breeding grounds at high latitudes and over-wintering sites at low latitudes (Boere et al., 2006;Bahl et al., 2013). Several different phylodynamic studies using discrete trait analyses (Box 1) report that AIV lineage movement rates are greater within flyways than between flyways in Asia (Central Asian, East Asian)  and in North America (Central, Mississippi, Atlantic) Fries et al., 2015;Li et al., 2018). Continuous phylogeographic analyses (Box 1), in which virus lineage movement is modelled as a diffusion process between geographical coordinates (Lemey et al., 2009(Lemey et al., , 2010, demonstrated that H5N1 movement throughout Asia andRussia (1996-2011) correlates with the geographical extent of known flyways (Trovão et al., 2015). While a crude simplification to capture avian movement trends, "flyways" appear a crucial facilitator of long-distance AIV spread in wild birds.
Several studies indicate that AIV transmission between birds during congregation at breeding grounds or staging areas where multiple flyways overlap ("mixing zones") may shape the subsequent global dissemination of AIV lineages (Ramey et al., 2010;Gerloff et al., 2013;Huang et al., 2014;Lee et al., 2015;Venkatesh et al., 2018;Mine et al., 2019;Gass et al., 2023). For example, Mine et al. (2019) determined that AIVs sampled in mixing zones in Mongolia and Siberia from wild birds that use different flyways were often phylogenetically intermixed. The study used the Bayesian tip-association significance testing (BaTS) software, which implements post-hoc statistical tests on time-calibrated trees to determine if sequences significantly cluster by a trait, such as sampling location. This approach enables rapid assessment of whether viral genetic diversity at tree tips is geographically structured, but, unlike discrete or continuous phylogeographic approaches, cannot be used to estimate full histories of virus lineage movement. The above-mentioned phylogenetic clustering patterns reported in Mine et al. (2019) are consistent with birds sharing AIVs in mixing zones before dissemination along different flyways, hence enabling long-distance diffusion of viral lineages between multiple continents or regions . Consequently, the authors hypothesized that cross-flyway viral diffusion may have contributed to the simultaneous outbreaks of H5N6 HPAIVs in East Asia and Europe in 2017-2018 . Similar findings have been observed in mixing zones in the Nile Delta and the Republic of Georgia (Gerloff et al., 2013;Venkatesh et al., 2018).
Several phylodynamic studies have shown high AIV genetic diversity in some mixing zones (Ramey et al., 2010;Gerloff et al., 2013;Venkatesh et al., 2018Venkatesh et al., , 2020Mine et al., 2019). High-density congregation of wild birds associated with different flyways at mixing zones may provide ideal conditions for AIV reassortment by increasing the probability of co-infections with diverse genotypes and subtypes Mine et al., 2019). However, further research is required to robustly determine if reassortant AIV genotypes exist at higher frequencies in mixing zones than in non-mixing zones. While phylodynamic approaches have been valuable in demonstrating the role of flyways and mixing zones in structuring AIV dispersal and reassortment patterns, several challenges remain. The scarcity of both wild bird migration data and wild bird sampling in some regions (particularly low-and middleincome countries) limits our ability to fully characterize how migratory flyways impact AIV dispersal patterns globally (Takekawa et al., 2010;Palm et al., 2015;Fusaro et al., 2019;Mine et al., 2019;Yong et al., 2021;Zhang et al., 2023). Secondly, using environmental and flyway data averaged over months or years limits our ability to understand the impacts of changes in weather or environmental conditions on wild bird-mediated AIV dispersal patterns (Kirby et al., 2008;Vandegrift et al., 2010;Iwamura et al., 2013;Bahl et al., 2016;Sullivan et al., 2018).

Roles of different host taxa
Several studies employing phylodynamic methods demonstrate how movement variation between wild birds from different taxonomic orders might impact AIV dispersal patterns (Ramey et al., 2010;Miller et al., 2011;Wille et al., 2011;Hall et al., 2013;Hill et al., 2022;Gass et al., 2023). In the boreal and temperate territories of the Northern Hemisphere, migratory Charadriiformes (e.g. shorebirds, gulls) are more abundant and undertake long-distance migrations across oceans more frequently than migratory Anseriformes (e.g. ducks, geese) (Ramey et al., 2010;Miller et al., 2011;Wille et al., 2011;Hall et al., 2013;Hill et al., 2022;Gass et al., 2023). Consequently, it has been suggested that Charadriiformes may facilitate intercontinental AIV transmission in these areas more than Anseriformes Gaidet et al., 2012;Gaidet, 2016;Rimondi et al., 2018;Hoye et al., 2021;Wille et al., 2023). Phylodynamic support for this hypothesis exists in the seemingly more frequent detection of inter-hemispheric reassortment events between American and Eurasian lineages associated with Charadriiformes compared to Anseriformes (Bahl et al., 2009;Ramey et al., 2010;Wille et al., 2011;Van Borm et al., 2012;Hall et al., 2013;Lang et al., 2016). Conversely, in areas such as the tropical regions of West Africa, there is a high abundance of long-distance migratory Anseriformes but a low abundance of Charadriiformes migrants. There, Anseriformes are instead thought to be the primary drivers of virus movement between regions and continents Gaidet, 2016).
Discrete trait analyses (Box 1) have been used to explore how wild bird species with contrasting migration behaviours may differently influence AIV spread . For instance, one study suggested that in Egypt and adjacent Black Sea-Mediterranean countries, local migrant species (common shelduck (Tadorna tadorna)) generally contributed to local AIV amplification, while longer distance migrants (northern shoveler (Spatula clypeata) and northern pintail (Anas acuta)) carried AIV lineages over longer distances . Careful analyses and interpretation of results regarding the contribution of different species to AIV dispersal is critical because virus genome sequences are often biased towards hunted species (e.g. game birds) or those that are easily detected when they die (e.g. larger inland birds such as mute swans (Cygnus olor)) (Runstadler et al., 2013;Lebarbenchon et al., 2015;Bahl et al., 2016;Beerens et al., 2021;McBride et al., 2021;Hill et al., 2022). The existence of within-species variation in movement ecology, both between juveniles and adults  and between resident and migratory populations from the same species (Lisovski et al., 2018), further complicates generalization of host species traits associated with long-distance AIV spread. The relative frequency of viral cross-species transmission events can be estimated using phylodynamic tools such as Markov jumps counting in combination with discrete trait phylogeography (Minin & Suchard, 2008a, b;Faria et al., 2011) (Box 1). Markov jump counting approaches enable the counting of the expected number of transitions between modelled discrete traits, such as country or host species, along phylogenetic branches (Minin & Suchard, 2008a, b;Faria et al., 2011). For example, Markov jumps counting analyses were used to infer that North America, east and southeast Asia are hotspots for cross-species transmission between wild and domestic birds (Ren et al., 2016). Likewise, one large study of a globally sampled H9 AIV dataset used both discrete trait analyses and Markov jump counting to identify spatial asymmetry in the geographical areas where wild-todomestic and/or domestic-to-wild viral transmission most often occurred (Bahl et al., 2016). These methods must be used cautiously because Markov jumps analyses are very sensitive to disproportionate sampling from each group relative to true virus prevalence (Layan et al., 2023). This is particularly problematic for AIV, where genome sequences are almost always more frequently available from poultry relative to wild birds (Bahl et al., 2016;Ren et al., 2016;Yang, Xie et al., 2019).

Between wild and domestic birds
The relative rates of transmission between domestic and wild bird populations can also be investigated using structured coalescent approaches and multitype birth-death models, which may be less sensitive biased sampling (Box 1) (De Maio et al., 2015;Grear et al., 2017;Yang, Müller, et al., 2019;Guinat et al., 2022). Such methods can allow the transition rates between populations or demes to vary depending on the direction or estimate a single rate regardless of the directionality (De Maio et al., 2015;Kühnert et al., 2016;Müller et al., 2018;Barido-Sottani et al., 2020). One study of a 2014 HPAIV outbreak in North American domestic poultry, caused by wild bird lineage spillover, inferred that minimal viral movement had occurred between wild and domestic birds over the subsequent course of the domestic outbreak (Grear et al., 2017). Furthermore, estimates of basic reproductive number (R 0 ; the expected number of susceptible individuals in a naïve population infected by one infected hostsee Box 2) indicated that the poultry outbreak size was stable (R 0 ≈ 1) (Grear et al., 2017). Taken together, these findings suggested that the poultry outbreak was largely self-sustaining (Grear et al., 2017).
Several studies indicate that AIV transmission between wild and domestic birds can vary seasonally (Alarcon et al., 2018;Ferenczi et al., 2021;Gonzales et al., 2021;Zhang et al., 2021;Liang, Nissen, et al., 2021). Recurrent temporal peaks in cross-species transmission events can be investigated using phylodynamic tools that infer population demographic history (Frost & Volz, 2010;Karcher et al., 2020) (Box 2). For example, one study observed that the effective population size of clade 2.3.4.4b H5N8 viruses in Chinese poultry, as estimated by a skyride coalescent model (Gill et al., 2013) (Box 2), increased during winter 2020/21, as previously observed for H5 lineages/clades during winter 2013/14 and 2016/ 17 (Zhang et al., 2021). Seasonal increases in viral effective population size and other phylogenetic evidence, such as estimated dates of incursion of new lineages, were interpreted as consistent with immigrating wild birds recurrently introducing new H5 viruses into domestic poultry (Zhang et al., 2021).

Box 2.
Past population dynamics can be estimated through phylodynamic inference by analysing how virus effective population size (N e ) changes over time. Briefly, the effective population size represents the size of an idealized viral outbreak (i.e. one without selection or population structure) that experiences the same level of genetic drift as the studied population (Magiorkinis et al., 2013). N e estimates are affected by transmission rates and therefore almost always cannot be scaled directly to the number of infected individuals (Frost & Volz, 2010). Furthermore, uneven sampling strategies such as focused sampling during transmission peaks can bias the estimation of effective population size (Karcher et al., 2020;Parag et al., 2020;Cappello & Palacios, 2022). Nevertheless, estimates of N e can provide important information about the viral outbreak dynamics, including capturing seasonality or the possible efficacy of interventions in reducing subsequent outbreak size (Frost & Volz, 2010;Rife et al., 2017;Drummond et al., 2005) (Figure 2A).
Methods that are commonly employed to estimate population size trajectories over an epidemic include the Bayesian skyline (Drummond et al., 2005), the skyride (Minin et al., 2008), the birth-death skyline , and the skygrid (Gill et al., 2013).
Several approaches estimate population size history alongside other tree parameters based on the principle that when the population size is small, sampled viruses are more likely to share a common ancestor in the very recent past, and therefore lineages coalesce (join) faster (Drummond et al., 2005). Therefore, a faster rate of branch coalescence suggests a comparatively smaller population at that time point (Drummond et al., 2005). The Bayesian skyline model requires pre-determination of the number of points at which effective population size can change, generating estimates that are summaries of multiple step-wise changes (Drummond et al., 2005). The skyride model does not require predefined points, and introduces a method of temporal smoothing based on the assumption that N e is correlated across successive coalescent intervals (Minin et al., 2008). The skygrid model modifies and extends the skyride model by allowing the N e trajectory to change at specific time-points pre-specified by the user (Gill et al., 2013). A further extension of the skygrid model implements a generalized linear model (GLM) to test how time-varying covariates, such as monthly temperature, are associated with temporal changes in N e in a method known as "skygrid-GLM" (Gill et al., 2016). The birth-death skyline (Figure 2) provides an alternative method to these coalescent-based approaches (e.g. Bayesian skyline, skyride, and skygrid) . While in the Bayesian skyline approach only the effective population size changes over time, the birth-death skyline model infers temporal changes in transmission, death/recovery, and sampling rates at discrete intervals alongside variation in population size . Birth-death models infer a forward-in-time process, starting with the common ancestor of all sampled infections, followed by bifurcating new lineages representing observed and unobserved transmission events, thus generating a tree . In contrast, the coalescent is modelled backwards-in-time starting from the sampled sequences in the present time until the most recent common ancestor of the sample in the past, with the internal nodes representing the merging or coalescing of two lineages into their most recent common ancestor (i.e. coalescent event) (Drummond et al., 2005;Frost & Volz, 2010).
Compared to coalescent approaches, birth-death skyline models can more easily explicitly infer the number of expected secondary infections caused by an infected individual ( Figure 2B), either as a time-varying average (R e : effective reproduction number), or at the start of an outbreak when all individuals are considered as a completely susceptible population (R 0 : basic reproduction number) (Stadler et al., 2012Volz et al., 2013). Changes in effective reproduction number can reflect depletion of the susceptible population as well as control interventions, with R e > 1 indicating a growing epidemic, R e < 1 a declining epidemic, and R e ≈ 1 that the outbreak is stable Domingo, 2020;Gostic et al., 2021).

Factors and settings associated with viral movement
To identify factors associated with higher rates of cross-species AIV transmission or lineage dispersal, phylodynamic analyses previously relied primarily on post-hoc literature searches for relevant events that occurred concurrently with reconstructed viral movements (Vijaykrishna et al., 2013). For example, one study suggested that higher rainfall might increase H10 subtype AIV spillover from wild aquatic birds to poultry in Australia because inferred spill-over events occurred more frequently during periods of increased rainfall in the region (Vijaykrishna et al., 2013). More recently, the incorporation of GLMs into discrete trait analyses (frequently referred to as "DTA-GLMs") (Box 1) (Lemey et al., 2012Gill et al., 2016) has enabled hypotheses to be formally tested during the phylogenetic estimation process (Beard et al., 2014). For instance, a DTA-GLM showed that H5N1 spillover from wild birds to poultry in Egypt tended to occur in locations with a higher density of birds and humans, higher elevation, and several meteorological variables (Magee et al., 2015).
To implement effective infection control, it is necessary to understand the settings in which AIVs move between wild and domestic birds. AIV outbreaks frequently occur in Asian wetlands or rice fields where free-grazing ducks are reared at high density and in close contact with wild birds, suggesting that these areas may support transmission between wild and domestic birds (Hulse-Post et al., 2005;Martin et al., 2011;Cappelle et al., 2014;Prosser et al., 2016;Sullivan et al., 2018). Several phylodynamic analyses have supported this epidemiological finding, for example, indicating that the spread and maintenance of H5N8 lineages in the Republic of Korea were positively associated with regions of high wild waterfowl immigration and domestic duck density (Hill et al., 2015). Low precision of metadata on sampled locations of virus sequences often limits analyses to considering predictors at district-level scales, making it harder to account for local heterogeneity in ecological suitability for wild birds or farming intensity (Hill et al., 2015;Bahl et al., 2016;Hill et al., 2020).
Furthermore, these methods have not been frequently applied in regions such as Africa and the Middle East where AIV genomic surveillance is limited (Naguib et al., 2019;Ayala et al., 2020;Kalonda et al., 2020).

Between avian and mammalian species
HPAIVs and LPAIVs can occasionally infect humans and therefore are a public health threat (Webster et al., 1992;Li et al., 2019;Lycett et al., 2019). Epidemiological studies suggest that most infected patients had recent exposure to live poultry or had visited live-bird markets, rather than exposure to wild birds or non-avian species (Zhou et al., 2013;Yu et al., 2014;Yang et al., 2017;Li et al., 2019;Oliver et al., 2022). Phylogenetic analyses of virus genomes from humans and birds have been frequently used to support that poultry were the likely origin Joseph et al., 2017;Yang et al., 2017;Zhang et al., 2021). Sporadic AIV transmission from birds to mammals can, in theory, select for variants that have increased transmissibility in mammals (Balzli et al., 2016;Bourret, 2018;Nuñez & Ross, 2019;Zhao et al., 2019) and therefore potentially increase pandemic risk (Nelson et al., 2012;Bourret, 2018;Bravo-Vasquez et al., 2020). Phylogenetic analyses have identified multiple AIV lineages associated with cross-species transmission from birds to swine or other mammals (Bodewes et al., 2016;Ramey et al., 2017;Zhao et al., 2019;Chauhan & Gordon, 2021;Rijks et al., 2021). These transmission events sometimes result in sustained transmission, particularly within swine populations (Nelson et al., 2015;Bourret, 2018). For example, a phylodynamic study used timecalibrated phylogenies to determine that H10N7 viruses that caused significant mortality in harbour seals (Phoca vitulina) in 2014 in Europe were closely related to various avian-origin H10N7 viruses detected in wild birds in the Netherlands (Bodewes et al., 2015(Bodewes et al., , 2016. More recently, H5Nx clade 2.3.4.4b viruses in red foxes and wild birds in Europe (the Netherlands) were found to be closely genetically related (Rijks et al., 2021). Our understanding of the frequency of spill-over to, and dissemination within, mammalian species is limited by the sparseness of virus genomic surveillance in these taxa (Runstadler et al., 2013;Bodewes et al., 2016;Ren et al., 2016).

HPAIV origins and diversification
HPAIVs pose a severe problem for the poultry industry, and understanding their origins could help predict and prevent future outbreaks (Nuñez & Ross, 2019;Beerens et al., 2021). Time-calibrated phylogenies of HA show that multiple H5 and H7 HPAIV genotypes have independently evolved from ancestral LPAIV viruses, potentially facilitated by high poultry density and contact rates within intensive farming systems (Monne et al., 2014;Dhingra et al., 2018;Seekings et al., 2018;Escalera-Zamudio et al., 2020). Phylogeographic investigations have tracked the geographic origins of multiple HPAIV genotypes, and in rare cases, even suggested a possible source farm (Monne et al., 2014;Dhingra et al., 2018;Seekings et al., 2018).

Geographic spread within poultry populations
Discrete trait analyses (Box 1) have frequently been employed to reconstruct AIV spatial spread within domestic poultry populations (Jin et al., 2014;Bahl et al., 2016;Harvey et al., 2021). For example, southern China was shown to be an epicentre for the national spread of H5N6 HPAIV in poultry from 2013 to 2017  and was identified as a potential source of a national wave of H9N2 infections in domestic birds (Jin et al., 2014).
Some studies have employed structured coalescent models (Box 1) to reconstruct AIV spatial spread in domestic poultry Hicks et al., 2020). Such analyses have shown that viral diffusion tended to occur within, rather than between, North American states during an H5N2 outbreak (Hicks et al., 2020). However, as previously described, sampling bias or the presence of unsampled locations (Box 1) can be problematic for discrete trait phylogeographic models, and results must be interpreted with caution given regional differences in AIV genomic surveillance capacity (De Maio et al., 2015;Layan et al., 2023). LPAIV surveillance is particularly challenging because infections rarely cause severe disease in poultry and can easily be missed (Hurt et al., 2014;Parvin et al., 2020). Although HPAIV is likely easier to detect due to its higher pathogenicity, in some countries farmers may avoid reporting HPAIV cases for fear that birds will be culled without financial compensation (Chattopadhyay et al., 2018;Parvin et al., 2020;Moyen et al., 2021;Ripa et al., 2021). Phylogeographic analysis can be performed using continuous models (Box 1) in which the spread of viruses is modelled using geographical coordinates (Lemey et al., 2009(Lemey et al., , 2010. Continuous phylogeographic analyses can be preferable to discrete analyses when it is beneficial to understand virus spread in both sampled and intermediate unsampled locations (Box 1) (Lemey et al., 2009(Lemey et al., , 2010. Continuous phylogeographic analysis indicated that the HPAIV H5N1 movement in Java in 2003 was characterized by short-range dispersal events interspersed with occasional long-range movements (Lam et al., 2012). Similar approaches were also used to show that short-distance viral movement was more common than long-distance movements during Italy's 2016-17 HPAI H5N8 epidemic, with the first outbreak wave generally restricted to the north-eastern areas of the country (Harvey et al., 2021). Biased sampling between geographical locations can result in a failure to determine the true origin of the outbreak and in the underestimation of viral diffusion rates into an oversampled region from an undersampled area when using continuous phylogeographic methods (Hill et al., 2021;Kalkauskas et al., 2021). The incorporation of sequence-free samples from affected yet unsampled areas may somewhat alleviate the effect of sampling bias in continuous phylogeographic analyses (as proposed in Kalkauskas et al. (2021)). However, this approach requires a prior understanding of the spatial distribution of outbreaks.

Drivers of dispersal
Understanding which species or breeds are most important for maintaining AIV within poultry systems can allow for the preferential targeting of surveillance and control efforts toward certain host types (Hill et al., 2015;Barman et al., 2017;Hicks et al., 2020;Youk et al., 2020;Harvey et al., 2021). A study using Markov jump and reward analysis showed that, in live bird markets in the Republic of Korea, the transmission rate of H9N2 from domestic ducks to chickens was higher than the rate in the opposite direction (Youk et al., 2020). Similarly, structured coalescent approaches (Box 1) demonstrated that the H5N2 viral transmission rates from layer chicken to turkey populations were higher than the reverse during a 2014-2015 outbreak in North America (Hicks et al., 2020).
Phylogeographic analyses helped identify factors associated with AIV spread in domestic birds (Yang, Müller, et al., 2019;Dellicour, Lequime, et al., 2020;Hicks et al., 2020), thus highlighting which components of the production system may be most affected in future outbreaks. DTA-GLMs (Box 1) are commonly used for this purpose and have identified several economic and agricultural factors significantly associated with AIV dispersal in domestic birds in China (such as poultry population density, freight transportation, the number of markets selling poultry or poultry products, and high human density) (Lu et al., 2017). When both precise location and detailed geographical environmental data are available, we can also investigate factors associated with viral dispersal in a continuous phylogeographic analysis using the R package "seraphim" (Dellicour et al., 2016) (Box 1). For example, this approach identified weak support for the association of several spatial factors (e.g. human, chicken and duck population densities, inaccessibility, savanna) with heterogeneity in lineage dispersal velocity of H5N1 in the Mekong region (Cambodia, Laos, Thailand, Vietnam) (Dellicour, Lemey, et al., 2020).
The impact of commercial poultry movement networks on AIV dissemination in domestic birds represents an important, but particularly challenging, area of research (Lu et al., 2017;Yang et al., 2020;Moyen et al., 2021). Poultry trading records are difficult to obtain on a large scale, especially in lowand middle-income countries where poultry production systems can be highly complex and dynamic, and where these data are not already routinely collected (Yang et al., 2020;Moyen et al., 2021). As such, most phylogeographic studies have used proxies for poultry trade networks (Lu et al., 2017;Yang et al., 2020). For example, one study used DTA-GLMs (Box 1) to show that a proxy network of poultry transportation in China, as determined by a gravity model built from metrics of domestic poultry production and egg production, was positively associated with the large-scale movement of three AIV subtypes (H5N1, H7N9, and H5N6) (Yang et al., 2020). Greater availability of accurate poultry trade data would likely allow for more rigorous assessment of the impact of poultry movements on AIV dispersal (Parvin et al., 2020;Yang et al., 2020;Moyen et al., 2021).

Across borders
Although AIV dispersal in domestic poultry typically occurs within countries, several phylodynamic studies have investigated lineage movements across borders (Melville & Shortridge, 2006;Yang, Müller et al., 2019;Yang, Xie et al., 2019). The occurrence of long-distance trade-facilitated AIV movements was highlighted by a 2004 report that linked the 1500 km spread of HPAIV H5N1 from Lanzhou (Gansu Province, Northwest China) to Lhasa (Tibetan Autonomous Region) to the transport of domestic birds (Melville & Shortridge, 2006). More recently, studies have begun to explore larger-scale factors that can drive interprovincial or international virus movement (Yang, Müller et al., 2019;Yang, Xie, et al., 2019). Through a GLM extension of a structured coalescent model (Box 1), one study identified annual levels of international live poultry trade between countries and national poultry production as predictors of cross-border H9N2 virus movement in domestic Asian birds (Yang, Chowdury et al., 2019). Illegal trade may also drive the international spread of AIV. However, phylodynamic investigations are limited in their ability to investigate the impact of illegal bird trade on the global dispersal of AIV, as this predictor cannot be easily quantified Yang, Müller, et al., 2019;Yang, Xie, et al., 2019).

Evaluating control efforts
In addition to helping guide the design of novel measures aimed at tackling AIV spread, phylodynamic analyses have been used to evaluate the efficacy of previously implemented control measures (Lee et al., 2014;Nickbakhsh et al., 2016;Kwon et al., 2020;Chakraborty et al., 2022). By incorporating time-varying predictors into a GLM extension of a structured coalescent phylogeographic model (Box 1), one study showed that duck culling in France likely reduced the spread of HPAIV H5N8 between French administrative divisions (Chakraborty et al., 2022). Several studies have also employed phylodynamic methods that infer past population demographic dynamics to explore the effectiveness of control measures (Lee et al., 2014;Kwon et al., 2020). For example, Kwon et al. (2020) argued that an observed fall in H5N1 virus effective population size in Bangladesh over 3 years was due to reduced virus prevalence in domestic birds following the concurrent introduction of wide-scale vaccination, although alternative explanations are possible (Kwon et al., 2020). Likewise, one study used estimates of effective reproduction numbers (Box 2) to infer that HPAI-targeted control measures (e.g. culling of infected flocks, pre-emptive culling of neighbouring flocks) introduced in Italy in 2000 successfully slowed the epidemic growth of a novel HPAI outbreak but not that of its LPAI progenitor lineage (Nickbakhsh et al., 2016).

Outlook
Whilst this review highlights where phylodynamic analyses have contributed significantly to our understanding of AIV over the last few decades, we identify several areas that can aid future progress of the field. First, we can benefit from the wide range of new approaches developed and extended during the COVID-19 pandemic. For example, methods that enable accessible, "real-time" and easily scalable incorporation of sequences into viral phylogenies (e.g. Nextstrain (Hadfield et al., 2018)) could be used more extensively to support surveillance, and methods that incorporate host travel history within phylogeographic analyses could be explored to accommodate bird migration histories .
Despite many recent methodological advances, formally integrating different data types within phylodynamic analyses remains a key challenge (Frost et al., 2015;Baele et al., 2017). Approaches that can better integrate temporally varying environmental and demographic data alongside genetic sequences would be extremely valuable to study how bird movements drive AIV lineage spread. Ideally, such models would be developed to handle a broad array of new data types, including bird tracking data from mobile global positioning systems (e.g. GPS-3G-Bluetooth technologies developed to investigate bird spatial behaviour (Yu et al., 2022)), satellite imagery of high-risk locations for AIV transmission between wild and domestic species, and poultry trade data. In locations where it is difficult for authorities to access trading records comprehensively, purpose-built apps could facilitate collection of more complete poultry trade network data for use in such new models (Ravindran, 2021;Grubaugh et al., 2019;Lycett et al., 2019;Cardona-Ospina et al., 2021). The development of methods that effectively accommodate reassortment in phylodynamic analyses in a user-friendly fashion would help better understand AIV spread (Frost et al., 2015;Lycett et al., 2019;Verhagen, Fouchier, et al., 2021).
It is essential to improve AIV genome sequence and metadata availability (Kalkauskas et al., 2021;Layan et al., 2023). Missing metadata (such as date of collection, host type or species, precise location for wild birds, or production context for domestic poultry) reduces the value of virus genomes within phylodynamics. Likewise, as discussed throughout this review, sampling biases increase the risk of drawing incorrect conclusions from phylodynamic analyses. It is therefore important to grow capacity for AIV genomic monitoring in currently under-represented countries and sampling from understudied species.
Finally, tackling challenges related to feasibility and resource-intensity is essential for enhancing phylodynamic studies of AIVs. At present, and particularly when dealing with large genomic datasets, a significant level of technical expertise and robust computing infrastructure is often required to formulate appropriate phylodynamic models, execute them, and interpret their results (Duchene et al., 2018;Sagulenko et al., 2018;Attwood et al., 2022). As such, the continued development of faster tools (e.g. recently introduced ML methods Ishikawa et al., 2019)) and the advancement of computational packages that allow existing tools to make more effective use of available computer hardware (e.g. BEAGLE (Ayres et al., 2012)) are critical, especially in more resource-constrained environments and where results are intended to inform emergency responses Baele et al., 2019). The use of cloud computing in phylodynamic analyses, which was critical for handling the unprecedently large genomic datasets produced the COVID-19 pandemic, could be beneficial in helping researchers handle increased numbers of genome sequences. Greater investment in appropriate training and computational infrastructure in many lower-and middle-income countries would improve global accessibility of phylodynamic approaches (Rife et al., 2017;Hill et al., 2021;Attwood et al., 2022).

Conclusions
AIVs can severely harm domestic and wild birds, and their effective control in birds can help protect the health of humans and other mammalian species. Phylodynamic approaches can be insightful in reconstructing the spatiotemporal dispersal of AIVs, with models capable of analysing viral diffusion within and between different host populations and locations. However, limitations in phylodynamic models exist when key metadata are missing, virus genomic sampling is uneven, and for analysing reassortant viruses. Addressing these challenges will be important to further fulfil the potential of phylodynamic analyses to improve human and animal health.

Disclosure Statement
No potential conflict of interest was reported by the authors.