The Avian Pathogenic Escherichia coli (APEC) pathotype is comprised of multiple distinct, independent genotypes

ABSTRACT Avian Pathogenic E. coli (APEC) is the causative agent of avian colibacillosis, resulting in economic losses to the poultry industry through morbidity, mortality and carcass condemnation, and impacts the welfare of poultry. Colibacillosis remains a complex disease to manage, hampered by diagnostic and classification strategies for E. coli that are inadequate for defining APEC. However, increased accessibility of whole genome sequencing (WGS) technology has enabled phylogenetic approaches to be applied to the classification of E. coli and genomic characterization of the most common APEC serotypes associated with colibacillosis O1, O2 and O78. These approaches have demonstrated that the O78 serotype is representative of two distinct APEC lineages, ST-23 in phylogroup C and ST-117 in phylogroup G. The O1 and O2 serotypes belong to a third lineage comprised of three sub-populations in phylogroup B2; ST-95, ST-140 and ST-428/ST-429. The frequency with which these genotypes are associated with colibacillosis implicates them as the predominant APEC populations and distinct from those causing incidental or opportunistic infections. The fact that these are disparate clusters from multiple phylogroups suggests that these lineages may have become adapted to the poultry niche independently. WGS studies have highlighted the limitations of traditional APEC classification and can now provide a path towards a robust and more meaningful definition of the APEC pathotype. Future studies should focus on characterizing individual APEC populations in detail and using this information to develop improved diagnostics and interventions.


Introduction
Avian Pathogenic E. coli (APEC) are a subset of Extraintestinal Pathogenic E. coli (ExPEC) and are the foremost cause of extra-intestinal bacterial infection in the poultry industry (Dho-Moulin & Fairbrother, 1999;Guabiraba & Schouler, 2015). APEC infection causes elevated morbidity and mortality in poultry flocks, and consequent economic losses in the broiler, layer, game, and turkey sectors (Kemmett et al., 2013;Guabiraba & Schouler, 2015;Huja et al., 2015). Economic losses within the Dutch poultry industry due to APEC were estimated to be €0.4-3.7 million in 2013 alone (Landman & van Eck, 2015), and are reported to account for multi-million USD losses in the turkey industry.
APEC is also one of several common bacterial pathogens causing morbidity and mortality in ducks and geese (Lima Barbieri et al., 2017), a cause for concern within the waterfowl-breeding industry in China with a population of 20-30 billion ducks per year (Li et al., 2016). Furthermore, wild ducks and geese and game birds can be carriers of APEC and may serve as potential transmission vectors between reservoirs (Díaz-Sánchez et al., 2012;Elmberg et al., 2017). APEC therefore poses a global threat to food security and avian welfare (La Ragione & Woodward, 2002).
Traditionally, APEC has been thought of as a secondary pathogen following antecedent viral or Mycoplasma infection (Mellata, 2013). In the last decade, APEC has become recognized as a primary pathogen in avian hosts (Collingwood et al., 2014), causing colibacillosis, a syndrome characterized by a range of localized and systemic infections including respiratory tract infection, pericarditis, perihepatitis, airsacculitis and septicaemia (Sadeyen et al., 2014). APEC may also cause avian cellulitis, an issue commonly associated with broiler chickens and characterized by subcutaneous, necrotic plaques and inflammation of the overlying chicken skin (Poulsen et al., 2018), initiated by infection of scratches and wounds. Whilst cellulitis does not often result in clinical disease or decreased performance, it does result in carcass condemnation and major economic losses. Additionally, APEC causes salpingitis and peritonitis, common pathological manifestations in layer hens often caused by ascending infection of the reproductive tract, most commonly during the late lay period (Van Goor et al., 2020). Stress or injury may exacerbate colibacillosis signs (Dho-Moulin & Fairbrother, 1999;Mellata et al., 2003;Alber et al., 2020). The reduced use of antimicrobials in livestock production, which followed the European Commission ban on using antimicrobials as growth promoters in feed in 2006 (EC Regulation No. 1831, may have contributed to the proliferation of pathogenic lineages and emergence of APEC as a primary pathogen in poultry. Furthermore, global intensification of poultry production and husbandry practices over the last two decades, as well as the expansion of free-range production systems, may have also contributed to increased incidence of colibacillosis (Guabiraba & Schouler, 2015). Despite the considerable economic impact of APEC due to treatment, mortality and carcass condemnation (Fancher et al., 2020), a thorough understanding of the APEC pathotype has remained elusive, due in part to the sheer genetic diversity of APEC populations and inadequate diagnostic and typing strategies (Schouler et al., 2012;Collingwood et al., 2014;Cordoni et al., 2016;Ronco et al., 2017). Furthermore, APEC have been implicated as a zoonotic risk due to the high genetic similarity between certain APEC isolates and E. coli causing urinary tract infections (UTI) in humans (Maluta et al., 2014), and the fact that APEC isolates are capable of causing UTI and meningitis in mouse infections models (Jakobsen et al., 2010;Tivendale et al., 2010). APEC are also a major concern for the transmission of antimicrobial resistance (AMR) genes Maluta et al., 2014). A vastly improved understanding of the population structure and identification of differential risks of specific genotypes/lineages is essential for the development of effective control and intervention strategies to mitigate the economic impact of colibacillosis.
APEC classification in the pre-genomic era has contributed to a poorly defined APEC pathotype Pathogenic E. coli strains are often classified into pathotypes which are identified using acronyms (Robins-Browne et al., 2016). However, the nomenclature of pathotypes which have been discovered at different points in time is not consistent. For example, certain pathotypes are defined by the target organ or tissue, such as uropathogenic E. coli (UPEC), the presence of a specific gene or combination of genes (Shigatoxin encoding E. coli, STEC) or, in the case of APEC, the infected host (Denamur et al., 2020). The myriad of existing pathotypes and the emergence of hybrid and sub-pathotypes has therefore complicated the classification of E. coli. More accurate definitions of pathotypes are therefore required. Whole genome sequencing (WGS) has the potential to inform on phylogeny, population dynamics, and molecular epidemiology of E. coli from different niches and should be the basis for improving obscure pathotype definitions such as APEC beyond isolation from avian hosts.
APEC classification is traditionally based on serotyping and virulence genotyping, with the gold standard for designation of a pathogenic E. coli isolate as APEC being confirmation of virulence in day-old chicks or embryos (Dho & Lafont, 1982;Awad et al., 2020.). However, this is now rarely performed due to a lack of suitable facilities, and ethical concerns.
Serotyping based on the detection of somatic Oantigens using PCR or antisera has been used to type E. coli implicated in colibacillosis, and this has been the traditional diagnostic method in laboratories for detection of APEC (La Ragione & Woodward, 2002;Ewers et al., 2004;Schouler et al., 2012;Joensen et al., 2015). However, serotyping only allows for the classification of a limited number of APEC isolates. The most frequent approach for identifying and designating APEC is by PCR-typing of E. coli recovered from chickens presenting with clinical signs or symptoms, targeting 10-15 virulence factors, with isolates containing ≥5 virulence determinants designated as APEC (De Carli et al., 2015;Schouler et al., 2012;Subedi et al., 2018). Coupled with epidemiological studies, these approaches have facilitated the identification of the most common APEC serotypes and virulence determinants implicated in colibacillosis. Globally, the most prevalent O-antigen serotypes of APEC are O78, O2 and O1, estimated to account for 80% of APEC isolates (Huja et al., 2015). Other observed serotypes include O5, O18, O35, O109 and O115 (Schouler et al., 2012;Kathayat et al., 2018). Many studies have attempted to define the virulence genes associated with APEC; these markers predominantly include virulence-associated genes, such as iss, iucD, hlyE/F, ompT, iroN, iutA, irp2, papC, cva/cvi, and tsh amongst others Subedi et al., 2018;Cummins et al., 2019). Whilst PCR assays targeting these genetic markers are commonly used, they are limited in defining pathogenic E. coli in avian hosts. For example, longitudinal analysis of APEC and virulence-associated gene carriage performed by Kemmett et al. (2013) showed that 17.9% of E. coli isolates obtained from commercial broiler chickens showing signs of colibacillosis, encoded none of the targeted virulence-associated genes (astA, iss, irp2, iucD, papC, tsh, cvi, vat, sitA, ibeA). Moreover, this study found that four virulence-associated genes, astA vat, iss and tsh, were not associated with systemic E. coli. A subsequent study also found that E. coli isolated from lesions typical of colibacillosis encoded between zero and seven of the target virulence-associated genes . Evidently, the suitability of these markers for discriminating these groups is questionable and may be confounded by secondary or coinfections by otherwise commensal E. coli.
Many of these virulence factors are encoded on large conjugative IncFII/IncFIB hybrid plasmids (Johnson et al., 2006). These plasmids often encode Colicin V, a compound produced by certain Enterobacteriaceae that is lethal to sensitive strains (Waters & Crosa, 1991). Early APEC studies have directly linked ColV-IncF plasmids with the ability to cause disease in production birds, highlighting their importance in pathobiology of APEC (Ginns et al., 2000;Gibbs et al., 2003;Ewers et al., 2004;Johnson et al., 2006). However, this plasmid family is not universally present nor unique for APEC strains, and is strongly associated with generalist ExPEC genotypes (Azam et al., 2019).
Whilst effective for the detection and differentiation of APEC, serotyping and virulotyping approaches allow for an extremely broad definition of APEC as a pathotype, considering there are no single, or combinations of, virulence determinants that have been shown to be exclusively associated with all APEC strains (Collingwood et al., 2014;Azam et al., 2019).
It is thought that the reservoir of APEC is the avian gut (Kemmett et al., 2013;de Oliveira et al., 2015), and that asymptomatic carriage of these potentially virulent E. coli precedes the development of systemic infection. The great diversity of isolates apparently implicated in colibacillosis may be due to a high incidence of opportunistic infection by otherwise commensal E. coli strains derived from the gut (Collingwood et al., 2014). Birds may become susceptible to opportunistic infection due to immune suppression, stress injury, the onset of sexual maturity and poor hygiene (Baghbanzadeh & Decuypere, 2008;Kemmett et al., 2013;Collingwood et al., 2014). Defining all E. coli recovered from infected birds as APEC conflates bystander isolates with virulent E. coli causing primary infection, thereby increasing the diversity of E. coli genotypes associated with colibacillosis and complicating the understanding of APEC populations. It has been suggested that to term such a wide array of isolates, including opportunists and bystander isolates, as APEC is flawed (Collingwood et al., 2014), and distorts any meaningful definition of the pathotype.
The improved accessibility to WGS has led to an unprecedented accumulation of genomic data which facilitates phylogenomic approaches to classify E. coli and provides crucial insights into the definition of APEC. WGS data can be used for in silico typing of E. coli for classic genotyping schemes such as O-and H-antigen typing, multi-locus sequence typing (MLST), and fimH allele typing (Wirth et al., 2006;Joensen et al., 2015;Beghain et al., 2018). WGS also permits typing with even greater resolution using techniques such as core genome MLST or whole-genome MLST and allows accurate inference of the evolutionary history of the isolates (Kovanen et al., 2014;Cody et al., 2017). Therefore, WGS allows traditional typing of APEC based on serotyping and virulotyping to be taken into context with high-resolution typing and phylogenetic data. Whilst many epidemiological and functional studies of APEC and poultry-associated E. coli have been performed in the genomic era (De Carli et al., 2015;Logue et al., 2017;Ibrahim et al., 2019;Sarowska et al., 2019;Jørgensen et al., 2019), APEC classification is still largely rooted in serotyping, and there is a lack of studies that unify traditional typing methods with higher-resolution, genomic approaches.
In addition to serotyping, approaches such as pulse-field gel electrophoresis (PFGE) and MLST have been used to study clonality of APEC isolates implicated in colibacillosis outbreaks (Hussein et al., 2013), phylogenetic relatedness of APEC and ExPEC (Moulin-Schouleur et al., 2007), as well as to monitor the long-term prevalence of certain strains (Kemmett et al., 2013). MLST has been particularly useful in identifying dominant sequence types implicated in multiple outbreaks and has revealed the extent of the diversity of isolates associated with colibacillosis and poultry carriage (Kemmett et al., 2013). The initial revision of the Clermont scheme and recent identification of phylogroup G enabled improved phylogenetic classification of E. coli (Clermont et al., 2013(Clermont et al., , 2019. Phylotyping shows a high correlation with MLST and offers an improved indication of isolate relatedness (Saha et al., 2020). Yet studies utilizing the updated method are scarce, and even these studies are now outdated with the identification of the poultry-associated phylogroup G .
These improved molecular typing approaches have facilitated greater characterization of APEC and identification of individual genotypes of concern (Maluta et al., 2014), identification of specific gene subsets (Knöbl et al., 2012), and discrimination of avian pathogenic strains from avian faecal E. coli (AFEC) and other niches . Relatively few studies utilize WGS approaches to examine the phylogeny of APEC; however, this approach is starting to become more widely used and allows unification classification by serotyping, MLST, and phylogrouping to accurately define predominant APEC populations (Papouskova et al., 2020). This more complex approach is required to understand APEC virulence, transmission and population dynamics.
Beyond serotyping: understanding APEC phylogeny and delineating the APEC pathotype The fact that O78, O1 and O2 serotypes are the most frequently detected serotypes amongst putative APEC isolates recovered from cases of colibacillosis (Mellata et al., 2003;Lynne et al., 2012;Kim et al., 2020;Hu et al., 2020), suggests a discrete set of lineages is responsible for most infections. Early genome sequencing studies focussed on these serotypes (Johnson, et al., 2006;Johnson et al., 2007;Mangiamele et al., 2013). An important sequencing study by Dziva et al (2013) revealed that O78 and O1 APEC strains displayed significant genetic diversity from one other and indicated that different core-genome types may be adapted to cause the same avian disease via distinct mechanisms (Dziva et al., 2013). Large scale WGS studies on APEC isolate collections have since been employed (Cordoni et al., 2016;Ronco et al., 2017;Jørgensen et al., 2019;Cummins et al., 2019); however, further work is required to more accurately define the relationship between different APEC types beyond mere serotyping and has allowed identification of factors that contribute to predominance of these genotypes in the poultry niche (Manges et al., 2019). In spite of comparative genomic studies highlighting the genetic diversity of the most common APEC types (Dziva et al., 2013;Cordoni et al., 2016), functional studies have often used isolates representative of only one APEC type (Alber et al., 2020), and typing approaches based on possession of virulence genes do not adequately account for phylogenetic differences (Subedi et al., 2018). This review seeks to highlight the phylogenetic relationship of predominant APEC types, paving the way for an improved definition of APEC and more targeted interventions.

Serotype O78 APEC are comprised of two distinct lineages
The O78 serotype is widely reported to be the most common APEC serotype (Dziva & Stevens, 2008;Lynne et al., 2012;Mangiamele et al., 2013;Huja et al., 2015;Cordoni et al., 2016), and this has been confirmed in genomic epidemiological studies (Ronco et al., 2017;Ibrahim et al., 2019). O78 serotype strains are also associated with diarrhoeal disease in cattle and calves (Ewers et al., 2004), and have been isolated from sheep (Babai et al., 2006), and reported to cause septicaemia in humans (Ron et al., 1991). The archetypal APEC O78 serotype isolates that were initially sequenced as representatives of this APEC group were determined to be represented by two related sequence types, ST-23 and ST-88 in phylogroup C, both of which belong to the ST-23 clonal complex (Mangiamele et al., 2013;Huja et al., 2015;Denamur et al., 2020). APEC O78 ST-88 strains encode the H9 flagella antigen whereas APEC O78 ST-23 strains encode the H4 flagella antigens, revealing a degree of heterogeneity within this group that cannot be identified by O-serotyping alone (Mangiamele et al., 2013;Dziva et al., 2013). In 2017, it was reported that ST-117 APEC, a sequence type that has been sporadically observed in APEC studies prior to 2014, had spread throughout Nordic broiler production and was implicated in large outbreaks of colibacillosis (Ronco et al., 2017). The majority of these isolates encode the O78:H4 antigens (Ronco et al., 2017). Subsequent epidemiological studies demonstrated that ST-117 belongs to the recently identified phylogroup G (Clermont et al., 2019) and can encode a variety of O-antigens in place of O78 (Kim et al., 2017). Epidemiologic data suggest that phylogroup G is dominated by isolates belonging to the ST-117 clonal complex and is associated with high virulence in a mouse sepsis model, and multidrug resistance (Clermont et al., 2019). Phylogroup G shows a remarkable divergence from neighbouring phylogroups, B2 and F (Figure 1), suggesting a high degree of adaption to the poultry niche. Whilst some phylogroup G/ST-117 isolates have the O78 antigen in common with O78/Phylogroup C isolates belonging to ST-23 and ST-88, genomic data show these two groups are phylogenetically distinct (Clermont et al., 2019;Denamur et al., 2020). This is typified by the fact that ST-23 and ST-88 APEC in phylogroup C encode the O78 antigen exclusively and exhibit a degree of variation in their H-antigen, encoding either H9, H19 or H4 (Figure 1). Conversely, ST-117 strains in phylogroup G exhibit a degree of flexibility in their O-antigen and encode a conserved H4 flagella antigen (Mangiamele et al., 2013;Dziva et al., 2013;Kim et al., 2017). These differences suggest a heavy selective pressure favouring the maintenance of the O78 antigen in the ST23/Phylogroup C lineage and the H4 antigen in ST-117/Phylogroup G isolates.
Serotypes O1 and O2 are representative of three APEC sub-lineages in phylogroup B2 APEC isolates encoding either the O1 or O2 antigens are the second and third most common serotypes associated with colibacillosis (Huja et al., 2015). In addition to APEC infections, O1 and O2 serotypes are frequently implicated in human bloodstream, urinary tract infections, and neonatal meningitis (Delannoy et al., 2017). Genomic-based studies have shown that these serotypes exclusively reside within the diverse B2 phylogroup and form at least three distinct sub-groups (Ge et al., 2014;Ciesielczuk et al., 2016). These sub-groups can be differentiated on the basis of sequence type; ST-95, ST-140 and ST-428/ 429, with the ST-95 lineage recognized as the most prevalent (Ge et al., 2014). APEC isolates belonging to one of these lineages often encode either the O2 or O1 antigen, but also can encode others such as O18, O25 and O45. Perhaps a more consistent marker of this APEC lineage is the H-antigen that they encode. ST-95 isolates encode either the H7 or the H4 antigen, independent of O-antigen type (Cummins et al., 2019;Denamur et al., 2020). The ST-140 subgroup appears to exclusively encode the H5 antigen (Ge et al., 2014). The number of APEC-associated virulence genes encoded by each genome was determined by screening genomes against a custom database using Abricate (https://github.com/tseemann/abricate) using an 80% minimum sequence identity threshold. Presence of a virulence gene is denoted by the teal-coloured box on the heatmap, absence is denoted by blank space. There are no virulence-associated genes that are suitable as discriminatory markers for these predominant APEC genotypes.
Predominant APEC populations are comprised of distinct, independent genotypes which can colonize multiple hosts The improved resolution afforded by WGS studies has shed light on the true complexity of APEC populations and revealed the existence of at least three independent APEC lineages outlined here; these are ST-23 in phylogroup C, ST-117 in phylogroup G, and ST-95, ST-140 and ST-428/429 in phylogroup B2. Conceivably, these predominant APEC groups may have distinct mechanisms associated with avian pathogenicity. Commonly used virulotyping techniques have been unable to identify factors exclusively associated with APEC because this approach inherently does not consider phylogenetic distance of these lineages. In the face of selective pressure encountered in the poultry niche, such as the poultry immune system, vaccination, competition, and antimicrobial compounds, it appears APEC isolates can only exhibit a finite range of O-and H-antigens, thereby striking a balance between immunological diversity and functional capability. Remarkably, APEC lineages that are phylogenetically diverse and distinct from one another, appear to often encode identical somatic and flagella antigens; this is most obviously observed between ST-23 in phylogroup C and ST-117 in phylogroup G which typically encode the O78:H4 antigens (Figure 1). Further work is required to determine whether the O78:H4 antigens encoded by these lineages are due to a shared common ancestor of these sequence types and whether conservation of this antigen combination is due to a functional significance. There are numerous instances of different, atypical O-and H-antigens encoded by isolates belonging to these lineages, such as O78:H9 in phylogroup C and O24:H4 in phylogroup G ( Figure  1). These are likely the result of independent pathoadaptive mutations, and future work should also be undertaken to characterize the extent of these independent adaptions of these predominant APEC genotypes. Ultimately, WGS data show that serotype markers are insufficient to accurately detect and differentiate APEC lineages, and future studies should focus on utilizing such data to develop more accurate diagnostic tools and efficacious vaccines. The future utility of WGS to further characterize APEC Identification and characterization of novel genotypes in APEC WGS is increasingly used by regulatory and public health agencies to facilitate detection, investigation and control of pathogens, particularly foodborne bacteria. Routine surveillance of food production animals affords the opportunity to characterize bacteria implicated in outbreaks and/or pathogenic reservoirs in a single workflow that is both cost-effective and rapid (Apruzzese et al., 2019). Increased global WGS data on APEC arising from these schemes have subsequently enabled the detection of more cryptic avian-associated genotypes, particularly those that exhibit sero-diversity, but a conserved core-genome type  2016; Azam et al., 2020). Perusal of ST-48 APEC genomes in Enterobase illustrates that ST-48 is a phylogroup A lineage that exhibits extensive O-and Hantigen sero-diversity (Zhou et al., 2020; Table S1), thereby masking the true incidence of this lineage in historical studies classifying by serotype alone. Furthermore, according to traditional virulotyping techniques, there is doubt as to whether ST-48 would be considered as APEC because isolates have been shown not to encode many of the archetypal virulence factors associated with APEC (Dissanyake et al., 2014). In view of the frequency with which ST-48 has been associated with colibacillosis cases (Maluta et al., 2014;Cordoni et al., 2016;Azam et al., 2020), greater consideration is required as to whether isolates from this lineage should be considered as APEC or could be more accurately described as opportunists or bystanders. Future experimental studies should aim to elucidate the pathogenicity of ST-48.
It is expected that establishment of surveillance networks using WGS (Allard et al., 2016;Gerner-Smidt et al., 2019;Chattaway et al., 2019), and the concurrent development of large data repositories, will allow research centres around the world access to sequence data with ease. This allows a comparison of pandemic APEC clones causing geographically restricted colibacillosis outbreaks with APEC from other parts of the world for rapid assessment of emergence and transmission potential (Knöbl et al., 2012).

WGS can inform the zoonotic risk of APEC
APEC are frequently cited as potential zoonotic pathogens due to the high similarity of certain APEC strains with UPEC/ExPEC isolates implicated in human disease Khairy et al., 2019;Jørgensen et al., 2019). Colonization of extra-intestinal niches necessitates specific bacterial adaptions, which are often conferred to APEC and ExPEC by virulence determinants. Given these niche requirements are similar between animal species, similarities in virulence determinants between ExPEC causing disease in different animal hosts are to be expected (Singer, 2015). This is especially true when the expected serotypes and virulence factors encoded by APEC are reduced to a minimally predictive set (Johnson et al., 2008). Whilst high gene similarity suggests zoonotic potential, the transmission of APEC from poultry to a human host has not been demonstrated. Transmission is hypothesized to be indirect via the consumption of retail poultry products which subsequently cause foodborne urinary infections (Liu et al., 2018;Saidenberg et al., 2020). Isolation sources of predominant APEC and ExPEC sequence types are shown in Figure 2 which indicates that all predominant APEC genotypes are capable of colonizing multiple hosts and thereby pose a zoonotic risk (Table   S2). For example, WGS data indicate that ST-117 is a predominant APEC lineage, but it is also reported in human UTI infections (Ronco et al., 2017), fatal sepsis in humans, and has been recovered from dairy cows (Kim et al., 2017). However, it is only a subset of genotypes consistently associated with chicken infection that displays significant overlap with human-associated ExPEC (Jørgensen et al., 2019). These genotypes include ST-95 belonging to phylogroup B2 (Jørgensen et al., 2019), and ST-69 belonging to phylogroup D (Hornsey et al., 2019). Based on source association (Figure 2), these genotypes are more commonly isolated from human hosts than poultry and could justifiably be classified under the umbrella of ExPEC rather than APEC. Of particular concern are potentially zoonotic APEC strains that are multi-drug resistant (Liu et al., 2018;Jørgensen et al., 2019).
Transition of bacterial pathogens between different host species is often accompanied by adaptive mutations or horizontal acquisition of genetic elements derived from host-specific gene pools, which facilitate host switching (Sheppard et al., 2018). WGS has the potential to elucidate whether there are any genetic signals associated with hostadaption in multi-host genotypes (Sheppard et al., 2013;Wheeler et al., 2018;O'Boyle et al., 2020), thus allowing the zoonotic aspect of APEC to be investigated in greater detail. WGS data for other bacterial pathogens have been used to infer evolutionary history and examine the frequency and timing of multi-directional host-switching events (Richardson et al., 2018). Such approaches can be employed to approximately date the emergence of predominant APEC lineages as well as the divergence of the ST-140 and ST-428/ 429 lineages which appear to be more restricted to the avian host ( Figure 2). This approach holds great promise for outbreak source attribution but, crucially, can shed light on the specific genetic factors that determine success in different host species and niches.
There is also the potential for drug-resistant, commensal E. coli to reside asymptomatically in the chicken gut reservoir and subsequently serve as a source of human infection (Thorsteinsdottir et al., 2010). Longitudinal WGS studies sampling the broiler chicken gut and faeces for E. coli, with a focus on commensal isolates, would greatly aid in determining the importance of the chicken gut as a reservoir of pathogenic E. coli impacting on human health.

WGS can improve understanding of vertical transmission of APEC
The poultry gut is potentially the primary reservoir of APEC (Kemmett et al., 2013) but the extent of vertical transmission of APEC from breeders to progeny is unknown. Broiler production in many countries relies on breeding pyramids wherein pedigree chickens and great-grandparent stock are used to produce grandparent and parent stock which are often imported between countries as one-day-old chicks (Dierikx et al., 2013). Previous studies have indicated that E. coli from grandparent flocks have the potential to be transmitted vertically through such breeding pyramids to broilers and thereby allow dissemination of E. coli between countries (Giovanardi et al., 2005). Extended-spectrum beta-lactamase (ESBL)-and plasmidic AmpC (pAmpC)-producing E. coli have been observed at all levels of broiler production suggesting vertical spread from parents to progeny. Studies concerning the vertical transmission of E. coli in broilers therefore have the tendency to limit investigation to the transmission of ESBL/pAmpC in broilers rather than dominant E. coli genotypes implicated in pathogenicity.
Genetically similar plasmids have been detected in ESBL-producing E. coli of different MLST types isolated longitudinally from breeders, meconium of progeny broilers, and broilers before slaughter, suggesting these plasmids are derived from nucleus poultry flocks (Zurfluh et al., 2014). Similarly, the occurrence of one extended-spectrum cephalosporin clone has been demonstrated in all levels of the Swedish production pyramid and suggests this clone was introduced through imported breeding stock and vertical transmission through the production pyramid (Nilsson et al., 2014).
There are compelling findings consistent with vertical transmission of ESBL-harbouring plasmids and E. coli through the broiler production pyramid (Pires-Dos-Santos et al., 2013;Nilsson et al., 2014;Zurfluh et al., 2014;Daehre et al., 2018;Oikarainen et al., 2019). However, studies have also shown that with routine disinfection procedures, bacterial contamination is significantly reduced (Dierikx et al., 2013;Projahn et al., 2017). Therefore, the incidence of direct vertical transmission, from infected breeder oviducts to progeny in ovo, is likely to be negligible. Projahn et al. (2017) suggest that transmission of E. coli to progeny is "pseudo-vertical"; a mix of different transmission routes such as faecal contamination of eggs as well as dust and surfaces within the hatchery. Horizontal transfer of E. coli within the hatchery must also be considered. Horizontal transfer of ESBL producers between consecutively fattened flocks is possible despite cleaning and disinfection procedures . Day-old chicks colonized with ESBL/pAmpC could therefore contaminate the hatchery environment and impact on subsequent flocks and facilitate this pseudo-vertical spread.
Questions remain as to whether APEC, particularly the predominant APEC genotypes outlined in this review, can be transmitted vertically. Most longitudinal studies concerning vertical transmission utilized antibiotic selection to determine the spread of resistance genes, and typing methods such as MLVA, random amplified polymorphic DNA, and PFGE which do not provide high enough resolution to infer clonal spread of genotypes. Consequently, there is a distinct lack of data describing the E. coli genotypes implicated in vertical transmission.
One of the few studies using WGS identified ST-429, designated here as a predominant APEC lineage, as one of the sequence types implicated in vertical transmission (Oikarainen et al., 2019). In modern broiler production, newly hatched chicks have a very low microbial load. They are deprived of maternal contact and their pioneer microbiome is derived from the hatchery environment (Gilroy et al., 2018). Questions remain as to whether APEC carriage is due to vertical transmission or acquired from the hatchery from consecutively fattened flocks. It is conceivable that pathogenic salpingitis-causing E. coli could be passed vertically, or pseudo-vertically, to cause mortality in day-old chicks, especially in those without a functional microbiome which thereby lack the capacity for APEC colonization resistance via competitive exclusion. On the other hand, the hatchery environment is an ideal bottleneck for the spread of pathogenic strains, particularly those with a strong biofilm-forming capacity , which promotes a stepwise, indirect transmission chain.
Detailed WGS-based longitudinal studies are required to further evaluate the relative contributions of direct and pseudo-vertical transmission pathways of E. coli in broiler production with a specific emphasis on predominant APEC populations. Whilst genotyping techniques will also form the basis of target surveillance of pathogenic lineages and allow the sources and transmission routes of E. coli to be investigated, rapid sequencing-based detection and diagnostic methods such as long-read based sequencing (e.g. Oxford Nanopore, PacBio) and rapid, isothermal amplification offer promise in surveillance of APEC in flocks and breeding pyramids (Romero & Cook, 2018).

WGS can improve surveillance of AMR and aid the characterization of gene transmission dynamics
The poultry gut microbiome is a significant global reservoir of AMR genes (Cordoni et al., 2016;Hornsey et al., 2019;Cummins et al., 2019). Within the poultry gastrointestinal tract, bacteria exist at high densities thus facilitating intra-and inter-species, as well as inter-genera, horizontal transfer of resistance genes (Cordoni et al., 2016;Card et al., 2017). APEC are often resistant to a range of antimicrobial compounds including tetracyclines, chloramphenicol, sulphonamides fluoroquinolones and β-lactams  Awad et al., 2020). The genes conferring resistance are frequently encoded on integrons and plasmids, as well as other mobile genetic elements, which can be passed between permissive APEC strains and other species (Hornsey et al., 2019;Cummins et al., 2019). Class 1 integrons in particular play a vital role in the capture and expression of gene islands that encode resistance to antimicrobials and heavy metals (Cummins et al., 2019;Zingali et al., 2020). These genetic features are considered markers of multidrug resistance and may serve as major drivers of AMR (Zingali et al., 2020). Therefore, determining the extent of class 1 integron presence in broiler production and other intensive farming settings is crucial to containing the impact of AMR.
Many studies have demonstrated a close association between antibiotic use in food animal production and AMR in humans (Montoro-Dasi et al., 2020). Antimicrobial misuse in livestock is considered a threat to the clinical utility of antibiotics in both animals and humans (Vanderhaeghen & Dewulf, 2017); consequently, there has been a drive to avoid the use of highest priority clinically important antimicrobials (HP-CIAs) in food production animals to preserve their clinical efficacy (Lhermie et al., 2020). Unfortunately, in many disease conditions, adequate alternatives to HP-CIAs are lacking (Lhermie et al., 2020).
WGS will improve surveillance of AMR within poultry pathogens, providing a greater understanding of the transmission of resistant bacteria and AMR genes throughout the food-chain. For example, WGS data will allow the presence, localization and co-occurrence of AMR genes to be assessed. If multiple transmissible genes are present on mobile genetic elements, there is the potential for AMR gene acquisition in a genetic transmission event (Hornsey et al., 2019). This information can inform and support risk assessment activities (Collineau et al., 2019). Increasing affordability of next-generation sequencing for diagnostic laboratories offers an opportunity to harmonize testing and susceptibility practices and develop integrated surveillance systems of AMR. Such initiatives focussed on animal pathogens already exist and hold promise for improving AMR monitoring in livestock and in linking data to surveillance systems in humans (Allard et al., 2016;de Jong et al., 2018;Ceric et al., 2019).
Recent years have seen a trend in WGS approaches becoming more widely used in attempts to characterize multi-drug resistance in APEC and infer potential zoonotic transmission (Sarowska et al., 2019;Cummins et al., 2019). This has led to the identification of concerning zoonotic genotypes that have a high AMR burden such as ST131-H22 sub-lineage and ST-69, multi-host-adapted genotypes (Figure 2) that are often reported to encode multiple resistance genes including those conferring resistance to colistin, a last-resort antibiotic (Hornsey et al., 2019;Saidenberg et al., 2020). For instance, whilst ST-131 is a relatively minor APEC type, this lineage is notorious for its extensively multidrug-resistant ST131-H30 pandemic sub-lineage which causes extra-intestinal infection in humans (Forde et al., 2019). Recently, Saidenberg et al. (2020) reported the O25:H4-ST131-H22 sub-lineage isolated from poultry with colibacillosis appeared to be capable of being transmitted to humans directly, thus posing a direct risk to poultry consumers.
There has been a tendency for WGS approaches to focus on multi-drug resistant isolates; sequence repositories are heavily populated with genomes encoding a high proportion of AMR genes as a result. Much of our knowledge and assumptions on the prevalence and evolution of AMR in food production systems relate to these pathogenic organisms with a high burden of AMR and may not consider the wider population of E. coli in the chicken microbiome (de Oliveira et al., 2015). Whilst longitudinal studies examining the carriage of APEC and commensal E. coli in the chicken gut exist (Kemmett et al., 2013), the role of commensal isolates as a reservoir of AMR genes is not fully explored. WGS has the potential to aid characterization of within-host transmission dynamics of AMR genes between different lineages, be they pathogenic, commensal, multi-hostadapted or host-restricted.
The increasing number of genomic epidemiology studies coupled with phenotype and AMR usage data will enable a thorough investigation of patterns of AMR in APEC. This will allow the global drivers of AMR to be assessed, including the impact of geographic location as a driver of drug-resistant E. coli, in a similar manner to that determined in human pathogenic E. coli (Ingle et al., 2018). This data will also inform on how the genomic background of APEC strains shapes the rate and mechanisms of AMR.

Future Perspectives
Source attribution of zoonoses, and identification of pathogenic variants derived from animal host reservoirs that represent a threat to human health, is an important aspect of infection control (Lupolova et al., 2017;Munck et al., 2020). However, there are inherent challenges in source attribution of pathogenic strains with broad host ranges. It can be expected that the number of genome sequences generated from both diagnostic and research settings will continue to rapidly accumulate. Consequently, there is a need to adopt effective computational approaches to predict host sources of human ExPEC infections and genotypes derived from animal reservoirs that pose a threat to human health (Lupolova et al., 2019).
Machine learning (ML) can be applied to interrogate bacterial WGS data and develop models that identify cryptic patterns associated with genetic variations in isolates from different host reservoirs to predict source host and zoonotic potential (Lupolova et al., 2019). ML methods in combination with core-genome phylogenetic reconstruction have previously been used to predict isolate source in Salmonella Typhimurium subpopulations (Lupolova et al., 2017;Branchu et al., 2018); such an approach holds great promise in the investigation of whether there is evidence of host adaption in the multi-host APEC genotypes, as well as improved discrimination of APEC from AFEC.
APEC causes a diverse array of syndromes ranging from common respiratory infections to severe pericarditis, perihepatitis, omphalitis, septicaemia and cellulitis. Each of these syndromes involves infection of distinct tissues. ML may also facilitate tissue tropism of APEC sub-lineages to be predicted and allow for patho-variants predisposed to cause one of these conditions to be investigated and, for example, differentiate genetic features associated with isolates causing cellulitis compared to those causing respiratory infections. For such studies, stringent recording of clinical features and post-mortem examination findings, and metadata linked to APEC isolates will be required.
This review has highlighted the extent of the diversity of predominant APEC lineages. It is likely that these lineages may have become adapted to the poultry niche independently and therefore possess distinct genetic traits related to colonization and pathogenicity. Improved functional annotation of APEC genomes, particularly of lineage-specific genetic features, using strategies such as transposon mutagenesis followed by fundamental host interaction investigation is required for future WGS-based investigations to be of value (Eckert et al., 2011;Dziva et al., 2013). Given the reported role plasmids often play in APEC virulence (Gibbs et al., 2003;Ewers et al., 2004;de Oliveira et al., 2015), there is also a need for improved consideration of the role of extrachromosomal DNA in the patho-biology of APEC which is often neglected in WGS-based studies due to difficulties in accurate reconstruction from short-read sequence data (Arredondo-Alonso et al., 2017). Comparatively few studies have undertaken sequencing and characterization of the most prominent APEC plasmids such as the ColV and ColBM/IncFIB and the IncFIC, IncFIIA, IncI1, incP, incB/O, and IncN plasmids (Mellata et al., 2010). Long read sequence platforms will facilitate the investigation of the promiscuity, fitness burdens and advantages conferred, and the chromosomal association and compatibility of these plasmids are key to further understanding the APEC pathotype.
Accurate portrayal of APEC populations using WGS will facilitate the development of improved diagnostics, epidemiology tools and, crucially, novel interventions (Mageiros et al., 2021). Limitations on the use of antimicrobials in poultry production have made the vaccine-based control of avian colibacillosis highly desirable. However, some vaccines do not confer adequate cross protection against heterologous APEC strains or else the capability of cross protection against minor APEC isolates is unknown (La Ragione et al., 2013;Sadeyen et al., 2015;Ebrahimi-Nik et al., 2018;Hu et al., 2020). This represents a major obstacle in controlling avian colibacillosis. A greater understanding of the vaccine target populations will allow the development of more targeted vaccines through reverse vaccinology and for cross-protection efficacy to be evaluated appropriately.

Conclusion
WGS is only just beginning to facilitate delineation of the APEC pathotype. The genotypes identified herein only account for a proportion of the myriad of strains recovered from afflicted chickens (Pires-Dos-Santos et al., 2013;Collingwood et al., 2014;Ibrahim et al., 2019;Azam et al., 2020), so it is possible that there are further, more cryptic genotypes predisposed to pathogenicity in poultry, beyond opportunist strains. Further characterization of the APEC pathotype using WGS approaches is therefore essential as it will inform on all aspects of APEC research. For instance, most functional studies characterizing APEC pathogenicity or host response use a very limited range of APEC strains, the genetic backgrounds of which are often unknown (Dziva & Stevens, 2008;Lynne et al., 2012;Alber et al., 2020). Given the fact that predominant APEC are phylogenetically distant from each other, conclusions are often drawn in these studies that may only be true for specific APEC sub-lineages.
The detail gleaned from APEC comparative genomic and epidemiologic studies has provided unparalleled insight into APEC populations. WGS has allowed the heterogenous APEC pathotype to be interrogated and has facilitated the designation of multiple, distinct genotypes as the predominant APEC types causing avian colibacillosis.

Disclosure statement
No potential conflict of interest is reported by the authors.