Association mapping in rice: basic concepts and perspectives for molecular breeding

ABSTRACT In the last decade, association mapping (AM) has become a well-established method to detect genes and quantitative trait loci (QTLs) associated with agronomically important traits. The identification of a large number of single nucleotide polymorphisms (SNPs) from genome sequencing and concurrent development of high-throughput genotyping platforms has led to AM being widely used for a range of crops. These technologies have been used in rice (Oryza sativa) to explore the abundant diversity and there is enormous potential to identify novel QTLs for traits of interest. Due to the availability of cost-effective high-throughput SNP genotyping methods and rapid developments in rice genomics, it is inevitable that these AM approaches will become more popular in the future, especially in the context of genome-wide association studies (GWASs). In this paper, we review the fundamental concepts, critical considerations and limitations of AM focusing on rice, and reiterate the importance of accurate phenotypic data. We also include a section about connecting GWAS to molecular breeding, covering practical consideration for breeders, which is required to use GWAS results in actual rice molecular breeding programs and which has not received adequate attention in the scientific literature.


Introduction
The development of quantitative trait loci (QTL) mapping methods to identify genes or QTLs controlling quantitative traits was a landmark achievement in plant genetics research in the late 1980s (Doerge, 2002;Mohan et al., 1997;Semagn, Bjornstad, & Xu, 2010). Since then literally thousands of research papers have reported the identification of genes or QTLs for important traits across a diverse range of plant species. This wealth of genetic information was built on the foundation of decades of research in crop molecular genetics and genomics.
The selection and enrichment of QTLs within breeding material was the inevitable next step from QTL discovery to application, and there have been numerous reports of successful tracking of genes and QTLs within breeding programs. DNA (or molecular) markers have enabled selection of major genes or QTLs for critical or important traits in a process called marker-assisted selection (MAS), which has revolutionized plant breeding. DNA markers are used as tools by breeders to improve the accuracy or efficiency of selection (Dwivedi et al., 2007;Xu & Crouch, 2008) as well cost and time efficiency. Apart from MAS, DNA markers have numerous applications including DNA fingerprinting, genetic diversity analysis and parental characterization (Collard & Mackill, 2008). Despite the availability of information, the extent of actual product development using MAS in rice breeding is limited to a few major QTLs for biotic and abiotic stress tolerance (Gregorio, Islam, Vergara, & Thirumeni, 2013;Singh et al. 2015b).
Recent developments in genotyping platforms and systems to implement molecular breeding schemes offer new tools for modern rice breeders (Thomson, 2014). Single nucleotide polymorphism (SNP) markers are biallelic, co-dominant markers which are abundant in the genome (Mammadov, Aggarwal, Buyyarapu, & Kumpatla, 2012). Insertion-deletion (Indel) markers are also prevalent and can be screened using high-throughput genotyping platforms (Misra et al., 2017;Yonemaru et al., 2015). These platforms are cost-efficient, can handle large sample sizes and provide fast data turn-around time. The availability of genome sequences and genomic resources continues to provide a wealth of SNP and Indel markers which will undoubtedly be the marker type of choice for decades to come (McCouch et al., 2010).
Rice was the first crop genome to be sequenced (Goff et al., 2002;Yu et al., 2002;(International Rice Genome Sequencing Project, 2005), and is uniquely poised as a major food crop and a model crop species, with abundant genetic diversity and publicly available genomic and phenotypic resources. Genomics resources include several de novo genome builds, thousands of re-sequenced genomes, numerous SNP resources and validated highthroughput SNP genotyping platforms. Developments in next-generation sequencing technologies have led to a wealth of genome sequence data that was not conceivable only a few years ago (Edwards, Batley, & Snowdon, 2013; Van, Rastogi, Kim, & Lee, 2013;Varshney & Dubey, 2009). While the japonica accession Nipponbare is still the gold standard reference genome, several other representative rice varieties from all major subspecies were subsequently assembled de novo, including IR64, DJ123 (Schatz et al., 2014) and Shuhiu498 (Du et al., 2017). Recently, 3000 rice genomes were re-sequenced and a resulting set of over 29 million SNPs has been characterized and made accessible (3,000 rice genomes project, 2014;Alexandrov et al., 2015).
A number of excellent general review articles have been written about AM in crops (Buckler, Gupta et al., 2014;Hamblin, Buckler, & Jannink, 2011;Ingvarsson & Street, 2011;Mackay & Powell, 2007;Myles et al., 2009;Rafalski, 2002Rafalski, , 2010Zhang, Zhong, Shahid, & Tong, 2016;Zhu et al., 2008). In this article, we review the fundamental concepts and use of AM specifically focusing on rice, a globally important food crop. We also discuss applied research activities and relevant topics that are required to move from QTL detection to actual molecular breeding, which has rarely been discussed in the scientific literature to date. In our view, consideration of the latter topic is critical to ensure that AM is fully integrated with rice molecular breeding programs in the future.

Approaches to QTL mapping
Traditional QTL mapping utilizes biparental population (BP) mapping populations from controlled crosses (Zhu et al., 2008) which we refer to as biparental population QTL mapping (BPQM) in this paper. The selected parents usually differ for traits of interest, and the resulting mapping population should segregate for these traits, although transgressive segregation can be detected even when the parents do not differ for the trait (Collard, Jahufer, Brouwer, & Pang, 2005;Mackay, Stone, & Ayroles, 2009). BP mapping populations include F 2 , backcross (BC), recombinant inbred lines (RILs) and near-isogenic lines (NILs). F 2 and BC 1 populations only require two generations to develop. Although they are the simplest population types, these are not homozygous (i.e. 'fixed') and cannot be repeatedly phenotyped. On the other hand, more complex populations, such as NILs and RILs, usually require seven or eight generations to develop, but can be repeatedly phenotyped over multiple years and across environments. Chromosome segment substitution lines (CSSLs) contain chromosome segments from the donor parent in the recurrent parent background. These secondary mapping populations are required to facilitate more comprehensive analysis of target QTLs (Yano, 2001). More recently in rice, multi-parent populations called multi-parent advanced generation intercross (MAGIC; Bandillo et al., 2013), CSSLs (Bessho-Uehara et al., 2017;Ogawa et al., 2016) and nested-association mapping populations (Fragoso et al., 2017) have been used for QTL mapping.
Constructing linkage maps is necessary to identify chromosomal locations and effects of genes and QTLs associated with traits of interest. A linkage map (or genetic map) shows the relative position and genetic distances between markers or genes along chromosomes (Collard et al., 2005). QTL mapping is based on the co-segregation of QTLs and DNA markers. Conceptually, this is based on the principle of chromosome recombination which occurs during meiosis. The principle of QTL detection is based on the association between phenotype and the genotype of markers. The mapping population is partitioned into groups based on the marker each individual carries. A significant difference between the phenotypic means of the groups indicates that the specific marker used is linked to a QTL controlling the trait (Collard et al., 2005).
In contrast to BPQM, association analysis uses diverse accessions from germplasm collections of varieties, landraces or breeding material referred to as a 'panel' (also 'diversity panel' or 'association panel'). Identifying novel QTLs from these panels is the most important use of GWAS for breeding. QTL identification is performed by examining the associations of the markers with the trait that can be explained by the 'linkage disequilibrium' (LD) between markers and polymorphisms across a set of diverse germplasm (Zhu et al., 2008). QTLs are identified based on historical recombination events between SNPs and QTL at the population level (Nordborg & Tavaré, 2002). It is based on the principle that over multiple generations of recombinations, markers that are tightly linked to genes for the trait of interest will generally remain to be associated with the trait.
Population sizes used in preliminary genetic mapping studies generally range from 100 to 250 individuals (Collard et al., 2005). A larger population size (approximately >500) is required for the analysis of QTLs having small effects on the target trait. Population sizes used for AM are usually larger than BPQM and, depending on the population structure and diversity, generally require several hundred individuals to identify QTLs.

Comparison between BPQM and AM
Both BPQM and AM are based on the co-segregation of DNA markers with traits of interest (Zhu et al., 2008). While the development of a mapping population is required for linkage analysis, AM usually uses diverse populations, or individuals with contrasting geographical origin (Lipka et al., 2015). Therefore, AM requires less time and resources because phenotypic data are sometimes available for the populations (i.e. panels) that are used for analysis. There is no need to perform controlled crosses to develop mapping populations, although a large number of markers (in the order of thousands or tens of thousands) are required compared to only a few hundred markers for BPQM.
The mapping populations used in BPQM include limited recombination events resulting from fewer generations to establish these populations (i.e. two generations for F 2 population and six to ten generations for RILs). For this reason, QTLs are usually span 10 to 20 cM intervals. In contrast, AM utilizes populations which have undergone many generations of recombination since domestication and therefore in general, only markers that are physically located close to the QTL will be detected as significant. This also explains why linkage analysis has a lower resolution of QTL detection compared to AM (Flint-Garcia, Thornsberry, & Buckler, 2003). As the size of the interval for localizing the QTL decreases, the number of individuals required to detect at least one recombinant in the region of interest increases, as does the number of molecular markers necessary to detect recombination events.
BPQM uncovers only a small portion of the genetic architecture for a trait because only alleles that differ between the two parental lines will segregate while AM provides an alternative route to identifying QTL that has effects across a broader range of germplasm. On the other hand, BPQM can lead to the discovery of very rare alleles provided the donor accession carries it, while in AM, alleles below a certain minor allele frequency (usually a MAF of < 5%) will be filtered out and hence rare QTLs are usually not detectable. By highlighting the differences between traditional BPQM and AM, it becomes clear that these methods are complementary to each other ( Figure 1).

Key concepts in AM
The main purpose of AM is to dissect complex traits and identify QTLs (Zhu et al., 2008). QTLs detected using AM (also called 'signals', 'peaks' or 'hits') are usually represented using 'Manhattan plots' which show the association of markers with the trait along a chromosome. The y-axis indicates -log10 (P value) for the association plotted against the SNPs along each chromosome on the x-axis, so the map positions of all markers used must be known ( Figure 2).

Linkage disequilibrium (LD)
The fundamental basis of detecting QTLs using AM is due to LD. LD is the 'non-random association of alleles at two or more different loci' in a population (Flint-Garcia et al., 2003;Slatkin, 2008). It measures the strength of correlation between markers caused by their shared genetic history. It is a characteristic of pairs of SNP that describes the degree to which an allele of one SNP is inherited or correlated with an allele of another SNP within a population (Bush & Moore, 2012). The terms LD mapping and AM are often used interchangeably.
AM is dependent upon the extent of LD across the genome. Thus, the extent of LD should be known before AM can be performed. Two markers that are in LD show non-random association between alleles, but do not necessarily correlate with a particular phenotype. An association is defined when there is significant correlation between the covariance of a marker polymorphism and a trait of interest. This is the basis of identifying QTLs associated with markers by AM (Soto-Cerda & Cloutier, 2012). It is important to note that LD is not the same as physical linkage; many allelic variants that are very close to each other may have low LD either due to recombination or because the variants are not at equal frequencies. It is also worth noting that SNPs in LD can be located on different chromosomes. If LD decays within a short distance, mapping resolution is expected to be high, but a larger number of markers are required (Rafalski, 2002). If LD extends over a long distance, mapping resolution will be low, but a relatively small number of markers are required.
In practice, combinations of closely linked, adjacent SNPs or 'haplotypes' are used to characterize a QTL region or allele of a gene rather than a single SNP (Buntjer, Sørensen, & Peleman, 2005;Rafalski, 2002). Generally, about 5-15 SNP markers per locus are usually sufficient to characterize haplotypes in crops (Famoso et al., 2011). Haplotypes based on multiple SNPs are generally multi-allelic, in contrast to a single SNP which is biallelic. Variations in SNP haplotypes are of interest to breeders in order to identify genomic regions under selection. Recently in rice, efforts have been initiated to compile data and investigate haplotype blocks in breeding material (Yamamoto et al., 2010), or SNP haplotypes for genes or QTLs (Yonemaru, Ebana, & Yano, 2014).
One of the first estimates of LD in O. sativa was~100 kb based on the region around xa5, a recessive gene conferring resistance to bacterial leaf blight (Garris, McCouch, & Kresovich, 2003). Targeting six genomic regions on chromosomes 1 and 4 and unlinked background SNPs, LD was estimated and ranged from 75 kb in indica,~150 kb in tropical japonica and >500 kb in temperate japonica (Mather et al., 2007). Huang et al. (2010) estimated that genome-wide LD rates were~123 kb for indica and~167 kb for japonica subspecies. These estimates (< 1 cM) represent a significant improvement in comparison with the confidence interval of QTLs detected by BPQM.
It is worth noting that these average rates of LD reflect~10,000 years of historical recombination. In a well-designed breeding program, these rates will be higher due to the crossing structure and use of elite x elite crosses. Based on the estimated LD in rice, a minimum of 5000 markers should theoretically be sufficient to cover the~400 Mb rice genome using the estimate of 75 kb (i.e. 400,000/75 = 5333 markers) as proposed by Courtois et al. (2013). However, filtering for monomorphic markers and low allele frequencies may lead to sub-optimal densities in practice, and therefore genotyping at higher densities is recommended.

Analysis methods for AM
In early reports of AM in crops, relative simple statistical association tests (e.g. general linear models for normally distributed traits or non-parametric tests) were performed (Li & Zhu, 2013), analogous to the initial use of single-marker analysis in BPQM. More advanced methods for AM were subsequently developed, which have become routinely used in crops (Lipka et al., 2015).
The effect of population structure (referred to as 'Q') must be accounted for when performing AM. The most commonly used approach to assess the level of population structure is to use marker information to detect subgroups within the experimental population. This is sometimes called 'structured association' and pieces of information on population structure are considered as fixed effects and used as cofactors in the analysis. Using this method, a set of random markers is used to infer the structure of the population as well as the ancestry of the panel. Common methods used to calculate population structure include: (1) using a computer program called STRUCTURE (Pritchard, Stephens, & Donnelly, 2000) or (2) principal component analysis (PCA) (Price et al., 2006). One of the main advantages of PCA is that the computational analysis is considerably simpler.
It was subsequently determined that taking into account the level of genetic relatedness (called 'kinship'; referred to as 'K') improved the accuracy of AM (Yu et al., 2006). In this landmark paper, mixed linear models (MLM) were used to include information on population structure and kinship (i.e. 'Q + K' mixed-model) that was superior in terms of reducing the false positive rate while maintaining statistical power (Zhao et al., 2007). This has now become commonly used in crops and algorithms have been streamlined to improve the efficiency (i.e. speed) of the 'data crunching'. The most commonly used approaches in rice include efficient mixed model association (EMMA; Kang et al., 2008), EMMA eXpediated (EMMAX, Kang, Sul, & Service et al., 2010), compressed MLM and population parameters previously determined (P3D) (Zhang et al., 2010). Recently, even more advanced methods have been developed (e.g. Settlement of MLM Under  Manhattan plots for flowering date (a) and panicle length (b). Analysis was performed using TASSEL software Version 4.1.34 (Bradbury et al., 2007). Results indicate a major locus for flowering time on chromosome 3 and multiple QTLs for panicle length, typical of quantitative traits.
False positive signals are an inherent problem in GWAS, which is often referred to as the problem of 'multiple testing' because a large number of markers are tested and a P value is generated for each individual test (reviewed by Balding (2006) and Gupta et al. (2014)). Therefore, the cumulative possibilities of false positive results are large when all of the multiple tests are considered. Among the methods to define significance levels (i.e. that will reduce the chance of false positives but increase the chance of false negatives and vice versa), the Bonferroni correction and the false discovery rate (FDR; Benjamini & Hochberg, 1995) are most commonly used.

Practical considerationsscale of AM
There are two broad categories of AM: (1) candidate gene (CG) analysis and (2) GWAS. While CG analysis is a hypothesis-driven approach based on prior studies about genes involved in the trait of interest, GWAS is a more comprehensive approach which does not require any initial information about the genetic control of a trait of interest (Zhu et al., 2008). Due to the increasing use of GWAS over CG, the term is now used interchangeably with AM.
Genotyping is a critical component of GWAS experiments, and two methods are usually employed. Genotyping by sequencing (GBS) is a cheaper way to generate medium to high marker densities (i.e. 10,000-100,000 filtered data points per sample) (Elshire et al., 2011;He et al., 2014). Although GBS has been optimized and successfully applied in rice (Table 1), it requires considerable wet-lab and computational expertise for library preparation and bioinformatics data analysis, respectively. Fixed rice arrays (i.e. 'SNP chips') are easier to use and several options for different marker densities are available (Chen et al., 2014a;McCouch et al., 2010;Singh et al. 2015a). At present, the cost of rice is generally about US$30-US$50/sample. Userfriendly, open-access software to perform AM, such as TASSEL (Bradbury et al., 2007) and GAPIT (Lipka et al., 2012;Tang et al., 2016), have been commonly used in rice and allow data formatting, data filtering, data visualization for quality control and advanced analyses.

Limitations of AM
There are several inherent limitations of AM which researchers must consider (reviewed by Korte & Farlow, 2013) when interpreting results, especially for downstream applications. Some of these limitations are based on the fundamental architecture of complex quantitative traits and are briefly discussed below.

Rare variants
Previous research has indicated that combinations of rare alleles are usually involved in controlling complex traits. The power to detect rare alleles, however, is low especially since markers used in GWAS are often excluded based on the MAF (i.e. < 0.05), since the statistical methods used for AM are not reliable for very low MAF. Furthermore, these variants may also be in strong/complete LD with noncausative rare variants (e.g. specific SNPs within an individual which have nothing to do with the target trait, but appear to be associated). This is referred to as a 'synthetic association' (Korte & Farlow, 2013). Importantly, indirect associations may be detected due to LD between multiple factors affecting a single trait, especially caused by adaptation (Platt, Vilhjálmsson, & Nordborg, 2010). Thus, conventional QTL mapping approaches still are and will likely remain more effective in discovering rare alleles. The deliberate inclusion of lines derived from BPs with parents containing rare alleles (if known) to supplement the panel and the use of 'joint-linkage association mapping' are two approaches to improve the detection for rare variants.

Small-effect QTLs
Many complex traits are controlled by multiple loci, many of which are small effect QTLs (Holland, 2007). Therefore, their presence may simply be too small to be reliably detected, which is a problem when using any QTL mapping method. Detection of these small-effect QTLs can be increased by using a larger population size and accurate phenotypic measurements (i.e. to increase heritability).

Ascertainment bias
This refers to the fact that SNPs are pre-selected during the SNP development stage prior to their use for association analysis. This is relevant for array-based SNP genotyping platforms, which are usually derived by sequencing a small number of accessions for SNP discovery (McCouch et al., 2010;Myles et al., 2009). Therefore, SNPs that are associated with a trait might not be included in SNP chips, and therefore cannot be detected. Furthermore, fixed arrays might be designed to be maximally informative across certain subgroups (e.g. indica), but might detect many monomorphic markers when deployed across panels of a different subgroup (e.g. japonica) and hence may not be useful.

Genetic heterogeneity
This can occur when different genes controlling a single trait under investigation are included in the panel (i.e. the mechanism of genetic control differs within the population and causes confounding results because the ability to detect associations is lower). This can be understood by considering biologyplants originating from different areas may have evolved different mechanisms of adaptation. In rice, genetic heterogeneity is present (i.e. in O. sativa). Indica and japonica varietal groups have evolved through independent domestication process (Ikehashi, 2009). The consequence of genetic heterogeneity reduces the strength of the association between markers and traits.

Will AM always 'land' on genes?
One interesting finding from several GWAS experiments in animal and plant species is that the SNPs at the QTL peak are not always within the functional gene. For example, in Arabidopsis, the highest associated SNPs were not the causal SNPs for the vernalization-response gene (FRI); this was attributed to complexities due to LD with specific alleles and population structure (Atwell et al., 2010). Similar observations have been reported in independent GWASs in rice (Huang et al., 2010;Yano et al., 2016) and may be confusing for researchers who are familiar with BPQM. In the pioneering work done by Yano et al. (2016), this issue of undetectable causal SNPs was addressed by using GWAS based on whole genome sequencing, followed by the screening of candidate genes based on the estimated effect of nucleotide polymorphisms.

AM in rice
A range of AM studies have been reported for rice in the last decade (reviewed by Zhang et al., 2016). Associations with important agronomic and morphological traits have been the focus of most studies, while others analyzed yield-related, abiotic and biotic stress tolerance, reproductive and metabolic traits (Table 2). SSRs were initially used in some of the earliest reports of AM in rice (Agrama, Eizenga, & Yan, 2007;Borba et al., 2010). SNPs have now clearly become the marker of choice for GWAS in rice (McCouch et al., 2010). In this section, we briefly review some of the seminal GWAS papers in rice. A landmark GWAS was reported by Huang et al. (2010), who investigated 14 agronomic traits (morphological, yield components, grain quality and agronomic or physiological) in a panel consisting of 517 landraces and detected many novel QTLs with relatively small effects. The panel was re-sequenced to identify a large number of SNPs, which permitted a high resolution for AM. Of the 3.6 million SNPs, 167,514 were located in coding regions of >25,000 annotated genes, which was important to investigate potential function effects of the detected SNP signals. Due to the larger sample size and genetic diversity, Huang and co-workers focused on the indica subset of the panel. Putative QTLs explained about 36% of the phenotypic variation and QTLs for six traits were located close to known genes (i.e. OsC1, Rc, ALK, Waxy, qSW5, GS3) that had been previously characterized. Zhao et al. (2011) conducted a large-scale GWAS, exploring 34 traits (including morphological, agronomic, yield component, stress tolerance, seed/grain Population structure Absent May be present morphology and quality) using a 44k SNP array platform across a panel of 413 diverse accessions including indica, aus, tropical and temperate japonica, and aromatic accessions. A large number of QTLs were detected for all traits, including many signals located near the locations of known major genes for the relevant trait. A major finding was that significant genetic heterogeneity was associated with subpopulation structure. In other words, different QTLs were observed when the subpopulations were analyzed separately compared to when the entire diversity panel was used. Another important finding was the influence of environmental effects. Flowering time was investigated in three locations and revealed season specificity, even for well-known major genes such as Hd1. Furthermore, some of the strongest signals were relatively far away from known candidate genes, which were attributed to ascertainment bias. Famoso et al. (2011) combined GWAS with BPQM for investigating aluminum tolerance, using the same panel as Zhao et al. (2011). They discovered that a large component of tolerance was due to subpopulation structure, and several subpopulation-specific QTLs for this trait were detected. Importantly, a detailed haplotype and sequence analysis were performed around the candidate gene Nrat1, which indicated a large-effect QTL (explaining 40% of the phenotypic variation within the aus subpopulation) and three non-synonymous mutations within Nrat1 that were predictive of aluminum sensitivity (Famoso et al., 2011). Huang, Zhao and Wei et al. (2012) extended their previous GWAS results and focused on flowering time and yield-related traits using >1.3 million SNPs (>700,000 for the indica subset and >490,000 for the japonica subset). In this study, a larger panel of accession (n = 950, including 508 indica and 383 japonica) was used. Furthermore, detailed gene annotation, expression data and genetic variation were integrated to refine identification of candidate gene and potential causal polymorphisms for the target traits. Novel QTLs were detected for the traits investigated and the authors identified unknown loci associated with wellcharacterized traits such as flowering time, which were not identified in their previous report.
Recently, a high-resolution open-access resource for GWAS in rice has been made available to rice researchers . The AM panel consists of 1568 diverse accessions including indica, aus, tropical and temperate japonica, and aromatic accessions, comprising two separate rice diversity panels (RDPs). The genotypic data were generated using a high-density rice array (HDRA) with 700,000 SNPs (i.e. approx. 1 SNP every 540 bp) and a suite of bioinformatics tools including a GWAS viewer, allele finder and Genome Browser were developed to assist in data interpretation.

From GWAS to molecular breeding
The large number of published QTLs or GWAS signals would imply that there are thousands of trait markers (i.e. highly predictive of target traits) available for breeders to use in selection programs. Most rice breeders, however, would argue that this is certainly not the case. Several authors have addressed the issue why only few reports of QTL discovery from GWAS are resulting in the actual use in MAS. Gupta et al. (2014) and Zhang et al. (2016) suggested that the high FDR was the most likely reason for a lack of application of GWAS signals in breeding.
We believe that the lack of validation of QTL-marker trait associations is a major factor explaining the lack of applied outcomes from GWASs. It is generally accepted that BPQM research needs to be validated in a range of genetic backgrounds and different environments (i.e. field testing) prior to deployment (Nicholas, 2006;Xu & Crouch, 2008). Due to high false discovery rates in AM approaches, there is an even greater need to verify marker-trait associations arising from AM experiments. In other words, the utility of a marker to accurately predict trait phenotype needs to be verified (referred to as 'marker-trait validation'). For example, Breseghello and Sorrells (2006) suggested that this could simply be performed by developing BPs and confirming this at the F2 or F3 stage. In our experience, readily available breeding populations (e.g. F2, BC, RILs or elite material) should be used to validate GWAS results in rice. Other methods involving gene expression of candidate genes can be used if the objective is to identify the causal gene for the trait, but this step is not required for breeding.
For large-scale routine screening in breeding populations, GWAS-based markers need to be converted to low-plex high-throughput marker assays (e.g. KASP, Taqman, Fluidigm) and considerable work is required to design and verify these assays. KASP assays currently range from about US$0.10-US$0.36/marker data point (Semagn, Babu, Hearne, & Olsen, 2014). Verification is typically performed for the markers using 48 to 96 samples, which contain known donors and recipients as well as a representative collection of the breeding germplasm. This tests whether the assay is robust in a variety of genetic backgrounds and whether it reliably scores favorable and unfavorable alleles in known backgrounds. Breeders should also be informed by molecular geneticists about details regarding the markers that have been developed (i.e. reliability in terms of selection accuracy for a trait, as well as sensitivity, specificity and effectiveness in different genetic backgrounds). A versatile analysis tool called 'SNP Seek' (http://snp-seek. irri.org) was developed in conjunction with the 3000 rice genomes (Alexandrov et al., 2015) and provides a valuable resource for the validation of markers located in QTL peaks. Single highly significant SNPs or linked SNP clusters/haplotypes of significant association can be queried, analyzed for allele frequencies on a subpopulation specific basis and put into context with global variation within the region of interest, all of which can greatly aid in the development of low-plex assay.

Phenotyping: critical considerations and new high-throughput phenotyping methods
The importance of accurate trait data for GWAS has been previously emphasized by several researchers (Myles et al., 2009;Rafalski, 2010;Zhu et al., 2008) and we reiterate the importance of phenotypic data for the success of any QTL mapping experiment. Proper experimental design and generation of high-quality phenotypic data are absolutely critical. In practice, accurate phenotyping of panels may be more complicated than molecular geneticists may realize. In rice, for example, ordinal scales are widely used for trait characterization because they are commonly used by rice breeders (for example IRRI standard evaluation system; IRRI, 2014). Ideally, scoring scales should not be based on subjective rating scales (Poland & Nelson, 2011). In order to properly characterize quantitative traits, reliable phenotyping methods based on quantitative measurements are required to accurately dissect genetic variation (Cobb, DeClerck, Greenberg, Clark, & McCouch, 2013). Broad-or narrow-sense heritability should be calculated for each trait in order to understand the proportion of genetic variance that has been explained by the detected QTLs.
Furthermore, GWAS experiments often do not consider genotype by environment (G x E) interactions. Breeders are well aware that G x E interactions may complicate trait phenotyping. Variation usually exists even within and between controlled environment greenhouse trials and certainly occurs between years, seasons and environments in field trials (Atlin, Kleinknecht, Singh, & Piepho, 2011). Further improvement of phenotyping can be achieved under controlled environment conditions, which is useful for reliable and precise phenotyping of plant responses to abiotic stresses (Negin & Moshelion, 2017). However, transferability of controlled environment observations to actual field conditions is difficult for some traits such as drought (Passioura, 2012). In these cases, completely new phenotyping methods are required.
Plant phenomics for accurate high-throughput phenotyping has undergone rapid development in the last decade. Both ground-based proximal sensing and aerial remote sensing systems are now routinely used for field phenomics. Rapid, GPS-guided, high-throughput semi-or fully automated phenomics systems have been developed for quantitative measurement of above-ground biomass, stem and canopy attributes, photosynthesis and pigment content (Simko, Hayes, & Furbank, 2016), flowering (Guo, Fukatsu, & Ninomiya, 2015), abiotic stress responses (Cobb et al., 2013), pathogenesis (Mahlein, 2016), leaf traits (Yang et al., 2015) and agronomic traits (Duan, Chapman, Guo, & Zheng, 2017). Advancements in aerial vehicle engineering, automation, sensor-based imaging, software capability, data storage and analytical capacity underpin this phenomics revolution. Groundbased platforms have the advantage of generating highresolution data, but they cannot screen large populations simultaneously, which is critical for many applications.
Remote sensing platforms such as unmanned aerial vehicles (UAVs) and satellites generate data with different spatial resolution and spectral coverage. Manned aircrafts are also used for aerial remote sensing. Most of the commercially available phenomics systems can be deployed directly for many crops though crop-specific features may demand considerable system optimization. UAV-based aerial phenomics platforms are the most commonly used systems for germplasm development and crop research (Watanabe et al., 2017). A range of RGB (red, green, blue), multi-spectral, hyperspectral, thermal-fluorescence, 3D, LIDAR and SONAR sensors are available for plant phenotyping, and many of them can be integrated with UAVs depending on their payload capacity. Rapid progress in automated sensor-based imaging and image-processing software solutions became the cornerstones of field phenomics (Cobb et al., 2013).
The success of high-throughput phenomics depends on the accuracy and precision of data collected (Cobb et al., 2013). Accuracy can be improved with the use of known standards, while the stability of the object sensing system determines the precision, which is important to reduce error variance. Unfortunately increasing the throughput may compromise accuracy and precision and this necessitates crop-specific system optimization. Despite the best efforts to maximize accuracy and precision of primary sensory data, which are collected as images over a relatively long period of time, they carry a significant level of genetic, environmental and experimental noise. Developing algorithms for converting the primary sensory data to useful phenotypic data for performance prediction of breeding lines thus can be quite demanding.
Measuring phenotype during the entire crop cycle requires the collection of a very large volume of data. This makes storage, processing and management of phenomics data challenging. Successful high-throughput field phenotyping is contingent on several factors (Negin and Moshelion, 2017). These include the type of the phenomics platform used, sensor systems with the required spectral coverage, computational and analytical capacity to convert sensory data to useful phenotypic data and database management capability. Identifying the appropriate phenotypic parameters that can be quantified accurately, rapidly and cost-effectively using a carefully designed experiment established with an appropriate genetic population is also vital.

Closer integration with breeding is needed
Over the last decade, GWAS results have been poorly utilized in breeding programs which have created an 'application gap' (Collard, Raghavan, & Islam, 2017). Molecular geneticists, who typically perform most of the gene discovery work and who are main users of GWAS, are often not well aligned with breeding programs. The objectives of molecular geneticists are to elucidate and begin to unravel the physiological basis of phenotypic variation at the gene or molecular level. While these are valid and noble endeavors, most of the knowledge obtained from these efforts is not needed in actual breeding programs.
In order to align gene discovery programs utilizing GWAS and breeding programs, discovery efforts need to be more closely integrated with actual breeding activities. Traits that gene discovery groups work on should be the same as those used to set breeding priorities. Reviewing the literature, this is seldom the case (examples in Table 1) and gene discovery work and breeding are rarely connected in the public sector. Often morphological or physiological traits are selected for gene discovery because they are easily measured in controlled environments with high accuracy, rather than their relevance for breeding. These secondary traits may be utilized, provided they are more heritable and highly correlated with the trait of interest like yield. Some notable exceptions were studies to specifically identify loci associated with field blast resistance (Raboin et al., 2016;Zhu et al., 2016). The challenge is that high priority complex traits for breeders (e.g. yield) often defy reductionist descriptions using component traits due to complex interactions involving genetics, physiology and the physical environment. When attempts are made to measure complex traits, often the phenotyping methods employed by molecular geneticists are limited to controlled environments like growth chambers and greenhouses, where the correlation with field conditions is either low or unknown. Loci discovered in this way may be highly reliable for a specific assay (e.g. greenhouse-based disease or abiotic stress tolerance test) but there would be little value using markers developed for these loci if the assay was not an effective predictor of field performance. In private sector breeding organizations, gene discovery and breeding teams are structured in a unified product development pipeline, to efficiently evaluate germplasm and identify new alleles of large effect that can be easily deployed in a breeding program (Eathington, Crosbie, Edwards, Reiter, & Bull, 2007).
In rice and other cereals, however, there are traits (notably disease resistance, abiotic stress tolerance and quality traits) for which single genes of large effect suitable to MAS are well known (Collard & Mackill, 2008;Das, Patra, & Baek, 2017). Any breeding program interested in doing MAS for these traits will need to prioritize the loci of interest. Particularly for traits where well-developed trait markers still need to be developed, GWAS using breeding germplasm is suited to inform marker development priorities by quickly identifying which loci are both polymorphic and at high frequency in the breeding germplasm and thus would have high utility rates for MAS. In this context, GWAS results can provide useful information by identifying loci (i.e. due to the lack of a QTL peak where one might be expected) that are already fixed in the breeding program. If fixed for the favorable allele due to selection, the expenditure of marker development resources can be avoided for a locus that has no utility within the breeding program. For example, one important finding by Begum et al. (2015), who used a panel comprised of elite breeding lines, was that major genes (e.g. for flowering time) were often not segregating in elite indica rice breeding germplasm, presumably because fixation had already occurred. This information sheds light on the genetic basis of flowering time among the breeding lines they are using, and can thus help direct future strategy for further modifications of flowering time within the breeding program.
Likewise if a new disease or specific trait suddenly becomes important to a breeding program, GWAS cannot only help to determine if the existing breeding lines contain loci easily amenable to MAS, but also to screen panels of exotic germplasm in hope of finding large effect genes/QTLs (provided they are present at reasonable allele frequencies). However, the utility of GWAS to a pre-breeding program is of limited use when the genetic variants of large effect for a trait are already well known. In these cases, gene discovery efforts are focused on searching for rare alleles of large effect which GWAS is unable to detect. In these situations, a more effective strategy is to screen a large number of exotic lines to identify donors and create BPs with each donor. Such populations can then be targeted for fine mapping and marker development such that NILs in elite backgrounds can be deployed in the breeding program using marker-assisted forward breeding.
There has been some research using GWAS results as covariates for inclusion in genomic selection (GS) models in a breeding program and potentially increase prediction accuracies (Spindel et al., 2016). GS is a new molecular breeding method that was originally developed in animals (Meuwissen, Hayes, & Goddard, 2001) and has recently been implemented in crops (Heffner, Sorrells, & Jannink, 2009). The method is based on using high-density marker data and advanced computational methods to calculate predict trait performance (in contrast to MAS, which is based on using a single or small number of target alleles). This area of research urgently needs more attention and could represent a critical application of GWAS to breeding programs.

Conclusion and future perspective
AM has become an increasingly popular approach in crop genetics to understand the architecture of quantitative traits and to identify QTLs controlling important traits. This method complements traditional linkage-based mapping methods based on BPs. By combining GWAS with functional genomics, it is inevitable that the growing use of AM in rice will lead to the identification of new QTLs and candidate genes in the future (Yano et al., 2016). It is likely that future progress using AM approaches will be based on two major factors: (1) greater integration with functional analysis or gene annotation data (i.e. 'post-GWAS research', Zhang, Bailey, & Lupien, 2014); and (2) improvements in statistical and computational methods (e.g. methods using Bayesian methods, haplotypes and SNP imputation). Further advances in rice genome re-sequencing (3,000 rice genomes project, 2014) and developments in mammalian and other model species will surely have a major impact on GWAS (Tak & Farnham, 2015;Visscher et al., 2017). Current research in AM includes methods to combine GWAS results from multiple studies (e.g. meta-analysis), strategies to account for G x E and tests for epistatic interactions. User-friendly analytical tools and genomics resources will always need to be improved and further developed.
In the meantime, researchers engaged in applied AM activities need to focus on the fundamental factors such as population size, marker density and accounting for population structure. Accurate and relevant phenotypic data will always remain to be one of the critical factors for success in terms of integrating results with breeding. Ultimately, a holistic and integrated approach (i.e. 'systems genetics') for GWAS would be the ideal configuration. Currently, integration and relevance to actual rice breeding programs remain a great challenge. Further application of GWAS research specifically using breeding panels comprising elite breeding germplasm needs further research. Collectively this could facilitate the outstanding recent achievements in rice genomics to be utilized in the development of new and improved rice varieties using molecular breeding.