Findings made in gene panel to whole genome sequencing: data, knowledge, ethics - and consequences?

ABSTRACT Introduction: Improvements in sequencing technologies have helped to refine diagnosis and patient stratification via molecular genetic testing for a number of conditions. Consequently, sequencing has increasingly entered clinical routine. Reduced cost, combined with enhanced throughput has helped to place sequencing also in the commercial market thus moving beyond particular indications. Diverse kinds of sequencing approaches are applied, ranging from gene panel to whole-genome sequencing. All these have proven successful in the identification of causal and therapeutically relevant alterations to the benefit of patients. However, a number of technical and ethical issues induce challenges that require their appreciation, societal discussion and consensual decision. Areas covered: In the following paper, advantages and disadvantages of different DNA sequencing strategies towards their application within and outside a clinical context are discussed particularly in the light of the incidence and impact genetic findings have at the personal as well as societal level. Expert commentary: We regard the comprehensive education of citizens about these challenges a prerequisite to reach a societal consensus on the exploitation of the huge opportunities while not neglecting the potential and real dangers that are associated with the resulting data.


Types of alterations and relevance to disease
In living organisms, DNA is comprised of a two-stranded antiparallel string of four different deoxynucleotides, where each base in one strand pairs with its complementary base in the opposite strand [1]. This structure is the basis for the relative stability of the genetically encoded information. A number of cellular mechanisms have evolved to maintain the identity of the genome sequence [2]. However, alterations inevitably occur due to intrinsic components [3] for example errors induced by the replication machinery with wild-type [4] or mutated DNA polymerases [5]. Extrinsic factors further add to genetic variation. These are, for example, infections with pathogens via integration of viral DNA [6] or environmental impacts, like carcinogens (e.g. tobacco) and UV-radiation [7]. Together, these mechanisms substantially contribute to evolutionary processes, however, also to the development of disease. Two lineages of cells are affected by mutations: Alterations in the germ line, if tolerated, are passed on to the next generation whereas mutations in somatic cells are specific just for the affected individual and are not propagated. While cellular genomes are affected in both lineages, only mutations in the germ line are true genetic alterations since somatic aberrations are not inherited and thus not genetic. Yet, the term genetic alteration will be used in the remainder of this article, irrespective of the cell types affected.
Mutations in the germ line may be beneficial acting as drivers of evolutionary processes via positive selection, are neutral, or may even have detrimental consequences and are commonly negatively selected. Disease-causing mutations that are propagated in the germ line are mostly associated with Mendelian disorders [8] while germ line mutations have thus far been causally linked to cancer only for a few dozen genes, including TP53 [9], BRCA1/2 [10], APC [11], and some others [12]. Instead, tumor diseases are mostly caused by mutations occurring in somatic cells during the lifetime of an individual [13]. However, mutations in somatic cells are increasingly associated also with noncancer diseases [14]. Different kinds of alterations occur affecting either single positions in the string of nucleotides within DNA (so-called single nucleotide variants [SNV]), or change the larger structure (structural variations [SV]) and copy number (copy number variations [CNV]) of the DNA. SNVs may alter the sequence of the encoded protein in single amino acid residues, introduce stop codons, or affect expression, splicing or stability of the mRNA. Alterations summarized as SV may affect larger stretches of sequence and include smaller or larger deletions, duplications and amplifications, inversions, and translocations. Many SVs and CNVs are large, for example, trisomy of chromosome 21 in Down's syndrome [15] or amplification of the ERBB2 gene [16]. However, even large SVs may also occur without obvious pathological consequences [17].

Discovery phases in genome analysis
The genome is made up of approximately 3 billion base pairs, while the protein-coding genes only cover 2-3% of the chromosomes ( Table 1). The remaining sequences are repetitive, thought to be mostly regulatory, or to have other functions many of which are still unexplored. Several types of 'discovery phases' have been carried out after completion of the first human genome sequence in 2001 [18,19] to unravel and better understand the level of divergence in genome sequences: The 1000 Genomes Project [20] aimed at establishing catalogs of natural variation. The Cancer Genome Atlas Network [21] as well as the International Cancer Genome Consortium [22] has been out to uncover somatic mutations that are causally related to tumor diseases. On top, numerous studies have unraveled causal mutations for many monogenic disorders (see OMIM [8] for examples) and other disease conditions.
The 1000 Genomes Project discovered that variation is rather the rule than an exception at SNV [24], SV as well as at CNV levels [17]. The genomes of any two human beings differ in about 3 million base positions. Approximately, 24-30 variants in every genome have annotated implications in rare diseases [24] and some 240 genes (and pseudogenes) appear to be even dispensable for life [17]. The Cancer Genome projects found large numbers of somatic mutations across the tumor entities analyzed [7]. On top of such SNVs, the diversity of larger genomic alterations is huge, many of which are associated with disease. Gene fusions after translocations and many other genomic rearrangements (SV and CNV) are common in cancer diseases and often driver events of tumorigenesis [25,26].
The frequency of mutations and chromosomal abnormalities that are identified in individuals complicates the interpretation of cancer genomes. Even not all bona fide driver mutations necessarily lead to tumor formation. A recent study found that normal human skin is comprised of a patchwork of cells carrying different numbers and kinds of mutations many of which have been classified as strong drivers of oncogenesis [27]. Despite these findings, development of melanomas luckily appears to be a rather rare event, underpinning the robustness of the biological system at the organismal level. Hence, the consequences of a finding, that is the identification of a tumor-associated mutation are not always so clear. Treatment of a druggable mutation only makes sense once a tumor has indeed been diagnosed. Somatic mutations appear to accumulate in the course of time even in 'unperturbed' conditions, for example without UV-irradiation [28]. The variation within individuals appears to be huge, complicating the distinction between healthy and diseased states.
Further, a particular mutation within an oncogene may have different effects depending on the tissue the tumor had originated from. Prahallad and coworkers showed that the BRAF V600E mutation, a potent driver in malignant melanoma, does not comprise a useful molecular target in monotherapy of colorectal carcinoma (CRC) [29]. While the Ras-RAF-ERK cascade drives melanoma, the PI3K-AKT signaling pathway is even activated when BRAF is inhibited in CRC [29]. The DNA sequence alone does obviously not inform about the signaling pathways and networks that are active in a particular cell or tissue type.
On top, the interpretation of findings regarding causality versus correlation may require more investigation since the presence of a particular variant does not necessarily inform on its pathogenicity or penetrance in the context of a genetic disorder. Neither the annotation of the genome nor of gene-disease associations appear to be final [30]. Genotype-phenotype databases are thus in constant need for updated information on the pathogenicity of variants [31]. This process is strictly dependent on the respective state-of-the-art in disease research, rendering genetic analysis a kind of moving target [32]. Recent studies further suggest that a considerable fraction of disease-causing mutations is not contained in the protein-coding part of the genome but rather in the many regulatory sequences governing expression levels [33]. Hence, the focus needs to be expanded from altered proteins to capturing also abnormal times and places of gene expression. These observations complicate interpretation of variants and the decision which findings to report and which not. Lastly, these discoveries impact also the kind of sequencing analysis that is required to find all relevant genetic events in the background of irrelevant or even unwanted findings.

Sequencing approaches
Next-generation sequencing has been a major advance over classical Sanger sequencing. The latter is less sensitive [34] and has lower throughput coupled with higher cost when applied for the analysis of many and/or large genes [35]. A wide spectrum of instruments has entered the market, now covering the full range from low throughput to genome-scale sequencing. Enrichment technologies have been developed to select particular regions of the genome to be sequenced [36,37]. PCR-or selection-based technologies enrich single amplicons up to whole exomes mostly focusing on the best annotated protein-coding part of the genome. This makes up just a few percent of the total genomic DNA content ( Table 1). The drastic decrease in complexity of the target sequences compared to the whole genome is accompanied by a complementary increase in sequencing depth when the numbers of reads are kept constant. Exome sequencing is commonly .001-~0.5 1->1000 selected** >500× WGS: Whole genome sequencing. In targeted sequencing approaches, the size of the analyzed fraction of the genome depends on the choice of exome enrichment technology (for Exomeseq) and the selection of target genes (Panel-seq), respectively. *Depending on sequencing technology and sequencing depth (fold coverage per basepair), some regions will be underrepresented (e.g. GC-rich sequences) [23]. **Coverage of gene structures is not necessarily complete: Splice variants with alternative promoters, alternative exons, alternative polyadenylation, etc. and their respective level of coverage depend on the quality of genome/gene annotation. Some regions may not be covered depending on target selection algorithm and the design of enrichment technology (e.g. see Figure 1).
performed to mean coverages of 80-150× per base while sequence analysis of gene panels allows for mean coverages >500× or even much higher when only one to a few thousand genes are analyzed. Higher coverage in general results in a higher likelihood to identify rare variants [38]. However, the outcome of targeted sequencing strictly depends on a number of issues. A first is the quality of genome and gene annotation. True genes that had not been annotated in the genome assembly used for target selection are not represented in the resulting exome enrichment or gene panel product. Genes that are not regarded as relevant in a particular condition are not represented in dedicated gene panels where the search space is mostly restricted to the 'usual suspects.' The target list of gene panels should be frequently revised since more recent discoveries might have newly established associations of additional genes for the particular condition this panel should characterize. Coverage is further affected by the parameters set in target selection algorithms. These impact the levels of repeat structures, GCcontent or other sequence features being allowed or disregarded. If not corrected, exons not matching the selection criteria may not be represented. For example, Figure 1 shows a commercial breast cancer gene panel where the highly disease-relevant exon 10 in the PIK3CA gene is not amplified. The oncogenic hot-spot mutations E542K and E545K within that exon can, therefore, not be detected using this panel. Technical biases, as for example introduced by the sequencing chemistry are further causes of potential false-negative findings [23,39,40].
Whole genome sequencing (WGS) is the only method suited to obtain an almost comprehensive portrait of the genetic and, mostly relevant in cancer, somatic makeup of a particular individual or patient. Yet, WGS is not devoid of biases either. Some are inherent to WGS and are mostly reflected in the lack of coverage for a number of loci also in the reference genome. Note: The current reference genome is comprised of >800 scaffolds and >1400 contigs [41]. In contrast, the human genome sequence would ideally consist of 24 contigs, that is the 22 autosomes plus the X and Y sex chromosomes. In total, over 160,000,000 bases are estimated to be missing in the reference genome [41]. Most genomes have been sequenced using technologies that are in use around the world. Hence, clinical WGS likely runs into the same problems as those that are reflected in the reference genome.
Sequencing technologies have led to an increase in throughput by a factor of >10 6 while the cost per genome equivalent has dropped by almost the same rate in the past 15 years [42]. Sequencing of several genomes per machine and day is now feasible (e.g. using Illumina X Ten technology) finally approaching the '$1000 genome' that had been envisioned already 10 years ago [43,44]. However, sequencing depth influences the fraction of the genome that can be reliably analyzed and which mutations can or cannot be detected (i.e. potential falsenegative findings). While a 30× WGS coverage appeared to be standard for (cancer) genome sequencing in the past [45], recent simulations suggest that sequencing coverage should be increased [46]. Higher coverages help to distinguish better between sequencing errors and true mutations. Low cellularity [47] and high heterogeneity [48,49] of many tumors add to this problem.

The 'targetome'what to sequence?
Clinical genetic testing should ideally uncover all types of alterations that are relevant for the respective condition. Focus had been on mutations, called SNVs, for long. The identity and number of SNVs that need to be analyzed depend on the biomedical indication that shall be tackled by genetic analysis. Some diseases are monogenic [50]. There, sequencing of a single gene would suffice to make an unambiguous diagnosis.  Other diseases may be caused or are modulated by aberrations in many genes, necessitating analysis of larger numbers of genes [51]. However, for quite many diseases the causal genes and gene defects have not been fully discovered. This is particularly true when many variants add just slightly to the respective clinical phenotype [52]. The depth of genetic testing should thus ideally be wide at least until the causal genes have been fully identified.
An increasing number of chromosomal abnormalities (i.e. SV) have been discovered, including CNVs, translocations, inversions, and catastrophic rearrangements affecting larger parts of or even whole chromosomes [53]. Some of these are associated with disease, however, chromosomal variability seems to be frequent also in normal genomes [54]. Mechanistically, these abnormalities involve double strand breaks and their repair, the latter often being accomplished by error-prone nonhomologous endjoining [2]. The abnormal joining of DNA-ends may lead to expression of oncogenic fusion proteins. These are commonly produced via splicing of chimeric transcripts [25,55]. Common amplicon sequencing can exclusively analyze normal gene structures since the PCR primers must anneal at both sides of the respective target sequence to amplify a product. Aberrations outside the amplicon are not detected either. Exome sequencing is only able to discover those fusion events that had occurred within enriched target sequences that is within coding exons. However, chromosomal break points are mostly located rather in the much larger introns or are even intergenic. Transcriptome analysis via RNA-seq is thus superior to gene panel or exomesequencing methods for detection of gene fusions [56]. However, the combination of WGS + RNA-seq with subsequent split-read analysis of sequencing reads [57] appears to be ideal for unbiased detection of break point at the genomic level [58] as well as their transcriptomic consequences [59]. It should be noted that chimeric RNAs of two genes are not necessarily associated with genomic rearrangements and/or disease, and some seem to have biological relevance [60,61].
Analysis of RNA (e.g. mRNA, miRNA, lncRNA) has the additional advantage of informing on expression levels of the respective genes. Since alterations in gene regulation (gene dosage [62], epigenetics [63], monoallelic gene expression [64], mutations in regulatory elements [33]) have major impact on many disease conditions, expression data add complementary information to SNV and SV analysis. Only analysis of DNA and RNA is thus able to fully inform on chromosomal break points, the consequences at the levels of gene expression, and about the sequences of encoded proteins [59].

Bioinformatic analysis
Sequence analysis of small gene panels basically requires a mapping file with information on the sequence positions covered in that panel, some compute power, and software for mapping and visualization. The technical challenges are minor, also because the base redundancy, that is the coverage at a particular position is large. This permits a reliable assessment of mutations. In contrast, the amount of data that is generated in WGS is huge. One genome sequenced to about 30× coverage is the equivalent to 100 Gb in 660 million reads. These are stored in a 70 Gb FASTQ file (numbers for one lane of an Illumina XTen with 2 × 150 nt reads). The sheer size of such data sets poses challenges in data storage, transfer, and processing infrastructures. In WGS and in exome sequencing, the individual reads are mapped to the reference genome and then several quality checks are performed that may necessitate subsequent dealing with biases, like removal of duplicates, filtering of guanine oxidation events, and correction of GC bias. Impacted by the kinds of biases and the depth of sequence coverage, the outcome of variant calling may vary depending on the algorithm that is applied [46]. Substantial bioinformatic hurdles thus persist in analysis of such large data sets and for nonspecialist laboratories. Similar to developments in the field of microarrays, however, potent software will certainly become available in the future, enabling even biologists to carry out analysis of large sequence data sets. With clinical sequencing coming into reach, the importance of simple bioinformatics solutions has been realized also in the commercial sector. Companies like IBM (WATSON e.g. [65]) and SAP (HANA e.g. [66]) are in the process of developing assessment solutions 'from sequence mapper to bedside.' These have huge potential to further stimulate utilization of clinical sequencing for the benefit of patients.

Dealing with findings outside the direct disease context
Clinical sequencing is carried out mostly to uncover causal [67] and/or therapy-relevant [68] alterations in the context of a particular indication, like a genetic disorder or cancer. However, every individual is carrier for a number of genetic disorders she/he could pass on to the next generation or might even develop in some future time [17,24]. In consequence, comprehensive analysis of anyone's genome may well determine variants outside the original disease context [69]. Such variants, when detected, were termed 'incidental' findings for long, based on the concept that comprehensive genetic testing would incidentally uncover alterations on top of such that are intentionally searched for. Reports on true incidental findings seem to be rare [70] affecting, for example, misattributed paternity, gender, or ethnicity. Since such findings must be expected, the term 'additional' findings would better reflect the lack of real incidence. We will use the term additional throughout. Increasingly, variants outside a particular disease context are actively searched for in selected genes [71]. There, the term 'secondary' finding should fit best as these variants are intentionally detected in addition to the ones with direct relevance for a given indication. Some of those additional and secondary findings may be of direct utility to the patient. This is when diseases related to such variations can be prevented, detected early, or treated better. In consequence, it could be even unethical to not report such variants, once identified, provided that there are no other reasons that would advise against them being reported, such as lack of significance, low penetrance, or patient preference. Nondisclosure of clinically relevant additional findings made in the diagnostic context might also have legal implications. An X-ray image had been taken from a patient suffering from some disease. This image showed an observable lung tumor which was not reported by the physician. In this particular case [72], the physician likely missed this true incidental finding of the adenocarcinoma. The judges decided that the physician had to discover and then report this clinically relevant finding in the light of his professional expertise. Furthermore, findings with direct utility might still be passed on to at-risk relatives, or be reported toward mental or emotional preparation. Yet, the ethical debate on which findings to report or to not disclose goes on [73], also concerning genetic analysis within academic studies and is also addressed further below.
The incidence of additional and secondary findings is a key question in clinical sequencing as are the potential consequences. The American College of Medical Genetics and Genomics (ACMG) developed a positive list of genes relevant for several different conditions [74]. The college recommended routine testing of these genes which has resulted in a controversial debate on potential benefits and risks of secondary, and even more so of additional findings in general [75][76][77][78][79][80][81]. Recently, the Clinical Sequencing Exploratory Research Consortium Tumor Working Group recommended reporting of additional findings of germ line variants/mutations having been identified in gene panel sequencing [12], also beyond the list published by the ACMG. The application of positive lists results in the potential identification of mutations present in all listed genes within the individuals being tested. The benefits should be obvious as many or even all potential causal variants are detected for the indications covered. However, there are points of concern. All genes and mutations, for example, in the ACMG list have high penetrance and pathogenicity. Any finding should thus have some benefit for the individual because a specific action would follow. However, geneticists are faced with the dilemma that neither the penetrance nor the pathogenicity are so clear for many other aberrations [32]. This is because biology is not always binary. For example, should a mutation leading to disease in 20% of affected individuals be reported? Would this decision be impacted by the severity of the condition that might be induced? What consequences would knowledge of the genotype have for the affected individual, or for his close relatives? What if information about the genotype of an individual was leaked out into the public, to employers, or insurance companies? Genetic testing of children and return of findings is yet another complex and highly sensitive issue. In many countries, like United States and Germany, predictive genetic testing of asymptomatic children is only permissible if they are at risk of childhood-onset conditions [82,83]. In contrast, 'testing for adult-onset conditions generally should be deferred unless an intervention initiated in childhood may reduce morbidity or mortality' [82]. Thus, only findings having clinical relevance in childhood are permitted to be reported. By inference, any additional findings that would become clinically relevant not before adulthood (e.g. causal changes in the huntingtin gene indicative of Huntington's disease with an onset at >50 years) should not be disclosed. Recontacting of tested individuals at maturity is oftentimes not envisioned and would likely be technically difficult in many cases. However, this point might raise concerns in the light of nondisclosure of clinically relevant additional findings and the connected ethical and potentially even legal issues we had discussed above. This appears to be an open issue that is potentially required to be tackled in the future.

Are filters a solution?
Technically, sequencing and subsequent bioinformatics analysis allows for different kinds of filters to be set, thereby restricting the search space to intended and secondary findings ( Figure 2). In gene panel sequencing, such filters are 'hard-encoded' in form of positive lists. Positive lists, like the one from the ACMG, are generated and then utilized to specify the target-sequences of genes that are regarded relevant for particular indications. Design of PCR-primers or hybridization probes is based on such lists, rendering the making of additional findings impossible in genes that are not covered in the respective panel. The other extreme is WGS, where the complete genome is sequenced. Filtering is done only once the individual reads have been mapped to the genome. Also there, positive lists are commonly applied. However, this reduction of search space is likely not primarily done to avoid additional findings but rather due to limited capacities in subsequent variant validation and interpretation processes.  Hence, in cancer, mostly somatic mutations are selected, adding only a few genes with known germ line contributions [12]. In consequence, the incidence of true additional findings seems to be low in general [70]. Exome-sequencing poses a special case, as there filtering is applied first during target selection for development of the exon-panel (i.e. hardencoded filtering). The resulting sequencing data pass a second, soft-encoded filter during bioinformatics analysis. In the light of these constraints and similar to WGS, the making of additional findings seems to depend on an active search for such variants [71]. However, any filtering is strongly affected by scientific progress. Genes and gene products that had not been druggable at some time may eventually become treatable. Scientific progress thus necessitates a frequent revisiting of filters. Keeping the filtering conditions constant might otherwise even result in wrong treatment decisions.
Open questions include which genes and potential mutations should be tested and reported back to the patient outside the original indication and how to achieve this with a societal consensus [84]. The ACMG had taken into account the penetrance and proven pathogenicity of mutations and only regarded actionable genes in their initial positive list of 57 genes having established biomedical impact [74]. This list has since been revised now covering 56 genes [85]. However, the longer the positive lists become, the higher is the complexity of findings and potential implications that need to be regarded in counseling. This will pose an increasing challenge on the counseling geneticists since specialists in one particular disease context are then supposed to potentially interpret up to genome-wide data and decide on whether or not to report findings affecting all kinds of diseases back to the respective individuals [86]. While first studies suggest that affected individuals might not undergo major crises [87] or could even benefit from better risk information [88,89], potential longtime implications regarding insurance, employment, and stigmatization [90] require further attention.

Dealing with germ line variants
It did not really come as a surprise when germ line mutations predisposing for genetic diseases were found in tumor patients considering that such mutations had frequently been found also in 'normal' individuals [91]. Hence, the ethical challenge of additional findings is a principal one. In cancer sequencing, the parallel analysis of blood DNA can be exploited to reduce the search space by automated filtering of any germ line variants. This is achieved by subtracting the germ line from the tumor genome of the same individual. Detection of germ line variants could thus be fully excluded. However, germ line mutations in few but highly relevant genes should be mandatory to detect also in the tumor context, like in the TP53 and BRCA1/2 genes. These genes causally contribute to particular tumor types when mutated in the germ line. Tumor-only sequencing [12] poses the challenge that somatic and germ line variants are identified, placing the interpretation of driver events in the downstream analysis of sequencing data. There, the identification of germ line variants outside the original indication is vastly increased [92,93] since filtering for just the somatic mutations is not feasible.
Application of positive lists might be a way to avoid additional findings there.
Genetic variants with no relation to the specific condition are to be expected in higher numbers in genetic testing of other than tumor diseases [14], in principle. However, trios of unaffected parents and an affected child are commonly analyzed for homozygous or compound heterozygous mutations just occurring in the offspring [94,95]. Further, sequencing of distantly related affected individuals from larger pedigrees is applied to uncover autosomal dominant events. In all these conditions, filtering can be effectively applied, again reducing the search space. In larger family studies or sequencing of several affected individuals, findings related to other diseases inevitably occur, however, are filtered out as these are unlikely to co-segregate with the phenotype [96].

An informed consent?
Ethically speaking, different disclosure positions toward additional findings have emerged. These range from more paternalistic/professional judgment driven to more patientpreference-based approaches [73]. The professional model was particularly popular in the early ethical literature. There, the decision whether or not to disclose was left at the discretion of medical professionals and had been based on their judgment about clinical utility. More up-to-date approaches incorporate the patients' judgment about what is meaningful to them and hence operate on a broader definition of net benefit than pure clinical utility [97,98]. The concept of shared decision-making has since evolved as the 'gold standard' in other areas of medical practice where value laden decisions need to be made and seems to becoming accepted also in the debate about returning of results from genetic testing. Just like in other medical fields shared decision-making involves a coalescence of patient values and professional expertise. The complexity of potential secondary findings that could be identified, for example, by application of positive lists demands a robust informed consent process. In practice, involving patients in choosing their 'disclosure profile' requires to prepare them for making informed choices. This requires provision of sufficient information for eliciting patient preferences regarding whether to learn about certain categories of information while refraining from others. Along these lines, the role of professional judgment in genetics and genomics remains indispensable [99] as this enables the patient to make an informed decision. However, the professional's responsibility is becoming an ever more time consuming and potentially Herculean task in the light of the breadth of potential findings and clinical consequences.
Prior to genetic testing in the medical context, an individual needs to be counseled for the opportunities and potential risks of the outcome. This shall result in the patient consenting to the testing process as well as to the consequences the data might induce [100,101]. The patient will likely have the current disease condition in focus and take for granted that potential risks that she/he should consent to are directly connected to related findings and their immediate consequences for her-or himself. Counseling should be repeated once the data from genetic testing have been analyzed. There, the individual should be informed about identified alterations and the potential consequences, with emphasis on the original indication which is likely the patient's primary concern. However, a genome sequence may have prospective value also for other conditions, once the development of a particular disease can be predicted. Here, an ethical signpost that refers to the analytic validity, clinical validity, clinical utility and ethical, legal and social issues framework [102] been proposed toward transparent communication of the analytical and clinical validity, clinical utility, as well as the ethical, legal, and social aspects of secondary and additional findings [103]. This pertains, for example, to the ramifications of genetic test results for insurance eligibility. Antidiscrimination laws are in place in many countriesfor example in the United States and Germanyaimed to protect people from potential discrimination, for example, by health insurers and employers on the basis of DNA information. However, these do not fully cover life insurance, disability insurance, and long-term care insurance (GINA -Genetic Information Non-Discrimimation Act [104], and §18 of the so-called Gendiagnostikgesetz [105]). The latter public law requires disclosure of relevant genetic data, if known to the affected individual, at the request of an insurance company once a life insurance is written that would exceed a certain limit. This law might have far reaching consequences for an individual who had been tested positive for a causal aberration, for example, in the Huntingtin gene at some previous time and who had afterward received this information. While the Huntington example may be an extreme case, no threshold regarding penetrance or pathogenicity has been defined in that law leaving it to the insurance agency to decide on the potential consequences of results from genetic testing.
Currently, first experiences with reporting of secondary findings are gathered and no greater than minimal harm has been reported [106]. Still, it could be questioned if every individual would or even could consider all possible consequences that disease-relevant findings in their genomes might induce. This should be even more true in a situation of acute stress when that individual is faced with the potential diagnosis of a serious disease, that is the reason for clinical testing. Hence, new procedural approaches for consent are explored that depart from the traditional informed consent toward a rather 'staged consent': With brief mentioning of additional findings at the time of initial consent and more detailed consent when reportable results have been identified [107].
3. The past, the present, and the future?
An individual's genome is relatively stable during life-time. The state-of-the-art in scientific research is not. While a large number of genes have been causally related to diseases, new discoveries about disease associations and causalities are published almost every day. Hence, the information that is encoded in a genome is changing with time (Note: Here, the term information is not meant as a proxy for DNA or proteinsequence, but rather as the knowledge of variants and their specific consequences). Protection of an individual's personal genome sequence is thus a key issue to ensure privacy in view of the long-term implications genetic data might have. This can be realized only once everyone involved takes responsibility in his or her field of action for handling genomic information: This applies to nonphysician scientist as well as to medical professionals but also calls for institutional frameworks and governance in protecting research subjects' or patients' genomic privacy. And lastly, the patients or individuals who are tested have to take responsibility for their own data.
Different countries and societies appear to be dealing differently with the protection of personal genomes [108]. Sharing of genetic data in clouds [109] or electronic patient records pose new challenges to data protection. The potential for leakage of data combined with the lack of control regarding the interpretation of potential implications should raise concern [110]. Full data protection can realistically not be guaranteed unless all patient information, including the genetic data are kept in a safe environment. Otherwise, the basis for a true 'informed' consent should be put in question. Harmonization of the legal basis for disclosure of results should be established in different societies [111], also in view of data sharing and cloud solutions. Globalization of data flows poses additional challenges when personal data might cross borders and legislations [112].
As long as there is no agreement on common standards in place, at least the scientific community is called upon to set their standard for data sharing in national and international consortia via options of self-regulation. Examples are the Framework for responsible Data Sharing of the Global Alliance for Genomic and Health [113] or the code of conduct for researchers working with genomic data of the EURAT consortium in Germany [114].
Genetic screening is likely to vastly increase in the coming years [115,116]. Projects [20,108] and companies, like Arivale Inc. (https://www.arivale.com/accessed 28 June 2016) and Human Longevity Inc. (http://www.humanlongevity.com/ accessed 28 June 2016), have begun massive sequencing of individuals outside a particular disease context. There, filtering of genetic data is intentionally not done. To the contrary, the complete information that is contained shall be uncovered and exploited. The resulting genetic data are placed either directly into the Internet or kept within the companies as part of their business model, respectively. Both approaches have huge potential for the benefit of science in the former case and for companies and, hopefully, also the individual having been tested in the latter. Only in this broad context, the application of negative lists could make sense. These would be comprised of genes that should explicitly not be analyzed and could contain genes like huntingtin which, with CAG triplet expansion, causes Huntington's disease [117] and/ or APOE where the E4 allele predisposes for late-onset Alzheimer disease [118]. Any such lists of disease-related genes and mutations therein are, however, also a moving target. While for some mutations a clear clinical significance has been established, the level of evidence is not always so clear [119,120]. Furthermore, the identity of genes within such negative gene list is impacted by the respective scientific state-of-the-art at the time. Development of new therapies for particular genes and mutations should be reflected in the frequent revision particularly of negative lists as newly identified druggable events should even lead to the potential move of such genes from a negative to a positive list.
Deep analysis of many individuals' genomes, in combination with public or commercial storage raises concerns regarding re-identification [121] and of potential misuse of data. It is less clear, what consent and data-protection standards commercial providers would subscribe to when marketing whole genome information to their consumers. These developments put the concept of an 'informed consent' in question. Alternative consent models, like the open consent [122], might be better suited to match the potential wide distribution of genome data. Provided that open consent still intends to qualify for some level of cognition it places quite an educational burden on the patient or participant. Accordingly, the personal genome project, which operates on an open consent model requires all participants to pass an exam testing their knowledge of genomic science and privacy issues and agree to forgo the privacy and confidentiality of their genomic data and personal health records [123]. It will be interesting to follow the progress of genetic testing in the diverse settings over the coming years [124].

Expert commentary
On the one hand, only WGS is able to pick up all variation that might be causally related to disease, while gene panel sequencing, in particular, can only be as comprehensive as the underlying state-of-the-art in genome and disease annotation had been at the time the panel was designed. On the other hand, the very high sequence coverage that is only feasible with gene panel sequencing is advantageous to potentially detect the complete set even of rare mutations in tissues due to low cellularity or because of high heterogeneity, particularly in tumor diseases. Sequencing throughput is steadily increasing at stable or even reduced cost, permitting higher sequences coverages also for WGS. Ever improved algorithms are developed for analysis and interpretation of the resulting data. Hence, the benefits of WGS over other approaches will eventually make the comprehensive genome analysis of individuals and patients out-compete the analysis of only particular subsets of the genome. In developed countries, genetic testing of patients will become routine clinical practice at least for some conditions [125]. However, the SHIVA study uncovered that the physician's best choice was not inferior to off-label use of targeted drugs with the end point of progression-free survival for heavily pretreated tumor patients [126]. A critical assessment of clinical utility and cost-effectiveness should thus precede potential implementation of reimbursement. A kind of gold-rush on the clinical market might otherwise evolve. An analogous hype in the private sector should be avoided as well. Ethical and legal issues, like additional findings and data privacy, must be tackled to protect individuals and societies also in the future from negative implications that could otherwise be caused by the abuse of genetic information. The legislation appears to be disparate at present [111], however, should be harmonized in view of globalized data flows.
Furthermore, the linking of clinical data with social media seems to be on the rise [127,128] and surely holds a lot of promise. However, the consequences might not always be so simple to foresee [129]. Stigmatization and discrimination appear to be realistic threats if genetic data are leaked into the public [110]. A societal consensus on the breadth of genetic information that could lawfully be disclosed requires a wide awareness of the opportunities, however, also of the associated challenges. The possibly greatest threat lies in the prospective value of the information that is encoded in anyone's genome, where the information content will definitely even change with time. It is impossible now to predict which consequences future knowledge will impose on the individual having been tested. This alone makes personal genetic data highly sensitive. Proper education of societies and of its individualspatients, healthy persons, and test providersis inevitable for a responsible and well-informed treating of genetic data. It appears that this process is still in its infancies.

Five-year view
Patient sequencing will become routine clinical praxis in the developed countries, particularly as sequencing throughput will further increase at decreased cost. Huge expectations regarding the benefits of genetic testing lead to an increasing demand in the population and drive these developments. The 'patient in the cloud,' connecting medical records with genetic data will have been established and in use in a growing number of countries. Companies will see a huge market for technologies, data analysis tools, and for the resulting patient data itself [124]. Ethical and, very importantly, legal issues must be solved in parallel, to avoid uncontrolled spreading of genetic data, particularly via the Internet, and potential abuse of personal information. Social media will be used to exchange genetic information, also by those whom this information belongs to that is the ones who have been sequenced. In our view, it is almost impossible to predict how this field will impact individuals and societies as a whole. Since information flows are global, we am somewhat pessimistic in view of adverse and unlawful utilization of genetic dataalready now medical records are valued higher by hackers than credit card information [130].

Key issues
• Technological developments enable clinical application of targeted gene panel and exome sequencing as well as untargeted whole genome sequencing. • The sequencing methods are associated with particular pros and cons regarding the depth of resulting data, (still) the bioinformatic hurdles required to analyze the data, the information content, the potential to make additional findings, and many more. • Hard-encoded (target selection for gene panel and exome sequencing) and soft-encoded (WGS and exome sequencing) filters help to reduce the search space, thus limiting the bioinformatics effort for data analysis on information of direct medical relevance and controlling the incidence of additional and secondary findings.
• It is ethically debatable whether the suppression of all possible additional findings would be desirable because some might result in a net benefit for the individual. • The impact of particular genotypes cannot always be predicted at the time of genetic testing since the scientific state-of-the-art is a moving target. New discoveries may newly attach or even detach disease associations with particular variants. Hence, the information from a genetic test having been done today will be different in years from now even though the data remains the same. • Counseling of individuals undergoing genetic testing is highly complex because of the breadth of very personal information that may (mostly with gene panel sequencing) or must be (in WGS) expected to be revealed. Informed consent of these individuals is a prerequisite before genetic testing may be carried out. • Societies change with the globalization also of data and information flows. The control (or rather the lack of it) over genetic data might put the concept of an informed consent into question. Robust educational efforts on the patient and provider level as well as institutional governance on privacy protection are needed to safeguard informational self-determination. • The level of data protection will be handled differently in different countries. However, harmonization should still be attempted to guarantee maximal protection of the individuals and their data while ensuring lawful exploitation for the benefit of future generations. • While deep and up to genome-wide genetic testing will be increasingly useful for the benefit of individuals, the risks of abuse of that data is real. Those having done genetic testing should consider also these risks when deciding to spread their personal genetic data in social media and in clouds.

Funding
This paper was not funded.