Hermaphroditic freshwater mussel Anodonta cygnea does not have supranumerary open reading frames in the mitogenome

Abstract The complete mitogenome of Anodonta cygnea is 15,613 bp long. This compact, circular molecule contains the set of 37 genes, typical for invertebrate mitogenomes, in the same order and orientation as in maternally inherited genomes of other bivalves from the same subfamily. There are only two unassigned regions longer than 200 bp (266 bp and 274 bp) and no indication of any supranumerary open reading frames.

Anodonta cygnea (Linnaeus, 1758) is a freshwater mussel from the family Unionidae, distributed in Eurasian waters. The family is several hundred species rich, but most of them are found in North America. They are usually gonochoristic, with the presence of two distinct mitochondrial lineages (M and F), inherited under DUI system (Skibinski et al. 1994;Zouros et al. 1994). This system has been faithfully operating in freshwater mussels for a long time, leading to extreme divergence of the two mitogenomes (Hoeh et al. 2002). Genderspecific anonymous open reading frames (FORF and MORF) have been described in both mitogenomes (Doucet-Beaupr e et al. 2010). The few species with secondary hermaphroditism were described, and in case of North American mussels, these always lost the divergent, paternally inherited mitogenome. There were also substantial structural changes in the FORF (now denoted HORF) (Breton et al. 2011).
Here we announce, for the first time, the mitogenome of a European hermaphroditic species from the same family. We were unable to find a distinct paternally inherited mitogenome in sperm of this species so we assume the announced mitogenome to be the only one present.
The sample was taken in July 2009 from a pond in Hamrzysko village, central Poland. Identification down to species level was based on diagnostic morphological characters (Piechocki and Dyduch-Falniowska 1993). The specimen is stored under voucher number 328 in the local collection at University of Szczecin. The taxonomic identity was confirmed by comparison of the barcoding cox1 sequence with the references (Bogan and Roe 2008).
The sequencing strategy followed the previously published three-step protocol (Soroka and Burzy nski 2010). Two parts of the mitogenome were amplified with universal primers and sequenced. Species-specific long-range primers were used to amplify the rest of the mitogenome. The LR-PCR products were sequenced by primer walking. The complete mitogenome was assembled in gap4 from Staden package (Staden et al. 2001). Annotations followed the established pipeline (Zbawicka et al. 2007) and were manually curated by comparison with the mitogenome of A. anatina (Soroka and Burzy nski 2015).
The sequence has been deposited in GenBank under accession number MG385135. Comparative phylogenetic analysis was performed ( Figure 1). The protein sequences encoded by the mitogenome differ from the closest relative (A. anatina F mitogenome) by approximately 10% (average pdistance, calculated in MEGA7 (Kumar et al. 2016)). No additional ORFs could be identified. In particular, the region containing FORF in A. anatina F mitogenome and HORF in Utterbackia imbecillis and Lasmigona compressa mitogenomes is much shorter and does not contain any ORF of appreciable length in A. cygnea.
Of the three cases of secondary hermaphroditism covered by the presented data set, the A. cygnea case seems to be the only one without the HORF and also the oldest one (Figure 1; Mitchell et al. 2016). It can be concluded that after the loss of DUI, the supranumerary ORFs can eventually degenerate. This reinforces the hypothesis of the involvement of gender specific mitochondrial ORFs in sex determination of these animals (Breton et al. 2011).  BEAST (Bouckaert et al. 2014) was used to reconstruct the phylogeny. All the records were downloaded, reoriented to the common origin and aligned using ClustalW (Larkin et al. 2007). Since these genomes have the same structure and similar gene lengths, the only alignment ambiguities concerned the unassigned regions. However, these were inconsistent and have no influence on the final phylogeny due to complete elimination of columns with missing data. The optimal model of sequence evolution (GTR þ G with relaxed, lognormal clock), matching the observed pattern of substitutions was selected, as previously described (Burzy nski et al. 2017). The MCMC chains were run in quadruplicates for 20 Â 10 6 generations to reach ESS of at least 300 for each parameter. The four runs were convergent so the final tree samples were combined using logcombiner. The Maximum Clade Credibility tree was generated using treeannotator. The tree was visualized in FigTree (Rambaut 2009), and the root of the tree was scaled to match that of the recently published mitogenomic analysis (Burzy nski et al. 2017). All nodes have posterior probabilities of 1.0, except for the ones indicated. The node bars represent 95% CI on node heights.