35-bp deletion in ABCG2 gene: mini-review and report on two herds of Bulgarian dairy synthetic population sheep breed

Abstract A mutation in the ovine ABCG2 gene could influence the fat content and milk yield in sheep. The aim of this study was to identify a 35-bp deletion/insertion in intron 5 of the ABCG2 gene in two herds of the Bulgarian Dairy Synthetic Population (BDSP) breed. The first population was raised in the Agricultural Institute in Shumen and included 116 ewes. The second one was bred at the Institute of Animal Science – Kostinbrod and included 68 ewes. Genomic DNA was extracted from a total of 184 blood samples. The allelic variants were determined by polymerase chain reaction (PCR) amplification with a specific primer set. The results revealed the presence of the wild-type allele “+” with a frequency of 0.53 and the mutant allele “-” with a frequency of 0.47 in BDSP - Shumen. In the animals from BDSP - Kostinbrod, the wild-type allele “+” had a frequency of 0.71 and the mutant allele “-”, a frequency of 0.29. All three possible genotypes were identified in both herds. In BDSP ewes - Shumen, the wild-type genotype “+/+” had a frequency of 0.22; the heterozygous genotype, a frequency of 0.63; and the mutant genotype “-/-”, a frequency of 0.15. In BDSP - Kostinbrod, the wild-type genotype “+/+” had a frequency of 0.55, the heterozygous genotype “+/-”, 0.32; and the mutant genotype “-/-”, 0.13. Both tested herds were not consistent with the Hardy-Weinberg equilibrium (HWE). In BDSP – Shumen p = 0.01 and in BDSP – Kostinbrod p = 0.05.


Introduction
The traditional improvement of animal breeds was limited to selection based on phenotypic traits in the last century. Advances in molecular DNA technologies in recent decades have opened up many opportunities for genetic amelioration. The application of molecular markers associated with economically significant traits in farm animals is an innovative approach that gives accurate results and can significantly speed up and facilitate the selection process. DNA markers allow genetic improvement of farm breeds and the preservation of local breeds of farm animals as well [1]. Marker-assisted selection (MAS) is a powerful tool for genetic improvement of farm animals by means of direct selection of genes and regions of the genome associated with productive traits [2]. MAS is a method for early and precise selection especially in milk production. Its main advantage compared to conventional selection is related to selection of males with desired traits [3].
Studies of the genetic markers could provide information about polymorphism at different loci. Such data may allow the detection and identification of genes that influence economically important traits in farm animals and furthermore evaluation of the genetic status of populations and breeds [4]. To improve compound phenotypic traits such as milk productivity, it is essential for marker-assisted selection to be incorporated in the conventional breeding practices. The first step of this process is studying the genetic diversity of candidate genes associated with economically important traits [5]. In the last two decades, different candidate-genes associated with milk production have been studied: DGAT1 (diacylglycerol acyltransferase), β-lactoglobulin, prolactin, ATP-binding cassette subfamily G member 2 (ABCG2), leptin, signal transducer and activator of transcription (STAT5A), the FATPs (SLC27A) family of genes encoding adipose transport proteins acids [6,7]. These genes have been investigated mostly in dairy cattle but, in recent years, they became in focus in dairy sheep breeding as well.
In Bulgaria, sheep farming has been traditional for thousands of years [8]. The size of the sheep population has decreased significantly over the last two decades, but sheep breeding is still an important sector for the national economy [9,10]. More than 30 sheep breeds are bred in Bulgaria, and over 88% of the available ewes are dairy. The most numerous breed is Bulgarian Dairy Synthetic Population (BDSP), which was registered in 2005. Bulgarian Dairy Synthetic Population is a composite sheep breed created through the implementation of continuous hybridization with a focus on the possibility for high milk yield and prolificacy with heterosis effect. As a basis were used Merino, half-Merino and dairy Bulgarian ewes and rams from highly productive dairy breeds: East Friesian (EF) and Awassi (AW) and Bulgarian Blackhead Pleven (BP) and Stara Zagora (SZ) [11,12].
Genetic improvement in sheep is often considered less effective than in other animal species in which molecularly assisted breeding schemes are implemented in some countries. Therefore, genes that can serve as markers influencing productive traits in sheep have received attention [13,14]. Sheep's milk has a higher content than goat's and cow's milk, it is rich in fats, proteins and minerals, which makes it preferred for processing. Some genetic factors affect milk production and its composition. The study of genes influencing milk productivity is an opportunity to solve the problem of limited sheep milk production in future [15].

Mini review
The membrane-associated protein encoded by the ABCG2 gene (ATP-binding cassette sub-family G member 2) is an ABC transporter that has been studied as a breast cancer resistance protein in humans. ATP-binding cassette subfamily G member 2 (ABCG2) belongs to a protein family of transmembrane drug transporters and actively extracts various drugs, carcinogens and dietary toxins from cells in the intestine, liver and other organs. Its gene is expressed in several tissues, including the mammary gland, with the highest expression in brain, gastrointestinal tract and placental tissues, and it is thought to be important in xenobiotic protection. The secretion of the protein significantly increases during the lactation period and is responsible for the secretion of some xenobiotics and vitamin K3 [16,17]. ABCG2 has been suggested to be involved in cholesterol transport in milk. ABCG2 also plays an important role in mammary gland differentiation and branching of the mammary ductal epithelial system [18]. It is responsible for transporting various molecules across cell membranes and limiting the exposure of certain drugs and natural compounds to various tissues and organs [19,20]. ABCG2 is frequently expressed in stem cells where it plays a role in the defense of various cells and tissues against xenotoxins and/or endotoxins [21,22].
There are studies on mutations in this gene associated with milk yield, protein and fat percentage, and somatic cell count (SCC) [23].
In cattle, the ABCG2 gene is located on chromosome 6 of the Bos taurus genome and contains a quantitative trait locus (QTL) with a large effect on milk production traits [24,25]. Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis of the ABCG2 (exon 14) gene in Karan Fries cattle revealed two alleles (A and C) and three genotypes -AA, AC and CC. The genotype frequencies of AA, AC and CC were 0.83, 0.15 and 0.02, while the allelic frequencies for allele A and C were 0.91 and 0.09, respectively [3]. The AA genotype was found at a higher frequency (0.97) in Iranian Holstein cattle [17], and a study of Sahiwal cattle in India found only the AA genotype [6]. By studying the polymorphism in exon 14 of the ABCG2 gene in two Turkish breeds of cattle, South Anatolian Red (SAR) and East Anatolian Red (EAR), a greater diversity was found -presence of two alleles (A and C) and three genotypes in one breed -SAR (with the frequency of AA − 0.50, AC − 0.28 and CC − 0.22) and two genotypes in the other EAR (AA − 0.62 and CC − 0.38) [26]. Ron et al. [24] studied a total of 35 breeds of Bos taurus (European cattle) and Bos indicus (Zebu), and as a result suggested that the A allele is of the wild ancestor (auroch), since the C allele is found only in Bos taurus breeds and may have appeared 200,000 years ago when Bos taurus and Bos indicus cattle split. ABCG2 alleles were found to be fixed in Red Chittagong cattle and in Indian cattle breeds [27,28].
ATP binding cassette G2 (ABCG2) is one of the genes influencing milk composition in different cattle breeds [28]. Studies have shown that a SNP within the ABCG2 gene (in intron 7) showed a significant genetic effect on milk fatty acid composition in dairy cattle, indicating its potential functions in milk fatty acid synthesis and metabolism [29]. The AA genotype in the ABCG2 gene (exon 14) has a significant effect on the breeding value for mean milk fat percentage [3]. yu et al. [22] reported an association between ABCG2 gene polymorphisms and somatic cell count (SCC) in Holstein cattle, while Asadollahpour et al. [30] conversely reported that genetic polymorphisms in the ABCG2 gene had no effect on SCC. Komisarek and Dorynek [16] reported a significant effect of ABCG2 gene polymorphisms on estimated breeding values for milk fat production traits in Polish Holstein-Friesian cattle. Soltani-Ghombavani et al. [17] and Alim et al. [18] found genetic effects of the ABCG2 polymorphism on milk production traits in Holstein cattle. Cohen-Zinder et al. [31] reported a SNP able to encode a substitution of tyrosine-581 with serine (y581S) in the ABCG2 transporter and increasing in its frequency through selection for a higher percentage of milk fat and protein in a population of Israeli Holstein cattle breed.
In domestic sheep (Ovis aries L.) the ABCG2 gene was studied as a candidate-gene associated with milk production and it was assumed that it plays a role in cholesterol transportation in milk [14,32]. ABCG2 gene is located on chromosome 6 in the Ovis aries genome. The ABCG2 gene consists of 20 exons separated by 19 introns, and the level of its expression increases during lactation [33,34]. The 35-bp deletion/insertion (c.683-80_46del), which was in focus in this study, was identified in intron 5. Since introns are non-coding DNA sequences, changes in these regions have no effect on the amino acid sequence. Introns can carry transcriptional regulatory elements. They can also be a source of non-coding RNA and play a role in alternative splicing [23,33]. According to authors, this mutation had an effect on the number of somatic cells in a dairy sheep population and therefore on the milk quality [32].
Regarding the association between genotypes of the ABCG2 gene and milk production traits in sheep, Árnyasi et al. [35] sequenced exons 1-16, an 833 bp region in intron 4, a 611 bp region in intron 5, a 380 bp region in intron 6 and a 610 bp region downstream of the 3′uTR. The authors reported that the c.683-80_46del variant had a significant effect on somatic cell score (SCS) in their research.
The aim of the present study was to detect a 35 bp insertion/deletion in the ABCG2 gene and to estimate the genetic diversity of this locus in 184 animals from two herds from Bulgarian Dairy Synthetic Population breed -one herd was raised in the experimental flock at the Institute of Animal Science in Kostinbrod and the other one was raised at the Agriculture Institute in Shumen. Both institutes are part of the Agricultural Academy in Sofia, Bulgaria.

Animals
In the present study, a total of 184 ewes from Bulgarian Dairy Synthetic Population breed were tested for detection of a 35-bp insertion/deletion in intron 5 of the ABCG2 gene. One hundred and sixteen (116) of them belong to the herd of the Agricultural Institute in Shumen (Bulgaria) and sixty-eight (68) animals, to the experimental herd in Institute of Animal Science in Kostinbrod (Bulgaria). Blood samples were collected from jugular vein in 3-mL vacuum tubes containing ethylenediaminetetraacetic acid (EDTA) as an anticoagulant. The investigation was carried out in the Laboratory of Genetics of Agronomy Faculty, university of Forestry (Sofia, Bulgaria).

DNA extraction
As previously described [36], the blood samples were stored at −20 °C until DNA extraction. Genomic DNA was isolated using manual purification kits Illustra Blood GenomicPrep DNA Purification Kit of GE Healthcare (uK), according to the instructions of manufacturers. The DNA concentration of each sample was determined via a Biodrop spectrophotometer. The quantity of the obtained DNA was about 10-50 ng and it was tested using gel monitoring on 1% agarose gel (Healthcare) prepared with Tris-acetate-EDTA (TAE) buffer (Jena Bioscience).

Polymerase chain reaction (PCR) amplification
The amplification of intron 5 of the ABCG2 gene was implemented as described by Oner et al. [37]. The primer set used for the amplification was as follows: The sizes of PCR products for ABCG2 gene were determined via a 3% agarose gel using GeneRuler™ Ladder, 50 bp (Thermo) supplied with 1 mL 6xDNA Loading dye and stained by 10000x RedGel TM Nucleic Acid Stain (Biotium). The obtained results were observed under ultraviolet light.

Statistical analysis
Statistical analysis was performed by using statistical functions in Excel 2013. The allelic and the genotypic frequencies of the FABP3 gene were estimated using simple gene counting method (Falconer and Mackay, 1996). The expected and the observed genotypic frequencies were compared using the χ 2 test. The population was found to be consistent with the Hardy-Weinberg equilibrium, so the value of p was > 0.05.

Results
After DNA extraction were purified 184 samples with mean DNA concentration 14.3 ng/µL and the quality of DNA was tested on 1% agarose gel. The results of the PCR amplifications revealed the presence of two alleles in both tested herds (Figure 1). The fragment that represented the wild-type allele "+" was with a size of 267 bp and the fragment with the deletion of 35 bp, which represented the mutant allele "-", was with a size of 232 bp. In sheep of Bulgarian Dairy Synthetic Population -Shumen, the wild-type allele "+" was with a frequency of 0.53 and the mutant allele "-"was with a frequency of 0.47. All three possible genotypes were observed: homozygous for the wild-type allele "+/+" with a frequency of 0.22, heterozygous genotype "+/-"with a frequency of 0.63 and homozygous for the mutant allele "-/-"with a frequency of 0.15. In BDSP -Kostinbrod, the "+" allele had a frequency of 0.71 and the "-"allele, a frequency of 0.29. The three genotypes "+/+", "+/-"and "-/-"" were observed, with frequencies of 0.55, 0.32 and 0.13, respectively. In BDSP -Shumen, the values of H o and H e were 0.629 and 0.498, respectively. The coefficient of inbreeding was −0.263. There was a statistically significant difference (p = 0.01) between H o and H e , and the population was not found to be in Hardy-Weinberg equilibrium (Table 1). In BDSP -Kostinbrod H o was 0.324 and H e was 0.412. The coefficient of inbreeding was 0.214. There was a statistically significant (p = 0.05) departure from the Hardy-Weinberg equilibrium in the Kostinbrod herd.
An interesting comparison can be made between the two studied groups. Although the two herds belong to the same breed, there were significant differences in the allelic and genotypic frequencies. In the group of BDSP -Shumen, all three possible genotypes were observed with predominance of the heterozygous genotype "+/-". In the other group, the predominant genotype was the homozygous wild genotype "+/+". In ewes in BDSP -Shumen the distribution of alleles was almost equal, while in the group in BDSP -Kostinbrod, the wild allele "+ was represented with a higher frequency. However, in both herds the genotype homozygous for the mutant allele had a lower frequency compared to the other two genotypes.
The value of the inbreeding coefficient was negative in BDSP -Shumen. This indicates that the applied selection program in this herd has managed to maintain a high level of heterozygosity and a low level of inbreeding for the investigated locus. Despite the fact that all three possible genotypes were detected in BDSP -Кostinbrod, the inbreeding coefficient was above 0.000, which meant that the tested herd is close to heterozygous deficiency. We would recommend rearrangement of the breeding individuals and better management of the implemented selection. These results can serve as a solid basis for future studies related to both genetic diversity and the phenotypic expression of productive traits.

Discussion
In our first study of the ABCG2 gene in Bulgarian sheep breeds, we tested a total of 42 animals from 6 breeds and we found the presence of polymorphism only in two of breeds, Caucasian Merino and Karakachan breeds. The animals from the other breeds, Askanian Merino, Karnobat Merino, Northeast Bulgarian Merino and Il de France, were monomorphic [36].
In another study, we tested 90 animals from three merino sheep breeds raised in Bulgaria, Askanian Merino, Caucasian Merino and Karnobat Merino sheep breeds. In contrast to the present study, in our previous study the frequency of the mutant allele "-" was higher in all breeds. The observed frequency of genotype -/-was significantly higher in all three breeds than the frequency in this study. The highest H o was observed in Karnobat Merino − 0.371 [22].
A limited number of studies have been conducted worldwide on the ABCG2 gene in sheep. The results in the present study essentially differ also from the results published for 100 investigated animals from three populations of the Turkish sheep breed Kıvırcık, in which the "-" allele was predominant and its frequency varied between 0.50 and 0.65 and the mutant genotype "-/-"was with the highest frequency of 0.5 [38].
The results obtained in BDSP are similar to those in some reports. In Hungary, the analysis of 75 purebred Gyimesi Racka animals and 310 Awassi sheeppurebreds and their crosses, showed presence of two alleles and all three possible genotypes the same region of the gene. In both studied populations, the wild allele "+ "was predominant, as in the group of animals we tested. The genotype frequencies for Awassi breed were: 0.39 for +/+, 0.46 for +/-and 0.15 for -/-. For Gyimesi Racka, the genotype frequencies were: 0.35 for +/+, 0.49 for +/-and 0.16 for -/-. The obtained frequencies for the wild and heterozygous genotypes differ significantly from the results obtained in the present study, while only the results obtained for the homozygous mutant genotype were closer to ours. The scientific team found relation between the "-" allele and higher somatic cell count (SCC) [35].
Hofmannová et al. [23] identified all three genotypes in a study of sheep from breeds Lacaune and East Friesian. In breed Lacaune the predominant allele was the allele with the deletion (0.694), whereas in East Friesian it was the allele without the deletion (0.784). The obtained results for the East Frisian breed were close to those obtained in our study. This was not surprising given that one of the participants in the breed formation process of Bulgarian Dairy Synthetic Population breed is the East Frisian breed. The authors also establish a connection between mutation c.683-80_46del in the intron 5 region of the ABCG2 gene with the effect on SCC in the dairy sheep populations.
In our previous study of 30 ewes from the Bulgarian Dairy Synthetic Population breed from the Institute of Animal Science -Kostinbrod, both alleles were identified: the predominant mutant allele with a frequency of 0.68, and only two genotypes -heterozygous with a frequency of 0.63 and mutant homozygous -with 0.37. The relationship between the two genotypes and milk yield was studied, but there was no statistically significant difference between them probably due to the small volume of the studied animals [39].
The established genetic diversity in the studied region of intron 5 of the ABCG2 gene in sheep from both herds from Synthetic Population Bulgarian Milk breed calls for further research to clarify the relationship between the signs of milk productivity and this gene.

Conclusions
Significant genetic diversity was found in both herds of the Bulgarian Dairy Synthetic population breed in intron 5 of the ABCG2 gene in sheep. In both herds of this dairy breed, two alleles and three genotypes are established, and the wild-type allele "+" is predominant. The study of genetic diversity is the first crucial step in the marker-assisted selection. After Table 1. allele and genotype frequencies, observed and expected heterozygosity, coefficient of inbreeding in two BDSp herds, in Shumen (S) and in Kostinbrod (K). identifying different genetic variants in a locus, our future studies will focus on the association of genotypes and specific phenotypic traits.

Data availability statement
The data that support the findings of this study are available from the corresponding author, [M. Bozhilova-Sakova], upon reasonable request.

Disclosure statement
No potential conflict of interest was reported by the authors.