Rapid detection of type II diabetes mellitus in Saudi patients via simultaneous screening of multiple SNPs

Abstract A large-scale DNA-based assay, namely SNaPshot, was developed and validated for the determination of allelic polymorphisms of genes associated with type II diabetes (T2D) mellitus disease in Saudi patients. The new approach is alternative to phenotyping and physiological diagnosis. The assay includes single nucleotide polymorphisms (SNPs) of seven most common genes namely, GNB-3, GCK, SLC30A, KNJ11, TCF7L2, CDKN2A/B and HNF4A. The study proved that the new genotyping assay is rapid, easy to perform and robust for screening of high risk T2D family members. The new assay can be applied in genetic testing as an important tool for identifying significant genetic factors that help in predicting diabetes at early stages of life to avoid subsequent complications. This assay can also be used to evaluate the gene risk variants in populations of other geographic origins.


Introduction
Diabetes mellitus is considered nowadays as one of the main threats to human health. Several factors combine to develop type II diabetes (T2D). Factors like dietary habits and exercise are extremely important to cope with the disease [1,2]. However, T2D also has a strong hereditary component. Several gene mutations have been correlated with T2D risks [1,2]. Many of them interact with environmental factors and with other genes to increase the risk of developing T2D [3].
If the prevalence of diabetes mellitus type II (T2D) continues to increase at the current rate, predictions estimate that the global burden of this disease will rise to 366 million patients in 2030 [4,5]. Records indicate that, in Saudi Arabia, about 25% of the urban population and 19.5% of the rural population is diabetic [6]. Regional differences in the prevalence of T2D indicate that the northern (27.9%) and eastern (26.4%) provinces experience higher rates than the southern region (18.2%), where the rural lifestyle is common [6,7]. More recently, Saudi Arabia is paying more attention to the risk of T2D in terms of healthcare and medical research [8]. The Middle East region has not been spared from this scourge and currently is among those worst-hit [4]. More recently, consanguinity in marriage in the Arabic-speaking countries represents the main reason for such an increased prevalence of T2D [9]. Saquib et al. [10] indicated that future studies on T2D should aim to conduct more clinical trials, particularly in the area of prevention.
The candidate genes associated with T2D and their SNP (single nucleotide polymorphism) markers utilized in the present study are shown below. They include one reference SNP marker (namely rs4812829) with a strong linkage to the increased susceptibility to T2D in South Asia and Japan [11][12][13][14]. This marker is located at 20q13 in the intronic region of HNF4A (hepatocyte nuclear factor 4 alpha) gene, which encodes a transcription factor that binds DNA as a homodimer [15]. HNF4A controls the expression of several genes, including HNF1A (hepatocyte nuclear factor 1 alpha), another transcription factor that regulates hepatic gene expression [16][17][18]. Mutations in HNF4A gene have been previously shown to cause mature-onset type I diabetes [19], thus, functionally linking this region to diabetes. Cyclin-dependent kinase inhibitor-2A/B (CDKN2A/B) gene has been reported as a candidate gene of T2D with rs10965250 located near to it. This SNP was previously studied in a Saudi population [13,14]. Another SNP marker, namely rs12255372, is located within the transcription factor 7-like 2 (TCF7L2) gene, and has also been linked to T2D in Tunisian [20], Lebanese [21], Palestinian [22], Moroccan [23] and Egyptian [24] patients. The association between type-1 diabetes and rs5215 marker in potassium inwardly-rectifying channel, encoded by subfamily J, member 11 (KNJ11) gene, was proved in a Saudi Arabian study [13]. In addition, the rs13266634 marker, mapped within the zinc transporter protein member 8 (SLC30A8) gene, is associated with T2D [25]. This SNP is also known as the Arg325Trp (or R325W) variant, where the (C) allele encodes arginine (R), and the (T) allele encodes tryptophan (W). The glucokinase (GCK) gene promoter polymorphism, namely -30 G > A rs1799884, has also been associated with T2D. This SNP is one of four relatively common SNPs reported to represent a risk of T2D in an epidemiological study of the insulin resistance syndrome (DESIR), a prospective study on 3877 Caucasian participants [26]. Finally, the rs5443 marker localized in the G-protein beta3 subunit (GNB3) gene and commonly known as the C825T variant, has been linked to a number of metabolic conditions including obesity, coronary artery disease, insulin resistance and T2D [27].
Several molecular methods have previously been developed for the prediction and genotyping of T2D carriers. They include polymerase chain reaction (PCR) followed by restriction fragment length polymorphism (PCR-RFLP) [28,29], allele-specific PCR (AS-PCR) [30] or real-time-based allele-specific extension assays [31,32] and Sanger sequencing of PCR products. Although, reliable, these methods are lowto medium-throughput approaches. Alternatively, we have developed and evaluated a rapid, sensitive and low-cost high-throughput molecular screening to search for up to ten SNPs at a time via the use of a primer extension-based method, namely SNaPshot or called mini-sequencing panel. This method permits simultaneous analysis of several SNPs from numerous samples of T2D patients in a relatively short time. The new assay is customized for the Saudi population.

Sampling and DNA extraction
Forty-six T2D patients recruited from Endocrinology clinics, King Abdulaziz University Hospital (KAUH), Jeddah, KSA, were included in this study. Diagnosis was based on ADA Criteria (fasting plasma glucose !7 mmol/L, 2 hours' glucose challenge plasma glucose !11.1 mmol/L or HbA1c !6.5%) and blood samples were collected. Forty-two control samples were also collected from age and gender matched volunteers who are healthy individuals. The collected blood samples were used for DNA extraction using a DNA blood mini kit (QIAamp, Qiagen, Inc.) following the manufacturer's instructions. The concentration of DNA was estimated using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific Inc.) and DNA samples were kept at -20 C until use.

Design of regular PCR primers and single-base extension primers
SNP markers associated with the following genes; e.g. GNB-3, GCK, SLC30A, KNJ11, TCF7L2, CDKN2A/B and HNF4A, were selected after consulting the ensemble (http://www.ensembl.org/index) and the dbSNP (http:// www.ncbi.nlm.nih.gov/ projects/SNP/) databases. Primers used for PCR were designed using the BatchPrimer3 v1.0 software (http://probes.pw.usda.gov/batchprimer3/) [33] with few modifications to default parameters. By testing under different PCR conditions, primer sizes were established to be between 21 and 24 nt, melting temperatures from 55 C to 65 C and GC contents from 40% to 60%. The conditions were also modified to avoid template mispriming, self-complementarity mainly in the 3 0 region, hairpin and autodimer formation [32]. Sequences of the forward and reverse primers used in PCR and the length of the amplified products are shown in Table 1. Seven single-base extension (SBE) primers for the SNaPshot panel were designed according to the SNP locations to terminate the reaction at the 5 0 end of each SNP site. SBE primers were constructed to be between 31 and 72 nucleotides in length to secure non-overlapping among recovered peaks of capillary electrophoresis. The SBE primer sequences are shown in Table 2.
PCR PCR was performed with 0.2 pmol of each forward and reverse primer and 2X GoTaqV R Green Master Mix (Promega, Inc.). PCR (10 mL) was performed in a Veriti TM 96well thermal cycler, Applied BiosystemsV R (Life Technologies, CA, USA) using the following conditions: denaturation at 95 C for 3 min, followed by 35 cycles of 95 C for 30 s, 64 C for 30 s and 72 C for 30 s, and a final extension step of 72 C for 7 min. The amplicons were analyzed in agarose gels (2%) stained with SYBRV R Safe Nucleic Acid Gel stain (Invitrogen Inc.). PCR products of expected sizes (643-697 bp) were purified with a PCR purification kit (QIAquick, Qiagen, Inc.) to remove excessive dNTPs and primers.

SNaPshot reaction preparation
SNaPshot analysis was performed using the ABI Prism SNaPshot Multiplex kit (Applied Biosystems, Life Technologies, CA, USA). One picomole of SBE primers (1.5 mL) was added to 3.5 mL of purified multiplex PCR products (30 ng) and 5 mL of the ABI Prism SNaPshot TM Multiplex kit (Applied Biosystems, Life Technologies, CA, USA) to reach a final reaction volume of 10 mL. The SNaPshot reaction placed on a Veriti TM 96 well thermal cycler (Applied Biosystems, Life Technologies, CA, USA) with the following conditions: 25 cycles of 96 C for 10 s, 48 C for 10 s and 60 C for 45 s. At the end of cycling, the SNaPshot reaction was treated with 1 unit of shrimp alkaline phosphatase (GE Healthcare, Little Chalfont, UK) at 37 C for 60 min followed by 75 C for 15 min to inactivate the enzyme.

SNaPshot analysis
The purified SNaPshot reaction was denatured at 95 C for 5 min in a Veriti TM 96well thermal cycler (Applied Biosystems, Inc.) after mixing with 1 mL of SNaPshot reaction, 9.5 mL of Hi-Di formamide and 0.5 mL of GeneScan TM LIZ120 TM internal size standard (Applied Biosystems, Inc.). Then, fragments were analyzed in a 3500 Genetic Analyzer (Applied Biosystems, Inc.) using a POP-7 TM Polymer (Applied Biosystems, Inc.). The conditions during fragment analysis were as follows: injection (15 s), pre-run (3 min), run time (15 min) and data delay (4 min). Then, the results were analyzed using GeneMapper Software v4.1 (Applied Biosystems, Inc.).

SNaPshot validation
To check the accuracy and reproducibility of the SNaPshot assay, DNA samples of 20 patients and 20 control individuals were genotyped for the seven genes by Sanger sequencing. When the results did not match, both the direct sequencing and SNaPshot   analyses were repeated in order to confirm 100% compatible results.

Results and discussion
In this study, multiplex SNaPshot assay was developed to detect seven SNPs associated with T2D in individuals who are at risk of the disease. The multiplex SNaPshot method was chosen for SNPs that could distinguish the most common alleles. PCR was optimized toward the identification of best primer pairs in single reactions ( Figure 1) to avoid undesirable amplicon formation that could invalidate test results. Moreover, additional tests to decrease the volume of reaction without losing quality or sensitivity were also carried out. Then, SNaPshot reaction was performed by adding seven different SBE primers that hybridize one nucleotide pointing the chosen SNPs. By hybridizing fluorescent labeled ddNTPs, each with different color, to the 3 0 end of the SBE primers, we were able to identify the type of the mutation. Add-on tails were added to each primer in order to secure differential migration of peaks during capillary electrophoresis. The specificity of both PCR and the SNaPshot reaction was confirmed for individual SNPs before approaching multiplexing. Capillary sequencing confirmed that the designated SNPs were properly selected and correctly analyzed. It is important to note that three, out of the seven, SBE primers were re-designed, as they originally did not migrate exactly as expected, probably because of the differential weights of the associated tails (purine or pyrimidine). At this point, an SNP panel using GeneMapper software was created. With this panel, samples were easily examined based on the exact migration of each SBE primer and, consequently, each allele was identified. As a strategy to avoid  misinterpretation in heterozygous samples, each peak should likely measure half of the peak size of the larger ones. SNaPshot genotyping was done for T2D patients and controls and the allele frequencies of the SNPs belonging to the seven genes are shown in Table 3. The results throw light on the genotypes that are associated with risk of the disease versus those referring to individuals that are unlikely to be T2D patients (control). The alleles of the six genes, GNB-3, GCK, SLC30A, KNJ11, TCF7L2 and CDKN2A/B, which refer to the high risk of the disease are A/G/C/T/T/A, respectively. On the other hand, the G/A/T/C/G/G alleles of the same genes, respectively, refer to the low risk of the disease as their frequencies were higher in the control individuals. It is important to note that the homozygous states of alleles A and T of genes CDKN2A/B and SLC30A8 are exclusive markers for the patients and the healthy individuals, respectively. Other genotypes indicate intermediate risk of the disease based on the allele frequencies. No particular allele of the HNF4A gene is likely to be a marker of the disease.
A typical GeneMapper electrophoregram is shown in Figure 2, which represents the pattern of a given T2D patient with the genotype of GA/GA/CC/TT/GT/AA referring to the SNPs of the six genes GNB-3, GCK, SLC30A, KNJ11, TCF7L2 and CDKN2A/B, respectively. This genotype has high frequencies of the alleles associated with the high risk of the disease, especially those in the homozygous state, e.g. SLC30A, KNJ11 and CDKN2A/B. . Electrophoregrams generated via Sanger sequencing to confirm SNaPshot fragment analysis of seven SNPs of a subject whose genotype based on capillary electrophoresis is GG/GG/CT/TT/GT/GA/TT referring to SNPs rs5443, rs1799884, rs13266634, rs5215, rs12255372, rs10965250 and rs4812829, respectively. These SNPs refer to genes GNB-3, GCK, SLC30A, KNJ11, TCF7L2, CDKN2A/B and HNF4A, respectively. Note that the pattern of CDKN2A/B gene is for the reverse order of the gene sequence.
To validate the data, Sanger sequencing was applied for 40 samples out of those genotyped by capillary sequencing. As a model, the data of Sanger sequencing in Figure 3 for the seven (SNPs) genes successfully validated the results of the GeneMapper electrophoregram shown in Figure 2. Overall, the genotypes of the seven genes analyzed in these 40 samples perfectly agreed with those of the SNaPshot analysis.
This study can be considered as the first of its kind that explored the SNaPshot method to genotyping T2D SNPs. Efforts on developing a multiplex SNaPshot reaction capable of identifying 14 alleles of seven genes were targeted. The SNaPshot panel assay that we used is based on simultaneous amplification of seven fragments associated with SBE. The new assay requires no gel electrophoresis, reduces the processing time, while maintaining quality. No amplification problems were generated during the multiplex PCR, indicating that the technique of the new assay is satisfactorily. We speculate that the success of the new protocol lies mainly in the selection of the proper SNPs within the T2D related genes, and the proper design of the regular and SBE primers. In addition, selecting the proper region for original PCR within the genes increases the successful recovery of target amplicons. Furthermore, verification of primary and possible secondary structural formation reduced the risk of unwanted products due to formation of hairpins and primer dimers [34,35]. The choice of SBE primer orientation on the genes is a challenge, as it should consider any other defined polymorphisms within the SBE primer sequences. SBE primers are likely designed to hybridize the gene's sense strand [36,37].
More recently, a number of 41 genes previously associated with T2D and obesity were studied in a Kazakh population [38]. The results indicated that some of these SNPs might have a considerable effect on the predisposition to development of diabetes. The authors also indicated that T2D is more frequently diagnosed in men of lower age. Although, SNaPshot assay is considered as a good biomarker for diagnosis of T2D, prognostic biomarkers were recently shown to be more beneficial for the early prevention of DM among pre-diabetic individuals [39].

Conclusions
Overall, the new SNaPshot genotyping assay was shown to be suitable for detecting T2D related mutations in the selected genes. The presented panel was successfully established, optimized and applied, getting reproducible results, indicating that it is suitable for large-scale screening. We conclude that the multiplex SNaPshot assay is applicable, solid, flexible, cost-effective and represents a good alternative to Sanger sequencing that can be applied in regular screening. The new SNaPshot assay can be a sole solution for genotyping alleles especially when no commercial genotyping kit is available. The assay provides a number of SNPs loci with two or more alleles by which we can precisely measure the risk of T2D in tested individuals. Finally, the data presented in this study can facilitate the establishment of a new database of genotyped individuals to monitor the progress of T2D within the families that had a history of the disease.

Disclosure statement
No potential conflict of interest was reported by the authors.

Ethics statement
Ethical approval from KAUH unit of biomedical ethics research committee was granted for this research project. All participants signed institutional informed consent forms for the proposed work