alexa
Reach Us +44-1764-910199
Discovery of Novel DNA Variants in Jordanians Population by Re-Genotyping Affymetrix DMET Arrays Data Using DNA Sequencing | OMICS International
ISSN: 2168-9547
Molecular Biology: Open Access
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on Medical, Pharma, Engineering, Science, Technology and Business
All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Discovery of Novel DNA Variants in Jordanians Population by Re-Genotyping Affymetrix DMET Arrays Data Using DNA Sequencing

Marzooq Ammar AL*

Faculty of Graduate Studies, Jordan University of Science and Technology, Manama, Bahrain

*Corresponding Author:
Marzooq Ammar AL, M.Sc.
Faculty of Graduate Studies
Jordan University of Science and Technology, Manama, Bahrain
Tel: 0097338898925
E-Mail: [email protected]

Received March 13, 2015; Accepted June 24, 2015; Published June 30, 2015

Citation: Marzooq Ammar AL (2015) Discovery of Novel DNA Variants in Jordanians Population by Re- Genotyping Affymetrix DMET Arrays Data Using DNA Sequencing. Mol Biol 4:126. doi:10.4172/2168-9547.1000126

Copyright: © 2015 Marzooq Ammar AL, This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Molecular Biology: Open Access

Abstract

The Affymetrix DMETTMM plus platform (Affymetrix, Santa Clara, CA, USA) is a GeneChip where 1936 SNPs can be genotyped in any given sample at once. The 1936 SNPs were distributed cross 225 genes in the genome. Thirtynine genes on the chip belong to phase I–II drug metabolism, disposition and drug transport gene family. These genes are functional in metabolizing the most widely prescribed anticancer drugs in the world including aromatase inhibitors, tamoxifen and thiopurines groups. In a high-throughput GeneChip array which is based on hybridization with allelespecific probes; genotyping errors are very common which limits the technologies applications; in addition missing calls for many SNPs on the chip immerge as a bigger and more serious problem in high- throughput genotyping methods. This study focuses on re-genotyping these No-call genotypes to maintain sample sizes that are already genotyped by Affymetrix DMETTMM plus platform. Sixty six different variations were identified, 39 of them had No-call genotypes; our re-genotyping resulted in increasing calling rate from 89.08% to 95.56%. Furthermore, Among the variations that were identified, 8 were non-reported: C.-209T > G in UGT2B7 and -698°C > A in CYP1A2 as promoter variants, c.252G > T in SLC22A6 and c.1356T > C in SLC15A1 as silent mutation, c.1277 + 69°C > T and c.1277 + 82°C > T in SLC22A1 as intronic variants associated with each other and the most important variations are two missense (non- synonymous) in NAT2 gene: D20N c.58G > A and G11S c.31G > A. Affymetrix DMETTMM plus platform.

Abbreviations

PHBC: Princess Haya Biotechnology Center; PCR: Polymerase Chain Reaction; RBC: Red Blood Cells; SDS: Sodium Dodecyl Sulphate; SE: Sodium EDTA; SNP: Single Nucleotide Polymorphism; TE: Tris-EDTA; DNA: Deoxyribonucleic Acid; EDTA: Ethylene Diamine Tetra Acetic Acid; TBE: Tris-base Borate EDTA; dNTPs: Deoxy Nucleoside Triphosphate; bp-Base Pair; COX2: Cyclooxygenase Isoenzymese 2; DMET: Drug Metabolizing Enzymes and Transporter; ADME: Absorption, Distribution, Metabolism, and Excretion; ABCB1: ATP-Binding Cassette Sub- Family B Member 1; ABCG2-ATP: Binding Cassette Sub-Family G Member 2; VKORC1: Vitamin K Epoxide Reductase Complex Subunit 1; FMO2: Dimethylaniline Monooxygenase [N-Oxide-Forming] 2; SLC22A1: Solute Carrier Family 22 Member 1; SLC15A1: Solute Carrier Family 15 Member 1; SLC15A2: Solute Carrier Family 15 Member 2; SLC22A6: Solute Carrier Family 22 Member 6; CYP1A1: Cytochrome P450, Family 1, Subfamily A, Polypeptide 1; CYP1A2: ytochrome P450, Family 1, Subfamily A, Polypeptide 2; TPMT: Thiopurine Methyltransferase; NAT2: N-Acetyltransferase 2; CDA: Cytidine Deaminase; UGT1A1-UDP-Glucuronosyltransferase 1-1; UGT2B7: UDP-Glucuronosyltransferase-2B7; μl: Micro liter; Nm: Nano liter; pmol: Pico mol; ml: Mili liter; g: gram; c: CELIOUS; 3`UTR- 3` Untranslated Region; D or Asp: Aspartic Acid; S or Ser: Serine; G or Gly: Glysine; N or Asn: Asparagine

Introduction

The therapeutic efficacy of most drugs is influenced by a number of different factors that in part include age, weight and concurrent drug use [1]. These factors may vary between patients [2]. In addition, fixed parameters such as gender and human genome sequence variation can contribute. This genetic variation underlies every individual’s response to drugs [3]. The vast majority of the enzymes involved in drug metabolism are highly polymorphic [4] and allele frequencies of low-activity variants often differ by population [5]. Consequently, their activity may differ depending upon an individual’s genotype(s). For example, drugs may be metabolized more slowly in individuals who are carriers of a genetic polymorphism that results in a decreased or null activity of a given enzyme. These individuals are at particular risk for adverse drug reactions (ADRs) or therapeutic failure [6]. Conversely, drug therapy could be ineffective if the drug is metabolized too rapidly. Genetically determined variation particularly impacts drugs with narrow therapeutic indices, increasing the risk for the development of ADRs [7].

In the common complex diseases, multiple genes contribute to the phenotype, each with a small effect. To identify these genetic factors, pharmacogenomics approaches now include microarrays, highthroughput automated DNA sequencing. This new approach are more efficient and sensitive allowing for multiplexed analysis of mutations/ variants in several genes and thus are more predictive for complex diseases wherein multiple genetic factors are involved [8].

The Affymetrix® DMET™ Plus platform (DMET stands for Drug Metabolizing Enzymes and Transporters) is such of this techniques, which enables highly multiplexed genotyping of known polymorphisms in Absorption, Distribution, Metabolism, and Elimination (ADME)- related genes on a single array [9].

Recently at PHBC they have been using the Affymetrix® DMETTMM plus platform (Affymetrix, Santa Clara, CA, USA) to genotyped 1936 SNPs variants (1931 single nucleotide polymorphisms ‘‘SNPs’’ and 5 copy number variations ‘‘CNVs’’). The 1936 SNPs were generated from a total of 225 ADME-related genes [1]. These genes are functional in metabolizing the most widely prescribed anticancer drug in the world including aromatase inhibitors, tamoxifen and thiopurines groups [10,11]. The array uses molecular inversion probes (MIPs) that amplify and hybridize independently of genomic sequences as well as universal primers and tag sequences [12]. The platform uses a single-sample genotype calling method that compares each marker to an expected signal distribution defined by large training sets at Affymetrix® [10]. The genotypes for the 1,936 single nucleotide polymorphism (SNP) on the chip were essentially performed by applying the laboratory protocol followed the standard procedures described in the targeted genotyping (TG) System user guide developed by Affymetrix ® Company (http://www.affymetrix.com).

In a high-throughput Gene Chip array which based on hybridization with allele-specific probes; genotyping errors is very common which limited the technology application; in addition missing No-call s for many SNPs on the chip immerge as bigger and more serious problem in high-throughput genotyping methods [10]. Several factors contribute to a No-call SNPs on the chip including; microarray manufacturing, probe hybridization stringent conditions, poor quality of DNA samples. These factors affected widely the final analysis of the genotyping result and frequently resulted in discarded the genotype for many SNPs [10]. Eliminations of many genotyping data because of the No-call genotypes SNPs leading to unexpected high percentages of heterozygosis for many SNPs on the chip and departure from Hardy-Weinberg equilibrium [13], since that, is important to achieve this study to:

• Confirmation by DNA sequencing the No-call alleles generated from Affymetrix DMETTMM microarray data using DNA samples from the same personnel previously their DNA used on Affymetrix® DMETTM.

• To investigate the extended effect of the No-call alleles on the general population allele frequency for these SNPs and reevaluate our cut-off and QC standard in filtering our Affymetrix DMETTMM microarray data.

• To discover new DNA variants could be unique to Jordanian population and evaluate their role on drug efficacy and adverse reactions.

Literature Review

Pharmacogenetics and pharmacogenomics overview

The study of the relationship between genetics and therapeutic drugs is usually called Pharmacogenetics or pharmacogenomics [14]. Pharmacogenetics investigated the adverse reaction when it is happen at the genetics level where when investigations started at the genetics level such as GWAS studies to explain certain drug adverse issue in the patients [15,16].

Pharmacogenetics as a terminology covers the mechanism of drug actions as well as the genetic predisposition for adverse reactions [10,17].

Pharmacogenomics studies are directly linked to the population genetics studies; since the population genetics is investigated quantitatively the genetic variation in populations and how these variation maintained and inherited from generation to generation and fixed in given population. All these information is very helpful in pharmacogenomics studies and aids in highlighted genetic mutation and polymorphism in association with adverse reaction for drugs [1,2,18].

Population genetics research contributed enormously to molecular genetics testing by providing disease allele frequency for many disease in specific populations and resulted in defining targets genes and variation to be screen for in certain country or among specific ethnicity [19]. Population genetics; also established allele frequencies for many variants and these data been used in setting risk calculations and genetic susceptibility for complex diseases [1], and resulted in large scale testing for many peoples and improve the medical services provided for them [2].

Drugs response and adverse reaction variation cross different population and within the same communities suggested genetic variations also; and that’s evident with a slow metabolizer phenotype for given drugs in certain population where the same drug metabolize the same drugs faster [18,19].

Phenotypes showing poor and extra extensive metabolism have been documented at the genetic levels where genetic variation in gene encoding enzymes involved in metabolism (Drug Metabolism Enzymes and Transporters, DMET) [20] is characterized and reproducible [21].

Drug response is complex where several genes and several mutation and polymorphism shape up the outcome of the drug efficacy [22]. Add to this complexity is environmental factors; diet and others factors [8,23].

Racial and race-based genetic susceptibility in correlation with drug therapy also been documented where White Americans and African Americans with congestive heart failure are responding differently to the enalapril drug an angiotensin-converting enzyme inhibitor drugs [24-26]. In conclusion; personalized medicine is becoming a reality because of these finding of genetic variation and the association with drug phonotypic response and adverse reactions [2].

Genetic variants and drugs metabolism: Genetic variation in the human genome can be at single nucleotides or at the chromosome level. Cytogenetic rearrangements such as large deletions, inversions, and duplications occurred in the genome and been associated with diseases or it could be just normal variants. Single nucleotide polymorphism (SNP) also is one of the most common genome variation and occur in coding region of genes; regulatory regions; introns and at the splice junctions and there is and there is 0.1% between ant two genome at the SNPs level [27,28].

Most of the enzymes involved in drug metabolism are highly polymorphic, their allele’s frequency also varied in different populations and their activities cross individuals is also different [4-7,29,30].

Drugs with narrow therapeutic indices (ED50/TD50) (Figure 2.1) are the most genes categories receive geneticist research. Today molecular diagnostic is very helpful in choosing the right drug and the optimal dosage for patients [31].

molecular-biology-therapeutic-drug-ratio

Figure 2.1: The therapeutic index of a drug is the ratio of the dose that causes toxicity to the dose that causes a clinically effective response in a population of individuals.Where: TD50 is the dose of drug that causes a toxic response in 50% of the population and ED50 is the dose of drug that is therapeutically effective in 50% of the population (Craig and stitzel, 2003).

The use of genetic profiles to individualize drug therapy is the aim of personalized medicine [32]. A recent study done by the Food and Drug Administration [33] shows at least 121 drugs were approved for molecular markers testing (Figure 2.2). Appling molecular testing in Pharmacogenetics contributes to reduce the death rate, improved quality of life, and very good positive economic impact [28].

molecular-biology-drugs-pharmacogenomic-information

Figure 2.2: Number of drugs that were approved with pharmacogenomic information in their drug labels during each 10-year period from 1945-2005. During the 60 years covered by this analysis, 121 drugs were approved that have genomic biomarker information in current product labeling (Frueh et al., 2008).

Response rate for cancer drug and analgesic COX2 inhibitors drugs range between 25% for cancer treatments to 80% respectively and this due to genetic and environmental heterogeneity [31].

Haplotypes: All variation on single each chromosome is called a haplotype [34] and each set of these polymorphisms which are in phase passed on from generation to generation [29].

In population genetics, haplotype frequency and variation were successfully used in several genetics application including; pharmacogenomics, migration and immigration rates, genetic demography and human evolutionary history [35-37]. More importantly, haplotyping studies contributed into discovery for many genetics disease by mapping the causative genes [38]; and becoming more powerful than those based on single markers [35,39,40]. Haplotypes can be generated either by experimental base or from a family base studies [41-44].

Haplotype block structure for hundreds of genomic regions which scattered and mapped to all chromosomes were developed with their tagging SNPs and some of them been documented and link with drugs phenotypes including UM, EM, IM, PM metabolizing status [46] (Figure 2.3).

molecular-biology-copies-human-genome-differ

Figure 2.3: Any two copies of the human genome differ from one another by approximately 0.1% of nucleotide sites. In this example, most of the DNA sequence is identical in these chromosomes, but there are three nucleotides where variation occurs. A pattern of DNA sequence variation defines a haplotype (Catanzaro and Labb, 2009).

Historical overview of drugs metabolizing, eliminating and transporters dmet genes and phaemacogenetics

The effect of genetic factors on drug response was observed way back to the fifty’s and serious studies started in seventies when Robert L Smith and colleagues studied the metabolism and PKs of debrisoquine (an antihypertensive drug) by giving the drug to themselves and watching its metabolism [47]. Population’s studies shows at least 6–10% of Caucasians were poor metabolizers of debrisoquine a drug linked phenotypically to genetic variation in CYP2D6 gene (Figure 2.4) [48]. Genotype–phenotype studies of CYP2D6 variants are now performed using debrisoquine or dextromethorphanas surrogate substrates as probe drug [49].

molecular-biology-Pharmacogenetics-debrisoquine-nortriptyline

Figure 2.4: Pharmacogenetics of debrisoquine and nortriptyline. The activity of the CYP2D6 enzyme is measured by the metabolic ratio (MR), which is the ratio of amounts of a substrate drug, debrisoquine and its metabolic product in urine after a standard dose of the drug. High ratios show poor conversion due to low enzyme activity (Wadelius and Piromohamed, 2007). The graph shows observed ratios for a range of patients. The same enzyme is largely responsible for phase Ι catabolism of the antidepressant drug nortriptyline. Depending on the CPY2D6 phenotype as measured by the metabolic rario, patients require different doses of nortriptyline (Strachan and Read, 2011).

Important allelic variants to drug treatment results have been found in the genes encoding enzymes and transporters used in drug pharmacokinetics: absorption, distribution, metabolism and excretion (ADME) and equally important in Phase I and II enzymes [8,12].

Most of the phase I reactions are catalyzed by the cytochrome P450 (CYP) enzymes [52,53], However, CYP1, CYP2 and CYP3, catalyze most phase I reactions of drugs; where over 75% of prescribed drugs are metabolized in mainly by three subfamilies: CYP3A, CYP2D6 and CYP2C [54]. N- Acetyl Transferases 1 and 2 (NAT1 and NAT2), Thiopurine S- Methyl Transferase (TPMT), and the Uridine Diphosphate Glucuronosyl Transferase (UGT) are major enzymes in Phase II reaction [55].

CYP3A family catalyzed over 37% of the drugs, where 17% ; 15%, 10%, 9% 6%, 4% and 2% are catalyzed by CYP2C9, CYP2D6, CYP2C19, CYP1A2, CYP2C8, CYP2A6 and CYP2E1, respectively [56]. CYP2C9, CYP2C19, and CYP2D6 genes polymorphisms are among the best established in correlation to drug therapy [52]. Currently the FDA approves molecular testing for many genes and approves also labeling of many drugs with strong recommendation for genotyping patients before given certain drugs such as atomoxetine, thioridazine, voriconazole and irinotecan [16].

Microarrays

Several technologies have been used in identifying singe gene mutation with Pharmacogenetics application such as RFLP and DNA sequencing [57]. Recently, microarrays, high-throughput automated DNA sequencing and genotyping, informatics, and mass spectrometry are widely used in high-throughput experiment in pharmacogenetics studies [58].

The first commercial SNP array was released in 1996 by Affymetrix (Santa Clara, CA) and targeted about 1,500 human SNPs [59] out of millions of SNPs characterized in the human genome [60] followed by Illumina (SanDiego, CA) and Nimblegen (Madison, WI) arrays which all are commercially available. Several statistical and computational tools have been developed to increase the efficiency of the Chip data [61,62].

Affymetrix® DMET™ Plus platform

Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) Offers a complete profile of 1,936 markers in 225 genes focused on drug metabolism. Genes including on the chip are belong to different transcription regulators, selected drug targets and several classes of drug transporters [13,17]. The polymorphisms and mutations described on DMET Plus Arrays were chosen because of their Pharmacogenetics application and their association with clinical outcome in patients [11,63]; and these polymorphism and their clinical application have been discussed and approved by pharmaceutical industry and academia (ADME Consortium; https://pharmaadme.org/). The DMET plus Assay Panel has been tested across a minimum of 1,200 individuals from multiple populations including 715 DNA samples from Caucasian, African, and Japanese and Chinese populations from the International HapMap Consortium [20]. Key genes are included on the chip including VKORC1, cytochrome P450 [CYP] 2C9, CYP4F2 [64], CYP3A4, CYP2C9, CYP2C19 (Figure 2.5) [65].

molecular-biology-Allelic-frequencies-variants

Figure 2.5: Allelic frequencies of CYP2C19 variants across populations. *1/*3 variant (noted in red) is found in much greater frequency in Chinese and Japanese populations. This information can help inform recruitment for trials examining this variant. If you have an indication that *1/*3 is resulting in a PK issue, you would need a very huge trial if you recruited patients from anywhere other than Chinese/Japanese pops (Brandt et al., 2007).

Affymetrix® DMETTMM plus platform data analysis: DMET™ Console software is widely used in analyzing the Affymetrix® DMETTMM plus platform data. The soft wear translates the genotype results into star nomenclature [71]. The star nomenclature style is always used in research studies of pharmacogenomics. DMET™ Console software equipped with preset and analytically validated cluster for each SNPs on the Chip [20] which facilitated the transformation of the genotype into clinical application [9].

DMET Case Control Data Analysis workflow [13] summarized in the following steps, as shown in (Figure 2.7).

molecular-biology-Molecular-Inversion-Probe

Figure 2.6: Molecular Inversion Probe (MIP). Each MIP is 120 bp oligonucleotides long with a unique gap fill for SNP of interest. Each probe contains unique tag (barcode) sequence corresponding to interrogated SNP (Ji and Welc, 2009).

molecular-biology-Workflow-clinical-bioinformatics

Figure 2.7: Workflow of a clinical bioinformatics experiment from the sample collection to the data analysis.Workflow of data in a typical DMET Case-Control Data analysis.

- Biological samples collection in preparing for microarray experiments [13].

- Generation raw microarray data (CEL data) [13].

- DMET data preprocessing: DMET Console software produces a table summarizing the detected SNP and linked them to the sample.

- SNPs detection and find there significant [13].

No-call values

Studying millions of single nucleotide polymorphisms (SNPs) and genotyping them is facilitated because of the advance in highthroughput microarrays technologies [72,73]. Several methods have been developed to detect genotyping errors or removing its effects on analyses [74-76].

Human’s high density SNP microarray chips are successfully implemented more than in other species Chip and that due to the sequence of the human genome; and the genotyping errors in the range from 0.05% to 5% [55]. Where other species chip such as cattle microarray the genotyping errors could reach up to 20% (Su, 2005).

Most of the time when one SNP genotype score is wrong it affected building accurate haplotype which leads to ignore all the data generated for one sample and resulted in the loss of a huge data portions (Yu et al., 2009). Genotyping errors and missing genotyping SNPs demonstrated in the finding of the data of the Phase I and Phase II Hap Map, where less than 20% data that failed to pass QC was due to genotyping, while more than 65% of the markers show missing data in over 20% individuals [10,77]. Various methods of estimating missing values in a dataset exist and been developed for deletion, insertion of sample mean, and linear regression [78]. Each approach presents problems built in the method itself or in the nature of the pattern of missing data [79]. Correction for these data is too expensive and can be done in two different ways including repeat the genotyping or alter the data analysis tools to make room for the missing data [80-83].

Large studies on the accuracy of haplotype inference methods have been carried out [38,84], but until recently there has been little investigation of the effect of missing data on these techniques [85].

There are two major causes for No-call genotypes. One is due to poor quality of DNA where this resulted in not enough strong intensity of fluorescence signals over the background. The other comes when an observation, i.e., a read out of fluorescence signals, cannot be put clearly to any of the clusters of genotype, therefore, is subject to ‘Nocall’ procedure [10] ( Figure 2.8).

molecular-biology-clustering-one-marker-method

Figure 2.8: The clustering results based on a one-marker-at-a-time method. Values on the X-axis and Y-axis are normalized signal intensities of two alternative alleles (A andC). Estimated genotypes “AA”, “AC”, and “CC” are indicated by symbols “0”, “1”, and “2”, respectively. Question marks represent missing values (i.e. No-call s) (Yu et al., 2009).

Also, there are many different types of genetic variations that are not seen well by genotyping technologies, like rare SNPs, and short deletions and insertions. Because of this another genotype technique should take a place which No-call could be as a signal for Novel SNPs or other variations types [86].

Materials and Methods

Sampling frame

Jordanian individuals donated blood samples to be used as a probe for Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA); with the aims of determine the pharmacogenomics of ADME genes profile in the Jordanian population. The goal is defining any pathological haplotype frequency with the hope of setting molecular diagnostic for personalized medicine in Jordan.

Inclusion and exclusion criteria

Some of the samples out-put No-call genotypes data in specific SNPs after using the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) in PHBC, this included important genes like: CDA, FMO2, SLC22A6, UGT1A1, UGT2B7, CYP1A1, SLC22A1, NAT2, ABCB1, ABCG2, SLC15A2, VKORC1, CYP1A2, SLC15A1, TPMT. Copy number variants and pseudo genes validation is excluded from this study.

Sampling procedure

All the samples that fit our criteria were included in our study as possible.

Sample size calculation

Sample size needed to satisfy the objectives was calculated using the following equation:

n = ((SD* t)/Δ) 2)

where SD is the Standard Deviation which is assumed to be 17.12 based on the study made in University of Jordan; t is the value of t-test at α (type one error) assumed to be 0.05; Δ is The meaningful difference needed to be observed among different genotypes. The sample size equal 44 but we increase the sample size to 101 to account for potential missing data.

Hypothesis to be treated

Jordanian populations have their privet genetic variations in ADME genes that influence the sufficient therapeutic dosage for a numerous number of drugs. No-call genotype for several SNPs (Affymetrix® DMETTMM plus platform) is limited the technology. Novel SNPs at the DMET probe positions might be the cause of the No-call result, for this reason we carried massive genotyping for the No-call using DNA sequence technology.

Institutional research board (IRB) approval

The study has been approved by the Institutional Research Board (IRB) (Appendix D).

Subjects

A total of 101 control DNA samples donated by Jordanian individuals were included in our study which has been used to generate the Chip data (Affymetrix DMETTMM plus platform) in PHBC by professional geneticist’s staff, and then it was investigated in this study. Fifty samples were collected before in PHBC since 2010 and they have their chip data and 51 samples collected for this works (Figure 3.2). All volunteers completed a consent form in Arabic (Appendix E).

molecular-biology-PCR-amplification-protocol

Figure 3.1: The PCR amplification protocol.

molecular-biology-Summary-study-design

Figure 3.2: Summary of study design. (*) represents the steps done by PHBC and afterwards the completed product can be obtained to use include in our study.

Primer design

Initially, the No-call SNPs genotypes defined; and there location on the correspondent genesis fixed. Both database websites http://www.ensembl.org/index.html, and http://www.ncbi.nlm.nih.gov/ utilized in this processes. The Primer3 software online: http://primer3.ut.ee/ used to design PCR primers flanking these No-call SNPs obtained from Affymetrix® DMETTMM plus platform. Primer designed executed carefully to include more than one No-call SNPs in the same amplified region. All primers were synthesized in PHBC.

Methods

The workflow includes sample collection, DNA extraction, polymerase reaction (PCR), gel electrophoresis, purification, DNA sequencing, sequencing analysis and statistical analysis.

Sample collection: After a signed informed consent, 3ml venous blood samples were collected by a vein puncture procedure into 3 ml EDTA tubes from each volunteer and the tubes were mixed properly to ensure the appropriate mixing of the blood with EDTA to prevent clotting. This was done according to the University Review Committee for Research on Humans at Jordan University of Science and Technology. The blood samples were stored at 4°C until the genomic DNA was extracted from each.

DNA Extraction: Genomic DNA from a total of 51 blood samples was isolated from peripheral blood by using a commercially available kit (Qiagen Germany).

DNA extraction procedure

1. For each 300 μl sample volume: 900 μl of RBC Lysis Solution was added to a sterile 1.5 ml micro centrifuge tube.

2. The tube of blood was gently rocked until thoroughly mixed; then transferred to the tube containing the RBC Lysis Solution. The tube was inverted 5–6 times to mix.

3. The mixture was incubated for 10 minutes at room temperature (inverted 2–3 times during the incubation) to lyse the red blood cells, and then was centrifuged at 13,000–16,000 × g for 20 seconds at room temperature.

4. The Supernatant was removed and as much as possible was discarded without disturbing the visible white pellet. Approximately 10–20 μl of residual liquid remained in the 1.5 ml tube.

5. The tube was vortexed vigorously until the white blood cells were resuspended (10–15 seconds) to obtain efficient cell Lysis.

6. About 300μl Cell Lysis Solution was added to the tube containing the resuspended cells. The solution was pipetted 5–6 times to lyse the white blood cells until the solution became very viscous. If clumps of cells were visible after mixing, the solution was incubated at 37°C until the clumps were disrupted.

7. About 100 μl of Protein Precipitation Solution was added to the nuclear lysate and vortexed vigorously for 10–20 seconds.

8. The mixture was centrifuged at 13,000–16,000 × g for 3 minutes at room temperature. A dark brown protein pellet was visible.

9. The supernatant was transferred to a clean 1.5 ml micro centrifuge tube containing 300 μl of isopropanol incubated at room temperature.

Note: Some supernatant may remain in the original tube containing the protein pellet. This residual liquid was left in the tube to avoid crosscontamination of the DNA solution with the precipitated protein.

10. The solution was gently mixed by inversion until the white thread-like strands of DNA formed a visible mass.

11. The mixture was centrifuged at 13,000–16,000 × g for 1 minute at room temperature. The DNA was visible as a small white pellet.

12. The supernatant was decanted and one sample volume of room temperature 70% ethanol was added to the DNA. The tube was gently inverted several times to wash the DNA pellet and then step 11 was repeated.

13. Ethanol was carefully aspirated using either a drawn Pasteur pipette or a sequencing pipette tip. The DNA pellet was very loose at this point and care was taken to avoid aspirating the pellet into the pipette. The tube was inverted onto a clean absorbent paper and the pellet was air-dried for 10–15 minutes.

14. About 100 μl of DNA Rehydration Solution was added to the tube and the DNA was rehydrated by incubation at 65°C for 1 hour. The solution was periodically mixed by gently tapping the tube. Alternatively, the DNA was rehydrated by incubating the solution overnight at room temperature or at 4°C.

15. DNA was stored at 2–8°C.

Quantification of isolated genomic DNA: The DNA yield was measured by using a NanoDrop 1000 spectrophotometer (Thermo Scientific). After mixing, 1 μl from each sample was transferred onto the lower measurement pedestal of the instrument, and using operating software on the computer, the concentration of each sample was measured in ng/μl. The TE buffer which was used for rehydration after DNA extraction, was used here as a blank.

Primers optimization: We performed gradient and touchdown PCR reaction using the DNA samples to define the best annealing temperature for the PCR conditions (Table 3.1).

molecular-biology-Product-Size

Table 3.1: PCR Primers, Tm and Product Size.

Polymerase chain reaction (PCR): Twenty pairs of primers were tested and have been used to amplify genomic DNA for over 101 DNA samples. Primer sequences and there expected product sizes are listed in (Table 3.1) and (Figure 3.3).

For each pair primers amplification was carried out using GoTaq® Green Master Mix (Promega; USA). Each PCR reaction was carried out in a total volume of 26 μl containing 12.5 μl of PCR master mix (Promega, USA), 8.5 μl of nuclease free water, 1.3 μl (5-10 pmol/μl) of each primer (Table 3.1) and 2-3 μl of DNA. The amplification protocol was: initial denaturation for 10 min at 95°C, 35 cycles were performed with denaturation at 95°C for 30 seconds, annealing at each primer temperature °C (Table 3.1) which was determined in the optimization step for 30s, and then extension at 72°C for 45 seconds. The final extension step takes 10 min at 72°C (S1000™ Thermal Cycler, BioRad) (Figure 3.1).

molecular-biology

Figure 3.3.1:

Gel electrophoresis: 5 μl of each PCR product were loaded into wells of 2% agarose gel, also 5 μl of 100 bp ladders (Fermentas) was loaded into the gel. 1X Tris borate EDTA (TBE) running buffer was used to run the gel electrophoresis under a constant voltage (140 V) for 40 minutes. 5 μl from 10 mg/ml Ethidium Bromide (Biorad, USA), an intercalating agent, was added to the buffer to stain the DNA for visualization under UV transluminator provided with a gel documentation system (Biorad, USA). The approximate PCR product size was determined by matching up the bands to known bands of the ladder.

PCR product purification: Before sequencing the PCR products, the products would be purified to remove impurities like nucleotides, unlinked primers, enzyme and Mg+2. For this purpose the EZ- 10 spin® column PCR purification kit (Bio Basic, Canada) was used according to the manufacturer instructions. The PCR product concentration would be measured by means of NanoDrop 1000 (Thermo Scientific, Massachusetts. USA).

Procedure for purification of PCR product

1. The PCR reaction mixture was transferred to a 1.5 ml microcentrifuge tube and 5 volumes of Binding Buffer III was added to it.

2. The column was placed into a 2.0 ml collection tube and the resulting mixture was transferred to the column. The column stand was left at room temperature for 2 minutes then it was spun at 8000 rpm for 1 minute.

3. The flow-through was discarded. About 500 μl Wash Solution was added to the column and spun at 10,000 x g for 1 minute. The flow-through was discarded and column was placed back to the same collection tube.

4. 500 μl Wash Solution was then added to the column and then it was spun at 10,000 x g for 1 minute. The flow-through was discarded and spun once more to remove the residue from the Wash Solution.

5. The column was then transferred to a clean 1.5 ml micro tube. About 30-50 μl Elution Buffer was added onto the membrane center part of the column. The mixture was incubated at 50°C for 2 minutes then it was spun down at 10,000 x g for 1 minute.

Then the purified PCR product was ready to be used in the Sequencing PCR, or kept at – 20°C until needed for use.

PCR sequencing: According to the visualized purified PCR product, 1-3 μl of the purified product was incorporated with 1μl of R-R mix (as provided in Bigdye® terminator V 3.1 Cycle Sequencing kit (Applied bio systems, Foster city, CA, USA), and 5 pmol forward or reverse primer, 4 μl Big Dye Terminator v1.1/3.1 sequencing buffer (5X), and 4 μl nuclease free water .The reaction tubes were then loaded into a thermal cycler through the PCR- sequencing amplification protocol.

Cleaning of the PCR sequencing: The PCR-sequencing product was then cleaned using a specially designed kit (NucleoSEQ® supplied from Applied Bio, USA), according to the manufacturer’s instructions:

After the addition of nuclease-free water to the agarose columns, these columns underwent centrifugation for 3 minutes at 1500 rpm. Then the PCR product was loaded at the center of the column and underwent the same previous step of centrifugation, in order to elute the product into a sterile microfuge tube. Finally, 50 μl of nuclease-free water was added to each tube and mixed. Subsequently, 20 μl of cleaned sequence PCR product was loaded onto the 3130xl genetic analyzer; capillary electrophoresis system (Applied bio systems, Foster city, California, USA).

Analysis of the samples’ electropherograms: The resultant sequences were analyzed by alignment of the standard reference sequence of each primer product according to sample inclusion criteria and the noted differences for each sample were reported by the SNPs position on the Affymetrix® DMET™ microarray screen and trying to find any kinds of suspects in the sequence that may cause the No-call SNP genotypes, screening other variations and validation of DMET GeneChip by Call genotypes.

Statistical analysis: Measuring call rates of each marker included in this study, before re-genotyping and after. Also, determining error rate through validation process (Figure 3.2,3.3).

molecular-biology

Figure 3.3.2:

molecular-biology

Figure 3.3.3:

molecular-biology

Figure 3.3.4:

molecular-biology

Figure 3.3.5:

molecular-biology

Figure 3.3.6:

molecular-biology

Figure 3.3.7:

Results

Study’s participants

A total of 101 healthy Jordanian DNA samples were collected; and hybridized on Affymetrix® DMETTMM plus platform (Affymetrix, Santa Clara, CA, USA). Sixty six variants were detected (39 SNPs of them are No-call target). These SNPs and others (like SNPs that appear in sequencing results and have their probes in the Affymetrix® DMETTMM plus platform) used to complete the missing genotype and/or to validate the Affymetrix® GeneChip. Also, novel variations in these regions which are located in important genes will be very significant. Blood samples were collected, and genomic DNA was isolated from peripheral blood lymphocytes. PCR was performed to amplify our region of interest to be used in genetic tests.

Polymerase chain reaction (PCR) results

More than 500 PCR reactions were carried out using pairs of primers to amplify 20 different regions in DMET Genes. PCR was followed by 2% agarose gel electrophoresis to visualize the PCR product, check the size of the amplified target regions, and to ensure that there was no contamination. The amplified product size for SLC22A1-region1 was 528 bp (Figure 4.1), for SLC22A1-region2 was 556 bp (Figure 4.2), for ABCG2-region1 was 660 bp (Figure 4.3), for CYP1A2-region1 was 785 bp (Figure 4.4), for TPMT-region1 was 500 bp (Figure 4.5), for TPMT-region2 was 510 bp (Figure 4.6), for VKORC1-region2 was 567 bp (Figure 4.7), for SLC15A1-region1 was 495 bp (Figure 4.8), for CYP1A1-region1 was 433 bp (Figure 4.9), for FMO2-region1 was 519 bp (Figure 4.10), for NAT2-region1 was 878 bp (Figure 4.11), for SLC15A2-region1 was 429 bp (Figure 4.12), for CDA-region1 was 477 bp (Figure 4.13), for UGT1A1-region1 was 599 bp (Figure 4.14), for UGT2B7- region1 was 562 bp (Figure 4.15), for SLC22A6-region1 was 558 bp (Figure 4.16), for ABCB1-region1 was 578 bp (Figure 4.17), for ABCB1-region2 was 620 bp (Figure 4.18), for ABCB1-region3 was 549 bp (Figure 4.19) and for ABCB1- region4 was 580 bp (Figure 4.20).

molecular-biology-separation-PCR-product-agarose

Figure 4.1: The separation of PCR product of SLC22A1-region1 by 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-12: the 528bp amplified products for samples that have No-call data in AM_14350 and AM_14348 probe ID codes.

molecular-biology-separation-PCR-product-ladder

Figure 4.2: The separation of the PCR product of SLC22A1-region2 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-12: the 556bp amplified products for samples that have No-call data in AM_14361 and AM_14363 probe ID codes

molecular-biology-separation-samples-No-call

Figure 4.3: The separation of the PCR product of ABCG2-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-13: the 660bp amplified products for samples that have No-call data in AM_13689 probe ID code.

molecular-biology-amplified-products-call

Figure 4.4: The separation of the PCR product of CYP1A2-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-5: the 785bp amplified products for samples that have No-call data in AM_10783, AM_10785 and AM_10784 probe ID codes.

molecular-biology-separation-TPMT-region1-Lane

Figure 4.5: The separation of the PCR product of TPMT-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-13: the 500 bp amplified products for samples that have No-call data in AM_13973 probe ID code.

molecular-biology-separation-ladder-amplified

Figure 4.6: The separation of the PCR product of TPMT-region2 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 510bp amplified products for samples that have No-call data in AM_13980 probe ID code.

molecular-biology-amplified-products-samples

Figure 4.7: The separation of the PCR product of VKORC1-region2 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-2: the 567bp amplified products for samples that have No-call data in AM_11045 probe ID code.

molecular-biology-separation-No-call-codes

Figure 4.8: The separation of the PCR products of SLC15A1-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-13: the 480bp amplified products for samples that have No-call data in AM_10650 and AM_10647 probe ID codes.

molecular-biology-separation-region1-ladder

Figure 4.9: The separation of the PCR products of CYP1A1-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 433bp amplified products for samples that have No-call data in AM_10768, AM_10769, AM_10766 and AM_10771 probe ID codes.

molecular-biology-agarose-negative-control

Figure 4.10: The separation of the PCR products of FMO2-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 519bp amplified products for samples that have No-call data in AM_11959 and AM_11958 probe ID codes.

molecular-biology-amplified-products-samples

Figure 4.11: The separation of the PCR products of NAT2-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-9: the 878bp amplified products for samples that have No-call data in AM_14998, AM_15000, AM_15001, AM_15005, AM_15006, AM_15007, AM_15008 and AM_15010 probe ID codes.

molecular-biology-PCR-products-negative

Figure 4.12: The separation of the PCR products of SLC15A2-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 429 bp amplified products for samples that have No-call data in AM_13301 probe ID code.

molecular-biology-agarose-amplified-samples

Figure 4.13: The separation of the PCR products of CDA-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-18: the 477bp amplified products for samples that have No-call data in AM_11499 probe ID code.

molecular-biology-separation-negative-Lanes

Figure 4.14: The separation of the PCR products of UGT1A1-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-14: the 599bp amplified products for samples that have No-call data in AM_13067, AM_13070 and AM_13068 probe ID codes.

molecular-biology-separation-agarose-probe

Figure 4.15: The separation of the PCR products of UGT2B7-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-15: the 562bp amplified products for samples that have No-call data in AM_13458 and AM_13459 probe ID codes.

molecular-biology-separation-agarose-ladder

Figure 4.16: The separation of the PCR products of SLC22A6-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 558bp amplified products for samples that have No-call data in AM_10347 probe ID codes.

molecular-biology-agarose-amplified-samples

Figure 4.17: The separation of the PCR products of ABCB1-region1 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-9: the 578bp amplified products for samples that have No-call data in AM_14592probe ID code.

molecular-biology-amplified-products-No-call

Figure 4.18: The separation of the PCR products of ABCB1-region2 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-5: the 620bp amplified products for samples that have No-call data in AM_14609 and AM_14612 probe ID codes.

molecular-biology-negative-samples-No-call

Figure 4.19: The separation of the PCR products of ABCB1-region3 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-8: the 549bp amplified products for samples that have No-call data in AM_14628 probe ID code.

molecular-biology-amplified-probe-code

Figure 4.20: The separation of the PCR products of ABCB1-region4 using 2% agarose gel. Lane M: 100bp ladder. Lane –ve: the negative control. Lanes 1-6: the 580bp amplified products for samples that have No-call data in AM_14633 probe ID code.

DNA sequencing results

After the PCR products were visualized on gels the remaining amplified products were purified and amplified again using the same forward or reverse primers that used in the PCR. DNA was sequenced on a 3130/3130xl Genetic Analyzer (Applied Biosystem).

A total of 64 variations were identified in 20 different amplified regions for 15 ADMEs genes among 101 samples, 39 of them are the target of this study which appear as No-call genotypes in Affymetrix DMETTMM plus platform (Affymetrix, Santa Clara, CA, USA) in some samples and Call in other samples (Table 4.1). Shows these SNPs with their code number in ensembles Database and NCBI, their common name that descripted their location and type, number of reactions that our study covers with No-call samples. While, from (Figure 4.21- 4.53) respectively, represents their genotypes differences (partial sequencing Electropherogram) according the sequence references database (we were used ensembles: http://asia.ensembl.org/index.html). Accordingly, Call rates were improve from 89.08% to 95.56% (Table 4.7) in these 39 No-call genotypes probes, details of each marker is present in the previously mentioned table.

molecular-biology-Partial-sequencing-Electropherogram

Figure 4.21: Partial sequencing Electropherogram of the CDA gene using primers that cover one of the No-call SNPs; showing c.79A>C(K27Q) that has a probe ID code AM_11499 in theAffymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE: reverse primer use here in PCR sequencing step. A. The wild type allele shows a conserved T at position c.79A in DM-10-024 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. CDA allele showing a heterozygous T→G in DM-10-023 sample. Vertical arrow points to the changed nucleotide T→G (red / black peak). C. CDA allele showing homozygous T→G in DM-10-015 sample at position c.79A. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of CDA gene exon red front base pair represents the location of the nucleotide change c.79A. The above reference CDA gene sequence is according to the ensemble database which this SNP under rs2072671: http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=1:20915201- 20916201;v=rs2072671;vdb=variation;vf=1645282

molecular-biology-primers-Promoter-Affymetrix

Figure 4.22: Partial sequencing Electropherogram of UGT2B7 gene using primers that cover one of the No-call SNPs; showing c.327G>A(Promoter) that has probe ID code AM_13458 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system.A. The wild type allele shows a conserved A at position c.327Gin DM-10- 035 sample. Vertical arrow points to the conserved nucleotide A (green peak). B. UGT2B7 allele showing a heterozygous A→G in DM-10-001 sample. Vertical arrow points to the changed nucleotide A→G (green/ black peak). C. UGT2B7 allele showing homozygous
A→G in DM-12-052 sample at position c.327G. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of UGT2B7 gene exon red front base pair represents the location of the nucleotide change c.327G. The above reference
UGT2B7 gene sequence is according to the ensemble database which this SNP under rs7662029:
http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=4:69961412- 69962412;v=rs7662029;vdb=variation;vf=5391360

molecular-biology-Vertical-arrow-nucleotide

Figure 4.23: Partial sequencing Electropherogram of UGT2B7 gene using primers that cover one of the No-call SNPs; showing c.161C>T(Promoter) that has a probe ID code AM_13459 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved T at position c.161C in DM-10- 026 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. UGT2B7 allele showing a heterozygous T→C in DM-12-024 sample. Vertical arrow points to the changed nucleotide T→C (red/ blue peak). C. UGT2B7 allele showing homozygous T→C in DM-12-012 sample at position c.161C. Vertical arrow points to the changed nucleotide C (blue peak). D. Part of reference sequence of UGT2B7 gene exon red front base pair represents the location of the nucleotide change at position c.161C. The above reference UGT2B7 gene sequence is according to the ensemble database which this SNP under rs7668258: http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=4:69961412- 69962412;v=rs7662029;vdb=variation;vf=5391360

molecular-biology-sequencing-nucleotide-ensemble

Figure 4.24: Partial sequencing Electropherogram of UGT1A1 gene using primers that cover one of the No-call SNPs; showing c.*211C>T(3’UTR) that has a probe ID code AM_13067 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE: reverse primer use here in PCR sequencing step. A. The wild type allele shows a conserved G at position c.*211C in DM-10-046 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. UGT1A1 allele showing a heterozygous G→A in DM-12-027 sample. Vertical arrow points to the changed nucleotide G→A (black/ green peak). C. UGT1A1 allele showing homozygous G→A in DM-12-024 sample at position c.*211C. Vertical arrow points to the changed nucleotide A (green peak). D. Part of reference sequence of UGT1A1 gene 3’UTR red front base pair represents the location of the nucleotide change at position c.*211C. The above Reference UGT1A1 gene sequence is according to the ensemble database which this SNP under rs10929303: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000241635 ;r=2:234526291-234681956;t=ENST00000373450

molecular-biology-Electropherogram-platform-allele

Figure 4.25: Partial sequencing Electropherogram of UGT1A1 gene using primers that cover one of No-call SNPs; showing c.*440C>G(3’UTR) that have probe ID code AM_13070 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved C at position c.*440C in DM-12- 024 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. UGT1A1 allele showing a heterozygous C→G in DM-10-007 sample. Vertical arrow points to the changed nucleotide C→G (blue/black peak). C. UGT1A1 allele showing homozygous C→G in the DM-10-019 sample at position c.*440C. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of UGT1A1 gene 3’UTR red font base pair represents the location of the nucleotide change at position c.*440C. The above reference UGT1A1 gene sequences is according to the ensemble database which this SNP under rs8330: http://asia. ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000241635;r =2:234526291-234681956;t=ENST00000373450

molecular-biology-heterozygous-nucleotide-homozygous

Figure 4.26: Partial sequencing Electropherogram of UGT1A1 gene using primers that cover one of the No-call SNPs; showing c.*339C>G(3’UTR) that has a probe ID code AM_13068 in the Affymetrix ® DMET TMM plus platform (Affymetrix ® , Santa Clara, CA, USA) system. NOTE: reverse primer use here in PCR sequencing step. A. UGT1A1 allele showing a heterozygous G→C in DM-10-015 sample. Vertical arrow points to the changed nucleotide G→C (black/blue peak). B. UGT1A1 allele showing homozygous G→C in DM-10-035 sample at position c.*339C. Vertical arrow points to the changed nucleotide C (blue peak). C. Part of reference sequence of UGT1A1 gene 3’UTR red front base pair represents the location of the nucleotide change at position c.*339C. The above reference UGT1A1 gene sequence is according to the ensemble database which this SNP under rs1042640: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=co re;g=ENSG00000241635;r =2:234526291-234681956;t=ENST00000373450

molecular-biology-Vertical-nucleotide-reference

Figure 4.27: Partial sequencing Electropherogram of SLC15A2 gene using primers that cover one of the No-call SNPs; showing c.1161A>G(A387A) that has a probe ID code AM_13301 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved A at position c.1161 in DM-12- 047 sample. Vertical arrow points to the conserved nucleotide A (green peak). B. SLC15A2 allele showing a heterozygous A→G in DM-10-015 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak). C. SLC15A2 allele showing homozygous A→G in DM-10-060 sample at position c.1161A. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of SLC15A2 gene exon; red front base pair represents the location of the nucleotide change at position c.1161A. The above reference SLC15A2 gene sequence is according to the ensemble database which this SNP under rs1143670: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000163406;r =3:121612936-121662949;t=ENST00000489711

molecular-biology-nucleotide-ensemble-database

Figure 4.28: Partial sequencing Electropherogram of FMO2 gene using primers that covers one of the No-call SNPs; showing 23238T>C(X472Q) that has a probe ID code AM_11958 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved T at position 23238 in DM-12- 052 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. FMO2 allele showing a heterozygous T→C in DM-10-054 sample. Vertical arrow points to the changed nucleotide T→C (red/blue peak). C. Part of reference sequence of FMO2 gene exon; red front base pair represents the location of the nucleotide change at position 23238T. The above reference FMO2 gene sequence is according to the ensemble database which this SNP under rs6661174: http://asia.ensembl.org/ Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000094963;r =1:171154347-171181822;t=ENST00000441535

molecular-biology-sequencing-covers-probe

Figure 4.29: Partial sequencing Electropherogram of SLC22A6 gene using primers that covers one of the No-call SNPs; showing c.149G>A(R50H) that has a probe ID code AM_10347 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved G at position c.149G in DM-10- 031 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. Part of reference sequence of SLC22A6 gene exon; red front base pair represents the location of the nucleotide change at position c.149G. The above reference SLC22A6 gene sequence is according to the ensemble database which this SNP under rs45564337: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000197901;r =11:62703857-62752455;t=ENST00000377871

molecular-biology-Electropherogram-platform-wild

Figure 4.30: Partial sequencing Electropherogram of TPMT gene using primers that covers one of the No-call SNPs; showing c.719A>G(Y240C) that has a probe ID code AM_13973 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved A at position c.719A in DM-10- 027 sample. Vertical arrow points to the conserved nucleotide A (green peak). B. TPMT allele showing a heterozygous A→G in DM-10-017 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak). C. Part of reference sequence of TPMT gene exon; red front base pair represents the location of the nucleotide change at position c.719A. The above reference TPMT gene sequence is according to the ensemble database which this SNP under rs1142345: http://asia. ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000197901;r

molecular-biology-Vertical-conserved-allele

Figure 4.31: Partial sequencing Electropherogram of TPMT gene using primers that covers one of the No-call SNPs; showing c.460G>A(A154T )that has a probe ID code AM_13980 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved G at position c.460G in DM-10- 017 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. TPMT allele showing a heterozygous G→A in DM-10-037 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. Part of reference sequence of TPMT gene exon; red front base pair represents the location of the nucleotide change at position c.460G. The above reference TPMT gene sequence is according to the ensemble database which this SNP under rs1800460: http://asia. ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000197901;r =11:62703857-62752455;t=ENST00000377871

molecular-biology-changed-showing-homozygous

Figure 4.32: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; showing c.181C>T(R61C)that has a probe ID code AM_14348 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved C at position c.181C in DM-10- 060 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. SLC22A1 allele showing a heterozygous C→T in DM-12-035 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. SLC22A1 allele showing homozygous C→ T in DM-10-015 sample at position c.181C. Vertical arrow points to the changed nucleotide T (red peak). D. Part of reference sequence of SLC22A1 gene exon red front base pair represents the location of the nucleotide change at position c.181C. The above reference SLC22A1 gene sequence is according to the ensemble database which this SNP under rs12208357: http:// asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=6:160542648- 160543648;v=rs12208357;vdb=variation;vf=8477839

molecular-biology-Affymetrix-allele-conserved

Figure 4.33: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; showing c.262T>C(C88R ) that has a probe ID code AM_14350 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system.A. The wild type allele shows a conserved T at position c.262T in DM-12- 035 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. Part of reference sequence of SLC22A1 gene exon red front base pair represents the location of the nucleotide change at position c.262T. The above reference SLC22A1 gene sequence is according to the ensemble database which this SNP under rs55918055: http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=6:160542648- 160543648;v=rs12208357;vdb=variation;vf=8477839

molecular-biology-reference-sequence-represents

Figure 4.34: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers two of the No-call SNPs; showing c.1201G>A(G401S) (blue vertical arrow) and c.1222G>A (red vertical arrow) that has a probe ID code AM_14361 and AM_14363 (respectively) in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved G at position c.1201G and c.1222G in DM-11-015 sample. Vertical arrows points to the conserved nucleotide G (black peak). B. Part of reference sequence of SLC22A1gene exon; blue and red front bases represents the location of the nucleotide change at position c.1201G and c.1222G. The above reference SLC22A1 gene sequence is according to the ensemble database which these SNPs under rs45512393 and rs628031: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000175003;r=6:160542821-160579750;t=ENST00000366963

molecular-biology-primers-No-call-vertical

Figure 4.35: Partial sequencing Electropherogram of CYP1A2 gene using primers that covers two of the No-call SNPs; showing -739T>G(Promoter) (blue vertical arrow) and - 729C>T(Promoter) (red vertical arrow) that have probe ID codes AM_10783 and AM_10784 (respectively) in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved T at position -739T and a conserved C at position-729C in DM-10-030 sample. Vertical arrows points to the conserved nucleotide T (red peak) and C (blue peak). B. CYP1A2 allele showing a heterozygous T→C in DM-10-029 sample. Vertical arrow points to the changed nucleotide T→C (red/blue peak); whereas second SNP Stay in wild type status. C. Part of reference sequence of CYP1A2*1Kgene Promoter; blue and red front bases represents the location of the nucleotide change at position -739T and -729C. The above reference CYP1A2 gene sequence is according to the ensemble database which these SNPs under rs2069526 and rs12720461: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000140505;r=15:75041185-75048543;t=ENST00000343932

molecular-biology-Vertical-arrow-conserved

Figure 4.36: Partial sequencing Electropherogram of CYP1A2 gene using primers that covers one of the No-call SNPs; showing -163C>A(Promoter) that has a probe ID code AM_10785 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved C at position -163C in DM-10- 030samples. Vertical arrow points to the conserved nucleotide C (blue peak). B. CYP1A2 allele showing homozygous C→A in DM-10-015 sample at position -163C. Vertical arrow points to the changed nucleotide A (green peak). C. Part of reference sequence of CYP1A2 gene Promoter; red front base pair represents the location of the nucleotide change at position -163C. The above reference CYP1A2 gene sequence is according to ensemble database which this SNP under rs762551: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000140505;r =15:75041185-75048543;t=ENST00000343932

molecular-biology-heterozygous-wild-arrows

Figure 4.37: Partial sequencing Electropherogram of CYP1A1 gene using primers that covers three of the No-call SNPs; showing 2460C>A>T (green vertical arrow), 2454A>G (blue vertical arrow) and 2460C>A (red vertical arrow) that have probe ID codes AM_10766, AM_10768 and AM_10769 (respectively) in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer use here in PCR sequencing step. A. The wild type allele shows a conserved G at position 2460C, T at position 2454A and a conserved G at position2460C in DM-12-024 sample. Vertical arrows points to the conserved nucleotide G (black peaks) and T (red peak). B. CYP1A1 allele showing a heterozygous T→C and G→T (2nd and 3rd SNPs respectively); whereas 1st SNP stay in wild type status in DM-12-010 sample. Vertical arrows points to the changed nucleotides in these SNPs. C. Part of reference sequence of CYP1A1 gene exon; green, blue and red front bases represents the location of the nucleotide change at positions 2460C, 2454A and 2460C. The above reference CYP1A1 gene sequence is according to ensemble database which these SNPs under rs41279188, rs1048943 and rs1799814:http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG000 00140465;r=15:75011883-75017951;t=ENST00000379727

molecular-biology-Santa-Clara-nucleotides

Figure 4.38: Partial sequencing Electropherogram of CYP1A1 gene using primers that covers one of the No-call SNPs; showing 2345insT that has a probe ID code AM_10771 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved which T base that should insert in this SNP is absent at position 2345insTin DM-10-015 sample. Vertical arrow points to the conserved nucleotides without T insertion. B. Part of reference sequence of CYP1A1 gene exon red front base pair represents the location of the nucleotide change at position 2345insT. The above reference CYP1A1 gene sequence is according to ensemble database which this SNP under rs72547510: http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=15:75012594- 75013593;v=rs72547510;vdb=variation;vf=16826816

molecular-biology-Electropherogram-No-call-probe

Figure 4.39: Partial sequencing Electropherogram of NAT2 gene using primers that covers one of the No-call SNPs; showing c.190C>T(R64W) that has a probe ID code AM_14998 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved C at position c.190C in DM-10-025 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.190C. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs1805158: http://asia.ensembl.org/Homo_sapiens/Variation/Sequence?db=core;r=8:18257203- 18258203;v=rs1805158;vdb=variation;vf=1367155

molecular-biology-Electropherogram-No-call-nucleotide

Figure 4.40: Partial sequencing Electropherogram of NAT2 gene using primers that covers one of the No-call SNPs; showing c.282C>T(Y94Y) that has a probe ID code AM_15000 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved C at position c.282C in DM-12-010 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. NAT2 allele showing a heterozygous C→Tin DM-10-025 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. NAT2 allele showing homozygous C→ T in DM-10-019 sample at position c.282C. Vertical arrow points to the changed nucleotide T (red peak). D. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.282C. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs1041983: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-conserved-Santa-Clara

Figure 4.41: Partial sequencing Electropherogram of NAT2 gene using primers that covers one of the No-call SNPs; showing c.341T>C(I114T) that has a probe ID code AM_15001 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system.
A. The wild type allele shows a conserved T at position c.341T in DM-10-007 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. NAT2 allele showing a heterozygous C→T in DM-10-025 sample. Vertical arrow points to the changed nucleotide T→C (red/blue peak). C. NAT2 allele showing homozygous T→C in DM-10-057 sample at position c.341T. Vertical arrow points to the changed nucleotide C (blue peak). D. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.341T. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs1801280: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?d b=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-conserved-reference-sequence

Figure 4.42: Partial sequencing Electropherogram of NAT2 gene using primers that covers one of the No-call SNPs; showing c.434A>C(Q145P) that has a probe ID code AM_15005 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved A at position c.434A in DM-10- 002 sample. Vertical arrow points to the conserved nucleotide A (green peak). B. Part of reference sequence of NAT2 gene exon; red front base pair represents the location of the nucleotide change at position c.434A. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs72554616: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-conserved-position-arrows

Figure 4.43: Partial sequencing Electropherogram of NAT2 gene using primers that covers two of the No-call SNPs; showing c.481C>T(L161L) (blue vertical arrow) and c.499G>A(E167K) (red vertical arrow) that have probe ID codes AM_15006 and AM_15007 (respectively ) in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved T at position c.481C and conserved G at position c.499G in DM-10-019 sample. Vertical arrows points to the conserved nucleotides T (red peak) and G (black peaks). B. NAT2 allele showing a heterozygous T→C(red/blue peak)in 1st SNP; whereas 2nd SNP stay in wild type status in DM-10-028 sample. Vertical arrows points to the changed nucleotides in this SNPs. C.NAT2 allele showing homozygous T→C in DM-10-035 sample at position c.481C. Vertical arrow points to the changed nucleotide T (red peak); 2nd SNP also here in wild type status. D. Part of reference sequence of NAT2 gene
exon; blue and red front bases represents the location of the nucleotides change at positions c.481C and c.499G. The above reference NAT2 gene sequence is according to ensemble database which these SNPs under rs1799929 and rs72554617: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-showing-homozygous

Figure 4.44: Partial sequencing Electropherogram of NAT2 gene using primers that covers one of the No-call SNPs; showing c.590G>A(R197Q) that has a probe ID code AM_15008 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved G at position c.590G in DM-10- 035 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. NAT2 allele showing a heterozygous G→A in DM-10-055 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. NAT2 allele showing homozygous G→A in DM-10-019 sample at position c.590G. Vertical arrow points to the changed nucleotide A (green peak). D. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.590G. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs1799930: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-Electropherogram

Figure 4.45: Partial sequencing Electropherogram of NAT2 gene using primers that coversone of the No-call SNPs; showing c.803A>G(K268R) that has a probe ID code AM_15010 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved G at position c.803A>G in DM- 10-035 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. NAT2 allele showing a heterozygous G →A in DM-10-020 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. NAT2 allele showing homozygous G→A in DM-10-025 sample at position c.803A>G. Vertical arrow points to the changed nucleotide A (green peak). D. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.803A>G. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs1208: http:// asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-using-primers

Figure 4.46: Partial sequencing Electropherogram of SLC15A1 gene using primers that covers two of the No-call SNPs; showing c.1347T>C(A449A) (blue vertical arrow) and c.1375C>T(R459C) (red vertical arrow) that have probe ID codes AM_10650 and AM_10647 (respectively) in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. The wild type allele shows a conserved T at position c.1347T and conserved C at position c.1375Cin DM-10-043 sample. Vertical arrows points to the conserved nucleotides T (red peak) and C (blue peaks). B. SLC15A1 allele showing a heterozygous T→C (red/blue peak) in 1st SNP; whereas 2nd SNP stay in wild type status in DM-10-002 sample. Vertical arrows points to the changed nucleotides in these SNPs. C. SLC15A1 allele showing
homozygous T→C in DM-10-037sample at position c.1347T. Vertical arrow points to the changed nucleotide T (red peak); 2nd SNP also here in wild type status. D. Part of reference sequence of SLC15A1 gene exon; blue and red front bases represents the location of the nucleotides change at positions c.1347T and c.1375C. The above reference SLC15A1 gene sequence is according to ensemble database which these SNPs under rs1339067 and rs2274827: http://asia.ensembl.org/ Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000088386;r =13:99336055-99404908;t=ENST00000376503

molecular-biology-Vertical-arrow

Figure 4.47: Partial sequencing Electropherogram of ABCB1 gene using primers that covers one of the No-call SNPs; showing c.1554+24C>T that has a probe ID code AM_14609 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. A. ABCB1 allele showing a heterozygous A→G in DM-12-035 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak).B. ABCB1 allele showing homozygous G→A in DM-12-023 sample at position c.1554+24C. Vertical arrow points to the changed nucleotide A (green peak). C. Part of reference sequence of ABCB1 gene intron; red front base pair represents the location of the nucleotide change at position c.1554+24C. The above reference ABCB1 gene sequence is according to ensemble database which
this SNP under rs2235033: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-reverse-primer

Figure 4.48: Partial sequencing Electropherogram of ABCB1 gene using primers that covers one of the No-call SNPs; showing c.1236C>T(G412G) that has a probe ID code AM_14612 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. ABCB1 allele showing a heterozygous A→G in DM-10-022 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak). B. ABCB1 allele showing homozygous G→A in DM-10-017 sample at position c.1236C. Vertical arrow points to the changed nucleotide A (green peak).C. Part of reference sequence of ABCB1 gene exon; red front base pair represents the location of the nucleotide change at position c.1236C. The above reference ABCB1 gene sequence is according to ensemble database which this SNP under rs1128503: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-ensemble-database

Figure 4.49: Partial sequencing Electropherogram of ABCB1 gene using primers that covers one of the No-call SNPs; showing c.-129T>C that has a probe ID code AM_14633 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. ABCB1allele showing a heterozygous A→G in DM-10-001 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak) at position c.-129T. B. Part of reference sequence of ABCB1 gene 5’UTR; red front base pair represents the location of the nucleotide change at position c.-129T. The above reference ABCB1gene sequence is according to ensemble database which this SNP under rs3213619: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=E NSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-reference-sequence

Figure 4.50: Partial sequencing Electropherogram of ABCB1 gene using primers that covers one of the No-call SNPs; showing c.61A>G(N21D) that has a probe ID code AM_14628 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. ABCB1 allele showing a heterozygous C→T in DM-11-013 sample. Vertical arrow points to the changed nucleotide T→C (red/ blue peak). B. Part of reference sequence of ABCB1 gene exon red front base pair represents the location of the nucleotide change at position c.61A. The above reference ABCB1 gene sequence is according to ensemble database which this SNP under rs9282564: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000085563;r
=7:87133175-87342611;t=ENST00000265724

molecular-biology-reverse-primer

Figure 4.51: Partial sequencing Electropherogram of ABCB1 gene using primers that covers one of the No-call SNPs; showing c.2677G>T>A (A893SorT) that has a probe ID code AM_14592 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. ABCB1 allele showing a heterozygous C→A in DM-10-014 sample. Vertical arrow points to the changed nucleotide C→A (blue/black peak) at position c.2677G. B. Part of reference sequence of ABCB1gene exon; red front base pair represents the location of the nucleotide change at position c.2677G. The above reference ABCB1 gene sequence is according to ensemble database which this SNP under rs2032582: http://asia.ensembl.org/Homo_sapiens/Transcript/
Exons?db=core;g=ENSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-nucleotide-change

Figure 4.52: Partial sequencing Electropherogram of ABCG2 gene using primers that covers one of the No-call SNPs; showing c.376C>T(Q126X) that has a probe ID code AM_13689 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. The wild type allele shows a conserved G at position c.376C in DM-10-035 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. Part of reference sequence of ABCG2 gene exon red front base pair represents the location of the nucleotide change at position c.376C. The above reference ABCG2 gene sequence is according to ensemble database which this SNP under rs72552713: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g =ENSG00000118777;r =4:89011416-89152474;t=ENST00000237612

molecular-biology-Homo-sapiens

Figure 4.53: Partial sequencing Electropherogram of VKORC1 gene using primers that covers one of the No-call SNPs; showing c.196G>A(V66M) that has a probe ID code AM_11044 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system. NOTE that reverse primer used here in PCR sequencing step. A. The wild type allele shows a conserved C at position c.196G in DM-12-047 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. Part of reference sequence of VKORC1 gene exon red front base pair represents the location of the nucleotide change at position c.196G. The above reference VKORC1 gene sequence is according to ensemble database which this SNP under rs72547529: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;
g=ENSG00000167397;r =16:31102163-31107301;t=ENST00000394975

molecular-biology-sequencing-reaction

Table 4.1: Summary of the 39 variations that have been identified in our study that are as part of No-call data in some of samples in Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA), which represent RS number according ensambles database, AM numbers according Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) product, commen name of this SNPs and numbers of sequencing reaction of each SNPs.

molecular-biology-ensambles-database

Table 4.1b: Summary of the 39 variations that have been identified in our study that are as part of No-call data in some of samples in Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA), which represent RS number according ensambles database, AM numbers according Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) product, commen name of this SNPs and numbers of sequencing reaction of each SNPs.

Furthermore, 273 No-call genotypes in the Affymetrix® DMETTMM plus GeneChips were identified (Appendix E), where approximately 47.25% of No-call are wild-type genotypes, 37% heterozygous and 15.75% were homozygous for variants.

Additionally, 19 of variations identified were reported in the Database; 10 of them are already markers in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) (Table 4.2), that are also used to validate the GeneChip by comparing with their Call genotypes in the chips. These SNPs are represented in (Figure 4.54) to (Figure 4.63) respectively to show their genotype variance as Electropherogram results gotten after the direct sequencing method. While the other SNPs reported are listed in (Table 4.3) and represent in (Figure 4.65) to (Figure 4.72) respectively to show their alternative genotypes that we have gotten in our samples that are included in our study.

molecular-biology-conserved-nucleotide

Figure 4.54: Partial sequencing Electropherogram of FMO2 gene using primers that covers one of No-call SNPs; besides that showing c.*60A>G that has a probe ID code AM_11959 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is SNPs that validate technique through call data results. A. The wild type allele shows a conserved A at position c.*60A in DM-12-025 sample. Vertical arrow points to the conserved nucleotide A (green peak). B. FMO2 allele showing a heterozygous A→G in DM-10-054 sample. Vertical arrow points to the changed nucleotide A→G (green/black peak). C. FMO2 allele showing homozygous A→G in DM- 10-015 sample at position c.*60A. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of FMO2 gene 3’UTR; red front base pair represents the location of the nucleotide change at position c.*60A. The above reference FMO2 gene sequence is according to ensemble database which this SNP under rs2020869: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000094963;r =1:171154347-171181822;t=ENST00000441535

molecular-biology-Vertical-arrow-points

Figure 4.55: Partial sequencing Electropherogram of TPMT gene using primers that covers one of the No-call SNPs; besides that showing c.474C>T(I1581) that has a probe ID code AM_13979 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is SNPs that validate technique through call data results.A. The wild type allele shows a conserved C at position c.474C in DM-10-043 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. TPMT allele showing a heterozygous C→T in DM-10-037 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. Part of reference sequence of TPMT gene exon red front base pair represents the location of the nucleotide change at position c.474C. The above reference TPMT gene
sequence is according to ensemble database which this SNP under rs2842934: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG000001 97901;r =11:62703857-62752455;t=ENST00000377871

molecular-biology-conserved-nucleotide

Figure 4.56: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; besides that showing c.156T>C(S52S) that have probe ID code AM_14347 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is SNPs that validate technique through call data results. A. The wild type allele shows a conserved T at position c.156T in DM-10-015 sample. Vertical arrow points to the conserved nucleotide T (red peak). B. SLC22A1 allele showing a heterozygous T→ C in DM-10-060 sample. Vertical arrow points to the changed nucleotide T→C (red/blue peak). C. Part of reference sequence of SLC22A1 gene exon red front base pair represents the location of the nucleotide change at position c.156T. The above reference SLC22A1 gene sequence is according to ensemble database which this SNP under rs1867351: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g =ENSG00000175003;r =6:160542821- 160579750;t=ENST00000366963

molecular-biology-sequencing-Electropherogram

Figure 4.57: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; besides that showing c.41C>T(S14F) that has a probe ID code AM_14342 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is SNPs that validate technique through call data results. A. The wild type allele shows a conserved C at position c.41C in DM-12-047 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. SLC22A1 allele showing a heterozygous C→T in DM-10-060 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. Part of reference sequence of SLC22A1 gene exon red front base pair represents the location of the nucleotide change at position c.41C. The above reference SLC22A1 gene sequence is according to ensemble database which this SNP under rs34447885: http://asia.ensembl.org/Homo_sapiens/Transcript/ Exons?db=core;g=ENSG00000175003;r =6:160542821-160579750;t=ENST00000366963

molecular-biology-nucleotide-change

Figure 4.58: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; besides that showing c.1260_1262_ delGAT(M420del) that has a probe ID code AM_14368 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a variant that can we validate technique through call data results. A. The wild type allele shows sequence at position c.1260_1262delGAT(M420del) without GAT deletion in DM-12-010 sample. Horizontal red box mentions the conserved nucleotides. B. SLC22A1 allele showing a heterozygous GAT→- (deletion GAT in one allele) in DM- 10-060 sample. Horizontal red box points to thedeletion variant that causes overlapping peaks in electropherogram sequence due to in frame shift effect. C. Part of reference sequence of SLC22A1 gene exon red front base pair represents the location of the nucleotide change at position c.1260_1262delGAT(M420del). The above reference SLC22A1 gene sequence is according to ensemble database which this variant under rs72552763: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000175003;r =6:160542821-160579750;t=ENST00000366963

molecular-biology-Partial-sequencing

Figure 4.59: Partial sequencing Electropherogram of NAT2 gene using primers that covers No-call SNPs; besides that showing c.364G>A(D122N) that has a probe ID code AM_15002 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a SNPs that can we validate technique through call data results. A. The wild type allele shows a conserved G at position c.364G in DM-10-020 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. Part of reference sequence of NAT2 gene exon red front base pair represents the location of the nucleotide change at position c.364G. The above reference NAT2 gene sequence is according to ensemble database which this SNP under rs4986996: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db= core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

molecular-biology-gene-sequence

Figure 4.60: Partial sequencing Electropherogram of SLC15A1 gene using primers that covers No-call SNPs; besides that showing c.1352C>A(T451N) that has a probe ID code AM_10648 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a SNPs that can we validate technique through call data results. A. The wild type allele shows a conserved C at position c.1352C in DM-10-017 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. SLC15A1 allele showing a heterozygous C→A in DM-10-037 sample. Vertical arrow points to the changed nucleotide C→A (blue/green peak) at position c.1352C. C. Part of reference sequence of SLC15A1 gene exon; red front base pair represents the location of the nucleotide change at position c.1352C. The above reference SLC15A1 gene sequence is according to ensemble database which this SNP under rs8187838: http://asia.ensembl.org/ Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000088386;r =13:99336055-99404908;t=ENST00000376503

molecular-biology-DMETTMM-plus

Figure 4.61: Partial sequencing Electropherogram of ABCB1 gene using primers that covers the No-call SNPs; besides that showing c.1350+44C>T that have probe ID code AM_14610 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a SNPs that can we validate technique through call data results. NOTE that reverse primer used here in PCR sequencing step. A. The wild type allele shows a conserved G at position c.1350+44C in DM-12-023 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. ABCB1 allele showing a heterozygous G →Ain DM-12-035 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. Part of reference sequence of ABCB1 gene exon red front base pair represents the location of the nucleotide change at position c.1350+44C. The above reference ABCB1 gene sequence is according to ensemble database which this SNP under rs2032588: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-platform-Affymetrix

Figure 4.62: Partial sequencing Electropherogram of ABCB1 gene using primers that covers the No-call SNPs; besides that showing c.-1G>A that has a probe ID code AM_14631 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a SNPs that can we validate technique through call data results. A. The wild type allele shows a conserved C at position c.-1G in DM-11-013 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. Part of reference sequence of ABCB1 gene 5’UTR red front base pair represents the location of the nucleotide change at position c.-1G. The above reference ABCB1 gene sequence is according to ensemble database which this SNP under rs2214102: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db =core;g=ENSG00000085563;r =7:87133175-87342611;t=ENST00000265724

molecular-biology-showing-heterozygous

Figure 4.63: Partial sequencing Electropherogram of VKORC1 gene using primers that covers the No-call SNPs; besides that showing c.174-136C>T(5’UTR) that has a probe ID code AM_11045 in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) system, which is a SNPs that can we validate technique through call data results. NOTE that reverse primer used here in PCR sequencing step. A. VKORC1 allele showing a heterozygous G →A in DM-12-047 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). B. Part of reference sequence of VKORC1 gene 5’UTR red front base pair represents the location of the nucleotide change at position c.174-136C. The above reference VKORC1 gene sequence is according to ensemble database which this SNP under rs9934438: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000167397;r =16:31102163-31107301;t=ENST00000394975

molecular-biology-ensemble-database

Figure 4.64: Partial sequencing Electropherogram of the CDA gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in data base; here one of these which has a SNPs code rs12059454. NOTE that reverse primers are used here in the PCR sequencing step. A. The wild type allele shows a conserved C in DM-10-024 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. CDA allele showing a heterozygous C→T in DM-10-001 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. Part of reference sequence of CDA gene intron red front base pair represents the location of the nucleotide change at position. The above reference CDA gene sequence is according to ensemble database: http://asia. ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000158825;r =1:20915441-20945401;t=ENST00000375071

molecular-biology-represents-location

Figure 4.65: Partial sequencing Electropherogram of the CDA gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in the data base; here one of these which has a code rs12731069. NOTE that reverse primers are used here in the PCR sequencing step. A. The wild type allele shows a conserved C in DM-10-047 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. CDA allele showing a heterozygous C→T in DM-10-046 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. Part of reference sequence of CDA gene intron red front base pair represents the location of the nucleotide change at position. The above reference CDA gene sequence is according to ensemble database: http://asia.ensembl.org/ Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000158825;r =1:20915441-20945401;t=ENST00000375071

molecular-biology-Vertical-red-arrow

Figure 4.66: Partial sequencing Electropherogram of the UGT2B7 gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in the data base; here are two of these SNPs which have codes rs73823859 and rs7668282 respectively. A. The wild type allele shows a conserved T in DM-10-040 sample in 2nd SNP. Vertical red arrow points to the conserved nucleotide T (red peak), whereas 1st SNP showing a heterozygous G→A. Vertical blue arrow points to the changed nucleotide G→A (black/green peak) in same sample. B. UGT2B7 allele showing a heterozygous T→ C in DM-10-016 sample in 1st SNPs. Vertical blue arrow points to the changed nucleotide T→C (red/blue peak), whereas 2nd SNP shows a conserved G in DM-10-047 sample in. Vertical red arrow points to the conserved nucleotide G (black peak) in same sample. C. Part of reference sequence of UGT2B7 gene Promoter blue and red front base pair represents the location of the nucleotides changes inthese SNPs position. The above reference UGT2B7 gene sequence is according to ensemble database: http://asia. ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000171234;r =4:69917081-69978705;t=ENST00000305231

molecular-biology-black-peak

Figure 4.67: Partial sequencing Electropherogram of the SLC15A2 gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in the data base; here isone of these which has a code rs3762819. A. The wild type allele shows a conserved G in DM-12-025 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. SLC15A2 allele showing a heterozygous G→A in DM-12- 010 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. SLC15A2 allele showing homozygous G→A in DM-10-060 sample. Vertical arrow points to the changed nucleotide G (black peak). D. Part of reference sequence of SLC15A2 gene intron; red front base pair represents the location of the nucleotide change. The above reference SLC15A2
gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000163406;r =3:121612936-121662949;t=ENST00000489711

molecular-biology-heterozygous-conserved

Figure 4.68: Partial sequencing Electropherogram of the SLC15A2 gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in the data base; here is one of these which has a code rs1882002. A. The wild type allele shows a conserved C in DM-10-047 sample. Vertical arrow points to the conserved nucleotide C (blue peak). B. SLC15A2 allele showing a heterozygous C→T in DM-10-029 sample. Vertical arrow points to the changed nucleotide C → T (blue/red peak). C. SLC15A2 allele showing homozygous C→ T in DM-10-060 sample. Vertical arrow points to the changed nucleotide T (red peak). D. Part of reference sequence of SLC15A2 gene intron; red front base pair represents the location of the nucleotide change. The above reference SLC15A2 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000163406;r =3:121612936-121662949;t=ENST00000489711

molecular-biology-Vertical-arrow-points

Figure 4.69: Partial sequencing Electropherogram of the FMO2 gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in the data base; here is one of these which has a code rs28369911. A. The wild type allele shows a conserved G in DM-10-015 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. FMO2 allele showing a heterozygous G→T in DM-10-054 sample. Vertical arrow points to the changed nucleotide G→T (black/red peak). C. Part of reference sequence of FMO2 gene intron; red front base pair represents the location of the nucleotide change. The above reference FMO2 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000094963;r =1:171154347-171181822;t=ENST00000441535

molecular-biology-variant-under

Figure 4.70: Partial sequencing Electropherogram of the SLC22A1gene using primers that covers one of the No-call SNPs; besides that showing other variants that are reported in data base; here is one of these variants which has a code rs113569197. A. SLC22A1 allele showing a heterozygous TGGTAAGT→- (deletionTGGTAAGT in one allele) in DM-10- 025 sample. Horizontal red box included to the this deletion variant that cause overlapping peaks in electropherogram sequence due to shifting effect between the two alleles.B. C. SLC15A1 allele showing homozygous TGGTAAGTdeletion in DM-10-031 sample. Vertical red arrow points to the position of this deletion segment. D. Part of reference sequence of SLC22A1 gene Splice donor red front base pair represents the location of the nucleotides change. The above reference SLC22A1 gene sequence is according to ensemble database which this variant under rs113569197: http://asia.ensembl.org/Homo_sapiens/ Transcript/Exons?db=core;g=ENSG00000175003;r =6:160542821-160579750;t=ENST00000366963

molecular-biology-red-front-base-pair

Figure 4.71: Partial sequencing Electropherogram of the SLC22A1 gene using primers that covers one of the No-call SNPs; besides that showing other SNPs that are reported in data base; here is one of these which has a code rs9457843. NOTE that the reverse primer is used here in the PCR sequencing step.A. The wild type allele shows a conserved G in DM-10-043 sample. Vertical arrow points to the conserved nucleotide G (black peak). B. SLC15A2 allele showing a heterozygous G→A in DM-12-046 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak). C. Part of reference sequence of SLC22A1 gene intron; red front base pair represents the location of the nucleotide change. The above reference SLC22A1 gene sequence is according to ensemble database: http://asia.ensembl. org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000175003;r =6:160542821-160579750;t=ENST00000366963

molecular-biology-allele-showing-heterozygous

Figure 4.72: Partial sequencing Electropherogram of the UGT2B7 gene using primers that covers one of the No-call SNPs; besides that showing other variations that are new and not reported in data base; C.-209T>G. A. The wild type allele shows a conserved G in DM-12-035 sample. Vertical arrow points to the conserved nucleotide G (black peak) in position C.-209. B. UGT2B7 allele showing a heterozygous G→T in DM-12-035 sample. The vertical arrow points to the changed nucleotide G→T (black/red peak) in same position. C. Part of reference sequence of UGT2B7 gene promoter; red front base pair represents the location of the nucleotide change C.-209T position. The above reference UGT2B7 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/ Transcript/Exons?db=core;g=ENSG00000171234;r =4:69917081-69978705;t=ENST00000305231

molecular-biology-Jordanian-samples

Table 4.2: Summary of 10 reported SNPs that identified and were used to validate Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) product that used in 101 Jordanian samples in PHBC, Table shows SNPs codes and their commen name with probe ID in this GeneChip.

molecular-biology-their-variants

Table 4.3: Summary of 9 reported SNPs that identified. Table shows SNPs codes and their gene, region, position and their variants.

Our sequencing screening study also found new variations in 8 different positions which are not reported in Ensemble and NCBI databases (Table 4.4). Two of them have an influence on protein level due to amino acid changing: C.-209T > G is a UGT2B7 variant located in the promoter region that changes G to T at position C.-209 (Figure 4.72). Sample code number DM-12-035 was heterozygous (GT), while other samples that were screened have wild type genotypes (GG). We still do not know if this new variation has influences on protein level.

molecular-biology-their-nomenclature

Table 4.4: Summary of about 8 non- reported variants that identified. Table shows these variants and their genes located in, types, position, which samples that identified in and their nomenclature.

c.252G > T is a SLC22A6 Variant located in exon1 region that changes G to T at position c.252 (Figure 4.73). Sample code number DM-12-053 was heterozygous (GT). While other samples that were screened had wild type genotypes (GG). This variation doesn’t change the amino acid CCG (Ser) →CCT (Ser) (synonymous mutation) at codon 84. c.1277 + 69°C > T and c.1277 + 82°C > T are SLC22A1 variants located close to each other in the intron7 region that changes C to T at both position c.1277 + 69 and c.1277 +82 (Figure 4.74). Samples codes DM-12-027, DM-12-021, DM-12-028, DM-12-043 and DM-12- 060 were heterozygous (CT) in both variants that looked associated with each other and DM- 12-025 in the 2ND variant only, while other samples that were screened had wild types genotypes (CC) in both variants again. -698°C > A is a CYP1A2 variant located in promoter region that changes C to A at position C.-209 (Figure 4.75). Sample code number DM-12-026 and DM-10-015 were heterozygous (CA), while other samples that were screened had wild type genotypes (CC). We still do not know if this new variation has influences on protein level.

molecular-biology-gene-using-primers

Figure 4.73: Partial sequencing Electropherogram of the SLC22A6 gene using primers that covers one of the No-call SNPs; besides that showing other variations that are new and not reported in the data base; c.252G>T shows here a silent mutation. A. The wild type allele shows a conserved G in DM-10-053 sample. Vertical arrow points to the conserved nucleotide G (black peak) in position c.252G. B. SLC22A6 allele showing a heterozygous G→T in DM-10-053 sample. Vertical arrow points to the changed nucleotide G→T (black/red peak) in same position. C. Part of reference sequence of SLC22A6 gene promoter; red front base pair represents the location of the nucleotide change c.25G2. The above reference SLC22A6 gene sequence is according to ensemble database: http://asia.ensembl.org/ Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000197901;r =11:62703857-62752455;t=ENST00000377871

molecular-biology-Vertical-arrow-points

Figure 4.74: Partial sequencing Electropherogram of SLC22A1 gene using primers that covers one of the No-call SNPs; besides that showing other variations that are new and not reported in the data base; c.1277+69C>T and c.1277+82C>T show here as intronic variants. A. The wild type allele shows a conserved G in DM-10-043 sample. The blue and red Vertical arrows points to the conserved nucleotide G (black peak) in positionsc.1277+69C and c.1277+82c respectively. B. SLC22A1 allele showing a heterozygous G→A in DM-10-043 sample. Blue and red Vertical arrow points to the changed nucleotide G→A (black/green peak) in same position. C. Part of reference sequence of SLC22A1 gene promoter; blue front base pair represents the location of the nucleotide change inc.1277+69C, and red front base pair represents the location of nucleotide change in c.1277+82c. The above reference SLC22A1 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000175003;r =6:160542821-160579750;t=ENST00000366963

molecular-biology-conserved-nucleotide

Figure 4.75: Partial sequencing Electropherogram of the CYP1A2 gene using primers that covers one of the No-call SNPs; besides that showing other variations that are new and not reported in the data base; -698C>G(Promoter) showing here. A. The wild type allele shows a conserved Cin DM-10-015 sample. Vertical arrow points to the conserved nucleotide C (blue peak) in position -698C. B. CYP1A2 allele showing a heterozygous C→Gin DM-10-026 sample. Vertical arrow points to the changed nucleotide C→G (blue/black peak) in same position. C. Part of reference sequence of CYP1A2 gene promoter; red front base pair represents the location of the nucleotide change-698C position. The above reference CYP1A2 gene sequence is according to ensemble database: http://asia.
ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000140505;r =15:75041185-75048543;t=ENST00000343932

c.1356T > C is a SLC15A1 Variant located in exon17 region that changes T to C at position c.1356 (Figure 4.76). Sample code number DM-10-040 was heterozygous (TC). While other samples that were screened, had wild type genotypes (CC). That variation doesn’t change the amino acid GAC (Asp) →GAT (Asp) (synonymous mutation) at codon 453. c.58G > A is a NAT2 Variant located in exon2 region that changes G to A at position c.58 (Figure 4.77). Sample code number DM-10-005 was heterozygous (GA). While other samples that were screened, had wild type genotypes (GG). That variation changed the amino acid GAC (Asp) →AAC (Asn) (non-synonymous mutation) at codon 20.

molecular-biology-blue-black-peak

Figure 4.76: Partial sequencing Electropherogram of the SLC15A1 gene using primers that covers one of the No-call SNPs; besides that showing another variation that is new and not reported in the data base; c.1356T>C showing here as silent mutation. A. The wild type allele shows a conserved C in DM-10-040 sample. Vertical arrow points to the conserved nucleotide C (blue peak) in position c.1356T. B. SLC15A1 allele showing a heterozygous C→G in DM-10- 040 sample. Vertical arrow points to the changed nucleotide C→T (blue/red peak) in same position. C. Part of reference sequence of SLC15A1gene promoter; red front base pair represents the location of the nucleotide change c.1356Tposition. The above reference SLC15A1 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/Transcript/Exons?db=core;g=ENSG00000088386;r =13:99336055-99404908;t=ENST00000376503

molecular-biology-Transcript-Exons

Figure 4.77: Partial sequencing Electropherogram of the NAT2 gene using primers that covers one of the No-call SNPs; besides that showing another variation that is new and not reported in the data base; c.58G>A(D20N) showing here as silent mutation. A. The wild type allele shows a conserved G in DM-10-005 sample. Vertical arrow points to the conserved nucleotide G (black peak) in position c.58G. B. NAT2 allele showing a heterozygous G→Ain DM-10-005 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak) in same position. C. Part of reference sequence of NAT2 gene promoter; red front base pair represents the location of the nucleotide change c.58G position. The above reference NAT2 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/
Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

c.31G > A is a NAT2 Variant located in the exon2 region that changes G to A at position c.31 (Figure 4.78). Sample code number DM- 10-005, DM-10-002, DM-10-014 and DM- 10-057 were heterozygous (GA). While other samples that were screened, had wild type genotypes (GG). That variation changed the amino acid GGC (Gly) →AGC (Ser) (non- synonymous mutation) at codon 11.

molecular-biology-silent-mutation

Figure 4.78: Partial sequencing Electropherogram of the NAT2 gene using primers that covers one of the No-call SNPs; besides that showing another variation that is new and not reported in the data base;c.31G>A(G11S) showing here as silent mutation. A. The wild type allele shows a conserved C in DM-10-057 sample. Vertical arrow points to the conserved nucleotide G (black peak) in position c.31G. B. NAT2 allele showing a heterozygous G→Ain DM-10-057 sample. Vertical arrow points to the changed nucleotide G→A (black/green peak) in same position. C. Part of reference sequence of NAT2 gene promoter; red front base pair represents the location of the nucleotide change c.31G position. The above reference NAT2 gene sequence is according to ensemble database: http://asia.ensembl.org/Homo_sapiens/
Transcript/Exons?db=core;g=ENSG00000156006;r =8:18248755-18258728;t=ENST00000286479

Data Analysis for validation

This study has a validation purpose and has been done by three sources: by call genotypes that are close to No-call genotypes and share the same reactions (Appendix B), by SNPs that appear in the reaction and have probes markers in Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) (Table 4.5) and by sequencing call genotypes directly (Table 4.5), to get about 375 call genotypes which score 370 (98.93%) with 4 wrong call genotypes (error rate is 1.07%). (Figure 4.1-4.78, Table 4.1-4.7).

molecular-biology-Validation-results

Table 4.5a: Validation results for SNPs that have probe markers in Affymetrix® DMETTMM plus platform.

molecular-biology-markers-Affymetrix

Table 4.5b: Validation results for SNPs that have probe markers in Affymetrix® DMETTMM plus platform.

molecular-biology-direct-sequencing

Table 4.6: Validation throughing direct sequencing.

molecular-biology-marker-probes

Table 4.7: Call rates improvement details in each marker probes.

Discussion

In a high-throughput GeneChip array which based on hybridization with allele- specific probes; genotyping errors is very common which limited the technology application; in addition No-call genotypes for many SNPs on the chip immerge as bigger and more serious problem in high-throughput genotyping methods which currently using at PHBC with Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) to genotyped 1936 SNPs in Jordanian samples.

Total of 101 samples were genotyping through this GeneChip type that specific for most important genes variations as pharmacogenomics study to determines the haplotypes of the major of reported SNPs clusters like these are functional in metabolizing of the most widely prescribed anticancer drug in the world including aromatase inhibitors, tamoxifen and thiopurines groups.

Accordingly, we were Genotyping these No-call s data by direct sequencing method that gives us an opportunity to complete some of SNPs clusters (haplotypes) which can then determine the exact phenotypes for the study samples weather the phenotype of these samples will be an UM, EM, IM or PM metabolizer. For validation purposes, we carried re-genotyping some of the SNPs which originally reported as a call genotype using the

Affymetrix® DMETTMM plus platform experiments. Twenty different genomic regions in 15 different ADMEs genes (NAT2, ABCB1, ABCG2, SLC15A1, SLC15A2, SLC22A1, SLC22A6, UGT2B7, UGT1A1, CDA, FMO2, CYP1A1, CYP1A2, TPMT and VKORC1) were amplified by PCR reactions and sequenced. We identified 66 different SNPs, 39 of these variation are belong to the No-calls genotypes markers in some samples of Affymetrix® DMETTMM plus results data. In all SNPs; there were no any new variants for these SNPs and all were the reference or the variant SNPs originally identified by Affymetrix® DMETTMM plus chip. In sequencing a total of 273 No-call s; there was no biological reason for the No-call which suggests other problems.

On the other hand, we have found that there are some of No-call genotypes samples have other SNPs close to the original SNP. These SNPs are 25 bp distance upstream or downstream of the No-call SNPs. Fifteen SNPs out of the 39 SNPs tested encounter this problem and have neighboring SNPs (Figure 3.2) probes of the No-call (probes of the Nocall SNPs Figure 5.1B). The existence of the target SNPs in proximity to another variants might cause some of the No-call results (Figure 5.1C) [63]. For example, we found in CYP1A1 gene 3 of No-call SNPs located very close to each other (Figure 5.1A). Several samples have No-call s genotypes in all of these 3 SNPs, which resulted in weaken the intensity of fluorescence signals over the background and prevent signing the genotype. SNPs located in proximity to each other’s might explain some of the missing data however some other reasons including poor quality DNA samples, problem in Cut-off percentage of the chip data and triallelic variations are also reported [87].

Our validation study shows in the No-call genotypes; 47.25% of the No-call s was for wild-type alleles, 37% were for heterozygous and 15.75% for homozygous variant. Where for the call allele validation the wild-type alleles corrected were 87.35%, heterozygous 9.57% and 3.08% homozygous variant respectively (Figure 5.2).

molecular-biology-missing-genotypes

Figure 5.1: Close or neighbors SNPs may be one of reason of missing genotypes out-put.
A. In CYP1A1 there are 3 SNPS in same 9 base pairs range, and all of these variants have huge No-call genotypes. B. Molecular Inversion Probe (MIP). Each MIP is 120 bps oligonucleotide, with a unique gap fill for SNP of interest, each probe contain two homologous sequence to complement to DNA template flanked interested SNPs (Ji and Welc, 2009).

molecular-biology-Histogram-comparison

Figure 5.2: Histogram shows the comparison between Call and No-call genotypes or alleles (major/minor), which the bias is clear here in genotypes (A) or alleles (B).

To overcome this problem; one way is to increase the Cut-off percentage for the chip; although this would decrease the samples numbers. The re-genotyping of No-call will be practical to achieve final conclusion of the core ADMEs genes.

Besides genotyping the No-call SNPs, this study have another purpose that focusing on validation of this new revolutionary techniques that still be as a research tool and not for diagnostic or clinical purpose. From 375 Call genotypes, DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) is scores 371 (98.93%) with 4 errors (1.07%), which is above accepted error rate (less than 1%) [10,88,89]. The bias in No-call genotype/alleles and the rate of error that above 1% are may be resulting from cut-off of QC that used in PHBC (87%). Therefore, we suggested that the ongoing QC cut-off for call-rate should be increased, while the cut-off for genotyping error rate can be reduced properly [10].

Furthermore, study identified 8 novel variant located in this important genes: c- 209T > G in UGT2B7 and -69°8C > A in CYP1A2 are non-reported promoter variants located close to SNPs that Affymetrix® DMETTMM gene chip has probe for them. c.252G > T in SLC22A6 and c.1356T > C in SLC15A1 are non-reported exonic variants which are silent mutations (synonymous) that no effect on protein level. c.1277 + 69°C > T and c.1277 + 82°C > T in SLC22A1 are intronic variants, its associate to each other in same 5 samples, which look as markers for Jordanian, and this need more investigation to look for which subpopulation are associate with.

The most important Novel variants that our study discover are c.58G > A and c.31G > A in NAT2 gene as non- synonymous or missense variants (1st one D20N and 2ND one G11S) which NAT2 gene encodes an enzyme that function to activate or deactivate aryl amine and hydrazine drugs and carcinogens [90]. Polymorphisms in this gene are associated with higher incidences of cancer and drug toxicity [91]. These non-reported variants repeat on 4 different samples (four alleles for 1st and one allele for 2nd) and appear associate to each other in one sample which may be recorded as new haplotype.

Therefore, these non-reported variants can be change the whole haplotype and phenotypic conclusion that chip determined, these will be a very clear after flow cloning procedure for variants to determine their location in which allele. For instance, in sample DM10-005, there are two new variants in NAT2 gene in coding region and nonsynonymous that change 2 amino acids. According reported SNPs, and specific allele hybridization that DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA), the haplotype is *5/*5 and phenotype for this sample is SA (Slow acetylator), But when include the new variants which still don’t know yet about the association to other SNPs and with each other (Table 5.1).

molecular-biology-currently-haplotype

Table 5.1: Novel variants and their sample currently haplotype/phenotype.

Conclusion

• This study identified 66 variations in 15 ADMEs genes (CDA, FMO2, SLC22A6, UGT1A1, UGT2B7, CYP1A1, SLC22A1, NAT2, ABCB1, ABCG2, SLC15A2, VKORC1, CYP1A2, SLC15A1 and TPMT).

• 39 of the variations had No-call genotypes in the Affymetrix DMETTMM plus platform (Affymetrix, Santa Clara, CA, USA), which improved call rates in these markers from 89.08% to 95.56% in 101 samples and that will help to get a more accurate conclusion about Jordanian haplotypes in these genes. All of the No-call genotypes SNPs in this study (39 SNPs) are reported, and this means that our study which includes about 273 No-call s genotypes didn’t find any relation between the missing genotypes in the high-throughput GeneChip and Novel variations in the same positions that the probes binds with. Also, we found that 15 of these SNPs are neighbored by other variations upstream/downstream and this may be one reason for No-call out-puts.

• Among these variations, 8 were non-reported: C.-209T > G in UGT2B7 and - 698°C > A in CYP1A2 as promoter variants, c.252G > T in SLC22A6 and c.1356T > C in SLC15A1 as silent mutation or synonymous variants, c.1277 + 69°C > T and c.1277 + 82°C > T in SLC22A1 as intronic variants associated with each other and the most important variations were two missense (non- synonymous) in NAT2 gene: D20N c.58G > A and G11S c.31G > A.

• The other variations were reported so some of them were used to validate the Affymetrix DMETTMM plus platform.

• Error rate was 1.07% which was above the accepted rate (less than 1%), and there was a clear bias between call and No-call genotypes (after re-genotyping). This will be eliminated just when the increase current cuts-off.

Recommendations

After analyzing and reviewing the results of the current study I recommend to:

• Increase the current cut-off percentage of QC of chips to decrease the error rates and bias conclusion.

• Focus on core markers in the Affymetrix® DMETTMM plus platform (Affymetrix®, Santa Clara, CA, USA) as a priority. This will help in managing how to deal and benefit from this high through-out technique.

• Eliminate DNA quality factors that increase No-call rates in same samples, this also will decrease the numbers of missing genotypes, even after increasing the cut- off percentage.

• Flowing Cloning procedure technique is used to determine Novel variant association alleles, to obtain new haplotypes that change the final conclusions for Jordanian phenotypic groups for ADME genes.

• Continue to screen other DMET Genes (especially gene associations with ADRs in Jordanians) by direct sequencing of the Affymetrix® DMETTMM plus platform data to complete the gap in the No-call genotypes to know the phenotypics.

• This performed study can be extended to include Jordanians from different ethnic origins like (Circassians, Chechens, Bedouins, Gypsy, etc.) as sub-population studies to looks how these groups differ from the main population groups.

• Eight novel mutations that were identified in this study needed to be studied to determine whether they are neutral polymorphisms or clinically relevant mutations.

• Due to that, further analysis is important here (such as, functional analysis and splicing assays) which is required to investigate their effects on both the RNA and protein level.

• Continue screening of DMET genes, even outside the Affymetrix® DMETTMM plus platform data project, when these genes look highly polymorphic in Jordanians, and this is very important to discover the differences between us and other populations, which has influences on our drug responses and therapies.

• Testing drugs metabolism, elimination and transporters through the DMETTMM plus platform technique is very helpful to determine suspected individuals with ADRs or non-response people, so that association studies can be included in the same project.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Article Usage

  • Total views: 13542
  • [From(publication date):
    September-2015 - Aug 22, 2019]
  • Breakdown by view type
  • HTML page views : 9655
  • PDF downloads : 3887
Top