Analysis of Human Androgen Receptor Polymorphism Using Fluorescent Loop-Hybrid Mobility Shift Technique

Polymorphic CAG repeats in the human androgen receptor gene (HUMARA), located at Xq12, have been used as a highly informative genetic marker for human female tumor cells. Because of the random inactivation of one X chromosome, the heavily methylated state of one of the heterozygous alleles provides a useful tool, together with methylation-sensitive restriction enzymes, to examine clonality of tumor cells in female cancer patients [1-3]. In this respect, a simple method to determine the allelic status of CAG repeats using slab-gel electrophoresis would be of some value. Previously, we have developed LH-MS technology for detection of hot spot mutations at various oncogene loci for targeted therapy, such as EGFR codon 858, BRAF codon 600 and KRAS codons 12 and 13 [4,5]. The mutated alleles were differentiated from the wild type by means of mobility-shift of the loophybrids. Unique sequences in these genes produced simple PCR bands and the LH-bands for the mutant alleles were shifted from the wild type and detected unambiguously. On the other hand, when relatively short di-nucleotide repeat polymorphisms in UGT1A1, namely (TA)6 vs. (TA)7, were genotyped using a loop-hybrid mobility shift (LH-MS) technique, PCR product of the segment containing these di-nucleotide repeats produced several confounding bands after polyacrylamidegel electrophoresis (PAGE). Using a Cy5-labeled LH-probe, however, simple fluorescent LH bands were detected and they were uniquely associated to the variant alleles of di-nucleotide repeats [6]. In this study, we show that a fluorescent LH-MS technique facilitates the detection of far more complex genotypic variants of CAG repeats ranging from 17 to 31 in repeat numbers. The present technique is capable of detecting even a single repeat unit difference and reveals heterozygosity of this locus in 87% of human female study population.


Introduction
Polymorphic CAG repeats in the human androgen receptor gene (HUMARA), located at Xq12, have been used as a highly informative genetic marker for human female tumor cells. Because of the random inactivation of one X chromosome, the heavily methylated state of one of the heterozygous alleles provides a useful tool, together with methylation-sensitive restriction enzymes, to examine clonality of tumor cells in female cancer patients [1][2][3]. In this respect, a simple method to determine the allelic status of CAG repeats using slab-gel electrophoresis would be of some value. Previously, we have developed LH-MS technology for detection of hot spot mutations at various oncogene loci for targeted therapy, such as EGFR codon 858, BRAF codon 600 and KRAS codons 12 and 13 [4,5]. The mutated alleles were differentiated from the wild type by means of mobility-shift of the loophybrids. Unique sequences in these genes produced simple PCR bands and the LH-bands for the mutant alleles were shifted from the wild type and detected unambiguously. On the other hand, when relatively short di-nucleotide repeat polymorphisms in UGT1A1, namely (TA) 6 vs. (TA) 7 , were genotyped using a loop-hybrid mobility shift (LH-MS) technique, PCR product of the segment containing these di-nucleotide repeats produced several confounding bands after polyacrylamidegel electrophoresis (PAGE). Using a Cy5-labeled LH-probe, however, simple fluorescent LH bands were detected and they were uniquely associated to the variant alleles of di-nucleotide repeats [6]. In this study, we show that a fluorescent LH-MS technique facilitates the detection of far more complex genotypic variants of CAG repeats ranging from 17 to 31 in repeat numbers. The present technique is capable of detecting even a single repeat unit difference and reveals heterozygosity of this locus in 87% of human female study population.

DNA
Blood DNA samples from an anonymized healthy adult Japanese population (39 males and 46 females) obtained with informed consent [6] were used. Tumor DNA was previously obtained from frozen tumor tissue collected from 215 colorectal cancer patients with informed consent [7], of which females were 91, and was used for clonality testing of the tumors. This study was approved by the Internal Review Board of the Kanagawa Cancer Center, Yokohama, Japan.

Restriction enzymes
HpaII (50000 U/µl, NEB, Ipswich, MA), a methylation-sensitive restriction enzyme, and MspI (20000 U/µl, NEB), a methylationinsensitive isoschizomer, both of which cleave CCGG, were used. Genomic DNA (50 ng/100 µl) was column-purified and concentrated (Zymo Research, Irvine, CA) to yield a purified DNA solution (8 µl). Purified DNA (6.2 µl) was treated with 1 µl of HpaII or MspI together with 0.8 µl of 10×buffer for 24-48 h at 37°C. Two restriction sites were present in the amplicon used in this study. One microliter of the digestion mixture was used directly as template in a 10-µl PCR reaction mixture, and 0.8 µl of the eluted DNA was used as untreated control template DNA.

Primers and LH probes
Primer sequences and LH probe sequences (Fw-and Rv-types) are given in Table 1. LH probes were labeled with Cy5 at the 5'-end. The human androgen receptor gene, HUMARA, resides at Xq12. The genomic sequence used in this study was derived from nucleotides 67544032-67730619 of the human X chromosome sequence (GenBank accession number NC_000023, http://www.ncbi.nlm.nih.gov/nuccore/ NC_000023.11). The polymorphic CAG repeat is located in exon 1. An LH probe was designed to produce loop-hybrids such that the variable repeat of (CTG) n and six neighboring nucleotides would loop out from the antisense strand of the amplified polymerase-chain reaction (PCR) product following hybridization with Fw-type LH-CTG probe, or from the sense strand using Rv-type LH-CAG probe. The LH probes, which were shorter than the amplicon, would be filled by polymerase extension in the hybrid to yield the complete LH form. In Rv-type LH probe, both the polymorphic CAG repeats and the other stable repeat, (CAG) 6 , were deleted so that the LH generated by hybridization of the Rv-type LH probe with the sense-strand was assumed to produce two loops 15 bp apart.

Fluorescent LH mobility shift technique
A previously described LH protocol [4] was modified as follows. PCR was performed using AccuPrime Taq polymerase (ThermoFisher, Waltham, MA) together with the primer pairs (Table 1) and the DNA template described above, under the following conditions: 94°C for 4 min, followed by 40 cycles of 94°C for 15 s, 55°C for 15 s, and 68°C for 45 s. LH probe (0.6 µl of 200 nM stock) was combined with 4.5 µl of PCR product to generate an LH by denaturation at 94°C for 4 min, followed by 55°C for 15 s and 68°C for 4 min. LH product (1.5 µl) was separated on a 10% pre-formed polyacrylamide gel (6 cm long, ATTO Co., Tokyo, Japan) by electrophoresis at 25 mA in Tris-glycine running buffer (37.5 mM Tris, 288 mM glycine) for 30 min. The gel was then stained with SYBR GreenI (ThermoFisher) for 8 min, and visualized using a laser scanner (STORM860, GE Healthcare, Little Chalfont, UK) at 450 nm excitation together with Longpath (LP) filter 520 nm for SYBR GreenI, or at 635 nm/LP 650 nm for Cy5 fluorescence [6].

Cloning and sequencing
To determine the actual repeat length, PCR products were cloned and sequenced. PCR products were ligated into vector pCR2.1 using a TOPO TA cloning kit (ThermoFisher) and then transformed into One Shot TOP10 electrocompetent Escherichia coli according to the manufacturer's instructions. Plasmid DNA was extracted from 20 colonies using CloneChecker (ThermoFisher) and the DNA was amplified using Phi29 DNA polymerase (TempliPhi, GE Healthcare) for sequencing using a capillary sequencer (3130 Genetic Analyzer, ThermoFisher). After the CAG repeat numbers had been determined, diluted plasmid DNA was used as a template and the segment containing CAG repeats was PCR-amplified with the specified primers (Table 1), and analyzed using the fluorescent LH-MS technique to assign these actual CAG repeat numbers to the relevant LH band positions.

CAG length polymorphism in human male DNA
For simplicity, CAG repeat length polymorphism in HUMARA exon 1 was first analyzed in male DNA. PAGE analysis of the PCR products of the exon 1 region containing CAG repeats exhibited various extra bands besides the bands of the expected sizes, as usually observed for amplicons containing highly repeated sequences [8]. Fluorescent LH-MS technique can circumvent this difficulty of complex band patterns due to short repeats. After hybridization with the fluorescent LH-CTG probe (Fw-type), the polymorphic CTG repeats and three neighboring nucleotides (CAG, TTG) looped out from the antisense strand of the LH. The LH can be visualized specifically as a single Cy5fluorescent band on the gel. Among 39 male DNA samples examined with the fluorescent LH-MS technique, an array of 15 male DNA was obtained, which showed their LH bands positioned in a consecutive order displaying a stepwise shift from one LH band to the other ( Figure  1A). By cloning and sequencing PCR products from these male DNA,  CAG repeat numbers were determined. The CAG repeat number in the cloned plasmids was largely consistent but, in a small fraction, the repeat number smaller by one repeat unit were observed, probably owing to polymerase slippage during PCR [9]. Such slippage-derived shorter PCR products were considered to account for the weakintensity bands associated with the LH bands of male DNAs ( Figure  1B). These weak-intensity bands were regarded as irrelevant bands for HUMARA genotyping.
Each LH band of the arrayed male DNA samples were allocated with CAG repeat numbers from 17 to 31 and used as a standard size marker to estimate CAG repeat lengths in fluorescent LH-MS technique. In the standard array of LH bands, it was noted that the migration-shift of the LH bands from 2n to 2n+1 repeats was always larger than the shift from 2n-1 to 2n when Fw-type LH-CTG probe was used ( Figures 1A and 1C). On the other hand, when Rv-type LH-CAG probe was used, the migration shift from 2n-1 to 2n was larger than the shift from 2n to 2n+1 ( Figures 1B and 1C). It follows that, in genotyping female DNA with heterozygosity of one CAG repeat unit difference, the difference of 2n and 2n+1 can be determined by Fw-type LH-CTG probe, meanwhile, the difference of 2n-1 and 2n can be determined by Rv-type LH-CAG probe ( Figure 1C). Standard LH size ladder markers were prepared for even numbered (18-30) repeats and odd numbered (17-31) repeats, separately, with either Fw-type LH-CTG probe or Rvtype LH-CAG probe.

Genotyping HUMARA alleles in female DNA
In contrast to male, a large part of female samples revealed two LH bands, which indicated allelic difference of CAG repeats and were sized properly using the standard LH size ladder markers (Figure 2A). Heterozygous cases of one repeat unit difference were also rendered to reveal two distinct LH bands by either Fw-type LH-CTG or Rv-type LH-CAG probe depending on CAG repeat composition. Only when both of these probes produced a single LH band, the female DNA was considered to be homozygous ( Figure 2B).
In a healthy female Japanese population, 87% (40/46) were heterozygous, of which 22% (9/40) showed an allelic difference of one repeat unit. From the allelic frequency distribution ( Figure 2C) for the population examined, including males, the expected rate of homozygosity (10.8%) was close to the actual rate (13%) found in the 46 female DNA samples, indicating that nearly 90% of female cases would be informative.

A clonality test of female tumor DNA
According to random inactivation of X chromosomes in female cells during early embryogenesis [10,11], methylated CpG sites near the CAG polymorphic site of HUMARA in the inactive X chromosome present in the incipient tumor cells may be maintained and passed to the replicating tumor cells in female patients. In contrast, infiltrating macrophages and lymphocytes may be randomly methylated at the same locus. Female tumor DNA with informative heterozygous CAG repeats in HUMARA was digested with the methylation-sensitive restriction enzyme HpaII, and examined for the undigested methylated allele, which would be amplified by PCR and analyzed for the undigested allele with fluorescent LH-MS technique. As shown in the representative female cases, one of the two LH bands was undigested by HpaII, indicating non-random methylation of these alleles and putative clonal origin of these tumor cells (Figures 3A and 3B).
In one male tumor DNA sample, a single LH band was lost following HpaII digestion, consistent with a single active unmethylated X chromosome. However, another male tumor DNA sample showed an LH band only partially digested following HpaII treatment ( Figure  3B). A certain alteration at the restriction sites in a subpopulation of the tumor cells may be considered.
The female tumor DNA which exhibited HpaII digestion only in one allele was treated with the methylation-insensitive enzyme MspI. Unexpectedly, the same HpaII-insensitive allele remained also as MspIinsensitive ( Figures 3B and 3C), though the LH band appeared less intense probably due to partial digestion. These results may suggest unusual methylation at the sites in a certain subpopulation of the tumor cells making one allele insensitive to both HpaII and MspI. Since the DNA after PCR amplification was completely digested by MspI (data not shown), the possibility of genetic alterations may be excluded.

Discussion
For each male DNA sample, a unique LH band was observed that apparently represented one of the highly polymorphic CAG repeat alleles. The LH band positions shifted regularly as the CAG repeat lengths in the loop of the LH increased by odd-number from 17 to 31 or by even-number from 18 to 30, respectively. However, comparison between odd-and even-numbered repeat ladder bands showed that the LH band positions for the even-numbered repeats were not placed in the middle but displaced from the middle of the neighboring LH bands of odd-numbered repeats. Loops of CAG repeats formed by Rv-type LH-CAG probe (and also those of CTG repeats formed by Fw-type LH-CTG probe) may generate rod-like structures under the hydrogenbonding effect between G and C residues situated periodically along the loop. At the distal ends, the rod-like loops of even numbered repeats may take a slightly different form from those of odd-numbered repeats [12]. This may account for the regularly shifted migration patterns of LH bands formed by even-numbered and also by odd-numbered repeats, slightly displaced to each other.
Polymerase slippage occurs in repeated sequences at certain rates depending on the nature of polymerase [13]. In the present study, the extent of slippage-induced variants during PCR for the CAG repeats may be low, considering the low intensity of the apparent single band consistently observed associating the major LH band for the male DNA. These less intense bands associating the major LH bands, probably due to polymerase slippage during PCR, may represent an artifact. Among the LH bands produced by Fw-type LH-CTG probe, less-intense bands were detectable for the odd-numbered CAG repeats because LH bands for 2n are widely separated from 2n+1 repeats. The associated LH bands for the even-numbered CAG repeats would be obscured by narrower separation between LH bands for 2n-1 and 2n repeats. Similarly, among the LH bands produced by Rv-type LH-CAG probe, less- intense associated bands were detectable for the even-numbered CAG repeats, while they would be obscured for the odd-numbered repeats. In spite of these weak associate bands, assessment of the allelic status of heterozygous normal female DNA was accomplished, even for alleles differing by only one repeat unit. However, caution may be required for analyzing tumor cells with microsatellite instability affected by mismatch repair defects [14], since these tumor cells may develop multiple allelic situation by varying repeat lengths during proliferation within the tumor population.
One of two alleles at HUMARA locus in the female tumor DNA examined appeared to be undigested by both HpaII and MspI at least in a certain subpopulation. Such observation suggests that methylation in the tumor DNA might not be restricted to the internal C residue of CCGG, but could also reside in the external C residue of the restriction sites in the amplicon, making these sites insensitive to both HpaII and MspI [15] in a subpopulation of these tumor cells. Regarding HpaII treatment, DNA needs to be derived from pure tumor cells because the DNA from infiltrating lymphocytes or fibroblasts could increase ambiguity caused by the random methylation profiles of X-chromosomes in these non-tumor cells. Since the CRC tumor DNA samples used in the present study were derived from frozen tissue blocks, the exact extent of tumor cell component was unknown. In this respect, tumor DNA isolated from tissue sections by laser captured microdissection, or other similar technologies, should be pursued in future studies of tumor clonality using the fluorescent LH-MS technique integrated in HUMARA assay described in this study.
In conclusions, fluorescent LH-MS technique in HUMARA analysis allows to determine CAG-repeat lengths polymorphism precisely in range of 17-31 even to one unit difference of repeat. Furthermore, the use of methylation-sensitive restriction enzymes together with fluorescent LH-MS technique in HUMARA analysis provides a simple technology for clonality test of various tumors in almost 90% of female patients. Fluorescent LH-MS technique can be extended to forensic or laboratory analyses for identification and discrimination of various human DNA samples.