alexa Research Note: Evaluation of Resequencing Technologies Parameters for CNV Genotyping | Open Access Journals
ISSN: 2161-1041
Hereditary Genetics: Current Research
Like us on:
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Research Note: Evaluation of Resequencing Technologies Parameters for CNV Genotyping

Sergio Ivan Román-Ponce1,2,3*, Alessandro Bagnato2 and Theo Meuwissen3

1Centro Nacional de Investigación en Fisiología y Mejoramiento Animal, Instituto Nacional de Investigaciones Forestales Agrícolas y Pecuarias, México

2Università degli Studi di Milano. Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza Alimentare, Via Celoria 10. 20133 Milano, Italia

3Department of Animal and Aqua cultural Sciences, Norwegian University of Life Sciences, Norway

*Corresponding Author:
Dr. Sergio Ivan Roman Ponce
Centro Nacional de Investigación en
Fisiología y Mejoramiento Animal, Instituto
Nacional de Investigaciones Forestales
Agrícolas y Pecuarias, México
Tel: 01 800 088 22 22 0 (55) 38 71 87 00
E-mail: [email protected]

Received date: March 16, 2016; Accepted date: July 15, 2016; Published date: July 17, 2016

Citation: Román-Ponce SI, Bagnato A, Meuwissen T (2016) Research Note: Evaluation of Resequencing Technologies Parameters for CNV Genotyping. Hereditary Genet 5:170. doi:10.4172/2161-1041.1000170

Copyright: © 2016 Ponce SIR, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Hereditary Genetics: Current Research

Abstract

Whole genome (re)sequencing provides new opportunities to discover Copy Number Variation (CNV) on the genome. Due to the continuous reduction in sequencing costs, it has become as the principal methodology to detect CNV in livestock. One parameter that increases the genotyping cost is the depth of the coverage during sequencing. The main aim of this note was to assess the variation on CNV identification with different depth coverage and readlength on genome sequencing. The results point out that sequences coming from short read-length require less depth coverage than those obtained with long read-length. In addition, small CNV require deeper coverage to be detected. These results can reduce the discovering and genotyping costs since sequencing technologies with short read-lengths are often less costly. Finally, a general formula was derived to optimize the sequencing costs.

Keywords

Copy number variation; Depth of coverage; Livestock; Read-length

Introduction

Copy Number Variation (CNV) represent a significant source of genetic diversity in mammals covering ~12% of the genome [1], and it has been shown to be associated with phenotypes (diseases/traits) in humans [2]. Next-Generation Sequencing (NGS) technology allows for whole genome (re)sequencing at very low costs per sequence and provides a wealth of information to tackle genetic problems, such as the identification of the molecular basis of complex traits that are difficult to study with conventional approaches [3]. To discover (detect, validate and characterize) or genotype CNV on whole genome sequences, the array Comparative Genomic Hybridization (aCGH) has been so far the most used technique. In aCGH experiments genomic DNA samples are co-hybridized on the same oligonucleotide array and the genomic variation differences from the reference sample lead to CNV detection [4].

Currently, some studies that identified CNV using aCGH on cattle [5,6], chicken [7], swine [8] and goat [9] are available. The sequencing effort and its cost represent an important limit to the identification of CNV in livestock populations. One of the parameters that deeply affect the genotyping costs is the coverage of the sequencing. The main aim of this note is to assess the effects of depth of coverage (X) and readlength of the sequencer (RL) on the accuracy of the estimate of the number of copies present in a CNV. All these parameters intend to represent the most common resequencing technologies available.

First of all, we need to know the number of reads (Nr) of the sequences, which is calculated as,

equation

Where Lg is the genome length and RL is the read-length of the sequencer.

The number of times that a read is within a CNV (K) is a function of the number of tandem repeats (copies) within the CNV (n), of the size of each copy of the CNV (S), of the RL and of the Lg and it is calculated as follows:

equation (1)

Assuming a Poisson distribution of K(X, Klambauer), the variance of K is

equation (2)

Finally, the coefficient of variation of the number of counts (CV) is:

equation (3)… which it is used as a measure of the accuracy of the estimates of K.

Input parameters used to assess the accuracy of K were: the length of the bovine genome (Lg = 2,344 megabase) as reported previously [10], the CNV size (S = from 1 to 200 kb) according to the results of Fadista [5], the read-length (RL=30, 90, 150 and 300bp) and the depth of coverage of the sequence (X=10, 20 and 30).

To evaluate the number of times that random fragments with size of read-length (RL= 30, 90, 150 and 300 bp) were inside of one CNV, three different sizes of the CNV were considered: 1.6, 105.5 and 220.1 Kb per copy. The numbers of fragments were extracted randomly in silico from the bovine genome sequence and correspond to the number of reads of the genome. Proportions of fragments inside of a CNV [E] were estimated as follows:

equation

where represents the number of counts of read fragments inside of a CNV and Nr is the number of reads.

The coefficient of variation of K is a function of the read-length and of the coverage depth. The CV(K) decreases with shorter reads and deeper coverage in the sequence as shown in the Figure 1.

hereditary-genetics-whole-genome-sequencing

Figure 1: Coefficient of variation (%) estimated in the detection of Copy Number Variation through whole genome sequencing with four Read-Length (RL) and three different depth of coverage (X).

Additionally, when the CNV length increases the coefficient of variation decreases, independently from the depth of the coverage and from the read-length here tested. The number of fragments included in a CNV extracted in silico marginally differs from the prediction done by formula [1] (Table 1).

RLa
(bp)
Coverage
(X)
Nrb
(n)
FCNVc
1.6 kb,105.5Kb,220.1 Kb
FCNVd
1.6 kb,105.5Kb,220.1 Kb
30 10 47,139,530 528 35,096 73,463 523.3 35,156.7 73,356.7
30 20 94,279,060 1,036 70,387 147,046 1046.7 70,313.3 146,713.3
30 30 141,418,590 1,585 105,490 220,616 1570.0 105,470.0 220,070.0
90 10 15,713,180 166 11,781 24,192 167.8 11,712.2 24,445.6
90 20 31,426,360 317 23,555 48,735 335.6 23,424.4 48,891.1
90 30 47,139,540 484 35,386 73,085 503.3 35,136.7 73,336.7
150 10 9,427,910 121 6,902 14,638 96.7 7,023.3 14,663.3
150 20 18,855,820 221 12,887 29,037 193.3 14,046.7 29,326.7
150 30 28,283,730 336 20,842 43,743 290.0 21,070.0 43,990.0
300 10 4,713,950 50 3,533 7,255 43.3 3,506.7 7,326.7
300 20 9,427,900 95 7,072 14,456 86.7 7,013.3 14,653.3
300 30 14,141,850 144 10,563 21,859 130.0 10,520.0 21,980.0

Table 1: Estimated number of copies presented in one copy number variations (CNV) varying the read-length, depth coverage and the size of CNV.

The in silico experiment was repeated and the results did not change because the read-length was a constant. The proportion of reads inside of a large (220.1 kb), a medium (105.5 kb) or a small (1.6 kb) CNV were the same for 10X (0.001%), 20X (0.075%) and 30X (0.155%), which shows that depth coverage did not affect the expected copy number estimation of the genotyping, only the accuracy of this estimate.

In cattle, the average size of CNV is 72.3 kb, with a median of 16.7 kb (Min= 1.7 kb; Max= 2,031 kb) [5]. The detection and genotyping of CNV by sequencing depends on the read-length of the sequencer and the size of the CNV. Accounting for these parameters is necessary to determinate the required depth of coverage in order to minimize the cost of genotyping on whole or target (re)sequencing. If whole genome (re)sequencing is used, a deep coverage is recommended to permit the accurate genotyping of also the smallest CNV. However, when a certain region is sequenced to detect one or several CNV(s), the formula [1] can be used to optimize the depth coverage and increase the accuracy of this CNV genotyping; Its application is not restrictive for cattle, can also be used in other organisms where the state of knowledge has not advanced sufficiently in order to optimize the economic effort.

An inherent problem of NGS data the considerable read-mapping ambiguity [11]. Several methods to detect CNV are based on read depths which assume Poisson distribution. Recently, several completely sequenced genomes were examined, and the Poisson distribution assumption was violated by some NGS technologies [12]. Despite this, in this study the results show that sequences obtained from shorter read-length require less depth coverage, a deeper coverage is required when small CNV are searched. Based on this conclusion, the advantage for the scientific community is that technologies with shorter readlength tend also to be less costly.

Acknowledgment

The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 222664. (“Quantomics”).

Disclaimer

This Publication reflects only the author’s views and the European Community is not liable for any use that may be made of the information contained herein.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Recommended Conferences

Article Usage

  • Total views: 8064
  • [From(publication date):
    August-2016 - Nov 18, 2017]
  • Breakdown by view type
  • HTML page views : 7996
  • PDF downloads : 68
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords