alexa Measuring Inequalities in Gene Co-expression Networks of HIV-1 Infection Using the Lorenz Curve and Gini Coefficient | Open Access Journals
ISSN: 2153-0602
Journal of Data Mining in Genomics & Proteomics
Make the best use of Scientific Research and information from our 700+ peer reviewed, Open Access Journals that operates with the help of 50,000+ Editorial Board Members and esteemed reviewers and 1000+ Scientific associations in Medical, Clinical, Pharmaceutical, Engineering, Technology and Management Fields.
Meet Inspiring Speakers and Experts at our 3000+ Global Conferenceseries Events with over 600+ Conferences, 1200+ Symposiums and 1200+ Workshops on
Medical, Pharma, Engineering, Science, Technology and Business

Measuring Inequalities in Gene Co-expression Networks of HIV-1 Infection Using the Lorenz Curve and Gini Coefficient

Chuang Ma1,2, Sheng-He Huang1,3* and Yanhong Zhou3

1Saban Research Institute of Children’s Hospital Los Angeles and the University of Southern California, Los Angeles, USA

2School of Plant Sciences, University of Arizona, Tucson, AZ, USA

3Hubei Bioinformatics and Molecular Imaging Key Laboratory, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China

*Corresponding Author:
Sheng-He Huang
Saban Research Institute of Children’s Hospital Los Angeles
and the University of Southern California
Los Angeles, USA
Tel: 213-440-2528
E-mail: [email protected]

Received date: December 02, 2013; Accepted date: January 27, 2014; Published date: January 30, 2014

Citation: Ma C, Huang SH, Zhou Y (2014) Measuring Inequalities in Gene Coexpression Networks of HIV-1 Infection Using the Lorenz Curve and Gini Coefficient. J Data Mining Genomics Proteomics 5:148. doi: 10.4172/2153-0602.1000148

Copyright: © 2014 Ma C, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Visit for more related articles at Journal of Data Mining in Genomics & Proteomics

The Gini methodology is a family of mathematical models that describe various relations in or between variables [1,2]. The basic concept of Gini methodology is the Gini coefficient (also known as Gini index, or Gini ratio), which measures the inequality of a distribution (e.g., income) with values ranged from 0 (complete equality) to 1 (complete inequality), has been popularly used in economics for quantifying the income inequality in a country [3,4]. Due to the superiority of analyzing data with normalized and non-normalized distribution [2], Gini coefficient and the derived statistical algorithms have been extended to apply in disciplines as diverse as social science, chemistry and engineering. Recently, the Gini methodology has also been introduced to biology for inferring transcription regulation relationships from gene expression data [5], and for exploring the symbiosis and pathogenesis of human immunodeficiency virus type 1 (HIV-1) infection [6].

HIV-1 is a virus that can cause acquired immunodeficiency syndrome (AIDS), leading to thousands of death per year in the world due to the lack of effective vaccines and cure. As one of powerful systems biology approaches, gene co-expression networks (GCNs) have been recently applied to investigate the molecular mechanisms of HIV-1 infection by organizing genes into a network, in which two genes with similar expression patterns are connected by an edge [6-8]. An in-depth statistical analysis of HIV-related network properties will be helpful to discover new biomarkers and signatures of HIV-1 infection. Here we applied the Gini methodology to explore inequalities in GCNs constructed with 943 genes differentially expressed in human lymphatic tissues of uninfected subjects and infected patients at different stages of HIV-1 infection (the acute, the asymptomatic, and the AIDS stages). More details about the microarray data generation and normalization, and the selection of differentially expressed genes can be found in Xu et al. [9]. To construct GCNs, the similarities of expression patterns between two genes were measured with Pearson correlation coefficient (PCC). Two genes were connected in the GCNs if the significance level (p-value) of PCC is lower than 0.05. The p-values were estimated with permutation method by shuffling gene expression data in the microarray dataset.

Application of Gini Co-efficient to Quantify the Connectivity In-equality (CI) in the GCN

Despite the connectivity distributions give some insights about how genes are connected in the GCN (Figure 1A), they fails to quantify the characterization of the connectivity in the whole network, leading to the difficulty of comparing two GCNs constructed for different biological conditions. Here the connectivity inequality (CI) is introduced to consider the distribution of connectivity of genes in the whole network with the Gini coefficient algorithm. The CI can be graphically represented with the Lorenz curve, which is a two-dimensional plot of the cumulative fraction of the number of genes in the network versus the cumulative fraction L(p) of total connectivity from these genes. The Lorenz curve more close to the diagonal line indicate that genes in the network are more equally connected. The Gini coefficient is equal to one minus twice the area under the Lorenz curve, and can be computed with the formula [10]:

data-mining-genomics-geneco-expression-networks

Figure 1: The connectivity in equality in the geneco-expression networks (GCNs) of HIV-1 infection.
(A) The connectivity distributions of GCNs for uninfected subjects and patients at different stages of HIV-1 infection.
(B) The Lorenz curves of connectivity distributionin four GCNs. prepresents the cumulative fraction of the number of genes in the network, L(p) denotes the cumulative fraction of connectivity from these genes.
(C) Gini coefficient of connectivity infour GCNs.
(D) Gini share of positive and negative connectivity in four GCNs.
(E) Gini correlation of positive and negative connectivity infour GCNs.

image, where n is the number of genes in the network, X(i) is the ith value of connectivity sorted in increasing order, 0 ≤ X(1) ≤ X(2) ≤…≤ X(n). We observed that the Lorenz curve from the GCN at the AIDS stage is markedly deviated from the diagonal line that those from the other three GCNs (Figure 1B). At the same time, the Gini coefficient from the GCN at the AIDS stage is much higher than those from the other three GCNs. These results indicate that dramatic changes of transcriptional regulation at the last stage of HIV infection.

Application of Gini Coefficient to Estimate the Contribution of Positive and Negative Connectivity to the Connectivity Inequality (CI)

In the GCN, the connectivity of a gene is composed with positive and negative connectivity, which present the connection to other genes with positive and negative PCC values, respectively. The contribution of positive and negative connectivity to the overall inequality of connectivity in the network is defined based on the decomposition of Gini coefficient (1):

image, where CI and CIn are the inequality of positive and negative connectivity, respectively. Sp and Sn (Sn =1−Sp) are two Gini share measures represent the percentages of positive and negative connection in the whole network, respectively. p τ(Xp,X) and n τ(Xp,X) are two Gini correlation coefficients ranged from -1 to 1, indicating the contribution of positive and negative connectivity to the CI, respectively. As shown in Figure 1D, Gini share of positive connectivity in four networks are remarkably higher than that of negative connectivity, indicating the positive regulation is the dominant relation in the network for uninfected subjects and patients at different stages of HIV-1 infection. Interestingly, the positive regulations were enhanced at the first two stages of HIV-1 infection. In contrast, The negative regulations at the AIDS stage were enhanced. From the HIV uninfected to the AIDS stage, the Gini correlation of negative connectivity is changed more significantly than that of positive connectivity (Figure 1E), indicating that positive and negative coexpression associations might play different roles in the pathogenesis of HIV infection.

Application of Gini Coefficient to Measure the Inequality of Edge Weights in GCNs

Besides the connectivity, the edge weights (i.e., correlation values) in GCNs were also changed during the HIV-1 infection. For a given gene i, the changes in the correlation strengths can be calculated using the differential co-expression (dC) measure with the formula [12]: imagewhere gene i connects m genes in two networks, imageand image represent the correlation values between gene i and j in two networks, respectively. In this study, we observed that there were differences in the inequality of edge weights between GCNs of HIV-1 infection (Figure 2). At the acute and asymptomatic stages of HIV-1 infection, the edge weights are more equal than those in network for uninfected subjects. However, the edge weights become dramatically unequal in network for patients at the AIDS stage (Figure 2). On this basis, a novel measure “delta Gini” was introduced to consider the differences in the inequality of edge weights between two networks. Although the delta Gini and dC were significantly correlated in most network comparisons (except AIDS vs. Uninfected) (Figure 3), the delta Gini provided additional information about the changes of edge weights between two networks. First, the delta Gini is ranged from -1 to 1, with positive value indicating the inequality of edge weights is increased and negative values indicating the inequality of edge weights is decreased. Second, the delta Gini is valuable to identify candidate biomarkers of HIV-1 with low rank of dC values. For instance, MRC1 is a mannose receptor interacting with several HIV proteins to promote viral spread [13-15], and has a delta Gini value of -0.44 (rank=2) and a dC value of 0.96 (rank=173) while comparing networks constructed for patients at the AIDS stage and for uninfected subjects. Similarly, PPFIBP1, which plays roles in HIV-1 replication, also has a high rank of delta Gini (value=-0.42; rank=3) but a low rank of dC (value=0.89; rank=283). MDM4 is another representative example showing a positive and high-ranked delta Gini (value=0.36; rank=22), but a low-ranked dC (value=0.92; rank=235). This gene was recently demonstrated to be a direct calpain substrate playing roles in the HIV-induced neuronal damage [16]. The detailed values of delta Gini and dC for all comparisons of GCNs of HIV-1 infection were listed in Supplemental Table 1.

No. Uninfected/Acute Uninfected/Asympt Uninfected/AIDS Acute/Asympt Acute/AIDS Asympt/AIDS
1 NM_002604 NM_006275 BE856807 Z00008 N45309 AU118882
2 J04162 BC025250 NM_002438 AF234255 NM_002612 U52913
3 AW169973 AI743534 NM_003622 NM_014668 AL554245 NM_002612
4 AB040957 AW169973 AL574194 BE886225 AI803181 NM_003622
5 AI760944 AK055572 BF592034 AW294722 S67238 AK095698
6 AI650364 NM_002759 NM_016164 L21961 AK095698 AK000776
7 NM_006482 AI650364 AI655611 BI598831 AA541622 NM_016164
8 AK000776 AU146963 BC038422 AL049337 AW027333 AL574194
9 NM_000570 AI554909 NM_017631 BF673779 BF340228 AA053711
10 AI148006 AI926479 AI803181 U52914 NM_014710 U52914

Table 1: List of top 10 genes with largest changes of edge weights between two compared networks.

data-mining-genomics-inequality-edge-weights

Figure 2: The inequality of edge weights in geneco-expression networks of HIV-1 infection.“Uninf.” represent networks for uninfected subjects.

data-mining-genomics-dot-plots-delta

Figure 3: The dot plots of delta Gin iand dC for comparing GCNs of HIV-1 infection. Each dot represents a gene.“PCC” denotes the Pearson correlation coefficient. The p-value of PCC was calculated using the “cor.test” function in R programming language.

These results indicate that Gini algorithm would be a complementary approach to dC for comparing the differences between two GCNs.

References

Select your language of interest to view the total content in your interested language
Post your comment

Share This Article

Relevant Topics

Article Usage

  • Total views: 11719
  • [From(publication date):
    February-2014 - Nov 19, 2017]
  • Breakdown by view type
  • HTML page views : 7923
  • PDF downloads : 3796
 

Post your comment

captcha   Reload  Can't read the image? click here to refresh

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

Agri & Aquaculture Journals

Dr. Krish

[email protected]

1-702-714-7001Extn: 9040

Biochemistry Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Business & Management Journals

Ronald

[email protected]

1-702-714-7001Extn: 9042

Chemistry Journals

Gabriel Shaw

[email protected]

1-702-714-7001Extn: 9040

Clinical Journals

Datta A

[email protected]

1-702-714-7001Extn: 9037

Engineering Journals

James Franklin

[email protected]

1-702-714-7001Extn: 9042

Food & Nutrition Journals

Katie Wilson

[email protected]

1-702-714-7001Extn: 9042

General Science

Andrea Jason

[email protected]

1-702-714-7001Extn: 9043

Genetics & Molecular Biology Journals

Anna Melissa

[email protected]

1-702-714-7001Extn: 9006

Immunology & Microbiology Journals

David Gorantl

[email protected]

1-702-714-7001Extn: 9014

Materials Science Journals

Rachle Green

[email protected]

1-702-714-7001Extn: 9039

Nursing & Health Care Journals

Stephanie Skinner

[email protected]

1-702-714-7001Extn: 9039

Medical Journals

Nimmi Anna

[email protected]

1-702-714-7001Extn: 9038

Neuroscience & Psychology Journals

Nathan T

[email protected]

1-702-714-7001Extn: 9041

Pharmaceutical Sciences Journals

Ann Jose

[email protected]

1-702-714-7001Extn: 9007

Social & Political Science Journals

Steve Harry

[email protected]

1-702-714-7001Extn: 9042

 
© 2008- 2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords