Received date: September 20, 2016; Accepted date: October 11, 2016; Published date: October 18, 2016
Citation: Szwarcwald CL, de Souza Júnior PRB, Pati Pascom AR, da Costa Ferreira Júnior O (2016) Results from a Method for Estimating HIV Incidence Based on the First Cd4 Count among Treatment-Naïve Cases: Brazil, 2004-2013. J AIDS Clin Res 7:627. doi:10.4172/2155-6113.1000627
Copyright: © 2016 Szwarcwald CL, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of AIDS & Clinical Research
Background: This paper introduces a method to estimate HIV incidence in Brazil using surveillance data. The interest is to estimate the annual lag (time from infection to reporting) distribution among incident cases in a given year with observations arising from a right truncated version of the distribution. Methods: For each treatment-naïve HIV case aged 15 years and over reported from 2004 to 2015 we estimated the time since infection based on a statistical model that relates the first CD4 count to time since infection. Under the assumption the lag distribution is expressed by a logistic probability distribution, we estimated HIV incidence as the upper limiting value of the logistic function. Since this approach requires at least eight observations per year, to estimate HIV incidence in recent years (2009-2013), we used linear regression models to estimate the missing observations for these years due to truncation in 2015. Using this approach, HIV incidence was estimated from 2004 to 2013, separately for males and females. Results: In 2013, HIV incidence among people aged 15 years and over was estimated to be 44827 (95% CI 41143-47987), 32459 men (95% CI: 29775-34642) and 12368 women (95% CI: 11368-13345). Results from 2004- 2013 have shown an increase among men and a slow decrease among women. The estimated proportion of cases reported less than one year after infection increased from 24.6% to 35.2%, among men, and from 35.1% to 45.8%, among women. For men who became infected from 2004 to 2013, 35.6% of the cases were not reported by year 2015 and among women, 23.2%. Conclusion: The delay between HIV infection and diagnosis is of concern. Designing interventions to motivate testing is essential, especially among most-at-risk groups, as the faster HIV infected cases are linked to care, the faster the HIV incidence curve will turn downward.
Surveillance; HIV incidence; Estimation; CD4 count; Depletion model; Diagnosis delay; Brazil
Over the past two decades, antiretroviral therapy has brought enormous health benefits to people living with HIV/AIDS (PLWHA), with remarkable increase in survival . To ensure effective implementation of prompt access to treatment and other prevention interventions, it is essential to have estimates of HIV incidence for describing the current dynamics of the epidemic [2,3]. Trends in HIV incidence indicate the degree to which HIV transmission is controlled , which groups are most at risk for HIV infection and help to identify the emergence of new sub-epidemics in the general population .
HIV incidence is therefore the most valuable indicator for epidemiological surveillance, both for planning prevention activities and for monitoring ongoing interventions effectiveness . Despite its relevant role in surveillance, estimating the annual number of new infections remains challenging in many countries [7,8].
Historically, calculation of HIV incidence has been based on reported AIDS cases, using back-calculation model of AIDS incidence, with the assumption that temporal trends in AIDS incidence reflect trends in HIV incidence in the past . However, the expansion of antiretroviral therapy has lengthened the time to the onset of AIDS  making inferences about HIV incidence based on AIDS reported cases very limited. More recently, updates of back-calculation models relying on new HIV diagnoses have been applied for HIV incidence estimation [11-15]. One of the difficulties faced is to distinguish the contribution of changes in testing patterns in the trends of HIV reported cases 
Another approach for estimating HIV incidence is to measure the rate of seroconversion in a cohort of individuals at risk, not HIV infected, followed over time . Apart from the difficulties in implementing this type of approach, this methodology is subject to selection bias of individuals who agree to participate and who remain in the study, and is affected by changes in risk behavior over time [8,17].
HIV incidence can also be calculated based on the change in HIV prevalence estimated at two points in time . The assumption underlying this methodology is that the number of new infections is the number of prevalent cases at the second point in time minus the number of cases who survived between the two time points. This is the basis of the methodology used by UNAIDS to estimate key HIV indicators such as the number of people living with HIV, new infections, and AIDS deaths . Limitations of this approach stem from both estimates of prevalence that are subject to errors of sample surveys , especially for concentrated epidemics, as well as outdated estimates of mortality due to expansion of antiretroviral therapy and resultant increase in survival .
In the late 1990s, laboratory tests were developed to estimate HIV incidence in cross-sectional studies . The algorithms are based on laboratory assays that identify if infections are recent, that is, if they occurred within a certain period of time after HIV infection. The main advantage of this type of approach is the use of a single blood sample collected at one point in time, as in cross-sectional surveys, which does not require follow up of subjects as in cohort studies.
This method has been widely used to estimate the incidence of HIV in several countries and different epidemiological settings [3,23- 26], including HIV incidence estimation in two Brazilian cities . However, validation studies have consistently shown that assay-based estimates of HIV incidence vary according to which assay is used to identify recent infections . Recommendations have recently been made to use multi-assay algorithms that include multiple biomarkers, including viral load and CD4 count, to identify recent HIV infections and thus provide more accurate HIV incidence estimates .
Lately, methods based on the first CD4 count after HIV diagnosis have been developed to estimate HIV incidence in United Kingdom [4,15], France , Brazil  and United States [13,14]. The main assumption of these models is that among antiretroviral therapy (ART)-naïve individuals, CD4 cell counts decrease over time  and the time since infection can be ascertained by applying an estimated rate of CD4 count decline . Although it is well known a small proportion of HIV cases experiences preservation of CD4 , these approaches have the advantage of using a CD4 count back-calculation model to estimate HIV incidence with routinely available data.
In this study, we propose a new method to estimate HIV incidence in Brazil in recent years. The method is based on the first CD4 count after HIV diagnosis among all treatment-naïve HIV infected cases reported to the Ministry of Health in the time-period 2004-2013.
The information source is Brazil’s Laboratory Tests Control System (SISCEL), which is the national laboratory-based information system created to monitor CD4+/CD8+ T lymphocyte counts and HIV viral load. SISCEL is managed centrally by technical staff of the Brazilian Department of Sexually Transmitted Diseases, AIDS and Viral Hepatitis. Data include some characteristics of the patient, such as age, sex, and municipality of residence.
In the present study, we analyzed the SISCEL database after removing patients’ identifiers. All treatment-naïve HIV infected cases 15 years or older who underwent a CD4 count for evaluation of treatment indication in the period 2004-2015 were included in the analysis. The project was approved by the Ethics Committee of the Oswaldo Cruz Foundation, Ministry of Health, Brazil.
CD4 depletion model
The method is based on a statistical model proposed in an earlier work by Lodi et al.  that relates the first CD4 count to time of HIV infection through a linear mixed model:
where t is the time from HIV infection to date of first CD4 count, CD41 is the first CD4 count, and the slope (b1) and the intercept (b0) are random variables following normal distributions. In model (1), the mean values and the standard deviations of the slope and the intercept were estimated separately for combinations of sex, quartile of age at infection, and risk group.
To calculate the time since HIV infection among reported cases in Brazil, we started from the premise that among antiretroviral therapy (ART)-naïve individuals, CD4 cell counts decrease over time  and that the time since infection can be estimated by applying a CD4 count depletion model, developed as an adaptation of model .
To calculate the intercept (b0), as SISCEL does not have information about risk group, we adapted model  and calculated the intercept by sex and age group (defined by quartiles of age at first CD4 count) using the distribution of risk group among AIDS cases in Brazil to weight the b0 estimates within each sex and age group. To calculate the slope (b1), we considered all treatment naïve cases reported to SISCEL with first CD4 count greater or equal than 500 and followed up at least one year before starting antiretroviral treatment. The slope (b1) was estimated by sex and age group as the ratio between the difference in the square root of the CD4 counts and the time between the first CD4 count and the last CD4 count before treatment (Table 1).
Then, for each treatment-naïve HIV infected case aged 15 years and over reported to SISCEL, we estimated the time since infection (t) based on the linear model coefficients by sex and age group given in Table 1. To account for cases tested in the private sector, we weighted the SISCEL database with weights inversely proportional to the coverage of private health insurance by geographical area of residence .
Distribution of time since infection (t) among reported cases
For the analysis, the time since infection was expressed by the number of years from infection to the first CD4 count and the values of the variable t derived from the CD4 count depletion model were aggregated in intervals of years since infection (t<1; 1<=t<2; 2<=t<3; ….; 19<=t<20; t>=20 years). That is, if the square root of the first CD4 count was greater or equal to b0, we estimated the year of infection as the same as the year of SISCEL reporting (t<1).
|Sex||Age at first CD4 count|
|Sex||Age at first CD4 count|
* Adapted from the Lodi et al. model  for application in Brazil
** Estimated from Brazilian data considering all treatment naïve cases reported to SISCEL with first CD4 count greater or equal than 500 and followed up at least one year before starting antiretroviral treatment.
Table 1: Coefficients of the CD4 depletion model applied to SISCEL data.
The annual frequency distribution of the time (years) from infection to the first CD4 count (t) among reported cases from 2004 to 2015 is represented mathematically in Table 2, where yj,u is the number of cases infected in year k and reported in year u=k+j-1, for u ranging from 2004 to 2015 and j from 1 to 21 (representing ≥ 20). For each year u, Yu is the total number of reported cases
HIV incidence estimation
In the context of studying SISCEL reported cases, individuals are observed at the time of a subsequent event (reporting) to infection, and at that time, the lag (time between infection and reporting) is ascertained using the CD4 depletion model (Table 1). The annual lag distribution among reported cases from 2004 to 2015 is known. However, to estimate HIV incidence in a given year, interest is to estimate the annual lag distribution among all incident cases of that year, with observations arising from a right truncated version of the distribution.
For the analysis, observations are the cumulative sum of cases reported to SISCEL less than one year after infection, less than two years after infection, and so on. That is, let Xk,k+j-1 be the total number of cases infected in year k and reported to SISCEL until year u=k+j-1. For example, for year 2004, X2004,2015 is the sum of incident cases in 2004 and reported in 2004, 2005,…, 2015 and HIV incidence is the upper limiting value of the cumulative sums.
Let Ik be HIV incidence in year k. Then,
Xk, k+j-1=Ik. Pk (t<j) for k=2004, …, 2015, j=1, 2, …, Lk, Lk=2015–k+1, where Pk (t<j) is the probability the case is infected in year k and reported to SISCEL less than j years after infection. (2)
Under the assumption the probability of reporting less than j years after infection are expressed by a logistic probability distribution, we estimated HIV incidence as the upper limiting value of the logistic function. For year 2004, separately for males and females, we estimated HIV incidence through an iterative procedure described in the supplementary methodology section.
As SISCEL data are available from 2004 to 2015, the maximum number of observations available for fitting the logistic probability distribution function is 12. To check the possibility of using the same approach for years after 2004, we applied the procedure for 2004 with a smaller number of observations, i.e., truncating the distribution for t<11, t<10, t<9, etc. The results showed the estimated HIV incidence loses accuracy (relative differences greater than 1%) with less than 8 observations, that is, the procedure could only be applied from 2004 to 2008. Therefore, to estimate HIV incidence in recent years (2009- 2013), we estimated the missing observations for these years by extrapolating, for each interval of time since infection, the number of cases to be reported in the years 2016 to 2020 based on the 12 year series of reported data.
To this end, with the mathematical notation used in Table 2, for each fixed j varying from 4 to 12, we fitted linear regression models to the observed values yj,u (u=2004, …, 2015) with year of reporting as the independent variable. The predicted observations in years u=2016, …, 2020 correspond to the estimated number of cases infected in year k=u-j+1 and expected to be reported in year u. For example, for year 2009, we estimated the number of cases infected in 2009 and expected to be reported in 2016 ( ) based on the 12 observations , for u ranging from 2004 to 2015, arranged along line j=8 in Table 2.
After completing the missing data to obtain at least 8 observations per year from 2009 to 2013, we estimated HIV incidence from 2004 to 2013, separately for males and females, fitting the logistic probability distribution to 2005-2013 data with the same standard deviation estimated for 2004, as described in the supplementary methodology section. Confidence interval estimates for HIV incidence were obtained by combining the uncertainties associated with each modeling component, the linear model intercept of equation (A.2) of the supplementary methodology section and the linear regression estimates of the number of cases expected to be reported between 2016 and 2020.
Using the data mentioned above, the method permits the simultaneous estimation of HIV incidence, proportion of cases reported in the same year of HIV infection and proportion of cases reported until 2015, from which we established the changes in testing patterns from 2004 to 2013.
|Intervals of t(years)||j||Year of Reporting|
Table 2: Mathematical representation of the frequency distribution of time (t) since infection among reported cases, 2004-2015.
Regarding the trends of HIV incidence, equation (3) with j=1 shows that .Therefore, the trend in HIV incidence is the same as that of the cases diagnosed in the year of HIV infection if the proportion of cases diagnosed in the year of infection remains constant throughout the period.
After weighting SISCEL data to take into account CD4 measures in private laboratories, the total number of reported treatment-naïve HIV infected cases 15 years or older from 2004 to 2015 was 511,328, with 323,055 males and 188,273 females.
For each treatment-naïve case aged 15 year and over reported to SISCEL, we applied the CD4 depletion model by age group and sex (Table 1) and estimated the time between infection and the first CD4 count.
Results of the application of the proposed procedure for men and women HIV infected in year 2004 are presented in Table 3. The comparison between the observed distribution and the probabilities estimated by the logistic probability distribution shows the goodness of fit of the logistic distribution to the data.
In Table 4, we present the number of reported cases from 2004 to 2015, separately for males and females, according to year of reporting and intervals of time since infection ascertained by the depletion model. For each lag interval, the predicted values of the number of cases expected to be reported in years 2016 to 2020 were used to estimate observations originally missing due to truncation in 2015. Exemplifying, the predicted observations y4, 2016 y5, 2017 y6,2018 y7, 2019 and y8,2020, were used to complete the 2013 dataset and estimate HIV incidence in this year.
Estimates of HIV incidence according to sex from 2004 to 2013 are presented in Table 5. In 2013, HIV incidence among people aged 15 years and over was estimated to be 44827 (95% CI 41143-47987), 32459 men (95% CI: 29775-34642) and 12368 women (95% CI: 11368- 13345). Based on the Brazilian population data (15 years old and over), HIV incidence rate was 43.5 (95% CI 39.9-46.5) per 100,000 men, 15.9 (95% CI 14.6-17.1) per 100,000 women and 29.4 (95% CI 27.0-31.5) per 100,000 population aged 15 years old and over.
Results also presented in Table 5 showed HIV incidence increased among men and slowly decreased among women in the period 2004- 2013. The male-female incidence ratio was 1.53 in 2004 but increased to 2.62 in 2013. The estimated proportion of cases having first CD4 count in less than one year after infection increased from 24.6%, in 2004, to 35.2%, in 2013, among men, and from 35.1% to 45.8%, among women. The estimated average time since infection shortened from 6.1 to 4.6 among men, and from 4.4 to 3.4 among women, from 2004 to 2013. Although men present a faster decline, the average time to SISCEL reporting is always higher among males. Among men who became HIV infected between 2004 and 2013, 35.6% of the cases were not reported to SISCEL by year 2015 and among women, 23.2%. The corresponding proportion for the total number of cases is 31.5%.
In this paper, we proposed a practical method for estimation of HIV incidence in Brazil using surveillance data. The method is based on the first CD4 count among treatment-naïve reported cases and a statistical model that relates the first CD4 count to time since HIV infection. One advantage of the approach here proposed is that the method is based on recent trends of new HIV diagnoses using regularly available data.
As SISCEL is based on reimbursement of each lab examination by the Brazilian government, the system coverage of CD4 count exams performed in public laboratories is considered complete. However, SISCEL does not include HIV cases tested in the private sector. To bypass this difficulty, we assumed that the proportion of cases tested in private labs would be the proportion of people who has private health insurance in their area of residence . Using this approach, the proportion of people tested in private labs was estimated as 35% similar to the proportion of cases that are in antiretroviral treatment (100% in the public sector) and are not included in SISCEL .
One limitation of SISCEL is that it does not collect information on exposure category, restricting the actual HIV incidence analyses to available variables, such as age group, sex, race, and area of residence (region, state, and municipality). However, as the Department of STD, AIDS and Viral Hepatitis routinely links SISCEL to the National System of Notified HIV/AIDS cases , it is possible to obtain exposure category for at least 70% of the SISCEL cases, enabling the estimation of HIV incidence by exposure category with the use of imputation methods . Moreover, any other information available in SISCEL related to the time since infection, such as viral load or patient's condition (symptomatic/asymptomatic) that could be used to refine our estimates [36,37] will be further investigated.
|Lag (t) Interval (years)||Males||Females|
|Observed cumulative cases||Observed truncated distribution||Estimated probabilities (logistic distribution)||Observed cumulative cases||Observed proportions||Estimated probabilities (logistic distribution)|
|HIV Incidence estimate||20825||Sum of square differences||0.00049||13578||Sum of square differences||0.00033|
Table 3: Estimates of HIV incidence separately for males and females using the proposed approach Brazil, 2004.
|Intervals of t (years)||3≤ t<4||4≤ t<5||5≤ t<6||6≤ t<7||7≤ t<8||8≤ t<9||9≤ t<10||10≤ t<11||11≤ t<12|
|Year of Reporting||Males|
|Year of Reporting||Females|
*For each interval of time since infection, we estimated the predicted observation (2016-2020) by linear regression models based on the 12 year series of reported data.
Table 4: Reported (2004-2015) and predicted observations* (2016-2020) according to intervals of time since infection (t).
|Year of Infection||Males||Females|
|Estimated HIV Incidence||95% CI||Average time since infection (years)||% Reported in the same year of infection||Not Reported until 2015||Estimated HIV Incidence||95% CI||Average time since infection (years)||% Reported in the same year of infection||Not Reported until 2015|
Table 5: Estimates of HIV incidence and other indicators by year of infection and sex, Brazil, 2004-2013.
In order to use the CD4 depletion model in Brazil, we adapted the model originally proposed by Lodi et al.  and calculated specific CD4 count decline rates by sex and age at diagnosis (first CD4 count) using Brazilian data. In general, the uncertainties associated with HIV incidence estimates derived from this method could be underestimated due to some restrictive assumptions, for example, the mean square root of CD4 count is linearly related to the time since infection, the number of cases to be reported to SISCEL is increased or decreased linearly in future years, etc.
Although the assumption that the square root of CD4 count is linearly related to time since infection has frequently been used in different settings [12,14,15], other types of functions have also been proposed [38,39]. In addition, Rice and collaborators have suggested that besides age at diagnosis, ethnicity and region of birth are significant predictors of rate of CD4 decrease . Sensitivity of the estimates deriving from changes in the CD4 depletion model underlying assumptions will be examined in next studies.
The application of the CD4 depletion model to reported cases made possible to estimate HIV incidence in Brazil and to identify changes in testing patterns. Based on fitting a logistic distribution to the cumulative number of cases classified by intervals of time since infection, the procedure used in this study represents a simplification compared to the previous model , since it requires extrapolation of reported data over 5 years only. The proposed approach allowed us to estimate HIV incidence and 95% confidence limits by sex in the period 2004-2013, and the results are consistent with previous findings .
For year 2013, our estimate of HIV incidence (44827) is similar to the UNAIDS estimate (44000) for the same year , but almost twice the estimate of the Global Burden of Disease (GBD) group (24187) . Reasons for differences in the UNAIDS and Murray et al. estimates have been discussed before [42,43]. In the case of Brazil, the most likely explanation is the influence of the GBD 2013 mortality estimate, with 10217 AIDS deaths . According to the Mortality Information System, 12564 AIDS deaths occurred in 2013, but it is believed that this number is still underestimated due to misclassification of the causes of death .
The application of the method in Brazil showed a delay of approximately 4.3 years (51 months) between infection and date of first CD4 count. The delay is a little shorter among women, probably due to the Ministry of Health policy of HIV testing during antenatal care .
In France, a similar model was used for Estimating HIV Incidence Based on the estimated time from infection to diagnosis using HIV surveillance data. The results are comparable to those obtained in Brazil: the estimated mean time since HIV infection ranged from 37 to 53 months among men who have sex with men (MSM) and heterosexual men, respectively, with an intermediate delay for women . In the United States of America, HIV cases diagnosed in 2011 had been, on average, infected 5.6 years before their diagnosis . Despite the improvement in shortening delays from infection to first CD4 count from 2004 to 2013 (Table 5), for both men and women, the delay remains excessively long .
A growing number of studies have shown evidence that wide and early initiation of antiretroviral therapy can reduce the level of HIV incidence in the population [10,47]. The ‘treatment as prevention’ (TASP) policy is based on the premise that the rate of new HIV infections can be reduced by increasing frequency of HIV testing to diagnose people living with HIV in early stages and initiate antiretroviral treatment regardless of CD4 count or viral load . In Brazil, difficulty in early detection of HIV infection impairs the benefits of TASP, adopted since 2014, and the spread of HIV transmitted by undiagnosed individuals may actually be the main factor driving the HIV/AIDS epidemic. In fact, this study results showed that there are around 45000 new HIV infections among people aged 15 and over per year in Brazil and only 38% is diagnosed in the same year of infection.
HIV incidence estimates showed an increase among men and a slow decrease among women from 2004 to 2013, with men currently accounting for 72% of new infections. Although we did not analyzed data by exposure category, the most likely explanation is an increasing rate among men who have sex with men (MSM) and a decreasing rate among injection drug users, which was a major contributor to the expansion of heterosexual transmission in the 1990s in Brazil .
During more than 30 years of the HIV/AIDS epidemic, Brazil has maintained a concentrated epidemic, with an HIV prevalence rate less than 1% in the general population . Despite earlier concerns about an increase in heterosexual cases and “feminization” of the epidemic, it seems the actual scenario is the predominance of HIV infection among MSM, similar to data from many other countries [49,50]. In a study conducted in two Brazilian cities in 2013 using laboratory tests to distinguish recent from long-term infections, the estimated incidence rate was greater than 1% among MSM .
In this paper, we introduced a method to calculate HIV incidence in Brazil. As it is based on the first CD4 count and date of first CD4 count, the approach is applicable to all countries that monitor these data among HIV infected cases. The method permits the simultaneous estimation of HIV incidence, proportion of HIV cases diagnosed in the same year of infection and proportion of undiagnosed cases. The application of the method in Brazil showed a decreasing trend in the time between HIV infection and the first CD4 count in the period 2004-2013 but the delay is still too long (4.3 years). Recent government policies are focused on increasing uptake of HIV testing with expansion of the offer in health units and home-based testing . Therefore, an additional contribution of the proposed method is its potential use to monitor and evaluate the impact of these new HIV testing strategies.
Under the assumption that the probability of reporting less than j years after infection is expressed by a logistic probability distribution with mean μ and standard deviation σ, we have:
(A.1) F(j; μ; σ) =P(t<j) = , where , and π is the number pi ~3.1416.
So, the cumulative number of cases reported to SISCEL can be written as:
(A.2) Xk, k+j = Ik. F(j; μk; σk), and
(A.3) , for and .
If we use equation (A.3) with known λ (or σ), we can estimate HIV incidence in year k (Ik) by fitting a linear regression model to the inverse of the observed values Xk,k+j The inverse of the intercept will be the estimated incidence in year k.
For year 2004, separately for males and females, we estimated σ, I2004, and θ by using an iterative procedure with an initial guess for σ to generate successive approximations to a solution. For each σ, we used the regression model (A.3) to estimate the mean (μ) and equation (A.2) to estimate the probabilities of reporting j years after infection. The selected σ was the one that minimized the sum of squared differences between the observed truncated distribution and the estimated probabilities of reporting less than j years after infection by the logistic probability distribution. The estimate of HIV incidence was given by the inverse of the intercept in the linear regression model (A.3) corresponding to the selected σ. Confidence intervals for HIV incidence were based on 95% confidence intervals for the linear model intercept.
The project was approved by the Ethics Committee of the Oswaldo Cruz Foundation, Ministry of Health, Brazil (Protocol 485.175).
This study has received funding and technical support from the Centers of Disease Control and Prevention, Global AIDS Program, Brazil.
CLS and PRBSJ designed the research. CLS and OCFJ collaborated in the writing of the manuscript. PRBSJ and ARPP performed the statistical analyses and participated in the interpretation of results. All authors revised the manuscript before submission.
We thank Drs. Rick Song and Aristides Barbosa Junior for their constructive comments, which led to a substantial improvement of this paper.