Self-report of STI symptoms, inconsistent condom use and condom nonuse are poor predictors of STI prevalence among men who have sex with men.

Background: Biological testing for Sexually Transmitted Infections (STI) are challenged by sample collection and high testing costs, where self-reports are used in predicting STI status. The validity of self-reports among populations at STI risk has not been established clearly. The objective of this paper is to assess the validity of selfreported ‘STI symptoms’, self-reported ‘recent condom non- use’ and ‘inconsistent condom use’ in comparison with laboratory diagnosed STIs among men who have sex with men (MSM) in India. 
  
Methods: Data were drawn from a cross sectional Integrated Behavioural and Biological Assessment survey conducted among MSM between 2005-2007 in India. Sensitivity analysis was used to assess the validity of selfreported ‘STI symptoms’, ‘recent condom non-use’ and ‘inconsistent condom use’ with laboratory diagnosed STIs (syphilis/Neisseria gonorrhoeae/ Chlamydia trachomatis). Multiple logistic regressions were used to identify population characteristics which were predictive of concordant self-reporting. 
Results: Of 3895 MSM surveyed, 14.3% were diagnosed with any STI while 8.3% and 3% reported any STI symptom in past and current respectively. Recent condom non-use and inconsistent condom use was reported by 43.1% and 77.6% of respondents. Self-reported STI symptoms showed very low sensitivity (5-13) in predicting laboratory diagnosis of STIs. Self-reported inconsistent condom use and recent condom non-use showed higher sensitivity than self-reported STI symptoms (50-74.4), but were less specific (21-52.9). Combined self reports showed relatively higher sensitivity (52.3-77.9) and low specificity (18.9 -51.8). Overall self reports showed very high negative predictive value (84.4-87.9) and low positive predictive value (12.4-15.7). Education grade more than 12 [AOR: 3.2 (CI 1.7-5.9)], and STI/HIV information exposure [AOR: 1.4 (CI 1.0-2.0)] were predictive of concordant self-reporting of STI symptoms and inconsistent condom use respectively. Knowledge about STIs [AOR: 1.4 (CI 0.9-2.2)] and education grade more than12 [AOR: 2.5 (CI 1.2-5.3)] were predictive of concordant self-reporting of symptoms/risk. 
Conclusions: Self-reports of STI symptoms, recent condom non-use and inconsistent condom use were not reliable in predicting true STI status of MSM and thus highlights the limitations in the validity of self-reports collected at different levels in the program setting. The study identified MSM education status, STI/HIV knowledge and information exposure, as predictors of concordant self-reporting of ‘symptoms’ and ‘inconsistent condom use’ with STI laboratory diagnosis, which could be utilized in future survey efforts for improving validity of self-reports.

. Bio-behavioural surveys and behavioural surveys routinely collect data related to the risk of acquisition of HIV and other STI's like consistent condom use and recent condom non-use. It is well accepted that condom use significantly decreases STI risk among population, and a relationship between them is assumed. Conversely 'inconsistent condom use' over time and 'recent condom non-use' as indicators of risky behaviours could predicate STI acquisition risk and consequently might help as an indirect marker in estimation of STI burden.
However there are factors associated with self-reporting of condom use and non-use, which may limit their use in predicting STI infection status [16]. Desirability bias leading to respondents not reporting stigmatized behaviours, or over reporting expected 'good' behaviours, as well as the sensitivity of survey instruments in capturing risk behaviours may influence the reliability of self-reported sexual risk behaviours [17,18]. Studies conducted among high risk clients and FSWs have documented discordance between self-reported condom non-use and incidence of STIs [16,19,20]. Contrarily a study among general population in India highlighted that a combination of selfreported risk behaviours and STI symptoms had better predictivity of true STI status, while individually they were poorly predictive of true STI status [21].
While there is an increasing focus of HIV/STI prevention programs among MSM, a periodic evaluation of STI prevalence and high risk behaviours amongst them is desirable [22]. Research on MSM has tended to use self-reported risk behaviours, without certainty about their validity [23,24]. While a few studies have documented discordance between self-reported STI symptoms and the serological status of MSM, a study conducted among Indian MSM, has noted concordance between their self-reported risk behaviors and serological reports of STI's [25][26][27].
In light of this background of data highlighting both concordance and discordance of 'self-reports' with laboratory diagnosis of STI in different populations, our study explores the relationship between selfreported STI symptoms and risk behaviours with laboratory diagnosed STI among MSM , using a dataset from the Integrated Bio Behavioral Assessment Surveys among MSM conducted in three high prevalence states in India.
We hypothesised that self-reported inconsistent condom use and recent condom non-use when taken in conjunction with self -reports of STI symptoms could add to the predictive value of self reported STI symptoms in measuring true STI status. We also studied the factors which were predictive of the concordance of self-reports of STI symptoms and self -reports of inconsistent condom use with laboratory diagnosis. The study specifically aims to assess the validity of selfreported 'STI symptoms' , self-reported 'recent condom non-use' and 'inconsistent condom use' in comparison with laboratory diagnosed STIs.

Design, setting and sample
Data on behavioural and biological indicators of STIs collected as part of first round of IBBA conducted between 2005-2007 among 3895 MSM respondents from Andhra Pradesh; (n=1621), Tamil Nadu; (n=1621) and Maharashtra; (n=653) were included in this analysis [28].
The overall objectives of the IBBA project for programme evaluation purposes were: (1) to measure the major outcomes of the Avahan India AIDS initiative by collecting bio-behavioural, and programme coverage trend data in populations targeted by the interventions; (2) to provide an additional source of size estimates for populations targeted by the project in IBBA districts; and (3) to make information available for use in transmission dynamics models, and provide evidence of Avahan's impact.As part of these objectives two rounds of bio-behavioural surveys were done in 2005-7 and 2009 among different high risk groups in three states of India.
The survey used a two stage cluster sampling design with time location clusters (TLC) as primary sampling units, except in East Godavari district of Andhra Pradesh, where fixed location clusters were additionally used. In TLC sampling a sampling frame was developed through a mapping and listing exercise which utilized all existing information from all existing sources. Mapping of the district-wide sites (venues) where MSM could be accessed and the information regarding their hours of operation and approximate number of eligible respondents available at different times of the day, on different days of the week (three-hour time segments, entire days or night) were collected. Based on these identified sites and information, a time-location sampling frame consisting of venue/time slots was constructed. In the first stage a systematic random sample of primary sampling units/clusters, (i.e. venue/time slots) were chosen by probability proportional to size. Then from the selected clusters survey respondents were selected randomly among all eligible respondents available during the selected time interval. A quick listing was made at the site using easily identifiable characteristics such as the colour of clothing. If the desirable sample size was not achieved in a TLC, then another new TLC was selected for achieving that sample, and thus no TLC was selected for the second time.
The basic survey eligibility criteria included men aged ≥18 years who had manual/oral/anal sex with another man in exchange for cash/ kind in the last one month. In Tamil Nadu, men aged ≥18 years, who had anal sex with another man in the past one month were enrolled. Our analysis focused on assessing the validity of self-reporting and STI laboratory diagnosis within MSM. Since the method of sampling, behavioural data collection, and laboratory testing were standardized and were the same in all three surveys the variation in eligibility criteria used in surveys conducted in Tamil Nadu and other states were were not considered specially significant in terms of our analysis as long as they represented MSM population.
Face-to-face interviews were conducted by trained field workers in the local language of the state, using a structured questionnaire that included questions on socio-demographic characteristics, sexual behaviours and program exposures. Interviews were conducted in private locations, specifically hired for the purpose. Blood and urine samples were collected for HIV/STI diagnosis. No rectal samples were collected. Rapid Plasma Reagin (RPR) test was used to diagnose syphilis, which was confirmed by the Treponema Pallidum Haemagglutination Assay (TPHA). Neisseria gonorrhoeae (NG) and Chlamydia trachomatis (CT) infections were diagnosed using APTIMA Combo 2 (AC2) nucleic acid amplification test on urine samples. The survey was approved by ethics committees of participating institutes of Indian Council of Medical Research (ICMR) and Family Health Internationsla (FHI's )Protection of Human Subjects Committee. Informed consents laboratory diagnosed individual and any STIs and self-reported STI symptoms. Chi-square test was used to assess the significance of bivariate relationships between demographic characteristics of MSM and self-reports. Tests for sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were performed to assess the validity of self-reported STI symptoms (past/current), inconsistent condom use and recent condom non-use in relation to laboratory diagnosed STIs using standard formulae.
Three multiple logistic regression models were used to predict concordant self-reporting of STI symptoms in relation to laboratory diagnosed STI (any STI). Self reporting was considered concordant when there was a match between positive reports of symptoms and a positive laboratory diagnosis or when there was a match between a negative self-report of symptoms with a negative test result. Model 1 examined the predictors of concordance of self-reported current STI symptoms and laboratory diagnosed STIs. The dependant variable was created by matching current self-reported STI symptoms with laboratory diagnosed STI (coded as 1 if matched and coded 0 if did not match). Model 2 examined the predictors of concordance of selfreported inconsistent condom use and laboratory diagnosed STIs. The dependant variable was created by matching self-reported inconsistent condom use with laboratory diagnosed STIs (coded as 1 if match and coded as 0 if did not match).
Model 3 used a dependent variable inclusive of both "concordant self-report of current STI symptom with laboratory diagnosed STI" and "concordant self-report of inconsistent condom use with laboratory diagnosed STI" termed as 'concordant self-reported symptom/risk", and examined its predictors . This integrated model was created to know whether the identified predictors in this model were consistent with the predictors identified from Model-1 and Model-2.The dependant variable in Model 3 was created by matching both the dependant variables of Model 1 and Model 2 (coded as 1 if both /or any one of them were concordant and coded as 0 if both were discordant).
In addition, a separate regression model was developed to identify the predictors of "False Negative" self-reports of current STI symptoms. The dependant variable was created by coding "False Negatives" selfreport as 1 and "True Positives and/or True Negatives" self-report as 0).
In all the regression models, the independent variables included were age, education status, duration of sexual exposure, marital status, substance use, knowledge of STI, STI treatment history and STI/HIV information exposure of MSM. Adjusted odds ratios were calculated at significance level less than 0.05. All statistical calculations were conducted after adjusting for sampling differences by applying sample weights. STATA/SE version 12.0 was used for performing all analysis. Table 1 shows, 453 (8.3 %) respondents reported ever experiencing any STI symptoms (past or current). Respondents experiencing any STI symptoms in the past (451) were twice as many as those who reported current STI symptoms (165). While few reported urethral or rectal discharge current or in the past; genital/ anal ulcers were reported by a relatively larger proportion at both time points. Laboratory diagnosis indicated a high prevalence of syphilis (12.8%) and a low prevalence of urethral NG (1.2%) and CT (0.3%).

Prevalence of self-reported STI symptoms and laboratory diagnosed STIs among MSM
The state wise prevalence of any STI was 14.7%, 11.6% and 14.4 % were obtained from all respondents, and in case of illiterate respondents it was administered in presence of a witness. A detailed description of the survey methodology of IBBA has been published earlier [29].

Measures
A few important variables were constructed for this analysis and are outlined below.

STI/HIV information exposure:
Exposure to STI/HIV information was measured by asking the respondents if they had received STI/HIV information from a peer educator or outreach worker in the past one year.

Knowledge of STI:
This variable was derived based on responses to the following: (1) knows that MSM are at higher risk of being infected with HIV/STI; (2) spontaneously describes any one of the following STI symptoms (genital or anal ulcer/sore, rectal discharge, pain on defecation, burning pain on urination, urethral discharge, and pain in the groin).

STI treatment history:
This variable was measured by asking the respondents whether or not they received free medicines for STI from any NGO/program in the past one year or had visited any NGO / program clinic in the past one year.
Self-reported"past" and "current"STI symptoms: Defined as having any one of six symptoms (genital or anal ulcer/sore, rectal discharge, pain on defecation, burning pain on urination, urethral discharge, pain in the groin) at least once in the 'past one year' or 'currently' , and reported by the respondent spontaneous/prompted).
Self-reported recent condom non-use: Respondent's reports of not using a condom at last sexual intercourse with any one of their clients/partners (regular male partner, commercial male/hijra clients, non-commercial male/hijra partner, commercial and regular female partner).
Self-reported inconsistent condom use: Respondent's reports of not using a condom for every sexual encounter with a client/partner (regular male partner, commercial male/hijras client, non-commercial male/hijra partner, commercial and regular female partner).

Laboratory diagnosed syphilis:
A reactive serum RPR using the rapid plasma reagin (RPR) test confirmed by the Treponema pallidum haemagglutination assay (TPHA).

Laboratory diagnosed NG:
A positive diagnosis of NG using APTIMA Combo 2 (AC2) nucleic acid assay amplification test on urine samples.

Laboratory diagnosed CT:
A positive diagnosis of CT using APTIMA Combo 2 (AC2) nucleic acid assay amplification test on urine samples.

Laboratory diagnosed Any STI:
A combined variable of a reactive syphilis or positive diagnosis of N. gonorrhoeae or C. trachomatis.

Statistical analyses
Descriptive statistics were used to assess the prevalence of

Prevalence of self-reported inconsistent condom use and recent condom non-use
Self-reported inconsistent condom use (77.6%) was found to be almost twice as much as self-reported recent condom non-use (43.1%). (Data not shown in table). Table 2 shows that the percentage of MSM reporting past and current STI symptoms were similar across groups defined by age, selfidentity, literacy status, marital status and Occupation. However the percentage of self-reported recent condom non-use and inconsistent condom use was higher among MSM who were married or literate and lower among male sex workers. MSM who identified as Kothis had higher self-reporting of inconsistent condom use and lower reporting of recent condom non-use. Bivariate analysis showed that characteristics like occupation, self-identity, marital status and literacy significantly distinguished MSM who self-reported recent condom non-use and inconsistent condom use. Table 3 shows that sensitivity of self-reported STI symptoms to laboratory diagnosis was low (5-13) while specificity was high (88.6 -95.8). Self-reported inconsistent condom use and recent condom nonuse showed a greater sensitivity than STI symptoms (50 -74.4) but were less specific (21 -52.9). A combination of both self-reported STI symptoms and recent condom non-use was found to have a sensitivity ranging from (52.3-77.9), and a specificity ranging from (18.9 -51.8).

Validity of self-reports
Overall both kinds of self-reports showed a high NPV (84.4-87.9) and a low PPV (12.4-15.7).When assessed by self-reports of STI symptoms, False negatives reports were 445-448 (11-12%), but when assessed by self-reports of inconsistent condom use and condom non-use false negative reports reduced to 131-256 (3-6 %).

Predictors of concordance between self-reported STI and laboratory diagnosed STI
Model-1 in Table 4 shows that MSM with education grade between (1)(2)(3)(4)(5) [AOR: 1.6 (CI 1.1-2.4)] and education grade more than 12 [AOR: 3.2 (CI 1.7-5.9)] were more likely to give concordant reports of current STI symptoms than illiterates. The model also showed that those MSM treated for STI in the past were less likely to give concordant reports of current STI symptoms [AOR: 0.7 (0.5-0. The fourth regression model which assessed the predictors of "False Negative" self report of current STI symptoms, indicate that MSM with higher education status with grade 6-12 [AOR 0.6 (CI 0.3-0.9)] and grade more than 12 [AOR 0.2 (CI 0.1-0.5)] were less likely to report false negatives. Additionally MSM with longer duration of sex work   Table 3: Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of self-reported STI symptoms, recent condom non-use and inconsistent condom use for laboratory-diagnosed STIs among MSM, Andhra Pradesh, Tamil Nadu and Maharashtra, India.
Note: STI symptoms refer to any of the following: genital/anal ulcer or sore/rectal discharge/defecation pain on defecation /burning urination pain/urethral discharge/ groin swelling; Sensitivity(SEN) denotes the proportion of MSM who reported current STI symptoms and also who had a laboratory-diagnosed any STI. Specificity (SP) denotes the proportion of MSM who did not report current STI symptoms and also who did not have a laboratory-diagnosed any STI .PPV and NPV denotes positive predictive value and negative predictive value respectively. Laboratory diagnosis was used as the gold standard for calculating values; N=3895.

Model -1 Model -2 Model -3
Characteristics Characteristics (n)  Table 4: Predictors of concordance of self-reports with laboratory diagnosis of STI among MSM, Andhra Pradesh, Tamil Nadu and Maharashtra, India. # For the percentage value, the numerator consists of concordance for self-report & laboratory diagnosis, both when laboratory diagnosis is positive and negative. The denominator consists of MSM defined by characteristics. All % values derived after applying sample weights. Overall N=3895, which included missing values. *p<.05 CI-Confidence Interval AOR-Adjusted Odds Ratio The dependant variable in the Model 1 is created by matching "self-reported current STI symptom" with "laboratory diagnosed o STIs" (coded as 1 if matching and coded as 0 if not matching). The dependant variable in the Model 2 is created by matching "self-reported inconsistent condom use" with "laboratory diagnosed STI "(coded as 1 if matching and coded as 0 if not matching). The dependant variable in Model 3 "Concordant self-reported symptom/risk" was created by matching both the dependant variables of Model 1 and Model 2 (coded as 1 if both/or any one of them were concordant and coded as 0 if both were discordant).
[Data not shown in Table].

Discussion
This well conducted large scale community-based survey among MSM with a high prevalence of syphilis (12%) and lower rates of urethral NG and CT revealed that self-reported STI symptoms, whether 'current' or in the 'past' , had very low sensitivity in predicting laboratory diagnosed STI (syphilis/NG/CT). The study identified population characteristics that influence the concordance of self-reported STI symptoms and inconsistent condom use with laboratory diagnosed STIs.
Overall the rates of self-reported past STI symptoms were comparatively higher than current STI symptoms, which may reflect a recent decrease in STI prevalence due to improved treatment under prevention programmes and perceived stigma in reporting current symptoms to the investigator. This may also be due to the larger time frame covered by the past STI symptoms (1 year) when compared to current STI symptom (1 month), increasing the probability of positive report in the former. The high level of self-reported past STI symptoms noted among male sex workers in this same population strengthens this argument.
Less than one-tenth of the respondents who had a laboratory diagnosed STI reported ever experiencing any STI symptoms, which was reflected in the low sensitivity of self-reported STI symptoms. However the high specificity of self-reported STI symptoms indicates that MSM more correctly reported their STI (disease)-free status. Unlike the PPV, the NPV was found to be very high for all self-reported STIs (84.4-87.9), indicating that MSM were more likely to accurately report disease-free status than infection status and safe sex behaviours ('true-negative' self-reports) than risky behaviours. Similar findings have also been reported among African-American female adolescents by Harrington et al. [7].
In contrast, the low PPVs of all self-reported STI indicate that MSM were more likely to report incorrectly their STI status and condom non-use/inconsistent use ('false-positive' self-reports). Our study thus indicates that self-reported STI symptoms cannot be considered as a surrogate for assessment of the actual prevalence of STIs as noted in some other studies [12,30]. It also raises concerns about the utility of information related to STI collected through routine surveys like Behavioural Sentinel Surveillance, National Family Health Survey, Reproductive Health Survey etc where the researchers may be tempted to equate the prevalence of self-reported symptoms of STI with actual prevalence of STIs [31,32]. In assessing the relationship between self reported risk behaviour and STI status, we found that self-reported inconsistent condom use (considered individually or in combination with self-reported current STI symptom) had relatively high sensitivity in predicting laboratory diagnosed STIs. However the specificity (18.9-21) and PPVs (12.4-12.7) were low once again highlighting that self-reports of risk behaviours may not be adequate predictors of STI status in spite of their high sensitivity even if combined with self reported symptoms.
Regression analysis showed that factors associated with concordance of self-reported current STI symptoms and inconsistent condom use, were related to education status in general and knowledge of STIs in particular. The predictors identified in the integrated Model-3 also remained consistent to the findings of Model-1 & 2, which strengthens our study findings. Similar association of education status and knowledge of STI with concordant reporting of symptoms and risk behaviour respectively has been identified in the past by Plummer and Hong in Tanzania and China respectively [12,33]. Additionally concordant reporting of risk behaviours, noted among substance users in this study, has also been documented among drug users in many studies conducted in United States which needs to be explored further for underlying reasons through focused qualitative studies [34][35][36]. The predictors of "False Negative" self-report of current STI symptom were similar to the predictors of overall concordance of self-reports of current STI symptoms, in relation to "False Negatives" and "False Positives", which highlighted the consistency of the predictors identified in our study.
We also recognize that the low sensitivity and concordance of self reported STI symptoms with laboratory diagnosis in this study may be due to the fact that many STI's are largely asymptomatic. Additionally the prevalence of any STI (14.31%) in this population was driven by syphilis, and the prevalence of NG (which is most likely to be symptomatic) was very low (0.3%), which may have lead to such results.
Our findings thus have significant implications on the validity of self reported data collected routinely in intervention settings and during surveillance, which are used widely by various organizations to estimate levels of risk and STI burden in the target population. Self reported STI symptoms often captured as syndromes also serve as corner stone of programme evaluation and disease screening and we wish to highlight the possible limitations in the validity of these reports. Our findings along with those of other studies could be used to reassess existing data as well as drive implementation of measures to ensure the validity of symptomatic/ syndromic data collected at different levels in the program setting [11,37,38]. Deployment of innovative interviewing techniques audio-video based self administered interview (ACASI) in field based surveys, may help in addressing stigma and social desirability bias [4]. Collection of self-reported data from future large scale surveys could be improved by using predictors of concordant reporting identified in our and other studies [21].

Study limitations
Although IBBA survey offered respondents a physical examination by trained doctors there was a high refusal rate making it difficult to validate the self reported symptoms as proxy for laboratory diagnosis. Another limitation was that this survey collected limited information on general STI symptoms, and hence we were unable to match individual self-reports of specific STI symptoms with specific laboratory-diagnosed STIs to check consistency.
This study restricted the STI laboratory diagnosis of NG and CT to urethral infections and rectal/pharyngeal samples were not collected. Consequently any rectal/pharyngeal NG/CT infections were not picked up. It can be argued that diagnosing these infections in addition could have increased the prevalence of laboratory diagnosed STI's and might have led to different and possibly improved associations between selfreported symptoms and laboratory diagnosed STI. However as specific self-reported 'rectal' and 'anal' symptoms were low in this study ( 0.2 % and 0.7 % respectively) we believe that including these would not have had a significant impact on the study findings.
Finally while some might wonder if the difference in eligibility (inclusion) criteria used in Tamil Nadu and other states may affect the study, we feel that these are largely extraneous to the purpose of our analysis. The behavioural data used for our analysis as well as the methods for sample collection and laboratory diagnosis were standardized and uniformly collected across all the survey sites. We wish to emphasise that whatever the risk or prevalence our analysis is geared towards ascertaining the relationship between reporting of symptoms and or risk and presence of laboratory diagnosed STI. Having said this we do recognise that there is still some possibility that the differences in inclusion criteria might have lead to differences in STI risk or prevalence across the states. However the data show that overall STI prevalence was not very different in each of the states (a maximum of 3% difference). Similarly proportion of MSM identifying themselves as Kothis (primarily practice receptive anal sex) was also similar across states (with a maximum of 1% difference). Thus while plausible the differences in inclusion criteria may not have substantially affected the analysis presented in this paper. Combining the data from three states yielded a large dataset and provided a rare opportunity for comprehensively assessing the self-reporting of MSM across three high focus states in India.

Conclusions
The findings from our analysis of a bio-behavioural survey among MSM in three states of India suggest that self-reported STI symptoms alone or combined with risk behaviours like inconsistent condom use, are not valid predictors of STI status. Thus the use of self reported data of STI symptoms of syndromes alone from behavioural surveys or programme data for advocacy, planning, assessment of burden or impact of interventions would not be accurate.
Our data reaffirm that that laboratory based investigations with highest sensitivity and specificity for identifying STI's should remain the corner stone for STI prevalence surveys even in populations with a moderate/ low prevalence of STIs inspite of the cost. The identified predictors of concordant self-reporting in this study highlight that background educational level and intervention exposure status of the high risk population surveyed, need to be considered and utilised for designing survey instruments with improved sensitivity in future large scale surveys.