Untargeted Lipidomic Profiling of Human Plasma Reveals Differences due to Race, Gender and Smoking Status

Volume 4 • Issue 1 • 1000131 concentrations in white subjects of an endogenous endocannabinoid receptor agonist and a choline. Several unidentified features, with masses suggestive of phospholipids or sphingomyelins, were present at lower concentrations in smoking subjects. Since all of these lipids are physiologically important and some have been associated with chronic diseases, our results suggest that young American adults may be predisposed to diseases because of differing lipid concentrations associated with race, gender and smoking.


Introduction
Since lipids are essential to functional membranes, energy storage and signaling [1,2], lipidomics provides an avenue for linking important biological processes with disease states. Indeed, differences in lipid profiles have been reported in investigations of cancer [3][4][5][6], diabetes [7], Alzheimer's disease, [8,9] and cardiovascular disease [10,11]. Such studies increasingly rely on high-resolution mass spectrometry (MS) platforms that can detect thousands of lipidomic features in plasma while simultaneously providing accurate masses for annotation [12].
Given strong associations between blood-lipid levels and chronic diseases, it is surprising that baseline lipidomic profiles have not been reported across fundamental population characteristics such as race and gender as well as lifestyle factors such as smoking. Here, we used untargeted Fourier Transform (FT) MS to obtain lipidomic profiles containing over 3,000 features detected in plasma from healthy American subjects stratified by race (black and white), gender and smoking status. Race was the strongest classifying factor (23 significant features) followed by smoking status (9 features) and gender (2 features). Identities assigned to race-discriminating features included several plasmalogens (ether phospholipids containing fatty alcohols with vinyl-ether linkages in the sn-1 position and fatty acids with ester linkages in the sn-2 position) that were more abundant in black subjects. Tentative assignments, based on accurate masses, pointed to greater Abstract Lipidomic profiling can link genetic factors and exposures to risks of chronic diseases. Using untargeted liquid chromatography-Fourier Transform mass spectrometry (LC-FTMS), we explored differences in 3,579 lipidomic features in human plasma from 158 non-fasting subjects, pooled by race, gender and smoking status. Significant associations with race (23 features), smoking status (9 features) and gender (2 features) were detected with analysis of variance (ANOVA)-based permutation tests. Identities of several features were confirmed as plasmalogens (vinyl-ether phospholipids) that were present at 2-fold greater concentrations in black subjects. Other putative features, based on accurate masses, were more abundant in white subjects, namely, dihomo-γ-linolenoyl ethanolamide (DGLEA), an endogenous endocannabinoid receptor agonist and a glycerophosphocholine [PC(16:0/18:1)]. After adjustment for race, multivariable linear regression models showed that gender was significantly associated with levels of plasmalogens and DGLEA and that consumption of animal fat was marginally associated with concentrations of plasmalogens. Interestingly, BMI did not explain additional variability in any race-adjusted model. Since plasmalogens are antioxidants that are generally regarded as health-promoting and DGLEA is an agonist of the cannabinoid receptor, our findings that these molecules differ substantially between black and white Americans and between men and women, could have health implications. The concentration of cotinine was greatly elevated in smoking subjects and 6 features with m/z values suggestive of phospholipids or sphingomyelins were present at significantly lower concentrations in smokers.

Plasma samples
Non-fasting blood samples were obtained in heparin from 158 healthy subjects (78 males and 80 females), representing a subset from a previous study conducted by the corresponding author under an approved human-subjects protocol [13]. Within a few hours of collection, plasma was separated from red blood cells by centrifugation. Red cells were washed with an equal volume of PBS, which was added to the plasma and thus reduced plasma concentrations. Plasma samples were frozen and stored at -80°C for approximately 13 y prior to being aliquoted and pooled by combining aliquots from 4 to 6 subjects stratified by race, gender and smoking status. (Pooling was required by our institutional review board to ensure anonymity of subjects). A quality control sample was prepared by pooling 100 µl of each of these 35 pooled samples.

Demographics, smoking and dietary assessment
Demographic characteristics, including race, age, height and weight were obtained with a standardized questionnaire at the time of phlebotomy. Smoking status was based upon current smoking (yes/ no). A semi-quantitative food-frequency questionnaire containing 131 items was used to evaluate average daily consumption of fat (animal, vegetable and cholesterol) over the past six months for each individual [14,15]. All dietary-intake values were compiled at the Channing Laboratory, Harvard Medical School [16,17].

Extraction of lipids
Lipids were extracted as described previously [18]. Briefly, 100 µl of plasma was thawed on ice and then mixed with 3 ml of chloroform:methanol (2:1,v/v) and 900 µl of phosphate buffered saline (PBS). After vortexing, the mixture was centrifuged at 2000×g for 5 minutes. The bottom layer was collected, dried under N 2 , and dissolved in 100 µl chloroform. Extracts were stored at -80°C before LC-MS analysis.

LC-MS analysis
Liquid chromatography-MS analysis was performed with a Surveyor LC system coupled to an LTQ-FTMS, containing a heated electrospray ionization source (ESI) (Thermo Fisher Scientific, Waltham, MA). The MS was operated in both ESI+ and ESI-ionization modes with data collected from m/z 100 to 1200. For LC separation, a Luna C5 column (4.6×50 mm, 100 Å, 5 µm, Phenomenex, Los Angeles, CA) was selected with column and autosampler temperatures maintained at 25°C and 4°C, respectively. The C5 column was selected to elute all potential lipids in the samples, including hydrophobic triacylglycerides and cholesteryl esters. Injection volumes were 20 µl and 25 µl for ESI+ ionization and ESI-ionization, respectively. Mobile phases contained 0.1% formic acid for ESI+ ionization and 0.1% ammonium hydroxide for ESI-ionization. The column was eluted with a gradient of mobile phase A (methanol:50 mM ammonium formate 5:95) and mobile phase B (isopropanol:methanol:50 mM ammonium formate 60:35:5) as follows: 100% A for 5 minutes at 0.1 ml/min; 0-100% B over 15 minutes at 0.4 ml/min; 100% B for 5 minutes at 0.5 ml/min; 0-100% A for 5 minutes at 0.4 ml/min. Blank and QC samples were analyzed after 7 or 8 experimental samples to wash the column and monitor stability.
The MS was tuned with the following high-abundance lipids in several structural classes: tuning in positive mode employed LPC (16:0), LPE (18:1), PC (36:4), PE (34:1) and TAG (58:5) and tuning in negative mode employed FA (16:0), FA (20:4), PI (24:1) and PG (34:1). Several FTMS parameters, namely, mass resolution, maximum injection time, and maximum number of ions collected for each scan, were optimized for sensitivity while maintaining a mass resolution of 100,000. The following settings were used: vaporizer temperature, 280°C; sheath and auxiliary gases, 35 and 15 (arbitrary units); spray voltage, 3.5 kV; capillary temperature, 350°C; capillary voltage, 10 V; tube-lens voltage, 120 V; maximum injection time, 1000 ms; maximum number of ions collected for each scan, 5×10 5 . Mass calibration was carried out with a standard LTQ calibration mixture (Thermo Scientific, Waltham, MA). For untargeted analyses, a full scan was used for the FTMS with a mass resolution of 100,000, and data were recorded in centroid mode. To study structures of discriminating features, tandem MS/MS analyses were performed with the linear ion trap in low-resolution mode with a CID voltage of 30 V. Accurate masses were calculated using the lipid calculator (http://pharmacology.ucdenver.edu/lipidcalc/) and then extracted with a mass tolerance of 10 ppm in the total ion chromatogram (TIC).

Quantitation of analytes
Because PBS had been added to plasma (as erythrocyte washes) at the time of phlebotomy, volumes of diluted plasma varied across the pooled samples in our investigation. Thus, rather than quantifying peaks of unknown lipidomic features relative to internal standards, quantitation was based on dividing each peak intensity by the sum of all peak intensities detected in each pooled sample [19,20].

Data collection and processing
Data were collected continuously over the 30-minutes LC separation using Xcalibur software (Thermo Fisher Scientific). The raw data were converted to mzXML data format using proteoWizard software (Spielberg Family Center for Applied Proteomics, Los Angeles, CA). Peak detection, retention time collection and alignment were processed on the XCMS platform (http://xcmsserver.nutr.berkeley. edu/). All data-collection parameters were set to the "HPLC Orbitrap" default values (centwave feature detection, loess non-linear retention time alignment, 0.5 minimum fraction of samples in one group to be a valid group, P-value thresholds=0.05, isotopic ppm error=5, m/z absolute error=0.015) except the following: maximal tolerated m/z deviation in consecutive scans=3.5 ppm; width of overlapping m/z slices (mzwid)=0.005; retention time window (bw)=15 s, minimum peak width=20, maximum peak width=80. Lists of retention times (RT), m/z values and peak intensities were exported to an Excel spreadsheet for processing. As noted previously, the intensity of each peak was normalized to the sum of total intensities in each sample and was then multiplied by 10,000 for statistical analysis. stability of the method were estimated from repeated analysis of 8 ion peaks representing lipids detected in the quality control sample that covered large ranges of masses, intensities and retention times. As shown in Supplemental Information, Table S1, mass accuracies were less than 6 ppm and coefficients of variation of retention times and peak intensities were 0.10%-0.56% and 4.08%-24.47%, respectively.

Statistical analysis
Because plasma samples were pooled for the current investigation, (4-6 subjects per pooled sample) mean values were used for statistical analyses. A combination of univariate and multivariate statistical models was used to investigate discriminating features. First, twotailed Student's t-tests were performed to screen for discriminating features by race, gender and smoking status. Then, analysis of variance (ANOVA) methods were applied using the R platform with significance determined using a non-parametric permutation test with 10,000 observations [21]. False discovery rates (FDR) were corrected using the Benjamini-Hochberg (BH) method to adjust P-values for false discovery involving multiple comparisons [22]. After application of the BH method, 34 significant features were detected.
After putative identification of discriminating features

Structural identification of discriminating features
Preliminary identification relied upon matching accurate masses from FTMS (with a mass tolerance of 10 ppm) with entries in the Human Metabolome Database (HMDB) (http://www.hmdb.ca/), the Structure Database of Lipid Maps (LMSD) (http://www.lipidmaps. org) and the Metabolite and Tandem MS Database (METLIN) (http:// metlin.scripps.edu/). Since human plasma rarely contains lipids with odd-numbered fatty acyl chains, matches representing odd-numbered acyl chains were removed. Other filtering rules were constructed based on relative abundances of signals representing molecular ions and their common adducts, as determined from analyses of our training set of 193 lipid species (Supplemental Information, Table S2). Additional structural information was derived from MS/MS analysis and comparisons with reference standards (Supplemental Information, Section 2). Table 1 lists summary statistics for the subjects represented by the 35 pooled plasma samples in the current investigation. Subjects were young, with mean ages of 26 years and 25 years for black and white participants, respectively. The mean BMI for black subjects (28.9 kg/ m 2 ) was significantly greater than that of white subjects (24.1 kg/m 2 ) and black subjects had significantly higher consumption of all forms of fat. Also, smokers consumed significantly more dietary fats than nonsmokers.

Discriminating lipidomic features
To screen for differences associated with race, gender and smoking status, ANOVA models were obtained for each m/z feature (2,862 in ESI+ mode and 717 in ESI-mode) and random permutation tests were performed to establish P-values. The significance of each feature for a given comparison was determined by its P-value after BH correction for false discovery (P-values were truncated at 10 -8 ). As summarized in Table 2 and Figure 2, a total of 34 discriminating features was detected,      Sixteen of these features were putatively identified by combinations of accurate mass, retention time, MS/MS fragment ions and reference standards (details are given in Supplemental Information, Section 2). The sole non-lipid feature was identified as cotinine (m/z 177.10246), a metabolite of nicotine that was 262 times more abundant in smokers than in nonsmokers. The other tentatively identified features were all lipids that significantly discriminated for race. These racediscriminating lipids included Multivariable linear regression models were used to investigate whether levels of the race-related lipids were affected by gender, BMI or consumption of fat as recorded by 6-month dietary recall. According to the R 2 values of the regression models (Table 3), race accounted for 68.0% of the summed plasmalogen levels, gender for 6.1% and consumption of animal fat for 2.2% (vegetable fat was not associated with plasmalogen levels). The model for putative DGLEA showed that race accounted for 50% of the variation and that race, gender and their interaction jointly explained 62%, with white males having 7 times higher concentrations than black females. Race was the only significant predictor for putative PC (16:0/18:1) and explained 45% of the variance. With race in each model, BMI did not significantly contribute to explained variability.

Discussion
Using untargeted lipidomics with LC-FTMS, plasma-lipid changes related to race, gender, and smoking status were detected in healthy young American adults. The fact that baseline concentrations of these lipids differ between black and white subjects could be relevant to interpretation of findings that chronic diseases are more prevalent in black Americans than white Americans [26][27][28].
Most of the race-discriminating lipids were plasmalogens that were present at 2-fold higher levels in black subjects. Plasmalogens are required for membrane integrity and messaging [29,30] and serve as free radical scavengers [3,31,32]. Thus, these lipids are generally regarded as health promoting and several plasmalogens were recently detected at significantly lower concentrations in subjects with pancreatic cancer than in control subjects [6]. On the other hand, some oxidation products of plasmalogens can be toxic [33][34][35][36]. Since animal fat is the major source of plasmalogens in Western diets [37], the observed differences could reflect higher dietary intake of animal fat in black and male subjects (Table 1). Indeed, dietary consumption of plasmalogens increased plasma levels of these lipids in rats [37]. Selfreported consumption of animal fat (but not vegetable fat) explained a small amount of the variability of plasmalogen concentrations in our subjects (Table 3) after adjusting for race and gender. The fact that race was a much stronger predictor of plasmalogen levels than animal fat in our study could point to imprecision in dietary assessment of fat consumption and from aggregation of subjects by race/gender pooling. Higher plasmalogen levels in black and male subjects could also point to differential plasmalogen biosynthesis, possibly related to peroxisome activity [29,30]. Although BMI was significantly greater in black subjects, it is noteworthy that BMI did not explain additional variability of identified features after adjustment for race and gender in multivariable models.
Putative DGLEA, which was found at 4-fold higher concentrations in white subjects, is an endocannabinoid that binds to receptors (CB1 or CB2) that are also the targets of tetrahydrocannabinol, the principal active component of marijuana [23]. Upon activation of at least one of these receptors, specific physiological short-range events can be triggered, including neurotransmitter release. Effects of these reactions include analgesia, increased appetite and neural tissue development [38]. Although the endocannabinoid system and its effects are not well understood, disruption of this system has been implicated in metabolic syndrome and accumulation of excess visceral fat [39]. Since the corresponding acid (DGLA) has been shown to have minimal differences across racial groups [40], a differentiating event may occur in the pathway between DGLA and the transformation to an ethanolamide.
Our untargeted lipid profiling discovered 6 features in the mass range between 740 and 790 Da that were present at lower concentrations in smoking subjects. Since this mass range is characteristic of phospholipids or sphingomyelins, our results lend support to the hypothesis that smoking interferes with metabolism of these lipid classes as suggested by targeted profiling of serum samples from smokers and non-smokers by Wang-Sattler et al. [41]. Interestingly, we also found that the level of the PC plasmalogen, PC(P-18:0/22:6), which was associated with race in our study, was approximately 33% lower in smokers, compared with non-smokers, consistent with the targeted study [41]. The two features with m/z 567.38180 and 568.38515 (Table  2) were highly correlated with cotinine (Spearman r=0.928, 0.948), suggesting that they are metabolites or reaction products of tobacco.  Because we used archived plasma from a previous investigation [13], our study has several limitations. First, it was necessary to pool the specimens -and thereby anonomyze subjects' identities -while retaining testable factors (race, gender and smoking status). Although pooling is generally undesirable for small studies and could have reduced our ability to detect significant differences in population characteristics, those features that differed between races and genders (DGLEA and plasmalogens) are unlikely to be false positives [42]. Second, the blood samples had been obtained from non-fasting subjects, and this could have affected profiles of plasma lipids irrespective of race/ gender/smoking status. Third, the blood sampling protocol employed heparinized plasma, and differences in concentrations of numerous lipids have been observed across blood samples collected with different anticoagulants, including heparin [43]. Thus, the 6 plasmalogens and putative DGLEA and PC (16:0/18:1) should be interpreted as lipidomic features that differed significantly between races and genders in samples of heparinized plasma obtained from non-fasting subjects and stored for a prolonged period at -80ºC. Finally, as noted previously, archived plasma samples from the 158 individual subjects in the original investigation had been diluted with varying volumes of erythrocyte washes. This effectively precluded quantitation based on internal standards and motivated us to normalize individual features by the sum of all detected peaks. While this method of quantitation could also have reduced precision -and the ability to detect discriminating features -it should not have generated false positives.