Development of Zwitterionic Hydrophilic Liquid Chromatography (ZICHILIC-MS) Metabolomics Method for Shotgun Analysis of Human Urine

Urine is a product of the body’s metabolism and the majority of the metabolic products exiting via the renal system are rendered polar in order to be water soluble. Resolution of urinary metabolites for metabolomic studies requires the development of HPLC separation techniques that match this feature of biological chemistry. ZIC –HILIC is an ideal candidate to take forward resolution of such metabolites where reversed phase is unable to give adequate separation. Metabolomic data has to be processed by Shotgun multivariate analysis to sift through thousands of analytes and their variables such as ion intensity.


Introduction
Urinary metabolomic profiling is now widely reported with numerous methods for analyzing urine that are developed to ascertain changes due to drug or disease processes. Urine has been regarded as an important bio-fluid for such analysis as it contains information related to numerous compounds that are associated with many diseases e.g. amino acids, peptides, purines, pyrimidines, inorganic ions, organic acids and lipids [1][2][3][4][5][6][7]. Urine is also a preferred bio-fluid as the techniques for collection of samples are simple, non-invasive and are truly variable reports of a dynamic metabolism, whilst other key body fluids such as blood are kept with homeostatic boundaries. However, accurate and reproducible simultaneous analysis of multiple urinary molecules is critical precisely because urine is such a dynamic fluid reflective of metabolism: concentrations can vary hugely because of states of hydration or water intake; ketones can vary according to diet and not just diabetes etc. However, despite these considerations urinary metabolomics can be extremely important for disease diagnosis and biomarker studies [8][9][10][11].
Due to the complex constitution of urine, the study of small molecules within the urine requires a platform that can resolve and characterize the maximum number of analytes with the minimal amount of sample preparation artifacts. The majority of urinary analytes are polar in nature, hence high performance liquid chromatography-mass spectrometry (HPLC-MS) based metabolomics is widely used for urinary analyses [12][13][14]. Commonly, reversed phase high performance liquid chromatography (RP HPLC) is employed for urinary metabolomics studies due to its versatile nature. Latterly normal phase chromatographies are being reintroduced in order to resolve polar analytes that are less amenable to separation by (RP HPLC) and therefore more efficient for urinary metabolomics. The ZIC-HILIC (zwitter ionic-hydrophilic interaction liquid chromatography) mode of chromatography can be defined as the one based on the combination of hydrophilic stationary phases and hydrophobic, mostly organic mobile phases. The ZIC-HILIC stationary phase used in the current work contains a covalently bonded, zwitterionic sulfobetaine-group as the functional polar group bonded to the silica. With aqueous-organic mobile phase a water-rich layer is established within the stationary phase. The partitioning mechanism of solutes leads to separation of solutes from the eluent. This process is exothermic and is dependent on various factors such as acidity or basicity of the solutes, the dipole interactions and hydrogen bonding. Thus, in ZIC-HILIC, the stationary phase adds an extra dimension of separation mechanism to analyte retention. However, buffer or salts are required in the mobile phase to disrupt the interactions for successful elution. The use of buffers with high salt concentration is not recommended for methods based on mass-spectrometric detection. However, with ZIC-HILIC lower concentration of buffers can be used as the electrostatic interaction effect is lowered by ionic-groups present on the stationary phase. The mechanism of HILIC partitioning and retention is still not fully understood and work is ongoing in this area by Sequant [15].
In this study, the relatively new separation technique of ZIC-HILIC has been optimised to separate urinary metabolites that were detected using Ion Trap (IT) and Time of Flight (ToF) mass spectrometry.

Methods
Patient samples 30 mL urine from healthy volunteers was collected. The samples were then aliquot into smaller volumes and randomized to prevent bias. The urine samples were taken from volunteers with no history of any medical drug intake in seven days before the sample collection. For sample integrity study pregnancy urine samples archived between 1996 and 2000, as part of a prenatal diagnosis service at a large leading teaching hospital, collected from women between 13 and 18 weeks of gestation (based on last menstrual period) were used. These were early morning urine samples which were split into 1 mL aliquots and stored at -20°C. The individually coded samples were later matched with birth outcome and/or cytogenetic data following amniocentesis. Permission to carry out anonymized studies for these samples was granted by the ethics committee of Middlesex University (Natural Sciences Ethical approval number 347). All aliquots were stored at -20°C until further sample preparation and analysis.

Sample storage and preparation
Frozen urine samples were thawed at room temperature before analysis and four aliquots (500 µL) were prepared in Fisherbrand® (Loughborough, UK) eppendorf tubes. An aliquot (500 µL) of each urine sample was mixed with deionized water (1:2, v/v) (resistivity of 18.1 Ω). The diluted urine samples were transferred to Vivaspin 2® (Sartorius Mechatronics, Surrey, UK) centrifugal concentrators with 3 kD molecular weight cut-off filter made of poly ether sulphone (PES) membrane. The samples were centrifuged for 30 min at 200.35 g. A small volume (100 µL) of the filtrate was transferred to an HPLC vial and diluted with aqueous mobile phase A.
Stationary phase and optimization of the mobile phase 2.1 mm × 150 mm, 5 µm ZIC® HILIC (Sequant, Umeå, Sweden) column was used in this study. For HILIC separation, 50 mM ammonium acetate was prepared by dissolving ammonium acetate in deionized water. The aqueous ammonium acetate was mixed with acetonitrile (95:5, v/v). This was used for mobile phase 'A'. Eluent 'B' was composed of mixture of 50 mM aqueous ammonium acetate: water and acetonitrile (50:45:5, v/v). A mixture of acetonitrile and deionized water (95:5, v/v) was used for the auto sampler needle wash. The mobile phase was tested and optimized with respect to buffers, pH and organic solvents. The buffers salts tested were ammonium acetate (pH 5.8), ammonium formate (pH 8.2) and sodium formate (pH 2.5). A 10 mL solution of 10 mM creatinine was made in deionized water. An aliquot (100 µL) of creatinine standard was mixed to 1 mL of mobile phase 'A' and 500 µL of urine. Acetonitrile and methanol were tested as organic phases. The urine profiles obtained were compared to identify the conditions that gave best resolution and peak shape. Effect of flow rate on retention time and peak width was tested using flow rates 0.1, 0.2, 0.25, 0.3, 0.4 and 0.5 mL min -1 .

Stationary phase conditioning
The column was conditioned prior to injection of urine sample or standards for 1 h using a mixture of water and acetonitrile (50:50, v/v) at a flow rate of 0.2 mL min -1 .

Preparation of MS tuning solution
Tuning solution was made by mixing 250 mL of 0.1% aqueous trifluoroacetic acid and 250 mL of 10 mM aqueous sodium hydroxide. The pH of the solution was adjusted to 3.5 using aqueous 10 mM aqueous sodium hydroxide and the final volume was recorded. The solution of trifluoroacetic acid -sodium hydroxide was mixed with an equal volume of acetonitrile.

Mass spectrometer tuning
IT-ToF mass spectrometer was tuned every day before use in order to calibrate the detector. The tune solution was infused into the mass spectrometer using an integrated syringe pump on the IT-ToF instrument. The nebulising gas flow rate was set to 1.5 mL min -1 and CDL temperature set to 200°C. The heat block was set to 200°C and argon was used as drying gas at 105 kPa. The detector voltage was set to 1.6 kV. Ions in the range m/z 150 -1250 were scanned in both positive and negative mode using an event time setting of 137 msec scan -1 . Ion accumulation time was set to 20 msec. The output signal was allowed to stabilize and 3-4 min of average spectra was recorded. This data was used to calibrate the mass spectrometer in both positive and negative ion mode.

HILIC conditions
2.1 mm × 150 mm, 5 µm ZIC® HILIC (Sequant, Umeå, Sweden) column was used. The flow rate was maintained at 0.2 mL min -1 . The gradient program used is detailed in Table 1. The column was equilibrated for 5 min at 0.4 mL min -1 flow rate with 5% aqueous solvent.

Matrix effect
In order to study the effect of sample matrix on analyte retention and ion intensity, the urine was spiked with standards and the results were compared against non-spiked standards. 10 mM adenosine and inosine were made in HPLC grade deionized water. 100 µL of each standard was added to 1 mL of mobile phase 'A' and 500 µL of urine.

Sample integrity study
A study was carried out to determine the stability of samples stored at -20°C in comparison to those kept at 4°C. A pool of 10 urine samples from week 13 to 18 of normal pregnancy were split into two groups and stored at -20°C and 4°C for nine months. All 12 samples were randomized using statistical software package using Minitab and analyzed in single batch at the end of nine months of storage.

Reproducibility study
This study was carried out to determine the reproducibility of retention time and peak intensity of metabolites in urine over 24 h at ambient temperature. Urine sample from a healthy male volunteer was split into five aliquots (2 mL). The experiment was carried out in duplicate-once with injection of mobile phase between every five samples and once without any injections of mobile phase between samples.

Freeze-thaw study
A freeze thaw study was carried out over five days for five freeze and thaw cycles using urine sample from a healthy volunteer. Urine was split into six 2 mL aliquots. The samples were run as a single batch each day and frozen at -20°C immediately after the analysis of last sample.

Validation of methodology and profiling solutions software
In order to validate the methodology and profiling solutions software a study was carried out using different concentrations of paracetamol. Paracetamol was chosen in this experiment as the urine sample from the healthy volunteer was known not to contain the drug. A 10 mL stock solution of 10 mM paracetamol was made using deionized water. This was further diluted to give solutions of paracetamol of concentrations: 1 mM, 0.1 mM, 10 µM, 0.1 µM, 1 nM and 10 pM.
Urine sample from a healthy male treated as described earlier was used as a matrix in which these different concentrations were spiked. 10 urine samples were spiked with 100 µL paracetamol standards (Two samples spiked with each concentration of paracetamol). 10 urine samples were used as normal/control samples. A pooled QC of all paracetamol concentrations and all control samples was made (x6). External standard consisting of toluene, adenosine, allantoin, caffeine and uracil was used in order to monitor the instrument conditions and the reproducibility. The selection of these external standards for HILIC studies was made due to their characteristic presence and absence in urine. The in-house mixture of these external standards was prepared as adenosine, caffeine and uracil are urinary analytes whereas toluene and allantoin are non-urinary analytes under normal conditions. By using this mixture the system conditions were monitored as it reflected the separation and ionization consistency throughout the day. All spiked and control samples were randomized using Minitab and were blinded. The experiment was carried out with and without injection of mobile phase at the start and end of each sample analysis.

Data pre-treatment: extraction of MS chromatogram
In order to carry out statistical analysis, the mass spectrometry data was exported to a third party software. This was achieved by transforming the total ion chromatograms into numerical version because total ion chromatograms are graphs with complex information that can be difficult to interpret. However by converting them to a numerical data set, values can be assigned to ionization of each m/z value at a particular retention time. This transformation of data from a total ion chromatogram to a numerical matrix was done using Profiling Solutions® software (Shimadzu, UK). The time-aligned high mass accuracy MS data was exported as an aligned matrix to Umetrics SIMCA P+. All the data were scaled and mean-centered in order to standardize the coefficients when exported to SIMCA P+.
All the positive as well as negative ions were included for this transformation. In order to control the bin width when identifying centroid ions, an ion m/z tolerance was set to +/-25 mDa. It was found that the ion mass accuracy is not affected if a wider tolerance is used. The retention time alignment of ions across the data set was limited to 0.2 min in order to separate isomer peaks (if any) with at least 0.2 min retention difference. The ion intensity threshold of 20,000 was set in order to reduce background noise, so that centroid intensities below 20,000 were not included in the data. In an effort to reduce interference from background noise a retention time window was set for extraction. Quality control (QC) samples were injected before and after the sample analysis in order to check reproducibility. The QC samples contained a representative pool of all the samples included in the experiment on the day. Thus, during data pre-treatment only 70%+ of the total pooled QC ions were included in the extracted matrix. The percentage of relative standard deviation in ion response and retention time shift was set to 15% and 5% respectively. A two-column file matrix was generated for each sample which contained all the ions detected at various retention times throughout the whole data file.

HILIC -Mobile phase selection
The aim of HPLC analyses was to separate as many analytes in the sample as possible. From Figure 1, it can be seen that the separation factor may be affected when acetonitrile (ACN) is used as organic solvent compared to methanol. However, the peak resolution improves with use of acetonitrile. The separation factor (α) to some extent and resolution factor (R s ) can be seen to increase with the use of ACN in place of methanol. The resolution is better with a long chromatographic total run time for metabolomic profiling of urine [16]. The urinary profile obtained using UV detection at 254 nm did not show significant difference in the number of HPLC-peaks detected using either ACN or methanol. However, the peaks within the profile were better resolved using acetonitrile ( Figure 1). ACN  Table 1.
when used as the organic component did show a shift in peaks that indicated of longer retention than methanol. HPLC solvents are more desirable if they have low viscosity and appropriate solvent strength, as it does not cause build up of column pressure. On Hildebrand scale for HILIC mode of chromatography, which is similar to normal phase chromatography, methanol acts as stronger solvent than acetonitrile for non-strongly H-bonding organics. The retention time (t R or RT) and retention factor (k') increases as the solvent strength in the mobile phase decreases. This result confirms the nature of HILIC column used where ACN being weaker solvent than methanol for HILIC tends to provide much higher increase in retention [15]. The results were not conclusive but indicated that acetonitrile enhanced resolution and increased retention times. ACN can act as an acceptor of hydrogen bonds and furthermore provides stronger dipole-dipole interactions. Liu et al. (2009) suggested that protic polar solvents like methanol can compete with the polar sites on the stationary phase and in turn disturb the formation of the aqueous layer that is essential for the HILIC partition mechanism [17]. Due to the disturbance in the hydrophobic layer formed the retention could be poor as evident from the results as analytes can form hydrogen bonds easily [18].

Optimization of buffer and pH selection
The peak shape for creatinine spiked in urine samples was found to be better using 50 mM ammonium acetate as a buffer in mobile phase. 0.1% formic acid gave good HPLC-peak shape, but the intensity at 254 nm, was lower than that obtained with ammonium acetate as a buffer. Ammonium formate as buffer salt did not give a better resolution but gave rise to peak fronting ( Figure 2). Common buffers used for HILIC studies include ammonium salts of volatile acids that are also compatible with MS [15]. Trifluoroacetic acid, phosphate buffers and citrate buffers are not ideal or compatible with MS and hence, are not recommended for LC-MS studies. Buffers with very high pKa values like borate and diethylamine were not used as the resultant mobile phase pH was too basic for urine studies. Two of the main ammonium salt containing buffers i.e. ammonium formate and ammonium acetate were selected along with formic acid to compare acidic mobile phase. The low pH is capable of suppressing ionisation of weak acidic analytes and hence, increasing their retention time [19]. Zwitterionic (ZIC) HILIC column used in this experiment is operational in optimum conditions between pH 2-8 [15]. The ammonium acetate at pH 5.8 may have induced polarity in creatinine (pKa-5.02) nearly as much as 0.1% formic acid at pH 2.5 ( Figure 2a) leading to nearly similar retention. However, maximum retention was obtained by using ammonium formate as a buffer (pH 8.2). The peak shape and ultra violet (UV) response was found to be better using ammonium acetate than the other two buffers. At pH values below pKa ionisation of creatinine may take place due to the presence of two ionisable hydrogens and one nitrogen in its structure (Figure 2b). Charged solutes are more hydrophilic than their neutral forms and thus, more retained in HILIC mode [20]. At low pH these hydrogen may be lost leading to change in RT compared to neutral creatinine. This may have reduced retention on RPLC, but increased retention on HILIC as evident from the results. The composition of aqueous mobile phase can affect the buffering capacity [21]. The pKa of weak bases decreases with an increase in the organic component in mobile phase whereas the pKa of weak acids increase. In HILIC high organic environment is used and hence, such pKa shifts play significant role in protonation of acids or bases. Purine and pyrimidine bases and nucleosides can be separated using gradient elution with decreasing concentration of acetonitrile in buffered aqueous-organic mobile phase on sulfobetaine ZIC-HILIC columns [22]. The compounds elute in the order of decreasing hydrophobicity, in agreement with generally observed HILIC behaviour [23]. Besides the common partition mechanism in HILIC chromatography, the electrostatic interactions between charged analytes and the stationary phase plays a major part in the separation of creatinine [24]. The hydrophilic interaction and the ionic interaction were kept constant as the same pH buffer was maintained in both the mobile phase solvents and thus, the variation of the retention mainly depended on the variation of the charge state of both the analytes and the stationary phase. The peak shapes and retention time reproducibility tend to improve if a cation like ammonium is present in the buffer. Ammonium's higher affinity towards the ionised silanol results in better peak shapes. The ammonium salts surrounds the charged groups lowering the electrostatic interaction but better retention could be observed when the electrostatic repulsion with the stationary phase is higher. This high electrostatic repulsion may be caused by hydrophilic interactions achieved by nearly perfect orientation of the analyte [25]. Alpert et al. (2008) described this combination of mechanisms as electrostatic repulsion-hydrophilic interaction chromatography (ERLIC) [26]. For non-targeted metabolomics it's often impossible to know the pKa values of compounds in the sample and hence, buffer selection may be based more on trial and error approach. An ideal buffer can be selected based on criteria like minimum ion suppression, maximum ionisation and resolution.
The ionisable acidic or basic analytes may reflect a huge retention uV(x100,000) shift by a small variation in mobile phase pH [27]. The pH shift may affect the partition mechanism between solvents and analyte. At lower pH, the ionised analyte may not retain well in RPLC. Thus, use of buffers to maintain suitable pH throughout the HPLC experiment is desirable.

Flow rate
Flow rate experiment was aimed at determining the effect of flow rates to obtain maximum resolution with minimum background noise that can be attributed to mobile phase components. The ideal condition for urinary profiling can be considered as getting as many analytes separated as possible in a minimum time and without compromising sensitivity or operation of a mass spectrometer. A very high flow rate of solvents into the mass spectrometer is most likely to cause higher background noise subsequently reducing the sensitivity. The increased amount of sample injection or solvent salts could also cause damage to the detector due to the nature of analyte. As expected when a flow rate of 0.1 mL min -1 was used the retention time was longer for major analytes within the urine profile. However, the peaks were too broad with poor resolution. The peak resolution and symmetry increased with increase in flow rate and as expected it gave decreased analysis time as expected (Figure 3b). The Van Deemter plot (Figure 3a) showed the flow rate of 0.21 mL min -1 to be the optimal linear velocity for the chosen stationary phase and it confirms the manufacturer's recommended flow rate [15].
From Figure 3b it is evident that by increasing flow rate from 0.1 to 0.5 mL min -1 the retention time decreases by almost 50%. No literature supporting such finding for HILIC has been reported but several other authors have confirmed that increase in flow rate could decrease the retention significantly [28][29][30].

Matrix effect
By spiking adenosine and inosine in urine the HPLC-peak shapes of these analytes were affected (Figure 4a). The retention times decreased for both analytes compared to pure standard adenosine and inosine. Interference by other analytes in sample matrix resulted in splitting of HPLC-peaks. Also, the ionisation of both the standards was suppressed (Figure 4b). Presence of various endogenous substances within any biological matrix may substantially interfere with analysis using HPLC. It may show change in retention times for the standards used due to presence of salts and other ionisable molecules. Co-elution of molecules or elution of isobaric molecules in the matrix is often an issue when analysing urine samples using HPLC-MS [31]. Although simple dilution of urine samples is suggested to be enough for making urine samples compatible for LC-MS analysis [32]. However simple dilution is not recommended for metabolomic studies because this procedure does not remove unwanted larger molecules. When the urine samples are pre-treated, there may be traces of small molecules and salts within  Table 1. the matrix that can generate parent ions and even tend to form adducts with the analytes of interest. Based on the nature of analyte and its m/z value it can either lead to ion suppression by competing with analyte of interest or ion enhancement due to high signal at same retention time as analyte. The matrix effect was important to analyse as it showed how much shift in retention time should be expected post-profiling if identification of any markers was to be made. Also it was important to notice the effects on signal enhancement as well as suppression for this semi-quantitative methodology. The UV detector hence would not suffice the need of cross-referencing biomarkers in urine with standards as RT cannot be used as reliable parameter. Thus, the use of IT-ToF to match the metabolite's ion profile with that of standard would is a more reliable approach for biomarker identification and confirmation.
The resolution of peaks of pure standards (peak c and d) was different than when adenosine and inosine were spiked in the urine sample (peak a and b). The resolution between peaks a and b was 2.49 whereas the resolution between peaks c and d was 1.42. The UV response was noticeably lower for both inosine and adenosine when spiked in the urine sample.

Sample integrity study
The sample integrity at different temperatures was tested in order to determine the optimum storage conditions of urine samples. The urine samples had been frozen at -20°C until the analysis. However, for this particular experiment the sample storage was tested at 4°C and -20°C. The reduction in intensity of ions at 4°C was attributed primarily to degradation of the analytes. The ion matrix was extracted and then exported to SIMCA P+ for multivariate analysis. The PCA (Principal component analysis) plot generated (Figure 5a) with R 2 X (1) and R 2 X (2) with eigen values 0.52 and 0.20 respectively explained 72% of total variations. The X-axis represented t [1] which shows score t [1] for PC1 (Principal component 1) showing maximum variation in the dataset. The Y-axis represented t [2] which represents score t [2] for PC2 (Principal component 2) showing maximum variation not seen from just PC1. From Figure 5a, it is obvious that the 12 data points each representing a sample is separated in to two distinct groups. It was clearly seen from PCA-X plot that there was a distinct variation between samples stored at 4°C and -20°C. Thus, PC1 could be representative of temperature or any other factor that is indirectly affected by temperature. It is evident from analysis of PC2 that samples stored at -20°C show a larger spread than those at 4°C. Since the samples were collected from different subjects the variation between urine samples was expected irrespective of their storage conditions. The results within a single group showed more deviation at 4°C than -20°C (Figure 5a). The loading plots for the separation showed majority of the compounds that differentiate a sample stored at -20˚C from those stored 4°C were strongly retained on the HILIC column used (Figure 5b The metabolites present in the sample at 4°C and -20°C were found to be variable. A PCA plot for samples stored at either 4°C or -20°C shows a distinct separation between the two groups. These loadings explain the dominating correlation structure of the X matrix. The plot shows how the variables vary in relation to each other, the ones that provide similar information (i.e. cluster) and the ones that are not correlated at all and not explained by p1 or p2 (the ones close to 0). The variables near each other or clustered are positively correlated and variables opposite to each other are negatively correlated. The variables or m/z values that are located at 90° from each other are almost uncorrelated in these two components. Temperature control plays an important role for analysis of the unstable analytes [33]. The degradation of analytes in the biological matrix is significantly slower at 4˚C than at room temperature [34]. For instance, Venneri and Del Rio (2004) did not observe any significant differences in the concentrations of 3, 4-dihydroxyphenylglycol stored at 4°C or -20°C. It is advisable to hence, establish the stability profiles for the analytes in their matrix at  the temperature considered for their storage [35]. According to Pihl et al. (2010) some analytes may be stable at the lower temperature due to much lower activity of biological matrix proteins [36]. Thus, one of the underlying factors showing clustering of samples at 4°C could be speculated as the change in intensity of samples due to temperature.     PLS-DA (Partial least suqare discriminant analysis) plots were used to understand differences that may be seen within the 24 h injections at ambient temperature. The PLS-DA scatter plot for study with blank injections showed a major variation between samples over 24 h ( Figure  6a). However, in consequent study where samples were injected without any intermediate blanks showed better reproducibility in the system over 24 h period (Figure 6b).

Reproducibility study
The PLS-DA plot for reproducibility was expected to show all the sample injections as a tight cluster indicating minimum variation. However, the initial PLS-DA analysis indicated that although the results within each set were tightly clustered, there was underlying variation along PC1 (Figure 6a). After six sample injections, a blank sample was injected in order to re-equilibrate the HPLC stationary phase before the start of second analysis cycle. The variation along the PC2 was not seen between analysis cycle two and three but there was variation across PC2 between analysis cycles one to two and three to four. The variation between first six injections and next 12 injections may be due to the lack of column conditioning. When urine samples are injected the stationary phase requires few injections in order to obtain the stability [37].The variation after the third cycle may be due to lack of reproducibility in the method. In order to investigate the cause of variation seen between each set of 6 samples, another run without injection of blank after each set was carried out. The PLS-DA analysis showed distinct clustering of 23 samples with one outlier. The results demonstrate that the injection of blank between urine samples created change in stationary phase chemistry leading to change in the analysis or the ionisation capacity of the analytes. No literature suggests the change in column chemistry or conditioning as a result of blank mobile phase injection. However, since these results suggest its role in underlying variation between each set of six injections, further indepth investigations could provide an insight to the not well known HILIC mechanism.

Freeze-thaw study
The variations among samples after each freeze and thaw cycle were seen using PLS-DA analysis (Figure 7).The data showed more variation across both Principal components t [1] and t [2] with each additional freeze-thaw cycles. An odd outlier was noted from samples that were analysed after the second freeze-thaw cycle. It was noted that variation within the set of samples increased after each freeze-thaw cycle.
In any clinical analysis the sample frozen may be thawed more than once. The sample aliquot or sample itself is frozen once thawed at room temperature. The change in temperature, sometimes in short span, may affect the metabolites present within the sample. In order to study, how much (if any) degradation or change in composition of urine samples took place after each freeze thaw cycle, this study was carried out. The urine samples were subjected to five freeze-thaw cycles and variations between their ion matrices were studied using multivariate analysis. The PLS-DA analysis of six samples and five freeze thaw cycles showed all but one sample was within 95% confidence interval of HotellingT 2 . The PC1 is likely to represent temperature and any other underlying variable that is indirectly changed due to temperature change and PC2 represents the second maximum variation among the variables. The samples from the first three freeze-thaw cycles were well clustered compared to freeze thaw cycle four and five (Figure 7). This indicates there is an increase in variations within the sample after more than Figure 6a: PLS-DA analysis for reproducibility of the urine samples injected in four cycles over 24 h. The X-axis represented t [1] which shows score t [1] for PC1 showing maximum variation in the dataset. The Y-axis represented t [2] which represents score t [2] for PC2 showing maximum variation not seen from just PC1. three freeze-thaw cycles. There could be numerous underlying factors to consider in this particular experiment and hence, no single variable can be definitely narrowed down in order to explain the variation caused by increasing number of freeze thaw cycles.

Method and extraction software validation study
A PLS-DA analysis of samples showed clear differences between normal/control samples and samples spiked with paracetamol at different concentrations. The methodology and the use of extraction software allowed successful detection of even the lowest concentration of paracetamol and differentiated it from controls. A larger variation was seen in paracetamol spiked samples on day two whereas less variation is seen between controls on the same day ( Figure 8a). However, on day one the controls showed more variation between themselves compared to the paracetamol samples.
To develop an analytical method for screening biomarkers, one of the important factors that need to be investigated was detection sensitivity. The sensitivity of MS was tested along with the Profiling Solutions' ability to generate ion matrices as well as application of multivariate analysis to distinguish variations using SIMCA P+ was tested. The PLS-DA plotted for all samples showed all concentrations of paracetamol were successfully detected. The multivariate analysis using SIMCA P+ was able to distinguish successfully between the samples containing up to fM concentrations of the paracetamol against controls. Figure 8a shows separation of normal v/s paracetamol spiked samples across X-axis i.e. PC1. The PC1 thus represents the presence or absence of paracetamol. The PC2 represents the ionisation or sensitivity as it shows the spread of paracetamol concentration. Figure  8b shows similar separation, however, the spread across PC2 is larger for paracetamol containing samples than in Figure 8a. The spread of variability across PC2 clearly depends upon the differences between the degrees of ionisation for each sample.
The methodology parameters tested for liquid chromatography mass spectrometry as well as the data extraction software parameters were able to demonstrate the presence or absence of a paracetamol in the urine samples that were tested. The sensitivity of detection achieved on the IT-ToF was in fM range of the analyte. These spectra were successfully extracted and qualitative differences between the two sets of samples were demonstrated using SIMCA P+ multivariate analysis. The optimum HILIC mobile phase was found to be acetonitrile with ammonium buffer (pH 5.8).

Final method parameters for metabolomics analysis
Instrumentation and conditions: Liquid chromatography LC-20AD from Shimadzu® system hyphenated to a Shimadzu IT-ToF mass spectrometer equipped with an electro-spray ionisation source was used. The LC system comprised of a model DGU 20A3 degasser, a model SIL-20 A auto sampler, model LC-20AB pumps and a model CT0-20A thermostated column oven (40°C to 60°C). Sample aliquot volumes of 2 µL were injected into the column. Profile data for both Stationary phase conditioning: The column was conditioned prior to injection of urine sample or standards for 1 h using a mixture of water and acetonitrile (50:50, v/v) at a flow rate of 0.2 mL min -1 . An additional conditioning was introduced by 12 injections of a pooled urine QC prior to any sample analysis.
Mass spectrometer tuning: IT-ToF mass spectrometer was tuned every day before use in order to calibrate the detector. The tune solution was infused into the mass spectrometry using an integrated syringe pump on the IT-ToF instrument. The nebulising gas flow rate was set to 1.5 mL min -1 and CDL temperature set to 200°C. The heat block was set to 200°C and argon was used as drying gas at 105 kPa. The detector voltage was set to 1.6 kV. Ions in the range m/z 150 -1250 were scanned in both positive and negative mode using an event time setting of 137 msec scan -1 . Ion accumulation time was set to 20 msec. The output signal was allowed to stabilize and 3-4 min of average spectra was recorded. This data was used to calibrate the mass spectrometer in both positive and negative ion mode.

Hydrophilic interaction liquid chromatography (HILIC)
conditions: A 2.1 mm × 150 mm, 5 µm ZIC® HILIC (Sequant, Umeå, Sweden) column was used. The flow rate was maintained at 0.2 mL min -1 . The gradient program used is detailed in Table 1.The column was equilibrated for 5 min at 0.4 mL min -1 flow rate with 5% aqueous solvent. 50 mM ammonium acetate was prepared by dissolving ammonium acetate in deionised water. The aqueous ammonium acetate was mixed with acetonitrile (95:5, v/v). This was used for mobile phase 'A'. Eluent 'B' was composed of mixture of 50 mM aqueous ammonium acetate:  0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Figure 7: PLS-DA plot showing variation between the samples analyzed after each freeze and thaw cycle. Five freeze and thaw cycles showed variation within 95% confidence of Hotelling T2 with one outlier seen in freeze thaw cycle 2. The X-axis represented t [1] which shows score t [1] for PC1 showing maximum variation in the dataset. The Y-axis represented t [2] which represents score t [2] for PC2 showing maximum variation not seen from just PC1. The coloured key represents each freeze and thaw cycle after which the samples were analyzed.  Data pre-treatment: extraction of MS chromatogram: In order to carry out statistical analysis, the mass spectrometry data was exported to a third party software. This was achieved by transforming the total ion chromatograms into numerical version because total ion chromatograms are graphs with complex information that can be difficult to interpret. However by converting them to a numerical data set, values can be assigned to ionisation of each m/z value at a particular retention time. This transformation of data from a total ion chromatogram to a numerical matrix was done using Profiling Solutions® software (Shimadzu, UK). The time-aligned high mass accuracy MS data was exported as an aligned matrix to Umetrics SIMCA P+. All the data were scaled and mean-centred in order to standardize the coefficients when exported to SIMCA P+. Thus, the means were subtracted from all values for standardisation of the data after it was extracted as described below in this Section.
All the positive as well as negative ions were included for this transformation. In order to control the bin width when identifying centroid ions, an ion m/z tolerance was set to +/-25 mDa. It was found that the ion mass accuracy is not affected if a wider tolerance is used. The retention time alignment of ions across the data set was limited to 0.2 min in order to separate isomer peaks (if any) with at least 0.2 min retention difference. The ion intensity threshold of 20,000 was set in order to reduce background noise, so that centroid intensities below 20,000 were not included in the data. In an effort to reduce interference from background noise a retention time window was set for extraction. During method validation and software validation experiments, QC samples were injected before and after the sample analysis in order to check reproducibility. The QC samples contained a representative pool of all the samples in that particular experiment. Thus, during data pretreatment only 70%+ of the total pooled QC ions were included in the extracted matrix. The percentage of relative standard deviation in ion response and retention time shift was set to 15% and 5% respectively. A two-column file matrix was generated for each sample which contains all the ions detected at various retention times throughout the whole data file.

Conclusion
Urinary metabolomics are an intriguing area for biomarker research as the processes by which products of body systems and pathologies are broken down to small molecules are complicated by the need to make them water soluble (for example conjugation) so that they can be excreted [38,39].
Resolution and analysis of a complex array of small molecules in urine is ideally conducted by liquid chromatography and mass spectrometry, preferably by normal phase separations such as HILIC. After resolution and detection, analysis of the identified molecules has to take on board the robust nature of the molecule(s) in question if they are to be used in routine clinical analysis. However, ZIC HILIC mass spectrometry separations are so incredibly data rich that methodical analysis of each resolved molecule is an almost resource impossible task. Shotgun multivariate analysis by-passes much of this and looks for patterns of identified variables (i.e. retention time, m/z values). However, in such multivariate analysis only differences are highlighted, and not whether these differences are robust and due to a reliable metabolic marker or due to variability in the technique that arise due to storage and temperature variables in sample handling.
Here, we found that temperature plays a vital role in small molecules' storage and analysis in generating any meaningful data. These results suggest that urine samples should not be analysed after more than three freeze and thaw cycles as they degraded rapidly with this temperature changes. The sample must be stored at least at -20°C or lower temperature in order to maintain sample integrity. The factors that affect the ionisation capabilities of metabolites stored at different temperature as well as due to freeze and thaw cycles were not studied in this project as the aim was to understand the extent of these effects; not the causes. Extensive research could help understand degradation process and the underlying factors that could affect the sample. The injection of mobile phase between the sample analyses could affect the retention mechanism and hence induce unknown systemic errors in HILIC analysis. The reproducibility of the HILIC stationary phase was observed to be repeatable when the stationary phase environment was conditioned with urine samples instead of cleaning the stationary phase with strong mobile phase (a common practice in HPLC). The maintenance of HILIC column with column cleaning is still recommended but the HILIC mechanism, the role of column conditioning and its effect on reproducibility should be investigated.
Data extraction and data pre-treatment play a key role in identification of robust real markers from a variable biological and instrumental background noise which do not consistently distinguish the pathophysiology being examined. Here, this was achieved by using QC and external standards. The reproducibility of ions (here 70%) within the QC can filter out incidental ions. The QC ion incident threshold can be varied based on the raw data extracted from mass spectra. Thus, the developed final method was sensitive, robust and appropriate for repetitive analysis of samples for shotgun metabolomics analysis.