The Probability of Inconstancy in Assessment of Cardiac Function Post-Myocardial Infarction in Mice.

In the present study, we explore the inherent variability that leads to overlaps in cardiac functional parameters between control and post-myocardial infarction (MI) mice. Heart failure was induced by Left Coronary Artery (LCA) ligation in mice. Average Ejection Fraction (EF) measured by echocardiography was lower in MI mice compared to control, but exhibited higher Standard Deviation (SD) and Standard Error (SEM), notably in 2D mode. Fractional Shortening (FS) showed a higher degree of overlap between MI and control mice even though the mean values were significantly different. Hemodynamic measurements of EF resulted in greater SD, SEM, ± 95% confidence intervals, and effect size. In comparing echocardiography at different time points, EF and FS were consistent by mean, but had apparent fluctuation in individual tracks, which were more obvious in MI than control mice. Hemodynamic measurements showed more complexity in data collection in mice in vivo. MI size showed variability that correlated with severity of cardiac function. These studies show that there is inherent variability in functional cardiac parameters after induction of heart failure by MI in mice. Analysis of these parameters by traditional statistical methods is insufficient, and we propose a more robust statistical analysis for proper data interpretation.


Introduction
Left Ventricular (LV) Ejection Fraction (EF) and Myocardial Infarct (MI) size are the most important indicators of cardiac function in coronary ischemic heart failure [1,2]. Echocardiography and hemodynamic analyses are the most popular tools to detect heart function, and infarct size can be measured by echocardiography and direct digital This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. photography [3,4]. Traditionally, heart failure is defined as an EF of less than 40% in human patients, but there is no standard definition for rodents [5,6]. Conventionally, if the EF post-MI is significantly lower than the control group, then the generation of a heart failure model is assumed [7]. Additionally, if heart function in MI mice is significantly improved after treatment, as determined by EF, it could be reported that the treatment is effective [8].
Statisticians have stressed that the P-value is not an absolute in indicating significance between two means [9,10]. The P-value is sensitive to multiple factors, including cohort number, where a high or low n can be manipulated into a low P value [11,12]. To date, there is no standard definition of heart failure in mice Changes in cardiac function are influenced not only by the size of the infarct but also by infarct location, anesthesia, echocardiographic probe, ventricular catheter position, animal's body temperature, amount of bleeding, and operator experience [13,14]. Small animal surgery, echocardiography, and hemodynamic experiments are complex and prone to variability, as the small size of the mouse heart makes reproducibility difficult [14,15]. Most studies report the average EF or Fractional Shortening (FS) with Standard Deviation (SD) or Standard Error (SEM); however, the inherent variability of MI in mice isn't exemplified in this practice. This, in addition to the lack of a standard definition of heart failure in mice, makes it difficult to conclude cardiac dysfunction and/or therapeutic improvement.
In the present study, we examine large cohorts of mice with and without MI to determine the level of variability in EF, FS, and MI size using 2D and M mode echocardiography and hemodynamic assessment. By reporting all animals examined, we show the range of EF and FS values that can be obtained and demonstrate the overlap between MI and control groups of mice. We also explore the correlation and margin of deviation between MI size and LV function. By reporting a more exhaustive set of statistical parameters (beyond mean +/− SEM and P-value), we demonstrate that a more powerful data set and comparisons can be generated. Overall, this study aims to provide a frame of reference for researchers to improve data collection and analysis of cardiac function in mice post-MI.

Animal protocol
All procedures were done according to the recommendations in the Guide for the Care and Use of Laboratory Animals (Department of Health and Human Services publication number NIH 78- 23,1996) and were approved by the Icahn School of Medicine at Mount Sinai Animal Care and Use Committee. C57BL6J male mice were used ranging in age from 10 to 12 weeks and in weight from 25-30 g. 70 mice underwent Left Coronary Artery (LCA) ligation to induce MI and 45 survived. 28 mice underwent sham operation for control and all survived. No animals were omitted from data analysis. Surgical procedures have been previously described [16]. Briefly, animals were anesthetized intraperitoneally with 0.06 mL mixture of KAX (ketamine 1 mL × 100 mg/mL, acepromazine 0.1 mL × 10 mg/mL, xylazine 0.1 mL × 20 mg/mL and 1 mL of 0.9% normal saline) per mouse. After thoracotomy, ligation of the LCA was performed with a 7-0 silk suture. The successful performance of LCA ligation was verified by visual inspection of the apex color. The chest was closed with 6-0 silk suture and the skin was closed with 4-0 silk sutures. Post-surgical monitoring was continued until the animal again becomes conscious. 8 mice died after LCA ligation (5 due to large MI area and bleeding, 3 because of additional anesthesia). 13 mice died 1-3 days after surgery due to heart failure. 4 mice were removed from the study because of signs of humane endpoints such as continual coma, cyanosis, and recumbence after 3 days. Antibiotics were not given, but no apparent infection developed in surviving animals during the course of the study or at the time of autopsy. All mice were housed under identical conditions and were given water and food ad libitum.

In vivo hemodynamics
Pressure-Volume (PV) loops of the LV were acquired and analyzed as previously described [14,21]. Briefly, hemodynamic measurements were performed using a 1.2 Fr PV conductance catheter (Scisense). Mice were injected intraperitoneally with urethane (1 g/ kg), etomidate (10 mg/kg) and morphine (1 mg/kg) before inhalation of 5% (vol) isoflurane, then mechanically ventilated and maintained at 0.5-1% (vol) isoflurane during the surgical procedure. The right carotid artery was exposed and isolated via a midline incision. Three sutures (7-0 silk) were placed under the left common carotid artery. The top one was tied and the bottom one was pulled toward to heart to prevent bleeding. A small incision was made in the middle of the carotid artery then the catheter was introduced and advanced down the ascending aorta, through the aortic valve, and into the LV. A small incision up the diaphragm was made and followed by transient (1-2 s) occlusion of the thoracic vena cava to decrease venous return during the recording of hemodynamics. The left jugular vein was intubated with a 24 G catheter connected to a 1 cc syringe with ~0.8 mL saline and 0.1% heparin. Subsequently, parallel conductance was determined by a 50 μL injection of 0.9% saline into the jugular vein. 35 MI mice underwent PV loops and 8 MI mice were excluded due to technical issues, like bleeding or bad catheter positioning. 15 control mice underwent PV loops and all were included.

MI by direct imaging
Hearts were harvested after the final hemodynamic measurement. The hearts were perfused with 10 mL cold PBS and 1% heparin. The LV was cut from the root of the pulmonary artery to the ventricular apex and photographed. MI size (% area) was the average of the infarct area of the whole heart, left half of the LV, and right half of the LV and quantified using Image-Pro software (Media Cybernetics, Bethesda, MD, USA) [21].

Statistics
Variables are expressed as mean ± Standard Error of the Mean (SEM). One-way ANOVA was used for time course of heart function and Student's t-test was used for all other statistical analyses to compare experimental groups using GraphPad Prism software. Pvalues <0.05 were considered statistically significant. Cohen's effect size based on the difference between two means was calculated by dividing the difference between means by the standard deviation (0.2-0.3 is considered a small effect and 0.8 or higher is considered a large effect).

Mean values and overlapping of LV functions in M mode and 2D mode echocardiography post-MI
In general, cardiac function was significantly worse in MI compared to control mice. In the MI group, the LV anterior wall thickness was decreased, LV internal diastolic and systolic diameter were increased ( Table 1). The mean values of EF and FS were significantly lower in the MI group compared to controls (P<0.01) ( Figures 1A and 1B). However, scatter plot analysis of this data revealed that 23% and 26% of MI mice had overlapping EF and FS with control mice, respectively ( Figure 1C). The EF determined by M mode was much higher compared to 2D mode for both MI and control mice ( Figure 1D). As evidenced by the scatter plots, 2D mode resulted in more variation in EF and FS compared to M mode both in MI and control groups even though the standard error is very low due to high animal number.

Time course variation of heart function with 2D mode
2D mode echocardiography was performed on control and MI mice at different time points. 39% (7/18) of control and 29% (13/45) of MI mice had fluctuations of 5-10% in EF, and 17% (3/18) of control and 22% (10/45) of MI mice had fluctuations >10% in EF in a onemonth interval in time course measurements. There were similar discrepancies in FS values for both control and MI groups ( Figure 2). Analysis of EF and FS in individual mice over time demonstrates the range of values that can be obtained, notwithstanding a virtually identical average between time points with a P-value >0.05. For EF, the difference between the highest and lowest values was 16% in control and 45% in MI mice (Figures 2A and 2C). For FS, the difference was 20% and 35% in control and MI mice, respectively ( Figures 2B  and 2D). This occurred despite the fact that the standard error of the mean was very small (<2%) ( Figure 1B).

Variation of cardiac function associated with MI size
MI size was determined by echocardiography. Figure 3 shows that EF and FS negatively correlated with MI size (P<0.01). However, there were variations greater than 20% in EF and FS using either M or 2D mode at most MI size points. Figure 4 shows some factors that typically cause this variation. The echo probe level has a significant impact on M mode data; a 1 mm difference in placement can result in 15-20% variations in EF ( Figure 4A). Another cause for variation in 2D mode echocardiographic measurement of EF can be due to an unclear internal edge of the LV and end of outflow tract (aortic valve) ( Figure 4B). In M mode echocardiography, the choice of posterior border can be difficult in some cases due to hypertrophic papillary muscle ( Figure 4C). Regarding MI size, the determination of the left and right points of U-type LV length and the border point of infarction is complicated, as the infarct wall shows gradients in thickness ( Figure 4D). Figure 5A shows that the average MI size 2-3 months after surgery is consistent by echocardiography, but scatter plot reveals that MI size fluctuated by 5-10% in 36% of mice between the 2-and 3-month time points, while 10% of mice showed >10% fluctuation in MI size in this same period ( Figure 5B). One of the common causes of inconsistency in MI size is the angle of the echocardiographic probe in relation to the LV long axis. Small or moderate MIs induced by occlusion of the small branch of the LCA could potentially be overlooked if the transducer probe was placed in the standard view position on the long axis because the infarct occurred on the left side of the LV (Figures 5C-5E).

Variation in hemodynamics and digital photo of MI
PV loop data showed significant differences in ± dP/dt, Tau, ventricle volume, stroke volume, and LV weight in the MI group compared to controls ( Table 2). Figure 6 shows PV loop examples of steady-state, saline-treated, transient occlusion (OCC) of the thoracic vena cava, and end-systolic PV relationships (ESPVR) in control and MI mice. The average EF was significantly decreased in MI compared to control mice (P<0.01), but scatter plot analysis of this data revealed that 41% of EF values overlapped between the MI and control groups ( Figures 7A and 7B). EF negatively correlated with MI size measured by direct imaging, even though the EF varied by 30-40% for most MI size points. Positive correlation was found in MI size measured by echocardiography and direct imaging. However, there was apparent discrepancy in MI size by these two methods, as the R 2 value was only 0.475 ( Figure 7C). As shown in Figure 7D, the heart is oval in shape; therefore, it is difficult to measure MI size as a percentage of heart surface area by a regular digital photo. When the heart is opened to assess MI size, curling of the infarcted wall, due to wall thinning as a direct consequence of MI, could result in a smaller MI size ( Figure 7D; a and b). ESPVR values had similar variability to EF in both the control and MI groups ( Figures 8A and 8B) even though the ESPVR positively correlated with EF and max dP/dt (P<0.01) ( Figures 8C  and 8D).

Discussion
In this study, we examined EF, FS, and MI size in a large cohort of mice post-MI by echocardiography and hemodynamics in order to determine the range of values typically observed in these methods. In addition to reporting the mean, standard error, and P-value, we analyzed the data by multiple statistical methods in order to gage a truthful comparison between functional measurements. The inherent variability in induction of MI in small rodents results in high variability in functional parameters. The data from the 70 mice presented in this study can be used as a reference index for subsequent heart failure studies in small rodents, as our large sample size and robust statistical analysis could provide a frame of reference for a typical data set.
In the examination and assessment of heart failure models in mice, several parameters are obtained, such as EF, FS, end-diastolic volume, and end-systolic volume [22,23]. A significant difference in these parameters, as evaluated by traditional hypothesis testing, provides evidence of defective cardiac function. In our study, the averages of all key functional parameters were significantly lower in MI than control mice; with a P-value <0.01, this signifies that the probability that a specific parameter of a control mouse is identical to that of a MI mouse is less than 1%. Therefore, if only the average is considered, one could conclude that, based on the data presented in this paper, a successful MI model of heart failure was established using LCA ligation. However, there is inherent variability between mice, as is evidenced by the presentation of our data as scatter plots. For example, if a mouse has a FS of 52% as obtained by M mode echocardiography, it would be impossible to determine if this mouse has heart failure, as 15% of MI mice had FS ≥ 52% while 17% of control mice had FS ≤ 52%. We could not be sure that the change in EF is due to experiment injury or medical treatment ( Figure 2).
In the present study, we examine the variability that occurs in assessment of MI by echocardiography and PV loop. Cardiac function data in mice post-MI are variable between laboratories [24,25]; in some instances, the EF can be as low as 21% with a standard error of less than 1% even in a group of 5-7 mice [8,26]. Our data show that EF values measured by M mode are higher than 2D mode echocardiography, but there is a higher chance of obtaining P values <0.01 due to low deviation and small scatter range. FS values measured in M mode have lower overall values but a higher overlap rate and SD/mean ratio. Manual analysis of LV volume can be adapted to different pathophysiology, but is usually accompanied by a higher deviation (Figures 2C and 4B). 2D mode echocardiography had the lowest percentage overlap rate and a higher effect size between the MI and control groups (Table 3). However, it also produced a larger range of values (42%) and a higher SD/mean ratio (32%) in EF in MI mice; thus, the probability of data inconstancy is higher in 2D mode than M mode, particularly with small sample sizes even if the P-value <0.01. PV loops are an invasive process and it is not possible to repeat them in the same animal. The geometry of the heart changes post-MI and the aneurysm makes the loops irregular, resulting in variation of EF. Compared to echocardiography, EF determined by PV loops had a higher effect size, SD/mean ratio, 95% confidence intervals, range of scatter, and percentage overlap rate (Table 3). This means that if the animal number is small, it will be difficult to draw conclusions from this type of data [11].
In summary, we have characterized a large cohort of control and MI mice using multiple methods to evaluate functional parameters. We have provided a robust statistical analysis of EF and FS in 2D and M mode echocardiography and hemodynamic analysis by PV loops. We suggest that researchers use more than one method to determine EF and FS, and if echocardiography is used, the mode should be clearly stated (as evidenced by the different values obtained with 2D and M mode). The data should also be analyzed beyond statistic difference, and authors should report 95% confidence intervals and overlap rate percentage between experimental groups, as these parameters better demonstrate the variability of the data compared to P-value alone. We also recommend that authors publish how many animals were excluded from the study (and reasons why) and how many animals died during the procedures, as these are not always provided. Our study has two limitations: 1) The variability in functional parameters following a treatment post-MI was not examined; and 2) We did not pursue any technical reasons for variability in MI size in mice. We are currently addressing this latter limitation by exploring the diversity of LCA in relation to MI size in mice.

Acknowledgments
This work was supported in part by grants from the National Institutes of Health: R01HL078731, R01HL083156, R01HL093183, R01HL088434 and P20HL100396 (RJH) and the Transatlantic Leducq Foundation (RJH). DKC is supported by a fellowship from the American Heart Association (15POST25090116).          Table 2 Hemodynamic measurement of left ventricle function in mice post-MI.  Table 3 Variance in EF and FS as measured by echocardiography and hemodynamics.