Improving Detection in Managing Health and Medical Care with Data Analytics

The purpose is to introduce the demand for the quality movement practice in problems associated with public health. We show both the need and application of quality monitoring, especially the need for multivariate quality concepts to reduce the costs of operating public health programs to control the flow of problems in the dynamic behavior of these systems in public health systems such as water treatment to utilize concepts associated with multivariate methods and auto correlated time series. One of the most difficult problems involves the use of multivariate quality control in the scientific and health environment. Multiple diagnoses require use of advanced techniques to arrive at complex solutions to medical and health problems. We explore these problems. Business and Economics Journal B u s i n es s a nd E oics Jour n a l


Introduction
Public Health and Water Quality management involve the leveraging of channel wide integration to better serve public needs. Increases in productivity and quality control and improvement will follow when public health managers implement and coordinate quality management activities upstream. Public health management should recognize anew the aspects of quality control and quality assurance requires two duties to be undertaken. First, we refer to the process whereby measures are taken to make sure defects in services are not part of the final output, and that the output meets quality and acceptable health standards. Second, one may observe that quality assurance entails overlooking all aspects, including design, development, service, installation, as well as documentation. The Quality movement is the field that ensures that management maintains the standards set and continually improves the quality of the output. The quality movement Lee, et al. [1] offers users sound lessons that can be very powerful to address public health lessons. Instead of final, end-service source inspection, the quality movement emphasizes prevention, total quality management, source inspection, process control and continuous improvement. These are all ingredients for successful and effective ways to manage and mitigate the risks in public health application such as water quality control [2][3][4].
We introduce the philosophy and methods of the quality improvement to achieve the best results of health service operations. This paper focuses on supply chain planning with quality control in an environment with multiple service centers and multiple customers. We first discuss the needs for quality planning in the supply chain environment to focus on where the notion of statistical process (or quality) control (SPC or SQC) is so vital to the performance of a health programs' environment to focus on where the notion of SPC fits and why it is so vital to the performance of public health programs in the global environment. In turn, we introduce and discuss the desire for more sophisticated methods to insure that quality and improvement is maintained in public health processes including water treatment systems.
While public health programs are so crucial to the general health of society, these health systems must be sustained by both preventative and emergency measures. Zhang, et al. [5] propose several sophisticated strategies for dealing with SPC strategies in an environment where service flows continue over time. Their study presents principle agent models regarding the consumer's quality evaluation and the supplier's quality prevention level decisions. Studies such as this may produce results not heretofore examined by the practioner's of SPC in public health and water quality. In addition, threats to water quality are real and many and measures must be developed to indicate when water quality and similar processes are not operating in an efficient and productive manner. These measures include those of SPC which will indicate when risks are present in the inspection processes in water treatment and public health programs. Since public health programs are increasingly globalized, these SPC measures must be strategically incorporated inspection and monitoring programs and the choice of the particular SPC procedures are critical in developing an optimal plan.
Most SPC methodologies assume a steady state process behavior where the influence of dynamic behavior either does not exist or is ignored. The focus is on the control of only one variable at a time and distinguishes between Phases I [analysis of historical data] and II [monitoring quality levels]. Specifically, SPC controls for changes in either the measure of location or dispersion or both. SPC procedures as practiced in each phase may disturb the flow of the service production process and operations. In recent years, the use of SPC methodologies to address the process where behavior is characterized by more than one variable is emerging. The purpose of this next section is to review the basic Univariate procedures to observe how one improves the performance of SPC to achieve better measures in Phase II by considering run length performance.

Univariate (Shewhart) Control Charts
A Shewhart control chart which is the central foundation of univariate SPC has one major shortcoming. This control chart is Page 2 of 6 considers only the last data point and does not carry a memory of the previous data. As a result, small changes in the mean of a random variable are not likely to be detected rapidly. Exponentially weighted moving average (EWMA) charts improve upon the detection of small process shifts. Rapid detection of relatively small changes in the quality characteristic of interest and ease of computations through recursive equations are some of the many good properties of the EWMA chart that make it attractive.
EWMA chart achieves faster detection of small changes in the mean. The EWMA chart is used extensively in time series modeling and forecasting for processes with gradual drift. EWMA provides a forecast of where the process will be in the next instance of time. It thus provides a mechanism for dynamic process control [6]. Late, examples of these methods will be analyzed.
The EWMA is a statistic for monitoring the process that averages the data in a way that gives exponentially less and less weight to data as they are further removed in time. The procedures for developing EWMA control charts give details on implementing this type of Phase I system. Montgomery [7] contains the development of the models for finding the control limits in this for the univariate charts and need not be discussed further at this point.
In many situations, the sample size used for process control is n = 1; that is the sample consists of an individual unit [7]. In such a situation, the individuals control chart is used. The control chart for individuals uses the moving range of two successive observations to estimate the process variability. Such small samples may lead to false signals which increase the likelihood of Type II errors, i.e., the error of leaving a process alone when it should be stopped and a search for the malfunctions should be implemented. Public health models were further explored in detail.
Often, in public health and water treatment programs, the distinction between Phases I and II is not clear. Sonesson, et al. [8] pointed out problems and issues related to statistically based evaluations. Researchers, often, did not examine average run length (ARL) of a proposed method over a variety of alternative process shifts. ARL performance of a proposed method or program for an incontrol state and for a single shift in the service process for which the proposed detection program optimizes must be evaluated. If the system is not optimized, misplaced control limits may result. The system for detection of quality shifts is sub-optimized and better techniques should be sought. In the next section, we introduce methods and their possible use in processes having dynamic inputs [9].
Alwan [10] found that more than 85% of process control applications studied resulted in charts with possibly misplaced control limits. In many instances, the misplaced control limits result from the autocorrelation of the process observations, which violates a basic assumption often associated with the Shewhart chart. Autocorrelation of process observations has been reported in many industries, including cast steel [10], wastewater treatment plants [11], chemical processes industries [12] and many other service industries and programs. Several models have been proposed to monitor processes with auto correlated observations. Alwan, et al. [13] suggest using an autoregressive integrated moving average (ARIMA) residuals chart, which they referred to as a special cause chart. For subsample control applications, Alwan, et al. [14] describe a fixed limit control chart, where the original observations are plotted with control limit distances determined by the variance of the subsample mean series. Montgomery and Mastrangelo [12] use an adaptive exponentially weighted moving average (EWMA) centerline approach, where the control limits are adaptive in nature and determined by smoothed estimate process variability. Lu, et al. [15] investigate the steady state ARL of cumulative sum (CUSUM), EWMA, and Shewhart control charts for auto correlated data modeled as a first order autoregressive process plus an additional random error term. Last, et al. [16] considering quality monitoring by feedback adjustment.
A problem with all these control models is that the estimate of the process variance is sensitive to outliers. If assignable causes are present in the data used to fit the model, the model may be incorrectly identified and the estimators of model parameters may be biased, resulting in loose or invalid control limits [17]. To justify the use of these methods, researchers have made the assumption that a period of "clean data" exists to estimate control limits. Therefore, methods are needed to assure that parameter estimates are free of contamination from assignable causes of variation. Intervention analysis, with an iterative identification of outliers, has been proposed for this purpose. The reader interested in more detail should see Alwan [18], Atienza, et al. [19] and Box, et al. [20]. Atienza, et al. [19] recommend the use of a control procedure based on an intervention test statistic, λ, and show that their procedure is more sensitive than ARIMA residual charts for process applications with high levels of positive autocorrelation. They limit their investigation of intervention analysis, however, to the detection of a single level disturbance in a process with high levels of first order autocorrelation. Wright, et al. [21] propose a joint estimation method capable of detecting outliers in an auto correlated process where the data available is limited to as few as 9 to 25 process observations. Since intervention analysis is crucial to model identification and estimation, we investigate varying levels of autocorrelation, autoregressive and moving average processes, different types of disturbances, and multiple process disturbances.
The ARIMA and intervention models are appropriate for auto correlated processes whose input streams are closely controlled. However, there are quality applications, which we refer to as "dynamic input processes," where this is not a valid assumption. The treatment of wastewater is one example of a dynamic process that must accommodate highly fluctuating input conditions. In the health care sector, the modeling of emergency room service must also deal with highly variable inputs. The dynamic nature of the input creates an additional source of variability in the system, namely the time series structure of the process input. For these applications, modeling the dynamic relationship between process inputs and outputs can be used to obtain improved process monitoring and control as discussed by Alwan [18]. West, et al. [22] proposed the following transfer function model to solve problems having dynamic behavior. If a process quality characteristic a t , has a time series structure, an ARIMA model of the following general form can represent the undisturbed or natural process variation: In equation (1) , where d = d 1 + sd 2 . This quantity is a polynomial in B that expresses the degree of differencing required to achieve a stationary series and accounts for any seasonal pattern in the time series. Finally, a t is a white noise series with distribution 2 ( ) a N O . This model is described by Chen and Liu [23,24]. If the series z t is contaminated by periods of external disturbances to the process, the ARIMA model may be incorrectly specified, the variability of the residuals overestimated, and the resulting control limits incorrectly placed.
The following transfer function model of Box, et al. [25] describes the observed quality characteristic, y t , as a function of three courses of variability: The first term v (B) x t-b , is the dynamic input term and represents an impulse function. v(B), applied to the input x t-b with a lag of b time periods. If a dynamic relationship between the input and output time series exists, lagged values of process inputs can be modeled, resulting in considerable reduction of unexplained variance. The second term, (w (B)/δ(B)I t , is the intervention term and identifies periods of time when assignable causes are present in the process. Here, I t is an indicator variable with a value of zero when the process is undisturbed and a value of one when a disturbance is present in the process. See, for example, Box, et al. [26] for the development of the transfer function term, and Box, et al. [26] for details of the intervention term. The rational coefficient term if I t is a ratio of polynomials that defines the nature of the disturbance as detailed in Box, et al. [26]. The third term (0(B)/Φ (B)a t , is the basic ARIMA model of the undisturbed process from Equation (9). We refer to Equation (10) as the "transfer function" model throughout this paper.
Different types of disturbances can be modeled by the proper design of the intervention term. The two most common disturbances for quality applications are a point disturbance, with an impact observed for only a single time period, and a step disturbance, with an impact persisting undiminished through several subsequent observations. The point disturbance is modeled as an additive outlier (AO). An AO impacts the observed process at one observation. The AO is modeled in the form where w o is a constant. A step disturbance to the process is modeled as a level-shift outlier (a form of innovational outlier or IO) in the form.
Chang, et al. [27] extended the concepts of Box, et al. [25] to an iterative method for detecting the location and nature of outliers at unknown points in the time series. The above researchers defined procedures for detecting innovational outliers and additive outliers and for jointly estimating time series parameters. Their work also demonstrates the need for future study of the nature of outliers.

Multivariate control charts
Multivariate analyses utilize the additional information due to the relationships among the variables and these concepts may be used to develop more efficient control charts than simultaneously operated several univariate control charts. The most popular multivariate SPC charts are the Hotelling's T 2 [28] and multivariate exponentially weighted moving average (MEWMA) [29]. Multivariate control chart for process mean is based heavily upon Hotelling's T 2 distribution, which was introduced by Hotelling [30]. Other approaches, such as a control ellipse for two related variables and the method of principal components, are introduced by Jackson [31] and Jackson [32]. A straightforward multivariate extension of the univariate EWMA control chart was first introduced in Lowry, et al. [33] and Lowry, et al. [34] developed a multivariate EWMA (MEWMA) control chart. It is an extension to the univariate EWMA.
Where I is the identity matrix, Z is the i th EWMA vector, X is the average ith observation vector I = 1, 2… n, Λ is the weighting matrix. The plotting statistic is Lowry and Montgomery [34] showed that the (k, 1) element of the covariance matrix of the i th EWMA, ∑ Zi is ( ) where ,1 á k is the (k, 1) element of ∑, the covariance matrix of the X 's. If 1 = 2 =……..= p = , then the above expression simplifies to [where ∑ is the covariance matrix of the input data].
There is a further simplification. When I become large, the covariance matrix may be expressed as: Montgomery and Wadsworth [35] suggested a multivariate control chart for process dispersion based ( )  (12) In the next section, we explore how multivariate methods improve process control in the supply chain.
In the bivariate case the representation is elliptical.

2.
You can maintain a specific probability of a Type 1 error (the risk).
3. The determination of whether the process is out of or in control is a single control limit.
Currently, there is a gap between theory and practice and this is the subject of this manuscript. Many practitioners and decision-makers Page 4 of 6 have difficulty interpreting multivariate process control applications although the book by Montgomery addresses many of the problems of understanding not discussed in the technical literature noted before. For example, the scale on multivariate charts is unrelated to the scale of any of the variables, and an out-of-control signal does not reveal which variable (or combination of variables causes the signal).
Often one determines whether to use a univariate or multivariate chart by constructing and interpreting a correlation matrix of the pertinent variables. If the correlation coefficients are greater than 0.1, you can assume the variables correlate, and it is appropriate to construct a multivariate quality control chart.
The development of information technology enables the collection of large-size data bases with high dimensions and short sampling time intervals at low cost. Computational complexity is now relatively simple for on-line computer-aided processes. In turn, monitoring results by automatic procedures produces a new focus for quality management. The new focus is on fitting the new environment. SPC now requires methods to monitor multivariate and serially correlated processes existing in many time series of public health and water treatment programs.
SPC emphasizes the properties of control for decision making while it ignores the complex issues of process parameter estimation. Estimation is less important for Shewhart control charts for serially independent processes because the effects of different estimators of process parameters are nearly indifferent to the criterion of average run length (ARL). Processes' having serial correlation, estimation becomes the key to correct construction of control charts. Adopting workable estimators is then an important issue.
In the past, researchers studied SPC for serially correlated processes and SPC for multivariate processes separately. Research on quality control charts for correlated processes focused on Univariate processes. Box, et al. [25] and Berthouex, et al. [11] noticed and discussed the correlated observations in production processes. Alwan et al. [13] proposed a general approach to monitor residuals of Univariate auto correlated time series where the systematic patterns are filtered out and the special changes are more exposed. Other studies include Montgomery and Friedman [46], Harris, et al. [47], Montgomery, et al. [12], Maragah, et al. [48], Wardell, et al. [49], Lu, et al. [15], West, et al. [22] and West, et al. [50], English, et al. [51], Pan, et al. [52] suggested state space methodology for the control of auto correlated process. Further, additional technologies implemented by Testik [53], Yang, et al. [54] and Yeh, et al. [9] provide newer methods for enabling better MPC methods.
In Alwan and Roberts' approach, a time series is separated into two parts that are monitored in two charts. One is the common-cause chart and the other is the special-cause chart. The common cause chart essentially accounts for the process's systematic variation that is represented by an autoregressive-integrated-moving-average (ARIMA) model, while the special cause chart is for detecting assignable causes that can be assigned in the residual of the ARIMA model. That is, the special cause chart is designed as Shewhart-type chart to monitor the residuals filtered and whitened from the auto correlated process (with certain or estimated parameters). In this analysis, the authors suggest methods used in conventional quality control software (i.e., Minitab) entitled multivariate T 2 and Generalized Variance control charts. These multivariate charts show how several variables jointly influence a process or outcome. For example, you can use multivariate control charts to investigate how the tensile strength and diameter of a fiber affect the quality of fabric or any similar application. If the data include correlated variables, the use of separate control charts is misleading because the variables jointly affect the process. If you use separate univariate control charts in a multivariate situation, Type I error and the probability of a point correctly plotting in control are not equal to their expected values. The distortion of these values increases with the number of measurement variables. In the next section, we will consider an illustration.
1. Multivariate charts simultaneously monitor correlated variables. To monitor more than one variable using univariate charts, you need to create a univariate chart for each variable.
2. The scale on multivariate control charts is unrelated to the scale of the individual variables.
3. Out of control signals in multivariate charts do not reveal which variable or combination of variables caused the signal.
Whenever the variables are correlated, multivariate control charts will achieve superior perform. A correlation matrix will show whether the variables are cross correlated. As we noted in the above charts, if the variables cross correlated, the use of separate control charts is misleading because the variables jointly affect the process. If you use separate univariate control charts in a multivariate situation, a Type I error and the probability of a point correctly plotting in control are not equal to their expected values. The distortion of these values increases with the number of measurement variables stated differently, the results of the use of univariate analysis are biased.
When one finds the out of control point in a multivariate control chart, the solution is often not very simple. An out of control point will not easily indicate which or how many of the variables give evidence of a special cause. When one finds out-of-control points, one may wish to create separate univariate charts to investigate each variable. However, one must interpret these charts with great caution since these charts do not account for the multivariate nature of the process data. Last, there are additional topics that can aid the data analysts in identifying the causes of processes being out of control. These "G and H" [55] charts provide for monitoring the number of cases between hospital-acquired infections and other adverse events. Much of these methods are now included in various in commercial quality control software.

Conclusions and Suggestions
We discussed the control chart usage and illustrate why better procedures are available to supply chain managers. For example, we illustrated methods developed by Alwan and Roberts' utilizing residual chart analysis. Later, we explored methods such as West et al. transfer function application and traditional Multivariate Hotelling T2 chart to monitor multivariate and multivariate serially correlated processes (those with dynamic inputs). The scheme can be viewed as a generalization of Alwan and Roberts' special cause approach to multivariate cases. The guideline and procedures of the construction of VAR residual charts are detailed in this paper. Molnau, et al. [56] produces a method for calculating ARL for multivariate exponentially weighted moving average charts. Mastrangelo, et al. [45] simulated a VAR process for SPC purposes. However, the general study on VAR residual charts is heretofore not reported. In addition, more recent studies by Kalagonda, et al. [39,40], and Jarrett, et al. [42][43][44], indicate additional ways in which one can improve upon the multivariate methods currently available in commercial quality control software such as Minitab® and others. These newer techniques provide more statistically accurate and efficient methods for determining when Page 5 of 6 processes are in or not control in the multivariate environment. When these methods become commercially available, practitioners should be able to implant these new statistical algorithms for multivariate process control charts (MPC) using ARL measure to control and improve output.
These new methods provide methods for MPC charts focusing on the average run length. The purpose is to indicate how useful these techniques are in the supply chain environment where processes are multivariate, dynamic or both. Simple SPC charts though very useful in simple environments may have limited use in public health. In any event, future research should focus on exploring the characteristics of the public health and finding the best model to implement quality planning and improvement programs. Multivariate analysis should provide many of the new tools for adaption in improving health and water quality. The costs of, stoppages and threats to the public health will diminish when managers explore the usefulness of multivariate methods noted before. Last, these quality analysts much be trained, retrained and continually trained in those methods that best fit the supply chain environment. Simple Shewhart methods no longer are sufficient to manage in the global environment of public health. The intensive use of automatic data acquisition system and the use o f computing for process monitoring have led to an increased occurrence of monitoring processes that utilize statistical process control . These analyses are performed almost exclusively with multivariate methodologies. Often, today, analysts utilize G charts when one desires to monitor the number of opportunities or, in many cases, the number of days between rare events, such as infections or surgical complications. For example, in cases of Wrong-site surgeries, patient falls, infection outbreaks, accidental needle stick and harmful medication errors. Last, mathematical modelers in recent year have made great strides in predicting rare events. This modeling method may show promise in the future to explain and identifying rare events and is likely to produce newer and better methods for improved quality control methods. Novak [57][58][59] shows methods for treating the cases of rare events in many applications that have similar statistical properties as those in public health