Reach Us
+44-1522-440391

^{1}Applied Mathematics Institute , Jilin University of Finance and Economics, Changchun, Jilin, 130117, China

^{2}Department of Statistics, University of Missouri, 146 Middlebush Hall, Columbia, Missouri, 65211, USA

- *Corresponding Author:
- Jianguo Sun

Department of Statistics

University of Missouri

146 Middlebush Hall

Columbia, Missouri, USA

**E-mail:**[email protected]

**Received Date:** July 31, 2010; **Accepted**** Date:** September 17, 2010; **Published**** Date:** September 27, 2010

**Citation:** Li Y, Shchy A, Sun J (2010) Nonparametric Treatment Comparison for Current Status Data. J Biomet Biostat 1:102. doi:10.4172/2155-6180.1000102

**Copyright:** © 2010 Li Y, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

Current status data occur in many studies and in this case, each subject is observed only once [3,10]. Furthermore, the distributions of observation times may be different for subjects in different treatment groups. This paper focuses on current status recurrent event data that concern occurrence rates of certain recurrent events such as disease infections and discuss nonparmametric comparison of several treatment groups. For the problem, two new tests procedures are proposed and a simulation study is conducted and shows that they are more efficient than the existing ones. An illustrative example on lung tumors is provided.

Current status data; Nonparametric Treatment Comparison; Recurrent Event Studies; Unequal Observation

Current status data occur in many studies such as cross-sectional studies, demographic studies, sample surveys and tumorigenicity experiments [3,5,6,9]. In this case, each subject is observed only once and no information is available on subjects between their entry times and observation times. Furthermore, the distributions of observation times may be different for subjects in different treatment groups. In this paper, we will consider such data arising from recurrent event studies that concern occurrence rates of certain recurrent events such as hospitalization and disease infection. For these current status recurrent event data, only the number of the recurrent events of interest that have occurred before the observation time is known and, in particular, the times at which the events occur are unknown.

A typical example of current status data arises from crosssectional studies that are often used in, for example, demographic studies or sample surveys. In these cases, the recurrent event of interest could be giving a birth, getting married, or changing a job. Tumorigenicity experiment is another area that often yields current status data. In these situations, the time until tumor onset is usually of interest and the comparison of different treatments with respect to the rates of development of tumor is often required. The tumor onset time, however, is often not directly observable. Instead, only the death time of animals in the study and the status of tumor onset at or the number of tumors developed by the death time is observed. For the treatment comparison here, an important factor that should be taken into account is animal death time, which serves as observation time and could depend on the treatments. A comparison not accounting for animal death time difference could overestimate or underestimate the treatment difference [6,8,10].

A number of authors have considered the analysis of current status data. For example, Diamond and McDonald [5] discussed the data arising from demographic studies and Dinse and Lagakos [6] and Hoel and Walburg [7] provided some methods for the analysis of the data given by tumorigenicity experiments. Several methods have been proposed for nonparametric treatment comparison based on current status data and these include the procedures given in Andersen and Ronn [1] and Sun and Kalbfleisch [12]. Also current status data can be regarded as a special case of interval-censored failure time data or panel count data and some nonparametric comparison approaches have been proposed for these situations [4]. However, most of these existing procedures only apply to situations where the distributions of observation times are identical across different treatment groups. One exception that considered the case where the distributions may be different was given by Sun [10]. In the following, two efficient procedures are presented that allow different observation time distributions.

The remainder of the paper is organized as follows. We will first begin with introducing some notation and briefly reviewing the procedures proposed in Sun [10]. Two new procedures are then presented in Section 3 and their asymptotic distributions are given. One procedure, which is much simpler, is designed for the situation in which observation times for all subjects under study follow the same distribution, where the other allows the distributions of observation times to be different or depend on treatments. Section 4 gives some results obtained from a simulation study conducted for assessing the performance of the proposed procedures in practical situations. An illustrative example from a tumorigenicity experiment is also provided in Section 4. Section 5 contains some discussion and concluding remarks.

Consider a recurrent event study that consists of n independent subjects and in which each subject receives one of p + 1 different treatments. For subject i, let *N _{i}(t)* denote the total number of occurrences of the recurrent event of interest up to time t and define Z

Several procedures are available for testing H_{0}. For example, Sun [10] suggested to use the following test statistic

assuming that the distributions of the T_{i}’s are identical for subjects in different treatment groups, where . Of course, in practice, the distributions of the observation times T_{i}’s may depend on the treatment indicators Z_{i}’s. To take this into account, Sun [10] proposed first to model this dependence by the proportional hazards model λ_{i}(t |Z_{i} )= λ_{0}(t)e^{1}Z_{i} for the hazard function of T_{i} [2]. Here λ_{0}(t) denotes an unknown baseline hazard function and is a *p*-dimensional vector of unknown regression parameters.

Note that for the T_{i}’s, one has the complete failure time data and thus one can easily estimate and the baseline cumulative hazard function by the partial likelihood estimate Breslow estimate respectively. Given these estimates, Sun [10] proposed to apply the statistic

for testing H_{0}, where . Furthermore, he showed that the statistic asymptotically follows a multivariate normal distribution with mean zero under the null hypothesis H_{0}. In the next section, two more efficient procedures are presented.

In this section, motivated by the two test procedures discussed in the previous section, we will present two new procedures for testing H_{0}. For this, define µ(t) = E{N_{i}(t)} under H_{0} and let (t) denote the isotonic regression estimate of µ(t) [13,11]. To test H_{0}, first suppose that all observation times T_{i}’s follow the same distribution or the distribution of the T_{i}’s is independent of the Z_{i}’s. Then by following the statistic , we propose to apply the statistic

It can be easily shown that under H_{0}, the distribution of U_{1} can be asymptotically approximated by the multivariate normal distribution with mean zero and the covariance matrix

Thus one can test H_{0} by using X_{1} = whose distribution can be asymptotically approximated by the distribution with degrees of freedom *p*.

Now we consider the general situation where the distribution of the T_{i}’s may depend on the Z_{i}’s. For this, we assume that the dependence can be described by model (1) as in Sun [10]. Let be defined as before, the partial likelihood estimate of given by the solution to the partial likelihood score equation

where To test H_{0}, we propose the following test statistic

It can been seen that the key difference between the existing test statistics reviewed in the previous section and the proposed test statistics is that unlike the former, the latter employs the centered response process N_{i}(t), thus reducing variance and gaining efficiency. The idea has been used by, for example, Sun [14] among others.

To describe the asymptotic distribution of U_{2}(), let A()and B()Define

and

Also define

and

i = 1, ..., n. Then one can prove that under H_{0}, the distribution of U_{2} () can be asymptotically approximated by a multivariate normal distribution with mean 0 and covariance matrix

Here *I* denotes the *p* ×* p* identity matrix and

The proof follows the similar arguments used in Sun [10] and is omitted. It follows that the test of hypothesis H_{0} can be carried out by using the statistic whose distribution can be asymptotically approximated by the distribution with degrees of freedom *p*.

A simulation study was conducted to assess the performance of the two test procedures presented in the previous section in practical situations. In the study, we considered the two sample comparison problem (*p* = 1) and took Z_{i} equal to 0 or 1 with probability q. Note that in the design of a study, the sample sizes for two treatment groups are usually set to be equal or close to each other, but in practice, they may be different. We investigated situations with q = 0.50, 0.67 and 0.80. To generate current status data, we first generated the potential number of events from the Poisson distribution with mean 2 and then generated the occurrence times of the events from the uniform distribution. The current status data were thus given by determining how many events have occurred before the observation time generated either from the uniform distribution or exponential distribution with the hazard function given in (1). The results given below are based on 1000 replications.

**Table 1** presents the estimated size of the two test procedures proposed in Section 3 with the type I error 0.05 and the total sample size n = 100 or 200. It can be seen that both procedures seem to give the proper size. The estimated powers of the two procedures are given in **Table 2**. Here we took λ_{0}(t) = e and = 0.5. For the comparison, we also estimated and included in **Table 2** the powers of the two test procedures given in Sun [10] and based on statistics and , respectively. These two procedures are denoted by and and given in brackets in the table. The results indicate that the new procedures always seem to have greater power than the existing procedures and the procedure based on X_{2} has better power than that based on X_{1} as expected. Also as expected, the power increases when the sample size increases and the more balance of the sample sizes between the two treatment groups means greater power.

Sample percentage | Procedure X_{1} |
Procedure X_{2} |
||

n = 100 | n = 200 | n = 100 | n = 200 | |

q = 50 | 0.057 | 0.051 | 0.047 | 0.052 |

q = 67 | 0.054 | 0.048 | 0.052 | 0.050 |

q = 80 | 0.041 | 0.047 | 0.054 | 0.052 |

**Table 1:** Estimated size of the proposed test procedures.

Sample percentage | Procedure X_{1} (X ^{ *}_{1} ) |
Procedure X_{2} ( X ^{ *}_{2}) |
||

n = 100 | n = 200 | n = 100 | n = 200 | |

q = 50 | 0.439 (0.325) | 0.452 (0.342) | 0.703 (0.679) | 0.718 (0.701) |

q = 67 | 0.413 (0.316) | 0.441 (0.338) | 0.698 (0.676) | 0.711 (0.688) |

q = 80 | 0.389 (0.282) | 0.426 (0.323) | 0.678 (0.661) | 0.711 (0.672) |

**Table 2:** Estimated power of the proposed test procedures.

To illustrate the two test procedures given in the previous section, we applied them to the current status data described in Hoel and Walburg [7] on lung tumors. The data arose from a tumorigenicity experiment on 144 male RFM mice and involve two treatments, conventional environment (96 mice) and germfree environment (48 mice). For each mice, the observation consists of its death time as the observation time and the presence or absence indicator of lung tumor at the death. One of the objectives of the study was to compare the lung tumor incidence rates of the two groups. As shown in Sun [10], for the data, the death or observation times are quite different between the two treatment groups. That is, we have unequal observation.

For the comparison of the lung tumor incidence rates, define Z_{i} = 0 if the i_{th} animal was in conventional environment and 1 otherwise. The application of the two test procedures described in the previous sections yielded X_{1} = 8.2549 and X_{2} = 3.9704 with the corresponding p-values of 0.0041 and 0.0463, respectively, for testing no difference of the lung tumor incidence rates between the two groups. The results suggest that the lung tumor incidence rates between the two treatment groups were significantly different and the animals in the germfree environment had higher incidence than those in the conventional environment. The results above also indicate that in the case where there exist unequal observations, one needs to be careful as the procedure that assumes the equal observation tends to overestimate the treatment difference. These conclusions are similar to those obtained by Sun [10], which gave the p-values of 0.0009 and 0.028 for the same comparison problem by using the test procedures based on the statistics and , respectively.

This paper discussed the nonparametric treatment comparison problem based on current status recurrent event data that usually occur in cross-sectional studies and sample survey that concern occurrence rates of some recurrent events of interest among others. For the problem, a few procedures have been developed under the assumption that the observation time follows the same distribution for all subjects under study [1,6]. However, the assumption may not hold in practice as seen in the example discussed in Section 4. We developed two new nonparametric test procedures that do not require the assumption and have been shown to be more efficient than the existing procedures that do not rely on the assumption.

As mentioned above, current status data discussed here is a special case of panel count data [4,11] and thus the comparison problem discussed here could also occur to panel count data. It is worth noting, however, that the observation processes between the two types of data are quite different. For current status data, the observation process involves only a single time variable, while the observation process with respect to panel count data has to be described by a point process and is thus much more complicated. The focus of this paper has been on recurrent events. If the event can occur only once, current status data become a special case of commonly referred to as interval censored failure time data [11]. As panel count data, interval-censored failure time data involve more than one observation time point for each study subject and thus also have much complex observation processes. For both panel count data and interval-censored failure time data, it would be useful to develop some nonparametric test procedures for treatment comparison that allow different observation processes for subjects in different treatment groups.

A limitation of the proposed test procedures as well as most of existing procedures is that the recurrent event process of interest and the observation process were assumed to be independent given treatments. In some situations, this is not true. An example is given by a tumorigenicity experiment concerning some tumors that are between nonlethal and lethal. In this case, the tumor occurrence rate and the animal death time are correlated and thus their relationship has to be taken into account for the comparison. In general, one usually says that there exists an informative censoring or observation time and some different procedures that take into account the relationship have to be developed for treatment comparison.

The authors wish to thank two reviewers for their many helpful and thoughtful comments and suggestions that greatly improved the paper.

- Andersen PK, Ronn BB (1995) A nonparametric test for comparing two samples where all observations are either left- or right-censored. Biometrics 51: 323-329.
- Cox DR (1972) Regression models and life-tables(with discussion). J R Stat Soc Series B 34: 187-220.
- Datta S, Sundaram R (2006) Nonparametric estimation of stage occupation probabilities in a multistage model with current status data. Biometrics 62: 829- 837.
- Deng D, Fang HB (2009) On nonparametric maximum likelihood estimations of multivariate distribution function based on interval-censored data. Commun Stat Theory Methods 38: 54-74.
- Diamond ID, McDonald JW (1991) The analysis of current status data, In Demographic Applications of Event History Analysis. Trussel J, Hankinson R Tilton J (eds.), Oxford University Press, Oxford, UK.
- Dinse GE, Lagakos SW (1983) Regression analysis of tumor prevalence data. Appl Stat 32: 236-248.
- Hoel DG, Walburg HE (1972) Statistical analysis of survival experiments. J Natl Cancer Inst 49: 361-372.
- Lagakos SW, Louis TA (1988) Use of tumor lethality to interpret tumorigenicity experiments lacking cause of-death data. Appl Stat 37: 169-179.
- Rai SN (1997) On semi-parametric models in occult tumour experiments. Biom J 39: 909-918.
- Sun J (1999) A nonparametric test for current status data with unequal censoring. J R Stat Soc Series B Stat Methodol 61: 243-250.
- Sun J (2006) The statistical analysis of interval-censored failure time data. Springer Science + Business Media Inc., USA
- Sun J, Kalbfieisch JD (1993) The analysis of current status data on point processes. J Am Stat Assoc 88: 1449-1454.
- Sun J, Kalbfieisch JD (1995) Estimation of the mean function of point processes based on panel count data. Stat Sin 5: 279-290.
- Sun Y (2010) Estimation of semiparametric regression model with longitudinal data. Lifetime Data Anal 16: 271-298.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Big data
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Computer-aided design (CAD)
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft Computing
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- Total views:
**12386** - [From(publication date):

October-2010 - Dec 15, 2019] - Breakdown by view type
- HTML page views :
**8492** - PDF downloads :
**3894**

**Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals**

International Conferences 2019-20