Reach Us
+44-1522-440391

**Richard Charnigo ^{*} and Cidambi Srinivasan**

Department of Statistics, University of Kentucky, Lexington KY 40506-0027

- *Corresponding Author:
- Richard Charnigo

Department of Statistics

University of Kentucky

Lexington KY 40506-0027

**E-mail:**[email protected]

**Received Date:** March 28, 2011; **Accepted Date:** March 30, 2011; **Published Date:** March 31, 2011

**Citation:** Charnigo R, Srinivasan C (2011) Estimating multiple derivatives simultaneously: What is optimal? J Biomet Biostat 2:102e. doi:10.4172/2155-6180.1000102e

**Copyright:** © 2011 Charnigo R. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

Nonparametric regression techniques including kernel smoothing [1], spline smoothing [2], and local regression [3] are useful for estimating a mean response function Âµ(x) in the statistical model Yi = Âµ (xi)+?i when one is unwilling to assume that Âµ(x) is linear (or polynomial of higher but known degree) in the covariate x. These same techniques can also be employed to estimate one or more derivatives of Âµ(x). While the techniques differ in their details, they have a common underlying theme. One specifies a covariate value x0 and estimates Âµ(x) or one of its derivatives at x0 by solving an optimization problem that is localized to a neighborhood of x0, in that only observations with covariate values inside the neighborhood contribute substantively to the solution. For example, the simplest incarnation of this theme is to define Âµ(x0) to be the average of all responses Yi for which |xi-x0| is sufficiently small. As one slides x0 through a continuum of all possible covariate values, an estimated mean response or derivative is then traced out. Selecting the neighborhood size is a crucial implementation decision to which much literature has been devoted.

Nonparametric regression techniques including kernel smoothing [1], spline smoothing [2], and local regression [3] are useful for estimating a mean response function *µ*(x) in the statistical model Y_{i} = *µ* (x_{i})+∈_{i} when one is unwilling to assume that *µ*(x) is linear (or polynomial of higher but known degree) in the covariate *x*. These same techniques can also be employed to estimate one or more derivatives of *µ*(x). While the techniques differ in their details, they have a common underlying theme. One specifies a covariate value x_{0} and estimates *µ*(x) or one of its derivatives at x_{0} by solving an optimization problem that is localized to a neighborhood of x_{0}, in that only observations with covariate values inside the neighborhood contribute substantively to the solution. For example, the simplest incarnation of this theme is to define *µ*(x_{0}) to be the average of all responses Y_{i} for which |x_{i}-x_{0}| is sufficiently small. As one slides x_{0} through a continuum of all possible covariate values, an estimated mean response or derivative is then traced out. Selecting the neighborhood size is a crucial implementation decision to which much literature has been devoted [4].

Under mild conditions, including appropriate dependence of the neighborhood size on the sample size n, Stone [5] established that local regression yields an optimal convergence rate of n^{-(J+1-k)/ (2J+3)} in estimating *µ*(x) for 0 = k = J when *µ*(x) has (*J*+1) bounded derivatives. However, optimality may be defined even more stringently than the attainment of a particular convergence rate. For instance, optimality may entail minimizing mean square error or an asymptotic approximation thereto. Yet, kernel and local regression estimators of *µ*(x) with minimal mean square error are not the k^{th} order derivatives of kernel and local regression estimators of *µ*(x) with minimal mean square error [6,7]. While the existing literature thus provides guidance on the optimal estimation of *µ*(x) by itself, or of *µ*(x) by itself, the existing literature does not elucidate what is optimal for the simultaneous estimation of µ(x) and µ(x) or, more generally, the simultaneous estimation of *µ*(x) and all of its derivatives up to order J. Here we clarify that by simultaneous we refer not merely to the explicit estimation of multiple derivatives in a single data analysis but also to the requirement that and honor the same functional relationship as *µ*(x) and µ(x), namely that Charnigo and Srinivasan [8] have termed this requirement "self-consistency".

There are several practical applications in which estimating a mean response function and its derivatives may help to address important scientific questions. These applications include the modeling of:

. Human height [9], for which the first derivative is the growth rate and the second derivative can be employed to delineate time intervals over which growth is speeding up or slowing down;

. Kidney function for a lupus nephritis patient [10], for which the first derivative quantifies the progress of the disease and the second derivative can be used to delineate time intervals over which the disease is progressing unstably;

. Scattering profiles of submicroscopic nanoparticles [11], for which the mean response function and its derivatives may be employed like "fingerprints" to identify nanoparticles of unknown size or structure given existing results for nanoparticles of known size and structure; and,

. Raman spectra of bulk materials [4], for which the mean response function and its derivatives may likewise be used to identify materials of unknown chemical composition. The Raman spectrum application may be particularly interesting to readers of this journal because of its potential to detect impurities as part of a quality control process in pharmaceutical production [12] and its potential to complement existing mammography and ultrasound technology for the noninvasive diagnosis of breast cancer via the detection of calcified lesions [13].

In some of these practical applications, one may reach contradictory scientific conclusions if µ(x) and its derivatives are not estimated simultaneously. For example, Charnigo and Srinivasan [8] illustrated the consequences of having inequalities among and in the human height application. Employing local regression, Charnigo and Srinivasan [8] found that the estimated first derivative for one child had a local maximum at 10.5 years, suggesting that the child's growth spurt peaked at 10.5 years. On the other hand, the estimated second derivative for that same child was nonzero at 10.5 years. The closest zero of the estimated second derivative was at 10.1 years, translating to a discrepancy of five months in pinpointing the peak of the growth spurt. While there is inherently some uncertainty about when the growth spurt peaked, acquiring two different estimates from a single data analysis is unsettling. The preceding illustration thus demonstrates that insisting upon optimal estimation of µ(x) by itself, of *µ*(x)by itself, of by itself, and similarly for higher order derivatives may lead to incoherent scientific conclusions.

We therefore perceive the need for a new criterion by which optimality may be defined when multiple derivatives are estimated simultaneously. Such a criterion would evaluate a family of selfconsistent estimators and so forth rather than evaluating each estimator by itself. Ideally, this criterion would favor good estimation of several derivatives over excellent estimation of one derivative accompanied by poor estimation of the remaining derivatives. Hence, a family of estimators deemed optimal by such a criterion would not be anticipated to include, for example, a that minimized the mean square error in estimating *µ*(x) ; the derivatives of such a would likely be too undersmoothed to serve as good proxies for the derivatives of *µ*(x). Likewise, an optimal family would not be anticipated to include that minimized the mean square error in estimating *µ*(x); the antiderivatives of such would likely be too oversmoothed to serve as good proxies for the antiderivatives of*µ*(x).

What, then, might such a criterion look like? One idea would be to consider the sum of mean square errors, with the rationale that the sum could be minimized only if each derivative were well estimated. Yet, because might be much larger than any other summand, the sum might overemphasize the estimation of *µ*(x) and thereby lead to oversmoothing in the estimation of µ(x) and its lower order derivatives. A less naive idea would be to consider a weighted sum of mean square errors, or the mean square error of a weighted sum of derivative estimators,. Either way, a sensible specification of *a*_{0} through *a*_{J} would be required for the criterion to serve its intended purpose. In light of Stone's [5] theory, one might think to let ak escalate in proportion to n^{(J+1-k)/(2J+3)} as the sample size n increased. However, prescribing *a*_{k} = c_{k} n^{(J+1-k)/(2J+3)} with c_{k} not dependent on n only reduces the question of specifying *a*_{0} through *a*_{J} to the problem of choosing *c*_{0} through *c*_{J}. One might imagine that c_{0} = c_{1} = **. . .** = c_{J} = 1 would be a natural default choice and least vulnerable to criticism for appearing ad-hoc, but whether such a choice would allow the criterion to serve its intended purpose is unclear.

The considerations in the preceding paragraph are, of course, predicated on the belief that *µ*(x) is (*J* + 1) - times differentiable. A still greater challenge remains in formulating a criterion by which optimality may be defined when *µ*(x) is infinitely differentiable and all of its derivatives are to be estimated simultaneously. Such a criterion is motivated by the recognition that, although one may not envisage practical applications in which estimates of all derivatives are required, there may exist practical applications in which the number of derivative estimates required is not known a priori. For example, in the Raman spectrum application, a researcher may first examine an estimate of *µ*(x). If the estimate of *µ*(x) reveals the chemical composition of the material, then the researcher may stop. Otherwise, the researcher may examine an estimate of *µ*'(x). This process may continue, with the researcher subsequently examining estimates of *µ*" (x) and higher order derivatives, until the researcher either knows the chemical composition of the material or regards the higher order derivative estimates as so noisy that he/she is simply forced to make a guess about the chemical composition. Charnigo, Hall and Srinivasan [4] provide an example in which an estimate of *µ*' (x) leads to successful identification of a sample of cerium bastnasite. In general, then, the number of derivative estimates to be examined may not be known a priori. Unfortunately, since most nonparametric regression techniques make no provision for the estimation of infinitely many derivatives, there is little theory to inform the construction of a criterion by which optimality may be defined when *µ* (x) is infinitely differentiable and all of its derivatives are to be estimated simultaneously. We thus conclude the present editorial by calling for additional research on the simultaneous estimation of a mean response function and its derivatives when the mean response function is infinitely differentiable.

This material is based upon work supported by the National Science Foundation under Grant No. DMS-0706857. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

- Hardle W (1990) Applied nonparametric regression. Cambridge University Press, Cambridge.
- Wahba G (1990) Spline models for observational data. SIAM, Philadelphia.
- Loader C (1999) Local regression and likelihood. Springer-Verlag, New York.
- Charnigo R, Hall B and Srinivasan C (2011) A generalized Cp criterion for derivative estimation. Technometrics, tentatively accepted.
- Stone CJ (1980) Optimal rates of convergence for nonparametric estimators. Annals of Statistics 8: 1348-1360.
- Muller HG, Stadtmuller U, and Schmitt T (1987) Bandwidth choice and confidence intervals for derivatives of noisy data. Biometrika 74: 743-749.
- Fan J, Gijbels I (1995) Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation. J. R. Statist. Soc. B 57: 371-394.
- Charnigo R, Srinivasan C (2011) Self-consistent estimation of mean response functions and their derivatives. Canadian Journal of Statistics, in press.
- Ramsay JO, Silverman BW (2002) Applied functional data analysis: Methods and case studies. Springer-Verlag, New York.
- Ramsay JO, Silverman BW (1997) Functional data analysis. Springer-Verlag, New York.
- Charnigo R, Francoeur M, Kenkel P, Meng¨uc MP, Hall B et al. (2011) Estimating quantitative features of nanoparticles using multiple derivatives of scattering profiles. J Quant Spectrosc Radiat Transf. 112: 1369-1382.
- Matousek P, Parker AW (2006) Bulk Raman analysis of pharmaceutical tablets. Applied Spectroscopy 60: 1353-1357.
- Matousek P, Stone N (2007) Prospects for the diagnosis of breast cancer by noninvasive probing of calcifications using transmission Raman spectroscopy. J Biomed Opt 12: 024008.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Big data
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Computer-aided design (CAD)
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft Computing
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- Total views:
**12175** - [From(publication date):

March-2011 - Dec 11, 2019] - Breakdown by view type
- HTML page views :
**8368** - PDF downloads :
**3807**

**Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals**

International Conferences 2019-20