Medical, Pharma, Engineering, Science, Technology and Business

**Bohning D ^{1*} and Viwatwongkasem C^{2}**

^{1}Southampton Statistical Sciences Research Institute, University of Southampton, Southampton SO17 1BJ, UK

^{2}Department of Biostatistics, Faculty of Public Health Mahidol University, Bangkok, Thailand

- *Corresponding Author:
- Bohning D

Southampton Statistical Sciences Research Institute, University of Southampton

Southampton SO17 1BJ, UK

**Tel:**00 44 (0)78 7951 9129

**Fax:**00 44 (0) 23 8059 3216

**E-mail:**[email protected]

**Received date:** February 05, 2015; **Accepted date:** June 04, 2015; **Published date:** June 11, 2015

**Citation:** Bohning D, Viwatwongkasem C (2015) A Miscellaneous Note on the Equivalence of Two Poisson Likelihoods. J Biomet Biostat 6: 231. doi: 10.4172/2155-6180.1000231

**Copyright:** © 2015 Bohning D, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are are credited.

**Visit for more related articles at** Journal of Biometrics & Biostatistics

This note shows that the concept of an offset, frequently introduced in Poisson regression models to cope with ratetype data, can be simply treated with a regular Poisson regression model. Hence Poisson regression models requiring an offset can be fitted with ordinary Poisson regression models. Some illustrations are provided and it is discussed how this result came about.

Poisson likelihood; Offset; Log-link

The paper at hand has anecdotal character and tells a story what can happen in the life of a university professor. An exam question was posted involving a Poisson regression model with an offset term. Most students found the solution that was also provided by the professor as the official solution (to the exam office), except one student who ignored the offset term and treated it as failures. The solution was marked initially as false as one would evidently not work along this pathway. However, as was correctly claimed by the student, the solution he/she provided was entirely identical with the solutions by his/her colleagues that were marked as correct. The professor sat down and worked out what was going on, and after two days, not without interruptions of the typical university daily life, it was found that also the unorthodox solution has a right to live as both likelihoods could be shown to identical. The purpose of this note is to share this experience.

We consider the data constellation as provided in Table. This is done entirely for illustrative reasons and does not provide any limitations on the generality of the major finding. Here a count (number of events, say) Y_{i} is considered for a binary exposure E_{i} and observation-time at risk T_{i} where i indicates the unit and goes from 1 to n. The person-time Ti is considered non-random

Whereas Y_{i} is a discrete random variable. This situation is commonly denoted as a rate-type problem with rate

λ_{i}=E(Y_{i} )/T_{i}.

This rate leads to a natural modelling as follows

log E(Y_{i})=log T_{i}+log λ_{i} (1)

The term log T_{i} is a covariate with known coefficient and called offset whereas contains the linear predictor with covariate vector zi for the ith unit. For the situation of **Table 1**, with the model would simply be

log E(Y_{i} )=log T_{i}+β_{0}+β_{1} E_{i},

uniti |
Yi |
Ei |
Ti |
---|---|---|---|

1 | 2 | 1 | 3 |

2 | 4 | 1 | 5 |

3 | 6 | 1 | 7 |

4 | 3 | 0 | 4 |

5 | 4 | 0 | 5 |

6 | 1 | 0 | 3 |

**Table 1:** Hypothetical data for six units arising from a cohort study with response
count Y, binary exposure E and person-time T.

where β_{1} is the log-risk ratio, usually the parameter of interest [1]. If one assumes that Y_{i} is Poisson with mean E(Y_{i}) given by (1), then the associated log-likelihood is given by

(2)

where we have ignored parameter independent terms. Note that the full log-likelihood (including parameter independent parts) is log L_{1}+Y_{i} log Ti-log(Y_{i}!). Many packages exist that can fit offset models such as (1). For the data of **Table 1 **we use STATA (2013) to yield the output given in **Figure 1** [2].

Consider a second problem where we observe, for each unit i, a binary variate X_{ij} where j=1,…, T_{i} . In other words, X_{ij}=1 if the event of interest occurs and zero otherwise, for j=1, …, T_{i}. Let, for unit i, denote the count of positives and T_{i}-Y_{i} are the number of zeros. Let p_{ij }denote the probability for an event in the i^{th} subject at the j^{th} occasion. Then the clustered likelihood

arises, assuming independence over subjects and occasions. This simplifies further if the event probability is constant over occasion:

Let us make the unconventional assumption, purely for the mathematical convenience of showing the equivalence, that Xij follows a Poisson distribution with mean λ_{i}. In other words, we assume that

log E(X_{ij})=log λ_{i} (3)

for i =, …, n and j=1, … , T_{i} , and Xij∼P o(λ_{i}). In particular, this implies P(X_{ij}=1)=p_{i}=exp(-λ_{i})λ_{i} and P (X_{ij}=0)=1-p_{i}=exp(-λ_{i}). Then, the following associated likelihood (Corresponding also to the full likelihood) occurs:

(4)

Model (3) can be fitted by regressing a 2n vector of n 1s and n 0s, (1,…, 1, 0, … , 0)i , on the 2n vector(s) of associated covariates, for example in the situation of **Table 2**, using as frequency weight vector the 2n-vector . This can be done easily in STATA and yields the output in **Figure 2**. Note that the model (3) does not involve any offset. It is clear from **Figures 1 and 2** that the results of the analysis for model (1) and (3) are identical. We show in the following that this is not accidental (as it might have only occurred in this special data set) but is far more general in nature.

unit i |
Ei |
Xij |
fi |
---|---|---|---|

1 | 1 | 1 | 2 |

2 | 1 | 1 | 4 |

3 | 1 | 1 | 6 |

4 | 0 | 1 | 3 |

5 | 0 | 1 | 4 |

6 | 0 | 1 | 1 |

1 | 1 | 0 | 1 |

2 | 1 | 0 | 1 |

3 | 1 | 0 | 1 |

4 | 0 | 0 | 1 |

5 | 0 | 0 | 1 |

6 | 0 | 0 | 2 |

**Table 2: **The data of Table 1 reorganized: for the first set of n observations: let for each unit i f_{i}=Y_{i}=#{X_{ij}=1} denote the count of 1s in the binary set X_{i1}, · · · ,X_{i}T_{i} and for the second set of n observations: let for each unit i f_{i}=T_{i}-Y_{i}=#{X_{ij}=0} denote the number of 0s.

We have the following result:

**Theorem 1:**

log L_{1}=log L_{2}.

Proof: We start with the clustered log-likelihood (4) and show that it is identical to the log-likelihood (2):

which ends the proof.

Note that this result, although mathematical almost trivial, is quite general and does not depend in any way on the form of linear predictor. Note that the full (including all terms) log-likelihoods are not identical as also **Figures 1 and 2** indicate. Typically, the full loglikelihood corresponding to model (1) will be larger, assuming Y_{i} ≤ T_{i}, since

Y_{i} log T_{i}≥ Y_{i} log Y_{i}≥ log(Y_{i}!)

for i=1, … , n.

• We would like to mention that the result does not require Ti to be an integer. The log-likelihood remains well-defined even if Ti is of real value as it might occur with rate data or otherwise. All that is required is that T_{i} ≥ Y_{i} which is only a question of scaling for T.

• The result does not require a special form of linear predictor. However, it is does not generalized beyond the log-link, typical for Poisson regression or log-linear modelling.

The result might have more curiosity than impact, but is interesting in it. It means, for example, that we can fit offset models without paying attention to the special offset variate in (1), but simply use the conventional model

log E(X_{ij})=log λ_{i}, (5)

Where, for each unit i, X_{ij}=1 exactly Y_{i} -times and T_{i}-Y_{i} times otherwise. The question arises how this result can be used. We see at least two applications:

• It can be used to fit Poisson-offset models in packages that do not provide the offset-option.

• It may be used to check the computational correctness of the offset option if the latter is available.

However, the most important take-home message might be that one needs to be careful when to decide about the correctness of a solution provided by an unorthodox thinking student. Finally, we wish to mention that the student received the full mark despite remaining doubts that he/she fully understands the depth of the equivalence.

- Cameron AC, Trivedi PK (1998) Regression Analysis of Count Data. Cambridge, Cambridge University Press.
- StataCorp (2013) Stata Statistical Software: Release 13. College Station, TX: StataCorp LP.

Select your language of interest to view the total content in your interested language

- Adomian Decomposition Method
- Algebra
- Algebraic Geometry
- Algorithm
- Analytical Geometry
- Applied Mathematics
- Artificial Intelligence Studies
- Axioms
- Balance Law
- Behaviometrics
- Big Data Analytics
- Big data
- Binary and Non-normal Continuous Data
- Binomial Regression
- Bioinformatics Modeling
- Biometrics
- Biostatistics methods
- Biostatistics: Current Trends
- Clinical Trail
- Cloud Computation
- Combinatorics
- Complex Analysis
- Computational Model
- Computational Sciences
- Computer Science
- Computer-aided design (CAD)
- Convection Diffusion Equations
- Cross-Covariance and Cross-Correlation
- Data Mining Current Research
- Deformations Theory
- Differential Equations
- Differential Transform Method
- Findings on Machine Learning
- Fourier Analysis
- Fuzzy Boundary Value
- Fuzzy Environments
- Fuzzy Quasi-Metric Space
- Genetic Linkage
- Geometry
- Hamilton Mechanics
- Harmonic Analysis
- Homological Algebra
- Homotopical Algebra
- Hypothesis Testing
- Integrated Analysis
- Integration
- Large-scale Survey Data
- Latin Squares
- Lie Algebra
- Lie Superalgebra
- Lie Theory
- Lie Triple Systems
- Loop Algebra
- Mathematical Modeling
- Matrix
- Microarray Studies
- Mixed Initial-boundary Value
- Molecular Modelling
- Multivariate-Normal Model
- Neural Network
- Noether's theorem
- Non rigid Image Registration
- Nonlinear Differential Equations
- Number Theory
- Numerical Solutions
- Operad Theory
- Physical Mathematics
- Quantum Group
- Quantum Mechanics
- Quantum electrodynamics
- Quasi-Group
- Quasilinear Hyperbolic Systems
- Regressions
- Relativity
- Representation theory
- Riemannian Geometry
- Robotics Research
- Robust Method
- Semi Analytical-Solution
- Sensitivity Analysis
- Smooth Complexities
- Soft Computing
- Soft biometrics
- Spatial Gaussian Markov Random Fields
- Statistical Methods
- Studies on Computational Biology
- Super Algebras
- Symmetric Spaces
- Systems Biology
- Theoretical Physics
- Theory of Mathematical Modeling
- Three Dimensional Steady State
- Topologies
- Topology
- mirror symmetry
- vector bundle

- 7th International Conference on
**Biostatistics**and**Bioinformatics**

September 26-27, 2018 Chicago, USA - Conference on
**Biostatistics****and****Informatics**

December 05-06-2018 Dubai, UAE

- Total views:
**11738** - [From(publication date):

June-2015 - May 24, 2018] - Breakdown by view type
- HTML page views :
**7975** - PDF downloads :
**3763**

Peer Reviewed Journals

International Conferences 2018-19