Method 
Basic notion 
Exposure effects 
Strength 
Limitation 
Ratio estimator (RE) 
the RE is appropriate when only one IV 
RD, RR, OR 
simple estimation method
with a single binary IV and no other confounders, 2SLS = RE 

Twostage least squares (2SLS) 
linear models without making parametric assumptions on the error terms
for multiple IVs, IV estimator is the weighted average of the ratio estimators 
estimator similar as classical regression 
natural starting point of IV analysis
the estimate asymptotically unbiased
widely used for binary exposure and outcome and provides the exposure effect on risk difference scale
unlike RE, it is able to adjust the possible measured confounders 
show biased results in binary cases or in the case of nonlinear models
for multiple IVs, 2SLS estimator is biased and hence limited information of maximum likelihood method would be an alternative
for smaller sample sizes, limited information maximum likelihood estimator is more efficient and consistent than 2SLS
IV and 2SLS are a special case of GMM; however both yield the same results in the case of homoscedastic errors variance 
Linear probability models (LPM) 
applied for binary outcome, exposure, and IV, the data are modelled using linear functions
for a single binary IV, the estimator equivalent to the RE 
RD 
simple to estimate and interpret as the regression coefficients
the RD is consistent for the ACE 
 sometimes predicted probabilities outside of the 0–1 range and for rare outcomes this may become negative
 assumes the marginal/incremental effect of exposure remains constant which is logically impossible for binary outcome 
Twostage predictor substitution (2SPS) 
the rote extension to nonlinear models of the linear IV models
targets a marginal (populationaveraged) odds ratio
it is the mimic of 2SLS
nonlinear least squares is used to estimates the parameter
for a linear model, 2SPS = 2SLS 
RD, RR, OR 
suitable for nonlinear association between exposure and outcome 
in practice, 2SPS in nonlinear model does not always yield consistent exposure effects on the outcome
 parameter estimation process is more difficult than 2SLS
under a logistic regression model, 2SPS may not provide causal OR

Two stage residual inclusion (2SRI) 
include the estimated unobservable confounder (residual) from the firststage as an additional variable along with the exposure in the secondstage model
 also called control function estimator
under a linear model, 2SRI = 2SLS = 2SPS 
RD, RR, OR 
yields consistent estimates for linear and nonlinear models
performs better than 2SPS
possible to apply in the specific case of a binary exposure with a binary or count outcome
for a loglinear model in the stagetwo, 2SRI estimator provides CRR 
it may give biased estimates when there is strong unmeasured confounding, as is usually the case in an IV analysis
under a logistic regression model, 2SRI estimator may not provide causal OR
generally require the exposure to be continuous,
rather than binary, discrete, or censored 
Twostage logistic regression (2SLR) 
when outcome and exposure are binary and interest to estimate OR
fully parametric, maximum likelihood technique is used to estimate the parameters 
OR 
parallel to 2SLS using LRM in both stages instead of linear models 
if the firststage logistic model is not correctly specified then secondstage parameter estimates might be biased
estimator does not provide COR 
Threestage least squares (3SLS) 
an extension of 2SLS but unlike the 2SLS, all coefficients are estimated simultaneously, requires three steps
in 2SLS, if the errors in the two equations are correlated, the 3SLS can be an suitable alternative 
RD, RR 
more information is used and hence the estimators are likely to be more efficient than 2SLS 
more vulnerable to a misspecification of the error terms
very rarely applied in epidemiologic studies
estimation process is more complicated than 2SLS
3SLS becomes inconsistent if errors are
heteroskedastic 
Structural mean models (SMM) 
SMMs use IVs via Gestimation and involves the assumption of conditional mean independence
additive SMMs use continuous outcome and multiplicative SMMs use positivevalued outcomes
MSMM assumed loglinear model to measure the risk ratio
LSMM assumes logistic regression model which is fitted by maximum likelihood technique 
RD, RR, OR 
it relaxes several of the modelling restrictions (constant treatment effects) required by ratio estimator/twostage methods
can be used in the case of timedependent instruments, exposures, and confounders
provides average treatment effects for the treated subjects 
the assumption of no effect modification is impossible to verify
with a binary outcome, additive SMMs and MSMM suffer from the limitations of linear and loglinear models (e.g., predicted response probabilities may outside of the interval [0, ])) 
Generalized method of moments (GMM) 
a nonlinear analogue of 2SLS
the standard IV (2SLS) estimator is a special case of a GMM estimator
making assumptions about the moments of the error term
allows estimation of parameters inoveridentified model (number of IV greater than number of exposure variable)
the parameters are estimated in an iterative process 
RD, RR, OR 
it requires specification only of certain moment conditions
applicable for the linear and nonlinear models
nonlinear GMM estimator is asymptotically more efficient than 2SLS
more robust and less sensitive to parametric conditions
works better than 2SLR when exposure and outcome are binary
in case of heteroskedasticity, this is more efficient than the linear IV estimators

GMM estimator with logistic regression model is not consistent for the COR due to noncollapsibility of the OR 
Bivariate probit models (BPM) 
twostage method, but as different to 2SLS and model the probabilities directly and are restricted on [0,1]
full information maximum likelihood is used to estimate the parameter
accounts for the correlation between the errors 
Probit coefficient* 
for binary outcome and exposure, BPM perform better than linear IV methods
the estimator of BPM have no interpretation like OR. However, by multiplying a probit coefficient by approximately 1.6, the estimator can be made to approximate OR 
when the distribution of error terms are not normal or the average probability of the outcome variable is close to one or zero, the BPM estimator may not be consistent for ACE 