Received Date: September 04, 2014; Accepted Date: June 16, 2015; Published Date: June 23, 2015
Citation: Oyeronke Alaba O, Olaomi JO (2015) Geo-Additive Modelling of Family Size in Nigeria. J Biom Biostat 6: 237. doi:10.4172/2155-6180.1000237
Copyright: © 2015 Oyeronke Alaba O, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Biometrics & Biostatistics
We used the 2013 Nigeria Demographic Health Survey (NDHS) data to investigate the determinants of family size in Nigeria using the geo-additive model. The model was used to simultaneously measure the fixed, nonlinear, spatial and random effects. The fixed effect of categorical covariates were modeled using the diffuse prior, P-spline with secondorder random walk for the nonlinear effect of continuous variable, spatial effects followed Markov random field priors while the exchangeable normal priors were used for the random effects of the community and household. The negative binomial distribution was used to handle over dispersion of the dependent variable. Inference was fully Bayesian approach. Results showed a declining effect of secondary and higher education of mother, Yoruba tribe, Christianity, family planning, mother giving birth by caesarean section and having a partner who has secondary education on family size. Family size is positively associated with age at first birth, number of daughters in a household, being gainfully employed, married and living with partner, community and household effects.
Bayesian inference; Geo-additive model; Family size; Nigeria; Negative binomial
Before the 19th and 20th centuries, studies have shown that family size was related to ecological features and means of subsistence. The family size was an indicator of the wealth of a farmer (main occupation) in West Africa. Agricultural families are characterized by large extended families. However, because of economic changes and technology the traditional family systems are no longer totally dependent on agriculture . Various sources support the contention that the family has changed as a result of the impact of industrialization and urbanization . Demographers have shown great concern on how many children is ideal for an average family or individual to have . Such information has been of great importance for trends in fertility. Considerable evidence from economically advanced countries has documented family size has a strategy to foster economic development and social well-being of the citizenry. The household and family are the most fundamental socioeconomic institutions in human society . However, family size mechanism is undoubtedly conditioned by cultural, political and socio-economic setting . Another line of thought is  that various factors influencing family-size desires are categorized into five: the costs and benefits of children; opportunity costs of childbearing; tastes and personal preferences; income and wealth; and childbearing itself. The dominant trend in most developed countries is a steady decline in household size from around 5 members in the middle of the 19th century to between 2 and 3 in 1990 . From 1960–2013, the family size dropped from 3.67 to 3.12 in USA (www. statista.com). There is still a long way to go in Nigeria. In Nigeria, ideal numbers of children are 6.5 for all women and 7.1 for currently married women. Only 9% of women think three or less children is ideal (NDHS, 2013). Family size and total number of children ever born are used interchangeably in this work. However, family size pattern still remains a puzzle for demographers in the industrial world .
Model-based analyses are becoming important sources of global information, largely because of the absence of reliable national level empirical data in most sub-Saharan Africa countries. Family size has attracted researchers, some of these include: Keller  used chi square on determinants of family size. Oppong  used ordinal scale to assess the degree of approval of closure of conjugal family. Snyder  employed the multiple linear regression and simultaneous equation model to investigate the economic determinants of family size. The pairity progression probabilities were utilized to examine how offspring affects family size . McCarthy and Oni  used a Twostage analysis to investigate the desired family size which they opined is affected by five categories [12-14], utilized the multiple linear regression using socio-economic determinants for ideal family size preference by men, observed linkages between religious composition of unions and fertility behaviour and age at family formation as a significant factor of family size using a cross-cultural comparison respectively. Murphy and Wang  employed simulation methods and multi-level model to investigate how successive generation affects the number of children born. Adsera  used a two- tailed test to explain the relationship between family size and factors associated with religion.
In-spite of the linear, nonlinear, spatial and random effect that exists among some variables, astonishingly such models are still lacking or scarce in literature to simultaneously capture family size. It is therefore imperative to proffer solution to this question: what are the effects of fixed, nonlinear, spatial and unobserved heterogeneity on family size (a count variable) within the Bayesian context using a geoadditive model?
The model is given as
fi,i=1,...,k is the nonlinear effect of metrical or continuous covariates x f(spat) is the spatially correlated effect of location Sr
u is the fixed effect of categorical variables γ
bg ∈ G are uncorrelated (unstructured) random effects to model unobserved heterogeneity.
Bt(x) are B-splines, αt are defined to follow a first order or second order random walk prior. The second order random walk is given as
with Gaussian errors where controls the smoothness of f. This variance is estimated jointly with the coefficients of the basis function by assigning a weakly informative inverse Gamma prior with IG(ε ,ε ) . A suitable choice of diffuse prior is assumed for the fixed effect of categorical covariates given as
p(γ)α const (4)
The spatial effects follow Markov random field priors 
Ni is the sum of adjacent sites and ∂i is the set of neighbours of site i
, is the spatial variance which controls the spatial smoothness
The random effects bg were modelled from exchangeable normal priors where is the variance that accounts for over dispersion and heterogeneity
We assigned highly dispersed but proper prior for all variance components. An inverse Gamma distribution with hyperparameters a and b is chosen, such that τ2~IG(a,b). Standard choices of hyperparameters are a=1 and b=0.005 or a=b=0.001(which is close to Jeffrey’s non-informative prior) [18,20]. These values can be varied to know the sensitivity of the choices of hyperparameters to the inverse Gamma distribution.
Let α=(f, fspat) and τ represent the vector of all variance components, and β is the vector of fixed effects parameters, then the posterior probability distribution is
p(α, τ, β/y) α P(y/α, β, τ) p(α) p(β) p(τ) (6)
p(y|α, τ, β) is the likelihood function of the data given the parameters of the model (based on the dependent variable )
p(α) p(β) p(τ) are the prior densities of all the parameters
The Deviance Information Criterion (DIC)  is employed for comparison of the models.
is the posterior mean of the deviance
pD is the effective number of parameters (not equal to degrees of freedom)
Small values of and pD indicate a better and parsimonious model respectively. The model with the lowest DIC is the best. The Bayesian framework based on Markov Chain Monte Carlo (MCMC) simulation techniques from full conditional will be used for estimation of the unknown posterior distribution.
The data used for this study were drawn from Nigeria Demographic and Health Survey (NDHS) for 2013 (www.measuredhs.com). The 2013 NDHS was conducted by the National Population Commission (NPC) with funding support from U.S Agency for International Development (USAID), the United Nations Population Fund (UNFPA), the United Kingdom Department for International Development (DFID). Technical support was provided by ICF International. The 2013 NDHS sample was selected using a three-stage stratified design consisting of 904 clusters, 372 urban areas and 532 in rural areas. In the 2013 NDHS dataset, 40,320 households were selected, out of which 38,522 were interviewed. In the interviewed households, 39,902 women in the childbearing age (15–49 years) and 18,229 men were found eligible for the interview. This represents a response rate of 99% for households, 98% for women and 95% for men. This study is based on the survey data with all participant identifiers removed. Although, different covariates on population and health issues in Nigeria were presented in the comprehensive and well detailed dataset, we focused on total number of children ever born as the dependent variable. The mean of the total children ever born is 4.35, variance=6.786, skewness=0.828, range=17 (Figure 1). The data are over dispersed . Equidispersion is often a mirage in real life studies, inappropriate imposition of Poisson regression will underestimate and overstate the significance of regression parameters . The negative binomial distribution has been suggested as an alternative to the Poisson regression when the data are overdispersed [24-26].
The socio- economic variables used as explanatory variables in explaining family size are educational attainment (EDUAT), body mass index (BMI), ethnicity (ETHNI), age at first birth (AGEFB), marital status (MARST), religion (RELIG), place of residence (RESID), wealth index (WEIND), family planning (FAMPL), number of daughters (DAUGH), number of dead children (CHDEA), method of delivery (DELIV), work status (WOKST), region (REGIO),state, community (COMUN), household (HHOLD) and partner education (PATED).
w'γ is the mean number of children ever born per woman
w'γ is the vector of fixed effect of the categorical covariates of EDUAT, RELIG, MARST, RESID, WEIND,ETHNI, FAMPL, DAUGH, CHDEA, DELIV, WOKST, REGIO, PATED
f(AGEFB), f(BMI) are the vectors of unknown smooth functions for BMI and AGEFB that are continuous and nonlinear
f(spat) is the spatial effect
bi1 and bi2 are the community and household effects respectively
We considered four models to investigate the best approach to family size modelling of (8). The first model (M1), we fixed all the categorical variables, AGEFB and BMI, such that their effects were estimated linearly. We used effect coding for all the categorical variables. In the second model (M2), we included the spatial effect to determine the magnitude of family size across the states. In the third model (M3), we introduced unobserved random effects of household and community while in (M4) explains the linear effect of the categorical variables, the nonlinear effect of continuous variables, the spatial effect and the unobserved random community and household effect. The four models were implemented in Bayes X version 2.1 . We carried out 15000 iterations with the first 2000 considered as a burn-in sample. We thinned every10th iteration of the remaining 13000 used for parameter estimation. Convergence and mixing were monitored through plotting and estimation of sampling paths and autocorrelation. Sensitivity analysis was carried out by varying the hyperparameters. The different choices of hyperparameters considered were a=1 and b=0.005, a=b=0.005 and a=b=0.001 (default). We reported the latter as the results were less sensitive to variation of the choices of the parameters .
Presentation and discussion of results
The primary outcomes of the four models were summarized in (Table 1). Model 1 gave a parsimonious model of 24.659 effective numbers of parameters while the best model based on least DIC of 21173.041 for the Negative Binomial models is M4. The regression coefficients were almost similar in the other three models. Precision is enhanced in M4; therefore we present the results of M4 which gave the best fit. Results of the posterior negative binomial regression are given in (Table 2). Regional differences are evident from the results, women from the North and South Eastern and South Southern parts tend to have more children. Women in the urban area have desire for large family size which actually negates documented literature. Education of mothers at higher level is inversely related to having a large family size with mean of -0.123. Low education (primary) showed desire for more children with mean of 0.0845 [29,30]. Women from the Ibo and Hausa ethnic groups tend to have more children than Yoruba women. The middle class wealth index showed desire for more children with mean of 0.006 while the richer and richest wealth index showed a reducing effect with mean of -0.009. Religion plays a significant role in family size, Christianity reduce the desire for a large family size which can be further explained by the fact that modern Christianity encourages monogamy . Astonishingly, Islam which encourages polygamy showed a declining effect on family size with mean of -0.013. The negative effect of family planning on family size is well documented . This study further supports the reducing effect of family planning on family size with mean of -0.039. One would not be surprised that a married woman who stays with her spouse will be at a higher risk of having more children as shown in our result with a mean value of 0.090 . A positive relationship exists between partners education with only primary education and large family size while its negative for partners who have secondary school education [3,12]. However, from our results, partners with higher education showed a positive association. This may be explained by the fact that higher education can be associated with higher income to cater for more children. Mothers who gave birth through caesarean sectioning or who have lost at least a child do not have desire for large family. The desire for more children is high for women who have only daughters. Infact, Ali  concluded that until women have at least a son, the family size is incomplete.
|M1: All variables fixed||21808.642||24.659||21857.960|
|M2: All variables fixed + spatial effect||21728.681||47.053||21822.787|
|M3: All variables fixed + spatial + community +
house hold effect
|M4:All categorical fixed + nonlinear of continuous variable
+ spatial + community + house hold effect
Table 1: Summary of diagnostic accuracy of the four models.
|North Central (ref.)||0|
|North East||0.001||0.024||(-0.046, 0.050)|
|North West||-0.019||0.025||(-0.074, 0.029)|
|South East||0.045||0.027||(-0.004, 0.100)|
|South West||-0.004||0.026||(-0.054, 0.050)|
|South South||0.024||0.022||(-0.016, 0.070)|
|Place of Residence|
|Mother’s Educational Attainment|
|No education (ref.)||0|
|Primary||0.085||0.006||( 0.072, 0.097)*|
|Other ethnic groups (ref.)||0|
|Middle Class||0.006||0.005||(-0.004, 0.016)|
|No method (ref.)||0|
|Married and living with partner||0.091||0.008||( 0.076, 0.106)*|
|No education (ref.)||0|
|Mother’s Working Status|
|Not Working (ref.)||0|
|Working||0.058||0.003||( 0.052, 0.064)*|
|Mode of Delivery|
|Normal delivery (ref.)||0|
|Caesarean section||-0.049||0.012||(-0.072, -0.026)**|
|Sex of Children|
|Daughters||0.277||0.003||( 0.271, 0.283)*|
|The continuous variables|
|Age at first birth||0.006||0.005||( 0.001, 0.019)*|
|Body mass index||0.001||0.001||( 0.003, 0.004)*|
|The spatial variable|
|States (36) and FCT||0.003||0.001||( 0.001, 0.006)*|
|The Random effect|
|Community||0.005||0.001||( 0.004, 0.006)*|
|Household||0.001||0.001||( 0.001, 0.002)*|
Table 2: Posterior estimates of M4 within 95% credible interval (CI).
The negative significant results for the fixed effect at 95% credible interval (CI) are higher (-0.148, -0.097) and secondary education of mother, Yoruba tribe, Christianity, family planning, partner secondary education, caesarean section, child dead (-0.219, -0.208) while the positive significant results are urban (0.003, 0.023), mother primary education, married and living with partner, mother is working and having daughters only.
The posterior nonlinear effect of BMI and age at first birth (in years) showed positive effect on family size with mean value of 0.001 and 0.006 respectively (Table 2). The posterior result showed that women who are obese (BMI >35.2) showed strong desire for large family size (Figure 2).
There is a decline in the desire for large family size as women grow older as depicted in (Figure 3). However, the 95% CI for BMI and age at first birth showed positive significant effect with (0.003, 0.004) and (0.001, 0.019) respectively. Women in Yobe, Kano, Benue, Edo and Bayelsa have higher positive significant result of having more children while women in Kebbi, Niger, Kwara, Oyo, Osun, Ekiti and Lagos showed negative significant result of having a large family size (Figure 4). There is overall spatial effect on family size. Household and community effects were also positively significant in explaining family size.
This study revealed that education, ethnic group, religion, use of family planning, marrying a partner who is educated, loss of at least a child and giving birth by caesarean section explains low family size.
We appreciate the permission granted by www.measuredhs.com to use the Nigerian Demographic Health Survey (NDHS) 2013 data.
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals