Prediction of Longterm Pattern and its Extreme Event Frequency of Rainfall in Dire Dawa Region, Eastern Ethiopia
Rediat Takele^{*} and Solomon Gebretsidik  
Jigjiga University, Jigjiga, Somali regional state, Ethiopia  
Corresponding Author :  Rediat Takele Jigjiga University, Jigjiga Somali regional state, Ethiopia Tel: 0917634724 Email: [email protected] 

Received February 28, 2015; Accepted March 20, 2015; Published March 23, 2015  
Citation: Takele R, Gebretsidik S (2015) Prediction of Longterm Pattern and its Extreme Event Frequency of Rainfall in Dire Dawa Region, Eastern Ethiopia. J Climatol Weather Forecasting 3:130. doi:10.4172/23322594.1000130  
Copyright: ©2015 Takele R, et al. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.  
Related article at Pubmed Scholar Google 
Visit for more related articles at Journal of Climatology & Weather Forecasting
Rainfall is the most critical and key variable both in atmospheric and hydrological cycle. Its patterns usually have spatial and temporal variability. Its variability is assumed to be the main cause for the frequently occurring climate extreme events such as drought and flood. In this study, spectrum analysis, crossspectral analysis as well as seasonal autoregressive integrated moving average (SARIMA) model were used in intending to predict the pattern and its extreme event frequency of rainfall in Dire Dawa Region based on data obtained from Dire Dawa and adjacent stations: Dengego and Haramaya. The result indicated that the amount of rainfall at Dengego and Haramaya are more or less the same on average in all seasons and much higher than that of Dire Dawa during last 30 year study period. The variability of annual rainfall in Dire Dawa during the study period is a bit larger than neighboring station’s rainfall (Dengego and Haramaya), indicating that climate instability is high in Dire Dawa than other nearby stations. The result also indicates that relatively there is a tendency of increasing pattern in average annual rainfall of Dire Dawa in forecasted period. In the region, the rainfall extreme event like flood predicts or inferred to be recurring at about 4.17 years. Moreover, the rainfall periodicities of Dengego and Dire Dawa are found to be more likely associated, which implies that rainfall extreme event frequency in two districts is statistical significantly associated. It is proposed that assessing the factors that cause the fluctuation in the pattern and frequency of rainfall distribution in the region is needed to be study.
Keywords  
Extreme event; Climate variability; Periodicities; SARIMA; Spectral analysis  
Introduction  
Located within the tropics, Ethiopia has great geographical diversity with high and rugged mountainous, flattopped plateaus, deep gorges, etc. Its altitudinal range lies between 120 m below sea level and 4600 m above sea level [1]. The differences in altitude and relief create a large variation in climate in various regions of the country [2]. Several regions receive rainfall throughout the year, but in some regions rainfall is seasonal and low making irrigation necessary [3]. Particularly, Semiarid regions receive very small, irregular, and unreliable rainfall. Rainfall is the most critical and key variable both in atmospheric and hydrological cycle. Its patterns usually have spatial and temporal variability. These variabilities affect agricultural production, water supply, transportation, environment and urban planning and the existence of its people. Its variability assumed to be the main cause for the frequently occurring climate extreme events such as drought and flood.  
Ethiopia is one of the countries whose economy is highly dependent on rainfed agriculture and also facing recurring cycles of flood and drought. Current climate variability is imposing a significant challenge to Ethiopia in general and Dire Dawa in particular. Metrologically, Dire Dawa Administration is characterized by an arid and semiarid climate, thus, receives low and erratic rainfall [4]. Despite the fact that, sloppy topography of the Dire Dawa administration are surrounded by the neighboring mountainous areas (Haramaya and Dengego). These are the main catchments for contributing to the disaster flood and drought event in Dire Dawa. Flood disasters are occurring more frequently in the urban area, and are having an ever more dramatic impact in terms of the costs on lives, livelihoods and environmental resources. The May 1984 catastrophic flood claimed 42 people, and property worth 10 million birr was lost. On August 2006 claimed 650 people, displaced 35,000 people, and caused direct damage of huge infrastructure estimated at 100 million Ethiopian Birr and indirect damage of similar magnitude. On April 2010 though there was the property of worth 28 million birr had been lost. On the other hand, the area has often experienced and affected by frequent and prolonged drought claiming the lives of millions of people. The impacts of the 2004 and 2005 droughts incident posed food shortage up to 85% of the rural population [5].  
Methods of prediction of rainfall extreme events have often been based on studies of physical effects of rainfall or on statistical studies of rainfall time series [2,3,6]. The study in various region of African shows that rainfall has Periodic tendencies [7,8]. A study made by Mersha and Seifu [9,10] on rainfall cyclicity over selected stations in Ethiopia shows that there appears to be periodic tendency in the annual rainfall series, particularly, Gode, Dire Dawa, Jigjiga, Negelle, and Debre Zeit station. Haile [11] found that drought occur at every 68 years in the semiarid regions of Ethiopia. Tsegay [12] on his study the occurrences of drought and frequencies of rainfall suggest that drought occur every 35 years in Eastern Ethiopia. On the other side, the study by Yilma and Alamerew et al. [2,3] reach the conclusion of rainfall extreme event recurs in Addis Ababa region between 10 to 11 years.  
Awareness about the characteristics of the rainfall over an area such as the source, quantity, variability, distribution and the frequency of rainfall is essential for efficient control and management of water resources, the implication in utilization and associated problems. The impacts of climate variability and climate change are evident on almost all socioeconomic sectors. The output of this study can play an important role in the design of hydrological structures in the area of study and help the policy makers in improve their decisions by taking into consideration the available and future water resources.  
Materials and Methods  
Data and variable of the study  
A time series of monthly 30 years rainfall data in mm for the period January, 1984 to January, 2014 collected by the National Meteorological Agency of Ethiopia were used in the study. The data were collected from the synoptic stations of Dire Dawa and adjacent station: Dengego and Haramaya for further investigation.  
Methodology  
The development of climatology as a science has given rise to growing statistical applications on climatic information. Conducting investigations using standard statistical methodologies is an essential step in the development of climatology [13]. For instance, time series analysis is used in order to evaluate the temporal and spatial behavior of rainfall [14]. In this study a univariate BoxJenkins Methods, in particular, Seasonal Autoregressive Integrated Moving Average (SARIMA) methods and spectral analysis and crossspectral analysis are employed [15].  
Seasonal autoregressive integrated moving average (SARIMA) modeling is one of the most widely implemented methods for analyzing univariate time series data that exhibits a seasonal variation such as temperature and rainfall, which have strong components corresponding to seasons. Hence, the natural variability of many physical, biological, and economic processes tends to match with seasonal fluctuations [16]. The seasonal autoregressive integrated moving average model, denoted by SARIMA (p,d,q)*(P,D,Q)_{s}, as in Shumay and Stoffer [17] and is given by:  
Φ_{P}(B^{S}) = α+ ΘQ(BS)θ(B)w_{t} (1)  
where wt is the usual Gaussian white noise process. The ordinary autoregressive and moving average components are represented by polynomials and θ (B) of orders p and q, respectively and the seasonal autoregressive and moving average components by Φ_{P} (B^{S}) and Θ_{Q}(B^{S}) of orders P and Q. The ordinary and seasonal difference components can be written as: and  
Test for stationary assumption  
Before developing a BoxJenkins modeling process, it is important to check whether the data under study meets basic assumptions such as stationary. A time series is said to be stationary if there is no systematic change in mean (no trend), if there is no systematic change in variance and if periodic variations have been removed. Many testing procedures for testing stationary are employed in the literature.  
Time plot: This procedure may reveals seasonality, trends either in the mean level or the variance of the series, long term cycles, and so on. If any such patterns are present, then these are signs of nonstationary.  
The correlagram test: test based on the plot its sample autocorrelation function and sample partial autocorrelation function. Both function used to reveal important information regarding given series stationary. Wei [18] states that if the sample ACF decays very slowly, which indicates that series is not stationary.  
The augmented dickeyfuller (ADF) test: This test first poses the null hypothesis that the given time series has a unit root, which means that the time series is nonstationary. That is:  
H_{0}: Φ=1 (has a unit root) Vs H_{1}: Φ < 1 (has root outside the unit circle)  
To use the Augmented DickeyFuller test the following test equation will be used:  
(2)  
Where:∇x_{t} is the first differenced value of series , wt is the error term, is the jth lagged first differenced of values of are parameters to be estimated  
Variance comparisons: The characteristics of the sample variances associated with different orders of differencing can provide a useful means of deciding the appropriate order of differencing [19] cited also in Alamarew et al. [3]. The sample variances will decrease until a stationary sequence has been found.  
Building seasonal ARIMA models  
To identify a perfect ARIMA model for a particular time series data, Box and Jenkins [16] proposed a methodology that consists of four phases: i) Model identification; ii) parameters Estimation; iii) Diagnostic checking for the identified model and iv) Forecasting.  
Model identification: The purpose of the identification stage is to determine the differencing required achieving stationarity and also the order of both the seasonal and the non seasonal AR and MA operators for the residual series. The autocorrelations function (ACF) and the partial autocorrelation functions (PACF) are the two most useful tools in any attempt at time series model identification. To identify the order of model parameters, we examine the ACF and PACF based on the theoretical pattern using Tables 1 and 2 as summarized in Shumway and Stoffer [17].  
Parameter estimation: After choosing the most appropriate order of the model, the next step is to estimate the model parameters by using several estimation procedures like maximum likelihood methods and least square methods.  
In time series analysis, there may be several adequate models that can be used to represent a given data set, and hence, information criteria are used for model comparison. The AIC and SBC is a mathematical selection criterion of model building by select the model that gives the minimum of the AIC and SBC defined by Shumway and Stoffer [17]:  
(3)  
Where: denotes the maximum likelihood estimator for the error variance and kis the order of seasonal and nonseasonal autoregressive and moving average parameters to be estimated in the model, that is, according to Wei [18], k=p + q + P+ Q + 1 and n is the number of observations in the series.  
Diagnostic checking: After fitting a provisional time series model, we can assess its adequacy in various ways. Most diagnostic tests deal with the residual assumptions in order to determine whether the residuals from fitted model are independent, have a constant variance, and are normally distributed. Several diagnostic statistics and plots of the residuals can be used to examine the goodness of fit of the tentative model to the time series data.  
The first approaches that can be used to evaluate the adequacy of a model are the plot of the errors over time, which can be written Shumway and Stoffer [17]:  
(4)  
Where is the onestepahead prediction of based on the fitted model and is the estimated onestepahead error variance. If visual inspections of the errors reveal that they are randomly distributed over time, then we have a good model.  
The autocorrelations function (ACF) of the series can also be used to examine whether the residual of the fitted model is white noise or not. Under the null hypothesis that residual follows a white noise process, roughly 95% of the autocorrelation coefficient should fall within the range ± 1.96/√T [20].  
LjungBox Q (LBQ) test: It can be employed to check independence of residual instead of visual inspection of the sample autocorrelations. This test uses the null hypothesis that  
Ho: and thetest statistic is calculated as Ljung and Box [21].  
(5)  
Where: n′ = (n − d) , is the number of observations in the original time series, μ^{2}_{j} is the sample autocorrelation of the residuals at lag j and d is the degree of non seasonal differencing used to transform the original time series values into stationary time series. If a model is correctly specified, residuals should be uncorrelated and Q(r) should be small (p value should be large).  
Runs test: The runs test can be used to decide if a data set is from a random process. The first step in the runs test is to compute the sequential differences (Yi − Yi−1). If Yi > Yi−1a 1 (one) is assigned for an observation and a 0 (zero) otherwise. let T be the number of observations, Ta be the number above the mean, Tb be the number below the mean and R be the observed number of runs. Then the test statistic for this test is given as Cromwell et al. [22]:  
(6)  
where:  
The test of series randomness is rejected if the calculated ZN value exceeds the selected critical value obtained from the standard normal distribution table.  
Turning point test: A turning point means when the series changes from increasing to decreasing or vice versa. That is, X_{t−1}<X_{t}> X_{t+1} or X_{t−1}>X_{t}< X_{t+1}. Let T=the number of turning points in an n period series. The hypothesis of residual whitenoise should be rejected if the absolute value of N_{T}(eqn.(7)) >N_{T(1−α/2)} , where N_{T(1−α/2)} is the (1α/2) quartile of standard normal distribution Cromwell et al. [22]:  
(7)  
where  
Test for normality of the residuals: The ShapiroWilk Test, histogram and QQ plot of a data set give information related to normality. The null hypothesis for any test of normality is that the data are normally distributed. The ShapiroWilk statistic “W”, which used as a base for the test of normality of residual in ShapiroWilks test, is given as [23]:  
(8)  
Procedure for checking of homosecdasticity  
GoldfeldQuandt test: If there is a change in variance (heteroscedasticity) of residuals, a transformation is necessary for the data. For this test:  
If test statistic: is smaller than the critical value , the residuals are assumed to be homoscedastic.  
BreuschPagan test: This involves applying ordinary leastsquare (OLS) to:  
and calculating the regression sum of squares (RSS). The test statistic is: Reject the null hypothesis of homoscedasticity: if :  
If the constant variance and normality assumptions are not true, they are often reasonably well satisfied when the observations are transformed by a BoxCox transformation [17]:  
(9)  
If the selected model is inadequate, the threestep model building process is typically repeated several times until a satisfactory model is finally obtained. The final selected model can then be used for prediction purposes [18].  
Forecasting: The last step in time series modeling is forecasting.  
There are two kinds of forecasts: sample period forecasts and postsample period forecasts. The former are used to develop confidence in the model and the latter to generate genuine desired forecasts. In forecasting, the goal is to predict future values of a time series, x_{t+m}, m =1, 2,... based on the data collected to the present, x={x_{t}, x_{t−1},..., x_{1}}. Then explicit forms of the model for the observation x_{t+m} generated by the ARIMA process may be expressed as follows (Box and Jenkins) [16]:  
(10)  
Forecasting accuracy measures  
Once forecasts are made they can be evaluated if the actual values of the series to be forecasted are observed. To assess the precision of the forecasts, prediction interval can be written as [17]:  
, Where is chosen to get the desired degree of confidence. Where: Mean square prediction error,  
There are also some measurements of the accuracy of forecasts, which evaluate the accuracy by comparing with observed value are Mean Square Error (MSE); Mean Absolute Error (MAE); Mean Absolute Percentage Error (MAPE); Theil’s inequality coefficient (UStatistics). The scaling of Uis such that it will always lie between 0 and 1. If U=0, there is a perfect fit; if U=1 the predictive performance is as bad as it possibly could be. Moreover, in evaluating the performance of the forecasting models, the lower the RMSE, MAE, MAPE, the better the forecasting accuracy [24].  
Spectral analysis: Quite often, hydrologic phenomenon depicts cyclic and stochastic processes. The Spectral analysis based on periodogram method has dominated hydrology for years because of underlying periodicities in hydrologic processes [25]. The main objective of spectral analysis is to estimate and study the spectrum. It is therefore concerned with estimating the unknown spectrum of the process from the data and quantifying the relative importance of different frequency bands to the variance of the process [26]. The spectrum of a time series is the distribution of variance of the series as a function of frequency. The area under the curve in spectrum is proportional to the variance. Hence, we can be estimating the power spectrum by the raw periodogram method as [17]:  
(11)  
where and the frequencies (w_{j}) are related to the size (n) by:  
By considering the fact that the periodogram estimates (eqn.11) are independent and identically approximated as χ^{2} distributed with vdegree of freedom, the approximate 100(1 )% confidence interval for the spectrum:  
(12)  
The raw periodogram is a wildly fluctuating estimate of the spectrum with high variance. For a stable estimate, the periodogram must be smoothed. Bloomfield [27] recommends the Daniell window as a smoothing filter for generating an estimated spectrum from the periodogram. The smoothed periodogram is defined as [17]:  
Where the weights hk> 0 satisfy,  
(13)  
The result in eqn. (13) can be rearranged to obtain an approximate 100(1 − α) % confidence interval for the best estimator of spectrum, f(ω) of the form  
(14)  
Cross spectral Analysis: Crossspectrum analysis examines the relationship between a pair of series. The main objective of crossspectrum analysis is to determine the existence of correlation between two variables and to simultaneously separate each of two time series in to its harmonic component. The principal tool that measures the strength of relation between two series in crossspectral analysis is squared coherence function defined as [17]:  
(15)  
Then spectral Matrix of a Bivariate Process is given by:  
(16)  
An estimate of the above spectral matrix is similar to the eqn. (13) defined in power spectra. Then an estimate of the squared coherence between two series, x_{t} and yt is given by:  
(17)  
At a particular significance level (α ), the approximate critical value that must be exceeded for the original squared coherence to be able to reject = 0 at a specified frequency is [17]:  
(18)  
Result and Discussions  
The statistical software package used for most of the analysis is by the help of R15.10.1 and for some test and graphs; however, SAS9.2 was used.  
Descriptive analysis  
Here it can be described rainfall of Dire Dawa, Haramaya and Dengego station and mainly focused on analyzing monthly as well as annually rainfall data of Dire Dawa station recorded by the National Meteorological Agency of Ethiopia (NMAE). From descriptive statistics result, it can be observed that the mean annual rainfall of Dire Dawa, Dengego and Haramaya are 611 mm, 774 mm and 772 mm, respectively. In Dire Dawa, the minimum annual rainfall of 228.3 mm was recorded during the year 1984 and the maximum of 719.7 mm in the year 1997. The amount of rainfall at Dengego and Haramaya are more or less the same on average in all seasons, and is much higher than that of Dire Dawa over the study period. NMA [28] documented that a series with coefficient of variation(CV) less than 0.20 can be considered as less variable, between 0.20 and 0.30 is moderately variable, and greater than 0.30 is highly variable. The monthly rainfall data in all stations show that there is high variability since their CV is greater than 0.30. However, relatively the monthly variability of rainfall of Dire Dawa and Dengego are similar. When we see the overall variability, the high variability occurs during months of dry seasons while the low variability is observed during the rainy season in all stations. The variability of annual rainfall in Dire Dawa during the last 30 year period is a bit larger than neighboring stations. This may indicate that climate instability is high in Dire Dawa than other stations.  
Testing stationary assumption  
Visual Inspection: Regardless of which technique is used, the first step in any time series analysis is to plot the observed values against time. The pattern of the time series plot in Figure 1 does not show any systematic upward and downward change about the mean. This indicates that the series is nonseasonally stationary. Moreover, the sample autocorrelation function (SACF) plot in Figure 2, the presence of seasonality behavior and seasonally nonstationarity of the mean monthly rainfall series is clear. Because it shows strong seasonal wave pattern at the multiple of seasonal intervals(s=12) and declining slowly while nonseasonal lags are relatively decaying quite rapidly. This can be interpreted as a 6month seasonal pattern that cycles between summer when there is little to no rainfall, and winter when rain is at its peak.  
Therefore, from the actual series plot and sample autocorrelation plot (Figures 1 and 2) we observed that our series has seasonal variation, and seasonal differencing is needed to make it stationary.  
Unit root test: This test first impose that the mean monthly rainfall series has a unit root, versus the series is stationary. We test the null hypothesis using the Augmented DickeyFuller (ADF) testto know whether our series are stationary or not at level or after differencing. From Figure 1, we observe that the rainfall series of Dire Dawa doesn’t have a trend and potentially slow turn around zero. Therefore, we can use the test equation given in equation (2). For our series, maximum lag lengths (P) of the ADF test to remove serial correlation from the residuals of the regressions were selected as 12. The results of ADF test for the mean monthly rainfall series of Dire Dawa at level (without differencing), after first regular and seasonal differencing shows that, the ADF test statistic for the original and first regular differencing monthly rainfall series are greater than the critical values at 1%, 5% and 10% significance levels. This implies that, the null hypothesis, which has a unit root, for the series should not be rejected at all significance levels. This figure further confirms that original series as well as series obtained after first regular differencing are not stationary. However, as also suggested by visual analysis, after first seasonal differencing, the computed ADF test statistic are smaller than the critical values at 1%, 5% and 10% significant levels. This leads to the rejection of the test that there is a unitroot problem.  
Variance Comparison: Increase in the differencing order tends to increase the variance indicating over differencing [29]. In this study the following results were obtained:  
Var( ∇x_{t} )=5127.7, Var(x_{t})=3252.7, Var( ∇_{12}x_{t})=2704.76, and (∇_{12}^{2} ) Var x_{t} =7981.4 values. From these it is clear that: Var( ∇x_{t}) >Var(x_{t}) and Var (∇_{12}^{2} ) x_{t}>Var(x_{t}) >Var( ∇_{12}x_{t}). These results again suggest that nonseasonal first differencing ( ∇ x_{t} ) has been overdifferenced and hence the original series is nonseasonally stationary. The first seasonal differencing would rather be important (Figure 3).  
Consequently, all tests for stationarity seem to agree and suggest that the firstseasonal differencing of the series make stationarity around a constant mean, which is approximately zero and calculated its standard deviation of 52.05 mm (Figure 4).  
Moreover, to determine whether stationarity has been achieved, either by trend removal or by differencing, one may examine the autocorrelation function (ACF) and partial autocorrelation function (PACF) of series [30]. The sequence corresponding to a stationary process should converge quite rapidly to zero as the value of the lag increases. From this point of view, Sample autocorrelation and partial autocorrelation plot shown in Figure 5 are in support of monthly rainfall series stationary after having first seasonal difference.  
To test whether residual from fitted model come from normally distributed series, we use histogram, QQplot of the residual and Shaprowilks test. The histogram and QQplot of the residual shown in Figure 6a, show skewed rather than normally distribution. Shipiro Wilks test also confirms this fact since it results in Pvalue of 7.03e 09, which is smaller than 0.05. This implies that the residuals from the fitted models do not come from a normal distribution. This therefore, suggests some kind of transformation for our monthly rainfall series to achieve normality. By applying BoxCox transformation defined in eqn.(9) to the monthly rainfall series, the normality assumptions of residual will be achieved using optimum valueλ = 0.5, which maximizes the likelihood function with various iteration by the help of SAS 9.2software. Figure 6b reveals that after this transformation the problem of nonnormality seems dealt with. The Shapiro–Wilk test result based on test statistic defined on eqn.(8) (pvalue=0.397) is also in support of the normality of transformed series.  
Since normality is not fulfilled for the original series, three stages model building is performed with help of Sample ACF and PACF plot for the square root transformed rainfall data of Dire Dawa.  
Model building for monthly rainfall of dire dawa  
Since monthly rainfall data has seasonality variation, the univariate Seasonal Authoregressive Integrated Moving Average methodology is used to model rainfall series of Dire Dawa region.  
Model identification: Once the degree of differencing has been determined, the autoregressive and movingaverage orders are selected by examining the sample autocorrelations and sample partial autocorrelations function. Thus, it considered the SACF and SPACF shown in Figure 5 above and Tables 1 and 2 [17] as a guide and preliminary values of p, q, P and Q are chosen. Because we are dealing with estimates, it will not always be clear whether the sample ACF or PACF is tailing off or cutting off. Also, two models that are seemingly different can actually be very similar. With this in mind, we should not worry about being so precise at this stage of the model fitting. At this stage, a few preliminary values of p, q, P and Q should be at hand, and we can start estimating the parameters. First, concentrating on the seasonal lags, the characteristics of the ACF and PACF of our transformed series in Figure 2 tend to show a strong peak at h=12 in the autocorrelation function, combined with peaks at h=12, 24, 36 in the partial autocorrelation function. Hence it appears that either: the ACF is cutting off after lag 12 and the PACF is tailing off in the seasonal lags, or the ACF and PACF are both tailing off in the seasonal lags. Consequently, either (i) a seasonal moving average of order Q=1, or (ii) due to the fact that both the ACF and PACF may be tailing off at the seasonal lags, perhaps both components, P=1 and Q=1, are needed. To identify the betweenseason model, we focus the lags h=1, 2…11 and identify order based on Table 1. First, we set the ACF to be tailingoff and the PACF to be cutoff after lag 5, we identify p=5 and q=0. Also it is possible to think of the PACF to be tailingoff and the ACF to cutoff after lag 5, leading to identify P=0 and Q=5. Fitting the models suggested by these observations, it can be obtained:  
SARIMA (0, 0, 5) × (0, 1, 1)_{12}; SARIMA (5, 0, 0) × (0, 1, 1)_{12}  
SARIMA (0, 0, 5) × (1, 1, 1)_{12}; SARIMA (5, 0, 0) × (1, 1, 1)_{12}  
Parameter estimation: Here the maximum likelihood estimation methods for transformed total monthly rainfall are used to estimate the model parameters and the results are summarized in Table 3, which displays the list of the parameters for each temporally entertained model. For each model parameter, the table presents the estimated value, standard error, t value, AIC, SBC and variance for the estimate. As indicated by Shumway et al. [17], parameters must differ significantly from zero and all significant parameters must be included in the model. The result indicated that the seasonal and nonseasonal moving averages as well as nonseasonal autoregressive parameters are all significant since their pvalues is smaller than 0.05 and should be retained in the model. However, the constant (μ), nonseasonal autoregressive parameters and seasonal autoregressive parameters (φ_{12}) in all selected models are insignificant; hence it should be omitted from the model. Moreover, the correlation of the parameter estimates is also used to assess the extent to which collinearity may have influenced the results. If two parameter estimates are very highly correlated, you might consider dropping one of them from the model. However, in our case, correlation between seasonal moving average and nonseasonal autoregressive parameter is 0.079, which indicate that less correlation is observed between parameter estimates.  
Finally, the AIC, SBC and variance of the estimate in Table 3 are computed according to eqn. (3) and used to compare selected models fit best to the monthly rainfall series. From this, SARIMA (5, 0, 0)*(0, 1, 1)_{12} model has lower AIC, SBC and variance of estimate than other model, so it fits the transformed monthly rainfall series of Dire Dawa better than other, and can be used for further analysis.  
Diagnostic checking: If the model fits the rainfall data well, the residuals of the fitted model are random or white noise [31]. Inspection of the time plot of the standardized residuals for square root transformed rainfall series in Figure 7, which can be obtain by help of eqn.(4) above, shows no clear patterns (trend or seasonality behavior).  
Under the test that residual follows a white noise process, roughly 95% of the residual autocorrelations should fall within the range of ±1.96/√360 . It is clear, as shown in Figure 7 that there is no pattern in the residuals autocorrelation plot for the selected model, which means there is no autocorrelation coefficient which lies outside the two standard errors significantly for the fitted models. The Ljlung Qstatistic again for each group of six lags are computed using Eq. (5) and the results are summarized in Table 4 below and Figure 6, indicate that the pvalue is greater than 0.05 for all lags, which implies that the white noise hypothesis is not rejected. Moreover, two alternative methods, namely runs and turning point tests, are applied to check the independence assumption of residuals of the fitted model. The fitted model was found to be consistent with the independence assumption for both methods since their test statistic is smaller than the critical values. Thus, those results also closely follow with the result plot of the standardized residuals in Figure 7 and Table 4.  
The results related to the normality of residuals using Shapiro–Wilk tests indicate the residuals of the fitted models is normally distributed since pvalue is greater than 0.05. In addition to these tests, Figure 6b shows the histograms and QQplot of the residuals. As expected, the curves significantly reflect a normal distribution. Moreover, all calculated values are found to be smaller than the respective critical values for GoldfeldQuandt (GQ) Test and Breusch and Pagan(BP) test, in the test of homoscedasticity of the residual, which indicate that the residuals has constant variance.  
Therefore, from above all result it can be concluded that the hypothesis that the residuals are white noise cannot be rejected indicating that the fitted model is adequate. That is, SARIMA (5,0,0) × (0,1,1)_{12} model is adequate for modeling the square root transformed monthly rainfall series of Dire Dawa region.  
Forecasting: Since the model diagnostic tests show that all the parameter estimates are significant and the residual series is white noise, the estimation and diagnostic checking stages of the modeling process are complete. We can now proceed to forecasting the rainfall series with fitted SARIMA (5, 0, 0)*(0, 1, 1)_{12} model. According to eqn. (1) above, the SARIMA (5, 0, 0) × (0, 1, 1)_{12} model can be written as:  
This equation (19) can also be multiplied out and rewritten in a form that is used in forecasting as:  
Where and , This equation further reexpressed as:  
After substituting the estimated parameter values in Eq. (21) above, we obtain the following equation:  
Forecasts are to be made at the origin, t=January, 2014 for lead times m= 1, …, 24 for the total monthly rainfall series in the coming 24month. Then the onestep ahead forecast at the origin, t=Jan, 2014, which give us forecast of actual series for the month of January, 2012 are given by:  
By the help of eqn.(23), it can be obtain 24 month(2years)steps a head forecast and prediction interval for total monthly rainfall series of Dire Dawa, as presented in Figure 8. Results reveal that there is a tendency of relatively increasing in average pattern of monthly and annual rainfall over the forecast period from January 2014 to January, 2016.  
Forecasting accuracy evaluation: If the fitted SARIMA (5, 0, 0)*(0, 1, 1)_{12} model has to perform well in forecasting, the forecast error will be relatively small. The accuracy of forecasts is usually measured using root mean square error (RMSE), mean absolute error (MAE), Mean absolute percentage error (MAPE) and Theil’s inequality coefficient (TheilU). The result shows that the Mean Absolute Percentage Error (MAPE) turn out to be 3.56%, which is relatively less than 4% and Theil’s inequality coefficient (Ustatistic) turn out to be 0.018, which is relatively close to zero. Besides this result, the bias and variance proportion are also very small, which are 0.047 and 0.001, respectively. Thus, measures indicate that the forecasting inaccuracy is low.  
To assess the outofsample forecasting ability of the model it is advisable to retain some observations at the end of the sample period. Therefore, with the holdout last monthly rainfall values from Jan, 2013 to Jan, 2014 for evaluating forecasting accuracy and model validation, The RMSE, MAE, MAPE values are less than 5% and TheilU statistics is close to zero, which indicates the difference between the actual value andthe predicted value is very small. That is, the predictive powers of the models are better and suitable for mstep ahead forecasting.  
Spectral analysis result  
Power spectral Analysis is applied for annual and monthly rainfall series of Dire Dawa region. The first step in estimation of the spectrum by the periodogram method is subtraction of the sample mean and removing any obvious trend. Figure 9 show that raw periodogram of annual rainfall of Dire Dawa, in each point of it represents the variance of the series contributed by a frequency range centered at the point. Since raw periodogram is a wildly fluctuating estimate of the spectrum with high variance, the periodogram must be smoothed to ensure stable estimate. Proper amount of smoothing is somewhat subjective and depends on the characteristics of the data, generally, it is a good idea to try several bandwidths that seem to be compatible with the general overall shape of the spectrum, as suggested by the periodogram [17]. Considering the tradeoff in smoothness, stability and resolution in selecting widths of modified Daniel filters, which leads to span of length, c (1,1) and 15 as a suggested value for smoothing the annual and monthly rainfall of Dire Dawa station respectively (Figure 10). The result is displayed in Figure 2a2c, which provide a sensible compromise between the noisy version, shown in Figure 1a1c, and a more heavily smoothed spectrum, which might lose some of unnecessary peaks in estimation of spectrum. The width of the center mark on the 95% confidence interval indicator indicates the band width [32] cited also in Alemrew et al. [3]. The bandwidth, is equal to 0.1 cycles per year and 0.0417 cycles per month for the spectral estimator. This bandwidth means we are assuming a relatively constant spectrum over about 20% and 8.3% for monthly and annual rainfall with entire frequency interval (0, 0.5), respectively. By using degrees of freedom (df) of 6 and 30 for annual and monthly rainfall respectively, it can constructed approximate 100(1α) %confidence intervals (eqn.(14)) for spectral density for the frequency bands identified as having the maximum power, as shown in Table 5 above. If the lower confidence limit for the spectral value is greater than the baseline level at some predetermined level of significance, we may claim that frequency value as a statistically significant peak [17]. As we observed from Table 5, an approximate 95% confidence interval for the spectrum fS(0.37) is [39.34, 255.75], which is again too wide to be of much use, but we do notice that the lower value 39.34 is higher than any other periodogram ordinate, so it is safe to say that this value is statistically significant. Similarly, an approximate 95% confidence interval for the spectrum fS (0.083) is [6.17, 17.25].  
The smoothed spectra shown in Figure 11 and summarized in Table 3 provide a sensible compromise between the noisy version, shown in Figure 9, and a more heavily smoothed spectrum, which might lose some of the peaks. As displayed in the Figure, the peak with an average period of 4.17 years, contributing the highest percent to total variance of the annual rainfall series of Dire Dawa, which corresponds to the existence of “strong” peak at a frequency band centered at 0.24 cycles per year. This indicates that the periodicity of 4.17 years the most dominant in the annual rainfall of Dire Dawa. This can be attributed to the El Nino phenomenon. By looking for the other large spectral ordinate, it can be identified peaks of 16.67, 2.86 and 2.27 year periodicities.  
Moreover, the monthly rainfall data are also used and their spectrum will exhibit narrow and sharp peaks at seasonal frequencies, as shown in Figure 12. For such type of data, commonly occurred periodic oscillation: SemiAnnual Oscillation (SAO), Annual Oscillation (AO), QuasiBiennial Oscillation (QBO) and ElNino/Southern Oscillation (ENSO) are approximately obtained by 47 months, 1014 months, 2232 months, and 4066 months, respectively [33]. The peak with an average period of 12 month contributes a higher percent to total variance of the monthly rainfall series. This indicates that the most dominant peak observed in monthly rainfall of Dire Dawa is annual Oscillation (AO). There is also a subsidiary peak corresponding to a periodicity of 4 month and 6 month with associated frequency band centered around 0.25 and 0.167 cycles per month, respectively. These can also be attributed to the SemiAnnual Oscillation (SAO).  
Using spectral analysis there have been observed prominent periodicities of 1.82.6 years [34], 2.73.3 years [35] and 2.58 years [36]. The study result by Perterbough [37] various regions of African rainfall indicate that there is no apparent statistically significant spectral peak in most cases. However, opposite of this result obtained by Nicholson [38], who discovered a statistically significant spectral peaks of approximately 2.3, 2.8, 3.6 and 5–6 years is observed in different regions of African rainfall. Moreover, Perterbough [37] study and our result opposed each other. By conducted a detailed power spectrum analysis of African rainfall series, Nicholson et al. [39] found that quasiperiodicities clustered in four bands at 2.2–2.4, 2.6–2.8, 3.3– 3.8 and 5.06.3 years, common throughout equatorial and southern Africa. Ayoade [26] detected periodicities of 26 years while Adejuwon et al. [40] detected prominent periodicities of between 2 and 8 years cycles and less prominent periodicities of 16, 18 and 21 years cycles both in study of precipitation pattern of Southern Nigeria. Alamerew et al. [3] using spectral analysis method found that the dominant cycle with periodicity of 11.24 year for the annual rainfall of Addis Ababa, but they didn’t test the statistical significance of spectral peak.  
Many study shows that Ethiopian rainfall strongly related to the ENSO phenomena via rainfall extreme event like drought and flood [4,11,41]. The ocean–atmosphere interaction is believed to be a dominant factor in the weather processes in Eastern Ethiopia and contributes the largest share of the total annual rainfall the region. With this interaction, many studies were used for estimating possible future rainfall frequency of extreme events of the area. Yilma [2] found that a dominant cycle for the annual rainfall of Addis Ababa is 10.22 years. Alemrew et al. [3] inferred that drought recurs in Addis Ababa between 10 to 11 years. Haile [11] also found that drought occur every 68 years in the semiarid regions of Ethiopia. The study by Tsegay [12] based on the frequencies of rainfall deviation from the average suggest that drought occur every three to five years in Eastern Ethiopia. The possibility of using prominent peaks in the spectrum to predict the longrange behavior of the rainfall is attractive [42]. Oduro et al. [6] predict that a rainfall extremes event in Ghana recurs every 5.6 years. Stringer [43] also estimated that the climatic events recur every 2 to 2.5 year in Africa.  
Crossspectra analysis result  
The crossspectrum analysis is used to provide a means of determining the contributions of fluctuations in various frequency bands to covariance quantities such as the fluctuations in rainfall series at different nearby stations.  
Figure 13 shows the squared coherence between rainfall series of Dire Dawa station with Haramaya and Dengego stations with L_{h} =15 and 30 degree of freedom and F2,30(0.05)=3.34 at 5% significance level. Hence, the hypothesis of no coherence is rejected for the values of estimated square coherence (eqn.17) that exceed C0.05=0.16 (eqn.18). As it can be observed in the same Figure that, at lower seasonal frequency, the Dire Dawa rainfall characteristics are significantly associated with both Haramaya and Dengego rainfall, which mean there is strongly coherent. However, the degree of relatedness or coherence is higher in Dengego than Haramaya rainfall to the rainfall of Dire Dawa.  
Conclusion of the Study  
In this study, 30 years annual and monthly rainfall records were analyzed, using data from Dire Dawa region and neighboring area in the Eastern Ethiopia, mainly to study the rainfall pattern and its extreme event frequency of Dire Dawa. Based on the overall results of the research, the following conclusions could be drawn:  
âž£ A time series model for monthly rainfall series of Dire Dawa was adjusted, processed, diagnostically checked and lastly SARIMA(5,0,0)*(0,1,1)_{12} model is established and adequately be used to forecast 2 years monthly rainfall values. Further results reveal that there is a tendency of relatively increasing in average pattern of monthly and annual rainfall over the forecast period from January 2014 to January, 2016.  
âž£ Rainfall extreme event like flood predict or inferred to be recurring at about 4.17 years in Dire Dawa region.  
âž£ The annual rainfall of Dire Dawa region found to dominate with El Nino phenomenon. On other hand, Annual oscillations dominate monthly rainfall of the area.  
âž£ Rainfall extreme event like flood and drought highly association between Dire Dawa and Dengego area.  
According to the result in this study, the pattern and frequency in rainfall extreme event is increasing. So, every effort should be made to minimize such risks as deaths, loss of property and damage to infrastructure, associated with rainfall extreme event like drought and flood, crop failure by preplanning and make decision against adverse effects of it on the region. Moreover, more study need to assess the factors that cause the change in the pattern of rainfall distribution in the region.  
Acknowledgements  
I am indebted to Dr. Solomon Harrar, University of Kentucky for his comments and suggestion and am also grateful to the National meteorological Agency (NMA) of Ethiopia for providing the data for study.  
References  

Table 1  Table 2  Table 3  Table 4  Table 5 
Figure 1  Figure 2  Figure 3  Figure 4  Figure 5 
Figure 6  Figure 7  Figure 8  Figure 9 
Figure 10  Figure 11  Figure 12  Figure 13 