Establishing Flow-Catchment Interactions as Means of Regionalising Flow Characteristics of the Save Catchment in Zimbabwe

The research was aimed at determining the areas of equal hydrology on the Save catchment for the purposes of estimating flow characteristics on ungauged catchments. Data quality control was first done on flow station data in order identify the best stations that could be use in the formulation the catchment flow characteristics. The flow characteristics created were mean annual runoff; coefficient of variation of runoff; base flow index; coefficient of variation of low flow; coefficient of variation of high flow and the coefficient of variation of monthly wet season flow. Literature was then used to select the catchment characteristics most likely to influence runoff. Redundancy analysis was done to correlate sub zone and runoff characteristics so that the nature and significance of relationships could be determined. Cluster analysis was then used to combine the selected catchment characteristics into natural groups of similar characteristics. Validation of the created clusters was done statistically using canonical variate analysis. The results showed that catchment characteristics significant are on the catchment rainfall (r=0.96); the proportion of each subzone under grasslands (r=0.56); the mean annual evaporation rates (r=-0.74), the coefficient of variation of mean annual rainfall (r=-0.55) and the catchment area (r=-0.53). The whole of the Save catchment had three different clusters and was found to be 60% similar. Validation showed that the catchment characteristics used in the study could explain only a total of 39% of the variation in flow characteristics in their clusters.


Introduction
The demands on Zimbabwe's water resources by the different sectors of the national economy have been increasing over the last two decades especially from agriculture, the mining sector and growing cities. It is therefore necessary for Zimbabwe to have an efficient water resources management system.
To facilitate efficient and informed management of water resources, data on flow characteristics of river systems is a prerequisite, since water resources planning and management depends on the availability of flow data. The systematic measurement of flow characteristics of rivers in a watershed is used to obtain flow statistics for each station which is extremely important for the design of engineering works, evaluation, planning and management of the water resources. However, high implementation, operation and maintenance costs of hydrological networks make it difficult for developing countries like Zimbabwe to have a comprehensive network in place [1,2]. This is compounded by the decline in the technical and human capacities in hydrology as noted by the reduction in the number of meteorological stations in Africa during the past 30 years [3]. Even if funds and human resources were to be made available for the extension of hydrological networks, it would take between 10-30 years before adequate data is collected. The equal and special distribution of hydrological stations will also be difficult to establish because some of the sites are remote and inaccessible [4]. This situation makes it imperative to develop methods for predicting flow characteristics at ungauged stations. One such technique to achieve this has been regionalisation [5][6][7][8].
Regionalization is the process which aims to estimate runoff on ungauged catchments [6,8,9] by the derivation of empirical relationships between flow and catchment characteristics [10,11]. In other words multi-variate regression techniques are used to derive linear relationships between the physical characteristics of a catchment and the natural flow regime [12,13]. These relationships, which are generated using data from a large number of gauged catchments, can then be used to estimate the flow regime of ungauged catchments [13,14]. If there is a gauged catchment with similar climate and physiographical characteristics as the ungauged catchment, then it may be possible to consider the catchments as analogous. Therefore, the observed flow record of the analogue catchment can be scaled, or transposed, to produce a synthetic flow record for the ungauged catchment [6,13].
This research is therefore aimed at determining areas of equal hydrology on the Save catchment for estimating flow characteristics on ungauged catchments. The main question that the research wanted to answer was if there were significant relationships between flow and catchment characteristics on the Save catchment.

Study area
The Save catchment is located between latitude18°S-21°S and longitude 30°E-33°E and occupies the south eastern parts of Zimbabwe ( Figure 1). The Save river flows in a south eastern direction, enters into Mozambique and eventually reaches the Indian Ocean. Most of the rivers on the Save catchment flow in a south-easterly direction and eventually joins the Save River in Zimbabwe except those on the north-eastern margins of the catchment that flow directly into Mozambique. The Save catchment has a total area of 48 925 km 2 . The catchment has four major dams ( Figure 1). These are Osborne dam to the north east, the Rusape dam to the north, Ruti dam to the Mid-west and the Siya dam to the south west. The catchment has eight Sub-catchments namely: Odzi, Budzi, Pungwe, Upper Save, Macheke, Lower Save East, Lower Save West and Devure.
The units of study used in the paper are sub zones.Sub zones are micro-catchments and within the Save catchment there are a total of 25. These are used by the Zimbabwe National Water Authority (ZINWA) form micro-level water resources management.

Methods
The research performed data quality control on the hydrological gauging station data obtained from ZINWA. The first part of the data quality control procedure involved the selection of stations that were not downstream of dams and also the water abstraction permit records from ZINWA were used to avoid stations with significant abstractions permit holders upstream that can alter the natural flow of rivers. The next step was to determine further if a station's records were useful and double mass curve analysis was done for stations on the same sub zone. The principle of double mass analysis is to plot accumulated values of a station's runoff data with accumulated values of the average of other stations in the area for the same period of time. Inhomogeneities in the time series (in particular jumps) can be investigated for example originating from change of observer, change of location of the station and even breakdown of the monitoring station.
Data quality control provided information on the best stations to use in the actual study. The data from these stations were then used to derive flow characteristics used in the study and these were: mean annual runoff (mr); coefficient of variation of runoff (mr.cv); base flow index (bfi); coefficient of variation of low flow (lcv); coefficient of variation of high flow (hcv); coefficient of variation of monthly wet season flow (wcv).
Redundancy analysis was then done to correlate sub zone   characteristics with runoff characteristics in each sub zone. Redundancy analysis was chosen because it is suitable at determining the significant effects of different combinations of (the environmental) sub zone characteristics on different forms of flow characteristics. It was also chosen because most variables in this study have some form of linear relationship between themselves, e.g. runoff increases with increasing precipitation, slope, and decreasing evaporation rates [18].
The statistical significance of each selected variable was determined by the monte-carlo permutation test. There was need to choose the number of permutations to be carried out for each test and in the study 199 permutations were used. In this way, monte-carlo permutation test was also used as a tool for hypothesis testing [18].
A correlation biplot graph was obtained from the percentage significance values and the length of a catchment characteristic variable arrow was a measure of fit (R) with the ordination diagram.
The main reason for performing redundancy analysis was for forward selection variables that significantly determine the flow characteristics in the Save Catchment, cluster analysis then followed. Cluster analysis is a multivariate statistical procedure for detecting natural groupings in data. Cluster analysis classification is based upon the placing of objects into more or less homogeneous groups, in a manner such that the relationship between groups is revealed. It is useful to classify groups or objects and is more objective than subjective [19,20].
Validation of the results of cluster analysis was done to ascertain if the results are hydrologically sensible. Validation was done statistically using canonical variate analysis which is the same as Fisher's linear discriminant analysis which shows the extent to which sub zone characteristics in each cluster explain the variation in flow characteristics in their respective clusters.

Results
Only correlation coefficients significant at the 5% significance level are presented. Redundancy analysis results show that positive correlations exist between the mean annual rainfall (r=0.96) and proportion of each subzone under grasslands (r=0.56) with flow characteristics. The mean annual evaporation rates (r=-0.74), coefficient of variation of mean annual rainfall (r=-0.55) and the catchment area (r=-0.53) are negatively correlated to flow characteristics. The distribution of correlation coefficients between sub zone characteristics and their ordination axes is shown on the biplot (Figure 2).
The lengths of arrows representing the environmental variables on the biplot (Figure 2) are proportional to the level of the correlation coefficient. Thus, mean annual rainfall (r) has the longest arrow and a correlation coefficient of (r=0.96) with the first ordination axes. Mean annual evaporation (e) has the longest arrow on the negative correlation side of the 1 st ordination axes with correlation coefficient of (r=-0.74).
Cluster analysis showed that considering the sub zone characteristics used in the study, the Save catchment is 60% similar (Figure 3). Figure 3 shows that almost 60% of the catchments sub zones fit well in one cluster. A sudden jump in the steps of Figure 3 suggests a difference hence a new cluster needs to be formed. The branches of the figure suggest that the appropriate number of clusters in the final Save Catchment is three. Figure 3 shows that the following Save catchment sub zones are similar at 95% similarity level: -

Discussion
The fact that the Save Catchment in terms of catchment characteristics is 60% similar (Figure 3) means the selected catchment characteristics from redundancy analysis are the same 60% of the times on every subzone. This translates to the fact that subzones in the same cluster have more or less the same hydrological response and hence similar flow characteristics. This is supported by Gan et al., Rees et al.,Merz and Loschl [6,11,14] who say that if there is a gauged catchment with similar climate and physiographical characteristics as the ungauged catchment, then it may be possible to consider the catchments as analogous.
Rainfall the major input factor in the hydrological cycle has the strongest positive association (r=0.96) with river runoff characteristics. This might also mean that the main factor determining the different clusters observed on the catchment is rainfall. The sub zones in cluster 2 and 3 most likely receive rainfall patterns different from those in the other cluster.
The proportion of each subzone under grasslands (r=0.56) was shown to positively affect river discharge characteristics by as much as 56% in each subzone. This means that grass cover promotes surface runoff which ends up in the river channel more significantly than other forms of landcover/landuse types like woodlands and croplands on the Save catchment. This is in contrast with other authors who found grasslands to be negatively correlated with runoff characteristics because many areas in Zimbabwe dominated by grasslands contain dambos which hold water and promote high evaporation rates [5,16,20,21].
The mean annual evaporation rate (r=-0.74) of the Save catchment was shown to be the strongest catchment characteristic inversely related to runoff characteristics on the Save catchment. This is most likely true because other authors say evaporation is an important component of the water budget in Zimbabwe where about 90% of the annual rainfall returns back to the atmosphere through this process [5,23]. Evaporation processes, including interception, play a controlling role in runoff generation [22], and interception is a major driver of the magnitude and speed of catchment response to rainfall, especially for semi-arid catchments with limited rainfall frequency and depth, and especially for smaller storm events [23,24].
The coefficient of variation of mean annual rainfall (r=-0.55) is inversely related to catchment characteristics. It shows that rainfall on the Save catchment mainly varies negatively and the catchment is receiving below average rainfall amounts more than it does above normal. This can be confirmed by [25][26][27] who suggested that areal annual rainfall in Zimbabwe had declined by 10% between 1900 and 1994 and also suggested that there was some evidence of progressive desiccation and increased rainfall variability.
The catchment area (r=-0.53) shows that as the Sub zone size increases, the discharge characteristics are reduced. This might be due to the fact that in larger catchments surface runoff moves longer distances to the rivers and in the process some can be lost to infiltration and evaporation. The same result can also be due to the shape of the sub zones, most of these are elongated (especially the western half) of the Save catchment as opposed to circular. It has been proven the elongated catchments have a longer discharge lag time than circular ones when it comes to response to input of rainfall thereby exposing the rainwater to evaporation and infiltration for longer periods [16,24].
The fact that the catchment characteristics used in the study can only significantly explain 39% of the variance observed in flow characteristics means that rest of the 61% unexplained variance can be due to random hydrological behaviour which cannot be discerned or due to the limitation in the data used in the study. For example instead of using a 90*90m digital elevation model in the calculation of slope indices, it has been proved in some regionalisation studies that the use of triangulated irregular networks (TINs) yields finer results. [28] Also instead of using the proportion of each sub-zone under different lithologies, their permeability attributes could have been used to come up with finer results [5].

Conclusion
In conclusion, it can be said that the research found significant  correlations between flow characteristics and some catchment characteristics. The catchment characteristics found to be significant are rainfall (r=0.96); the proportion of each subzone under grasslands (r=0.56); the mean annual evaporation rates (r=-0.74), the coefficient of variation of mean annual rainfall (r=-0.55) and the catchment area (r=-0.53).
The whole of the Save catchment was found to be 60% similar. Canonical variate analysis was done validate the created clusters revealed that sub zone characteristics used in the study explain a total of 39% of the variation in flow characteristics in their clusters. This mean that there is need for future work aimed at identifying more sub zone characteristics that could significantly explain variations in flow characteristics and also to develop new methods of estimating flow characteristics of ungauged catchments which produces more robust results.