Received date: March 06, 2014; Accepted date: April 21, 2014; Published date: April 23, 2014
Citation: Tefera M, Mola M, Jemaneh G, Doyore F (2014) Application of Data Mining Techniques to Predict Urinary Fistula Surgical Repair Outcome: The Case of Addis Ababa Fistula Hospital, Addis Ababa, Ethiopia. J Health Med Informat 5:153. doi: 10.4172/2157-7420.1000153
Copyright: © 2014 Tefera M, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Health & Medical Informatics
Background: Maternal outcomes are good in most countries of the developed world while the same is not true in many developing countries. The likelihood of the occurrence of incontinence after successful surgical repair makes predicting urinary fistula surgical repair outcome is important for decision making during treatment and follow up. Therefore, this research is aimed to apply data mining techniques to build a model that can assist in predicting surgical outcome of urinary fistula repair based on clinical assessments done just before surgical repair.
Methods: The six-step hybrid knowledge discovery process model is used as a framework for the overall activities in the study. 15961 instances that have undergone urinary fistula repair in Addis Ababa Fistula Hospital are used for both predictive association rule extraction and predictive model building. Apriori algorithm is used to extract association rules while classification algorithms J48, PART, Naïve Bayes and multinomial logistic regression are used to build predictive models. Support and confidence are used as interestingness measure for association rules while area under the WROC and ROC curve for each specific outcome is sequentially used to compare performances of models from the predictive algorithms.
Results: Predictive association rules from Apriori have shown frequent co-occurrence of less severity of injury with cured outcome. The predictive model from PART-M2-C0.05-Q1 scheme has shown an area under WROC curve of 0.742. Area under the ROC curve for residual outcome (ROCResidual=0.822) from this algorithm is better than Naïve Bayes and logistic, while the areas under the ROC curves for the other outcomes are greater than the model from J48.
Conclusion: Predictive model is developed with the use of PART-M2-C0.05-Q1. The predictive association rules and predictive model built with the use of data mining techniques can assist in predicting urinary fistula surgical repair outcome. Therefore, it is better in detecting residual outcome than the logistic regression model.
Vaginal fistula; Rectal fistula; Data mining; Hospital data; Ethiopia
Since the 1990s, the social and economic structure of the world has changed from industrial and product oriented environment to information and knowledge dependant one. Rapid growth of information technologies and its integration with digital networks, software, and database systems are the main characteristics of information and knowledge society . The explosive growth in raw data accumulation in turn widened the gap between raw data that is not yet analyzed and meaningful information available for decision making. Because of the high volume of data and summarizing those with simple quantitative models became a great challenge for the information age-turning data into information and information into knowledge lead to a demand for specialized tools to view and analyze the data .
Following that data mining was applied for summarizing a large volume of data of maternal related problems. Maternal outcomes are good in most countries of the developed world while the same is not true in many developing and resource-poor countries. This disparity in maternal outcomes can easily be seen from the maternal mortality rate and lifetime risk of maternal death. For instance, the 2008 estimate of maternal mortality ratio for developed regions is 14 per 100,000 live births while it is 290 per 100,000 live births for developing regions. In the same line, the lifetime risk of maternal death is 1 per 4,300 births for developed regions while it is 1 per 120 births for developing regions. The above statistics for Sub Saharan Africa will rise to Maternal Mortality Ratio (MMR) of 640 per 100,000 live births and lifetime risk of maternal death of 1 per 31 births .
Generally, throughout the world, half a million women die from complications of pregnancy or childbirth every year, most of which occurs in resource-poor countries. In 2008 alone, an estimated 358,000 maternal deaths occurred worldwide because of complications related to pregnancy and childbirth from which developing countries accounted for 99% of the deaths. Furthermore, the analysis of the maternal mortality data for Sub-Saharan Africa and South Asia alone has shown that 87% of the global maternal deaths occurred in countries of these regions .
A fistula is an abnormal opening between the vagina and the bladder, the most common and the one which dominates the clinical presentations i.e. vesico vaginal fistula (VVF), and/or between the vagina and rectum i.e. recto vaginal fistula (RVF) [4-6].
Despite its devastating effects the exact prevalence of obstetric fistula is unknown while it is estimated to affect thousands of women in developing countries. The most frequently reported global prevalence of obstetric fistula shows that approximately two million women have untreated fistula in Asia and sub-Saharan Africa alone and additional 50,000 to 100,000 women develop obstetric fistulas each year . In Ethiopia also obstetric fistula is a health challenge to thousands of women where 9,000 are affected each year .
The purpose of the research is therefore, to apply data mining techniques and build a model that maps clinical examination attributes with the outcome of surgical repair for urinary fistula . This research will also compare the performance measures of logistic regressions with that of Decision Trees, Decision rules, Naïve Bayes, multinomial logistic regression so as to come up with a model of relatively higher area under the ROC (Receiver Operating Characteristics) curve. To this end, this research will try to answer the following questions:
1. What values of predictive factors (attributes) are associated with each outcome of urinary fistula repair?
2. Would it be possible to draw association rules among the attributes and the classes of urinary fistula surgical repair outcomes?
3. Can models from other algorithms predict urinary fistula surgical repair outcome with better sensitivity and specificity expressed as area under ROC curve than logistic regression?
The hybrid model (six-step KDP model) is chosen to be used as a framework to guide the overall activities in the current study. Hybrid process model was selected since it combines best features of CRISPDM and KDD methodology to identify and describe several explicit feedback loops which are helpful in attaining the research objectives. Hybrid methodology basically involves six steps (Figure 1):
The Weka GUI chooser
The Weka GUI chooser provides a starting point for launching Weka’s main GUI applications and supporting tools. It includes access to the four Weka’s main applications: Explorer, Experimenter, Knowledge Flow and Simple CLI.
Classifier accuracy measures
Classifier Accuracy Measures using the same dataset to derive a classifier or predictor and then to estimate the accuracy of the resulting learned model results in misleading overoptimistic estimates due to over specialization of the learning algorithm to the data. Then, the classifier is applied on the test set and the number of instances that were assigned to actual classes and different class by the classifier is counted, a process whose result is effectively represented by confusion matrix .
Confusion matrix is useful tool for analyzing how well classifier recognized the classes. An entry, CMi,j in the first m rows and m columns indicate the number of tuples of class that were labeled by the classifier as class j .
Receiver operating characteristic curve
Receiver Operating Characteristic Curve to test which classifier is highly significant for a given subject is determined by ROC analysis and it becoming widely used tool in medical tests evaluation .
The source of data about obstetric fistula victims’ treatment was obtained from an internal application. The database has attributes designed to store information on the social and demographic background, information on obstetric and medical history, preoperative care, and information on operation date of the victim who comes to the hospital seeking treatment. The datasets in the access files are exported to excel files whose size amounted to 10.5 MB before any processing activity is done on it. Data found in electronic format is preferred to the manual records found on more than 35,000 victims that the hospital has treated for the past 38 years. Therefore, because of the short period of time given for the study, the study has considered only the 19,929 instances found in the access database. Finally, access was obtained to analyze the dataset for the objectives specified in this research.
The 63 attributes left in the dataset were organized under five general headings such as; social and demographic variables, medical and obstetric history, preoperative care, operation date, postoperative course. Socio demographic attributes indicate the social and demographic back ground of these women with child birth injuries. Attributes found under this general heading are serial number, age at marriage, age at causative delivery, current age, height (cm), weight (kg), parity, number of living children, days to AAFH (Addis Ababa Fistula Hospital) on foot, days to AAFH by transport, educational status, marital status, accompanying person, distance to the nearest health facility, source of information, how many days before the woman could walk. The second groups of attributes are found under the medical and obstetric history. Values of attributes such as: antenatal care, duration of incontinence months, no of previous repairs done at other hospital, cause of fistula, other illness, duration of labour, place of delivery, mode of delivery, fetal outcome, other major illness, menstruation history, are recorded for each case. The third groups of attributes found under preoperative care are; pre-operative stay days, antibiotic given pre-operatively, type of antibiotics, Pre-operative care provided, nerve and musculoskeletal injury. The fourth groups of attributes are those attributes whose values are recorded during operation date. These attributes include: anesthesia , approach for urinary fistula repair, circumcision, type of procedure (repair), number of urinary fistula, type of urinary fistula (site), VVF length, VVF width, scarring, bladder size, Status of bladder neck, status of urethra, status of ureters, ureteric cateters, bladder fistula closure, graft, flaps, RVF location (rectal-injury type), RVF length, RVF width, rectal fistula closure (layers), sphincter status, intra operative complications, duration of surgery, and surgery outcome urinary, surgery outcome bowel.
Finally, information is captured during the post-operative course. The attributes for recording the values during this course are: transfusion, antibiotics post-operative type, pack in (days), postoperative complications, duration of bladder urethral catheters and total length of stay
Attribute subset selection
The major criterion for selecting an attribute set at this initial stage is to check whether each attribute is relevant to the data mining objective. Two crows corporation also ascertain that usefulness to the data mining objective is the major criteria in selecting attributes at the initial stage .
The Chi Square AttributeEval also ranks the attributes based on their chi-square statistics because the selected attributes are all nominal values to see the distribution of each value of attributes in the dataset to identify errors and to discern there is exist missing values or not.
Selection of instances
Building a predictive model for victims of urinary fistula requires selection of instances with no additional type of fistula is identified. Out of the 19929 victims, 15961 victims were affected by urinary fistula (VVF) and have undergone urinary fistula repair. Instance with missing values for outcome class are not useful for predictive model building in data mining because classification algorithms of data mining learn how instance were classified under the different classes . As this study uses classification algorithms for the purpose of predictive model building, the 220 records without class information are removed from subsequent analyses. The remaining dataset was then having 15741 records whose outcomes are distributed in one of the outcome categories. Thus, the statistical summaries of attributes relevant to the data mining objectives are on these 15741 records.
Exploratory data analysis
The attribute’s description, data type, unit of measure and list of values or range of values are described. The frequency tables for the selected attributes show the original distribution of values of attributes in instances of the dataset before any preprocessing is done on the dataset.
Number of previous repairs at other hospital
It is an attribute used to show the number of previous repairs done at other hospital. It is nominal valued attribute and includes values such as 1, 2, 3, >3, not applicable, no information (Table 1).
|Number of previous repairs in AAFH database||Frequency||Percent|
Table 1: Statistical summary for the number of previous repairs at other hospital.
Type of urinary fistula
This attribute mainly indicates the site at which the fistula has occurred. Like the other attributes it assumes valid nominal values such as Urethral, Circumferential, Combined, Juxta-urethral, Mid Vaginal, juxta-cervical, vesico uterine, vault, uretheric, Torn urethra, Absent Urethra, No bladder, Other and no information (Table 2).
|Type of urinary fistula||Frequency||Percent|
|Valid values together with inconsistencies as a result of discrepancy in data representations||Circumferential||574||3.65|
Table 2: Statistical summary for type of urinary fistula as presented in AAFH database.
VVF length is a measure of fistula size which indicates the length of fistula in centimeters, takes only limited and pre-specified number of values which makes the attribute to be considered as nominal. The values of this attribute are 1, 2, 3, 4, 5, >5 (Table 3).
|Errors/noises (<, 11, 22, 6)||4||0.02|
Table 3: Statistical summary for the distribution of VVF length as presented in AAFH database.
VVF width is the second measure of fistula size which indicates the width of fistula in centimeters, it takes only limited and pre-specified number of values which makes the attribute to be considered as nominal. The values of this attribute are 1, 2, 3, 4, 5, >5 (Table 4).
Table 4: Statistical summary for the distribution of VVF width presented in AAFH database.
Scarring is an attribute used to rank the amount of the scarring around the fistula. The values are nominal and they can be severe, mild, moderate, none, obliterated vagina (Table 5).
Table 5: Statistical summary for the distribution of the type of scaring as presented in AAFH database.
It indicates the size of bladder in terms of its volume expressed with nominal values such as small, good, fair, none, no information (Table 6).
Table 6: Statistical summary for the type of bladder size as presented in AAFH database.
Status of bladder neck
It is an attribute used to indicate the level of the effect of obstruction on bladder neck. The values to this attribute are complete destruction, partially damaged, intact, no information, not applicable. Only 3.36% of the total number of instances has no values and no errors are committed during entering values to the fields (Table 7).
|Status of Bladder Neck||Frequency||Percent|
Table 7: Summary statistics of bladder status as presented in AAFH database.
Status of urethra
It is an attribute used to indicate the level of the effect of obstruction on the urethra. The values to this attribute are complete destruction, partially damaged, intact, no information, not applicable (Table 8).
|Status of Urethra||Frequency||Percent|
Table 8: Summary statistics distribution of status of urethra as presented in AAFH database.
Number of fistula
It is an attribute used to record the number of fistula at different sites. It is considered nominal because of values none and >3 cannot be taken as numeric. The values a particular record can assume are also pre-specified to include 1, 2, 3, >3, and none (Table 9).
|Number of fistula||Frequency||Percent|
Table 9: Summary Statistics for number of fistula repaired as presented in AAFH database.
Status of ureters
It is an attribute used to record the side the ureters are affected. It assumes one of the three nominal values such as one outside, both inside and both outside (Table 10).
|Status of Ureters||Frequency||Percent|
Table 10: Statistical summary for status of ureters as presented in AAFH database.
Surgical outcome of urinary fistula repair
It indicates the restoration of urinary continence after surgical intervention. Valid values of this attribute are: cured, failed, stress, residual. The missing values are in each case was handled by replacing the most frequented value (Table 11).
|Surgical outcome of urinary fistula repair||Frequency||Percent|
|Values entered in the fields||Abscess draneige only||1||0.01|
|Other Missed fistula Specify||1||0.01|
|Other Specify half cured||1||0.01|
|Other Specify big residual||1||0.01|
|Other Specify Big residual||1||0.01|
|Other Specify Ileal conduit||1||0.01|
|Ureteric fistula not Managed||1||0.01|
|VVF cured but ureteric||1||0.01|
|VVF Cured but Ureteric||1||0.01|
Table 11: Statistical summary for surgical outcome of urinary fistula repair as presented in AAFH database.
Noise refers to a random error mostly characterized by a deviation from valid values of the attribute. The errors for nominal valued attribute are resolved by methods used for handling missing values . First, the error values are removed manually, and then replaced by the modal value (Table 12).
|Attributes||Errors/noises||Frequency||Handling mechanism (manual)|
|Type of urinary fistula||>||1||Replaced manually with the frequent value.|
|VVF length||<, 11, 22, 6||One for each||Replaced by the most frequent value|
Table 12: Noises identified and corrected in the attributes selected for the study.
The two possible causes for the inconsistencies detected in the fields of selected attributes are human error in data entry and the design of the values of attributes of the database with no predefined values. The problem associated with existence of inconsistencies is that they reduce the quality of the final model and makes learning difficult for the algorithms. Discrepancies were detected while extracting statistical summaries of attribute values. Despite the valid values of the attributes observed in the manual form used in actual data collection, there are invalid values entered in the database. Han and Kamber  also state that knowledge about the properties of the data can be used in detecting discrepancies that may exist in databases (Tables 13 and 14).
|Attributes||Frequency||Identified Inconsistency and Handling Mechanism used|
|Type of urinary fistula||2||Vesico vaginal & Visico vaginal replaced manually with the frequent value (mid vaginal).|
|13||Replace Juxta-Urethral with Juxta-urethral|
|3||Replace Absent Urethra with Absent urethra|
|2||Replace no bladder with No bladder|
|12||Replace Juxtra-cervical with Juxta-cervical|
|1||Replace Visico vaginal with Vesico vaginal|
|60||Replace torn urethra with Torn urethra|
|9||Replace Mid vaginal with Mid Vaginal|
|774||Replace Juxta-uretral with Juxta-urethral|
|Surgical outcome of urinary fistula repair||1||no Change replaced by No Change|
|15||improved replaced by Improved|
|4||stress replaced by Stress|
|1||No chage replaced by No Change|
|2||No change replaced by No Change|
|2||Broken replaced by Failed|
|1||Other specify big residual replaced by Residual|
|1||Other specify Big residual replaced by Residual|
|VVF width||41||>=6 replaced by the more general concept i.e. >5|
Table 13: Inconsistencies identified and resolved in the attributes selected for the study.
|Number of instances||15546|
|Number of attributes||11|
|Number of classes||4 (Cured, Stress, Failed, Residual)|
|Size of the data||2MB|
Table 14: Final summary of the dataset constructed ready for experiments with the use of algorithms.
Description of preprocessed and prepared data
Different activities were performed on the dataset with the objective of making it suitable for the data mining algorithms and producing representative model. Very large numbers of instances were removed and large numbers of attributes are removed (Table 14).
Experimentation, in this study, represents the data mining step in the six step hybrid KDP model where five data mining algorithms (including the association algorithm) are applied on the dataset to achieve the objective of extracting association rules from attribute values of urinary fistula assessment and to build a model for predicting the outcome of urinary fistula surgical repair association rule mining experiments and predictive model building experiments. Likewise, experiments which make use of different classification algorithms are intended to build urinary fistula surgical repair outcome predictive model of relatively better sensitivity and specificity as compared to others.
All the experiments that are discussed in the subsequent sections are carried on 15546 instances and 11 attributes. The attribute set includes “previous repairs at other hospital”, “type of urinary fistula”, “VVF length”, “VVF width”, “bladder size”, “status of bladder neck”, “status of urethra”, “scarring”, “status of ureters”, “number of fistula” and “surgical outcome of urinary fistula repair”. The last attribute in the list represents the class attribute which is mandatory in developing predictive models. In order to build predictive models for urinary fistula surgical repair outcome, four different algorithms were used. More specifically, J48, PART, naïve Bayes, and multinomial logistic regression are the algorithms with which predictive model building experiments are conducted. In 10 fold cross validation, one option in Weka for the purpose mentioned; the dataset is split into 10 equal parts. The “explorer window” is opened from the Weka GUI chooser “Explorer” button.
Experimentation with Apriori Algorithm to Discover Association Rules
Association rule mining algorithm, Apriori, is used to identify attribute values co-occurring with urinary fistula surgical repair outcome (Tables 15 and 16).
|CAR||If enabled, class association rules are mined instead of general association rules.||Boolean|
|numRules||The required number of rules||Numeric|
|metricType||Type of metric by which to sort rules such as confidence, lift, leverage, conviction.||Nominal|
|minMetric||Minimum metric score. Consider only rules with scores higher than the specified value. Minimum confidence by default is 0.9||Numeric|
|Delta||The delta by which the minimum support is decreased in each iteration (default: 0.05).||Numeric|
|lowerBoundMinSupport||Lower bound for minimum support (default: 0.1)||Numeric|
Table 15: Summary of Apriori Parameters.
|Minimum support||Minimum confidence|
Table 16: Number of rules (in each cell).
Association rules by the number of fistula
The number of fistula is a characteristic of fistula which is identified by counting the number of fistulas occurred in different sites of the birth canal and bladder. From the total of 29 best rules obtained by eliminating the redundant ones, the antecedent part of only two of the rules start by stating the number of fistula  (Table 17).
|1||number of fistula=1||surgery outcome=cured||79%||93.08%|
|2||number of fistula=1, status of ureters=Both inside||surgery outcome=cured||80%||88.30%|
Table 17: Association rules by the number of fistula.
Association rules by the number of previous repairs at other hospitals
The number of repairs at other hospitals is one of the predictors of the outcomes of urinary fistula surgical repair. It indicates the number of repeated repair attempts that has been made but hasn’t enabled the victim to regain complete continence (Table 18).
|1||number of previous repairs at other hospitals=not applicable||surgery outcome=cured||78%||82.08%|
|2||number of previous repairs at other hospitals=not applicable, number of fistula=1||surgery outcome=cured||79%||76.69%|
|3||number of previous repairs at other hospitals=not applicable, number of fistula=1, status of ureters=Both inside||surgery outcome=cured||79%||73.06%|
|4||number of previous repairs at other hospitals=not applicable, status of ureters=Both inside||surgery outcome=cured||79%||77.69%|
Table 18: Association rules by the number of previous repairs at other hospitals.
Association rules by the status of ureters
It was discussed in the literature that obstruction of labour affects multiple organ systems, one of which is ureters. Obstruction of labour may affect only one ureter or both ureters (Table 19).
|1||status of ureters=Both inside||surgery outcome=cured||79%||94.38%|
Table 19: Association rules by the status of ureters.
Association rules by the status of urethra
Solbjorg Sjoveian, Siri Vangen, Denis Mukwege, Mathias Onsrud stated that published reports indicate the degree of involvement of urethra (status of urethra) as one of the main prognostic factors for surgical outcome  (Table 20).
|1>||Status of urethra=intact>||surgery outcome=cured>||86%>||69.53%>|
|2>||Status of urethra=intact, number of fistula=1>||surgery outcome=cured>||86%>||65.49%>|
|3>||Status of urethra=intact, number of fistula=1, status of ureters=Both inside>||surgery outcome=cured>||87%>||62.27%>|
|4>||Status of urethra=intact, status of ureters=Both inside>||surgery outcome=cured>||86%>||65.35%>|
Table 20: Association rules by the status of urethra.
Association rules by status of bladder neck
Status of bladder neck ranks the degree of injury that the obstruction has resulted on bladder neck on a nominal scale. Clinical assessment at the outpatient or immediately before surgical repair reveals the status of the bladder neck (Table 21).
|1||status of bladder neck=intact||surgery outcome=cured||87%||61.88%|
|2||status of bladder neck=intact, number of fistula=1||surgery outcome=cured||88%||58.29%|
|3||status of bladder neck=intact, status of ureters=Both inside, number of fistula=1||surgery outcome=cured||88%||55.53%|
|4||status of bladder neck=intact, status of ureters=Both inside||surgery outcome=cured||88%||58.27%|
|5||status of bladder neck=intact, Status of urethra=intact||surgery outcome=cured||88%||58.71%|
|6||status of bladder neck=intact, Status of urethra=intact, number of fistula=1||surgery outcome=cured||88%||55.90%|
|7||status of bladder neck=intact, status of ureters=Both inside, Status of urethra=intact, number of fistula=1||surgery outcome=cured||89%||53.21%|
|8||status of bladder neck=intact, status of ureters=Both inside, Status of urethra=intact||surgery outcome=cured||89%||55.60%|
Table 21: Association rules by the status of bladder neck.
Association rules by scaring
Scaring refers to fibrosis or dead tissue around the fistula margins. If exists it may vary from minimal when the fistula margins are soft and mobile to extreme when the fistula margins are rigid and fixed. For fresh fistula scaring will be none (Table 22).
|1||scaring=none, status of bladder neck=intact||surgery outcome=cured||92%||29.09%|
|2||scaring=none, status of bladder neck=intact, number of fistula=1||surgery outcome=cured||92%||27.88%|
|3||scaring=none, status of bladder neck=intact, status of ureters=Both inside,||surgery outcome=cured||92%||28.07%|
|4||scaring=none, status of bladder neck=intact, Status of urethra=intact||surgery outcome=cured||92%||28.17%|
|5||scaring=none, status of bladder neck=intact, Status of urethra=intact||surgery outcome=cured||92%||27.14%|
|6||scaring=none, status of bladder neck=intact, status of ureters=Both inside, Status of urethra=intact||surgery outcome=cured||92%||27.19%|
|7||scaring=none, Status of urethra=intact||surgery outcome=cured||91%||31.23%|
|8||scaring=none, Status of urethra=intact, number of fistula=1||surgery outcome=cured||91%||29.98%|
|9||scaring=none, status of ureters=Both inside, Status of urethra=intact, number of fistula=1||surgery outcome=cured||91%||29.08%|
|10||scaring=none, status of ureters=Both inside, Status of urethra=intact||surgery outcome=cured||91%||30.18%|
Table 22: Association rules by the scarring around the fistula.
Experimentation for Predictive Model Building
Developing a predictive model in datasets with high class imbalance and multiple classes requires some kind of countering the imbalance (Figure 2).
Experimentation with J48 Algorithm
J48 is Weka’s implementation of the C4.5 algorithm which can work on multiple valued attributes. As it was observed from the data description the attributes that affect surgical repair outcome of urinary fistula are multi valued. In addition to using the default parameter settings of the algorithm to build predictive model with J48, an attempt was made to find better classifier by varying its important parameters (Table 23).
|binarySplits||Whether to use binary splits on nominal attributes when building trees||Boolean|
|confidenceFactor||The confidence factor used for pruning (smaller values incur more pruning)||Numeric|
|minNumObj||The minimum number of instances per leaf||Numeric|
|subtreeRaising||Whether to consider the subtree raising operation||Boolean|
|Unpruned||Whether pruning is performed||Boolean|
Table 23: Summary of the J48 classifier parameters.
Binary Splits parameter by default is set to “False”. If this value is changed to “True”, it enforces the model generated to be binary decision tree rather than generalized decision tree. The confidence factor helps to set a limit so that the algorithm makes more or less pruning. The default value for confidence factor is 0.25. The working of confidence factor requires the unpruned parameter to be set to “False”. The subtree raising parameter is by default set to “True” to replace the nodes in a decision tree with a leaf during pruning.
After building four predictive models by modifying the parameters of J48, it has been observed that the performances of the models are not the same. Thus, as indicated in the methodology part based on measures of performance, an evaluation is made by comparing these models.
The first comparison is made between experiments 1, 2, and 3. The common feature of these experiments is that they all return trees by pruning. The second and the third experiments has resulted in predictive accuracy of 79.24% with 0.50 WROC shows that this experiment has very low sensitivity and specificity. Greater sensitivity and specificity among these experiments is observed in experiment one with 0.568 WROC (Table 24).
Table 24: Experimentation with J48 by modifying its parameters before SMOTE.
The second comparison is between the unpruned model from the fourth experiment and the model from the first experiment. The model from unpruned J48 scheme has resulted in 75.87% accuracy and WROC area of 0.665. This model is better in WROC, however, not of good accuracy as compared to the model from the first experiment. The J48 unpruned has shown better performance based on area under the ROC curve from the previous experiments. Experimentation is done using the J48 unpruned after SMOTE is applied (Table 25).
Table 25: Experimentation with J48-U-M2 after successive SMOTEs.
As sensitivity and specificity has greater importance than general accuracy of the classifier in clinical and medical fields, models are better compared based on WROC area? But another challenge with the use of SMOTE is the question where to set the threshold. Here, the researcher has taken 300% SMOTE as the threshold because after the third experiment oversampling the minorities will lead to under sampling of previously majority classes, despite the continuous decrease in accuracy and continuous increase in WROC area (Figures 3 and 4).
Experimentation with PART Algorithm
PART algorithm extracts rules. Due to this reason the algorithm is categorized under classification by rule induction. The rules are landed together to give a complete set of rules. PART has almost a similar set of parameters with J48 algorithm that can be adjusted to build better model from datasets (Table 26).
|BinarySplits||Whether to use binary splits on nominal attributes when building the partial trees||Boolean|
|ConfidenceFactor||The confidence factor used for pruning (smaller values incur more pruning)||Numeric|
|MinNumObj||The minimum number of instances per rule.||Numeric|
|ReducedErrorPruning||Whether reduced-error pruning is used instead of C4.5 pruning.||Boolean|
|Unpruned||Whether pruning is performed||Boolean|
Table 26: Summary of the PART rule learner parameters.
The second and the third experiments were done by decreasing the confidence factor to 0.1 and 0.05. Decreasing the confidence factor enforces more pruning. The fourth experiment shows the results of setting the unprune parameter to “True” and taking the default values of the other parameters. The last experiment is done by applying reduced error pruning i.e. setting the value of this parameter to “True”. Performance measures such as accuracy, WROC and the number of rules are better in the third experiment. The third experiment is better both in accuracy and WROC area than the other algorithms. Therefore, the model from the third experiment i.e. PART-M2-C0.05-Q1 has an accuracy of 78.66%, and WROC of 0.728 which is better than the others.
Schemes discussed in Table 27 are experiments performed before applying SMOTE. Additional comparison among the performance measure of the classifiers from the best schemes after SMOTE has been applied shows a continuous decrease in accuracy and a continuous increase in area under the ROC curve. The results of PART-M2- C0.05-Q1 after successive SMOTEs are shown in Table 28.
|5||PART-R -M 2-N3-Q1||78.37 %||78.4%||66%||0.721|
Table 27: Experimentation with PART rule learner by modifying its parameters.
Table 28: Experimentation with PART-M2-C0.05-Q1 after successive SMOTEs.
Experimentation with naïve bayes algorithm
Bayesian methods are based on assumptions of probability. The Naïve Bayes algorithm assumes the attributes are independent. Then, the class of a new instance will be computed by multiplying the probabilities of values the instance has assumed under each attribute (Tables 29-31).
|Display Model In Old Format||Use old format for model output. The old format is better when there are many class values. The new format is better when there are fewer classes and many attributes.||Boolean|
Table 29: Summary of the Naïve Bayes classifier parameter.
Table 30: Experimentation with Naïve Bayes classifier by modifying its parameter.
Table 31: Experimentation with Naïve Bayes-O after successive SMOTEs.
The most important parameter in relation to this study is display Model In Old Format. However, there are also other parameters which can be adjusted according to needs of data used in different research areas. Table 28 shows the description of the parameter and type of values it takes. The default value to this parameter is “False”. The researcher has altered this value to “True” as displaying the model in old format is recommended to output the classifier’s result for multivalued class classification.
Experimentation with logistic regression
In traditional statistics logistic regression is applicable only in cases where the outcome attribute is binary. In Weka, logistic regression can perform learning on a dataset with multiple outcome classes. As urinary fistula surgical repair intervention can result into more than two outcome classes, experiments were done with multinomial logistic regression. In cases of much co-linearity in the attributes of datasets ridge estimator is used to limit the range of values that the coefficient of regression function assumes.
The experiments shown in Table 32 were performed to develop model with a higher performance measures by incrementing the ridge parameter value from 10-8 up to 10-10 and decrementing it up to 10-4. The default value for ridge parameter in logistic regression is 10- 8. In times of much co-linearity the very small ridge value enables to detect the coefficients of the values of each attribute. All the models from logistic regression have shown 79.4% accuracy and area under the WROC curve of 0.762. Comparison among these experiments can be concluded by selecting the default scheme (Logistic-R1.0E-8-M-1).
Table 32: Experimentation with logistic regression by modifying its ridge parameter.
Like the effect of successive SMOTE observed in Naïve Bayes-O, decrease in performance of the model from logistic regression when SMOTE is increased successively from 100-500%. After 300% SMOTE, model from Logistic-R1.0E-8-M-1 is having as accuracy of 76.8% and area under the WROC curve of 0.752. Comparison of measures of performances of models before and after SMOTE shows that the models before SMOTE are better in both predictive accuracy and area under the WROC curve (Table 33).
Table 33: Experimentation with Logistic-R1.0E-8-M-1after successive SMOTEs.
Findings from the classification algorithms
The researcher has tried to experiment four algorithms namely: J48, PART, Naïve Bayes, and logistic regression with the purpose of developing a model for urinary fistula surgical repair outcome. Under each algorithm multiple schemes are tested for their ability in predicting outcomes at better sensitivity and specificity which is expressed in WROC. This measure is selected as a base for comparing performances of schemes because accuracy alone is not a good measure of selecting models in medical areas. The last activity is to compare the best schemes from each algorithm with other best schemes found from other algorithms.
At first glance of Table 34, it seems that logistic regression is better than the others in area under the WROC curve. Close investigation of the models based on area under the ROC curve for each outcome class as shown in Table 35 depicts that the logistic regression is relatively insensitive to “residual” outcome for urinary fistula repair (ROCResidual=0.669). The same drawback is observed in Naïve Bayes-O (ROCResidual=0.677). However, high compromise is made in the ROC area for failed outcomes in PART-M2-C0.05-Q1 as compared to logistic regression and Naïve Bayes models, PART-M2-C0.05-Q1 with no SMOTE is highly sensitive to residual outcome than the models from logistic and Naïve Bayes. Additional comparison based on each outcome’s ROC area with J48-U-M2 after 300% SMOTE shows that PART-M2-C0.05-Q1 with no SMOTE is better in all the ROC areas for the outcomes except ROC area for residual outcome (Table 35). Based on these multiple reasons it could be inferred that PART-M2- C0.05-Q1 scheme after 300% SMOTE is relatively better than models from the other schemes (Figure 5).
Table 34: Measures of performance of models from best schemes of the different algorithms based on area under the WROC curve.
Table 35: Area under the ROC curve for each outcome in the models which have greater weighted area under the ROC curve (WROC).
In classification or prediction tasks, the accuracy of the resulting model is measured either in terms of the percentage of instances correctly classified or in terms of “error rate”. Classification error rate on pre classified test set is commonly used as an estimate of the expected error rate when classifying new records . To make the procedure valid, the 10- fold cross validation is used, so that model is built and tested 10 times. Errors during each test are averaged to give the average error rate of the model. The classification error rate for the selected model is 23.8%, which means the model has incorrectly classified about around 23.8% instances out of their actual classes each time when the model is tested on the test set. Several reasons may be attributed for increased error rate from the models. First, algorithms differ in their capability as observed from comparisons of performance measures. Second, attributes in preoperative, operative and postoperative course that are not included in the study might have influenced it. In fact, a particular victim regains her continence not because of clinical examination rather because of the treatments and the surgical repair.
Analysis of classification rules from PART-M2-C0.05-Q1
PART rule learner with the specified scheme has resulted in 262 rules. Listing all the rules here will be quite cumbersome, thus, the rules which are highly predictive are selected and discussed as the finding of this study based on success ratio. The success ratio of a rule is found in parenthesis just at the end of the predictive rules. The numbers in parenthesis at the end of each rule tells the number of instances in the rule. If one or more of the rules were not pure (that is all in the same class), the number of misclassified cases also are given after slash (/). The greater the number before the parenthesis the greater the chance of the rule to predict the class indicated by that particular rule.
Classification rules predicting cure after surgical repair
The same way of interpretation of the rules can be used for the classification rules that the researcher has selected and presented in the tables hereunder. For example, rule number one in Table 36 shows that a new instance with (Status of Urethra=Intact AND Status of bladder neck Neck=Intact AND Scarring=None AND Length=1) has 93.82% chance of being cured after surgical repair and 6.17% chance of not being cured. The second rule shows that if the length increases by one, keeping the other measures the likely hood of being cured after surgical repair decreases 93.35%.
|Rule No||“IF” Part||“Then” part||Success ratio||%|
|1||Status of Urethra = Intact AND Status of bladder neck = Intact AND Scarring = None AND Length = 1||Cures||(1746.0/115.0)||93.82|
|2||Status of Urethra = Intact AND Status of bladder neck = Intact AND Scarring = None AND Length = 2||Cures||(1656.0/118.0)||93.35|
|3||Status of Urethra = Intact AND Scarring = Mild AND Type of urinary fistula =Juxta-cervical AND Length = 2||Cures||(312.0/10.0)||96.89|
|4||Status of bladder neck = Intact AND Scarring = None AND Type of urinary fistula = Vault||Cures||(45.0/1.0)||97.83|
|5||Status of bladder neck = Partially Damaged AND Type of urinary fistula = Juxta-cervical AND Scarring = Mild||Cures||(40.0/3.0)||93.02|
|6||Status of bladder neck = Intact AND Bladder size = No information AND Scarring = Moderate AND No of Prev Repair Other Hospital = No Information||Cures||(35.0/2.0)||94.59|
|7||Status of bladder neck = Partially Damaged AND Bladder size = Fair AND Scarring = None||Cures||(18.0/1.0)||94.74|
Table 36: Classification rules predicting cure for a surgical repair.
Classification rules for predicting stress incontinence after surgical repair
Each rule in Table 37 should be taken independently and no form of relationship can be created among these rules. The rules can be used to situations in which a new instance assumes attributes values indicated by the rule. All the rules shown in the table work for smaller number of instances in the dataset, however, stress incontinence is observed in large number of instances for whom the rules apply.
|Rule No||“IF” Part||“Then” part||Success ratio||%|
|1||Type of urinary fistula = Circumferential AND
Status of Ureters = Both Inside AND
Length = >5 AND Status of Urethra = Partial Damage
|2||Length = 2 AND
Type of urinary fistula = Juxta-urethral AND
Scarring = Moderate AND Width = 2 AND
No of Prev Repair Other Hospital = Not applicable AND Status of bladder neck = Complete Destruction
|3||Status of bladder neck = Partially Damaged AND Type of urinary fistula = Circumferential AND Status of Ureters = Both Inside AND
Length = >5 AND Status of Urethra = Partial Damage
|4||Length = 3 AND Status of Urethra = Intact AND Width = 4||Stress||(16.0/2.0)||88.89|
Table 37: Classification rules for predicting stress incontinence after a surgical repair.
Classification rules for predicting failure after a surgical repair
Each rule in Tables 38 and 39 should be taken independently and no form of relationship can be created among these rules.
|Rule No||“IF” Part||“Then” part||Success ratio||%|
|1||Type of urinary fistula = Combined AND Status of bladder neck = Partially Damaged AND Width = 3 AND Status of Urethra = Complete Destruction||Residual||(7.0/1.0)||87.50|
|2||Type of urinary fistula = Combined AND Status of bladder neck = Partially Damaged AND Number of fistula = 2 AND Status of Urethra = Partial Damage AND Length = 4||Residual||(8.0/1.0)||88.89|
Table 38: Classification rules for predicting residual incontinence after a surgical repair.
|Rule No||“IF” Part||“Then” part||Success ratio||%|
|1||Type of urinary fistula = Absent urethra AND Bladder size = Small||Fails||(12.0/2.0)||85.71|
|2||Type of urinary fistula = Combined AND
Status of bladder neck = Partially Damaged AND
Bladder size = Small AND
Length = 5.0 AND
Status of Urethra = Partial Damage
Table 39: Classification rules for predicting failure after a surgical repair.
The instances in the dataset include the victims identifying information and health information and all other services provided by the hospital. Beyond explicit importance and use of the information in therapeutic process, researches like this thesis make use of it. But, the use of this medical information of instances for research and other varied purposes raises ethical issues such as: patient’s privacy or confidentiality. However, the research is for the purpose of professional contribution to assist obstetric fistula treatment and it will not attempt to harm anybody in any way. Identifying information were removed from the dataset to protect the privacy and confidentiality of the victims treated in the center and of those now on treatment. Ethical clearance is obtained from the research and ethics committee of the School of Public Health of Addis Ababa University to carry out the study and analyze the dataset.
Prediction of outcomes of urinary fistula surgical repair intervention is of paramount importance for both during surgical decision making and for special post-operative care that particular victims may require. Browning A  has indicated the purpose of predicting victims who are more likely to suffer post-repair complication because of residual outcome. According to him identifying these victims can enable to tailor surgical techniques to try and decrease complication rate and to make the surgery be done by more experienced fistula surgeon. The results from predictive models could also be used in post-operative consultations with the victim who has undergone repair surgery.
Association rules are extracted from the clean dataset with the use of Apriori algorithm which showed attribute values that frequently co-occur together with specific classes. All of the rules showed that less severity of injury co-occurring more with “cured” outcome than any other outcome. The reverse of which indicates stress, residual, and failed surgical outcomes may occur in cases of higher severity of an injury. Moreover, the addition of an attribute value decreases the coverage rules indicating cured surgical outcome, which means that instances with additional injury have a decreased chance of cure than a victim with only one injury of same type.
The study has shown the necessity to experiment as many classification algorithms as possible before picking and using a single algorithm for prediction. On the way to the major objective i.e. developing predictive model, performances of models from best schemes of J48, PART, and Naïve Bayes algorithms were compared with the performance of the best scheme from logistic regression. The comparison has revealed that PART-M2-C0.05-Q1 after 300% SMOTE has performed prediction better than logistic regression in ROCResidual. The model that PART-M2-C0.05-Q1 scheme after 300% SMOTE learns is better in area under the ROC curve for residual outcome than Naïve Bayes and logistic and better than J48 in the ROC area for the other outcome classes.
PART-M2-C0.05-Q1 after 300% SMOTE resulted in 76.81% accuracy and with a weighted area under the ROC curve of 0.742 was used to build the predictive model. At first scene these performance measures seem very low as compared to the very high accuracy, sensitivity and specificity needed in surgical decision making. But, predicting surgical outcomes disregarding the preoperative care provided, intra-operative complexities that may occur during surgery, the post-operative care and complexities at this level of accuracy and ROC area are encouraging.
To sum up, consultation with domain experts on the rules and models that were left after objective evaluations also confirms that the increase in the severity of fistula diminishes the chance of being cured after surgical repair. Less severity, on the other hand, is a positive ground for “cure” as an outcome. This shows that the finding of this research agrees with the previously existing knowledge in urinary fistula surgical repair outcome.
Before the data has been used for the purpose of predictive model building and association rule mining, a number of preprocessing and preparation steps were carried out on the data. Those activities which resulted in clean data are: cleaning for errors, and handling missing values. As indicated in summary statistics during data preparation, the dataset has some error entries that could be prevented by predefining the values a particular attribute can take. This is because of the holistic treatments that the hospital provides to victims of obstetric fistula and injuries in birth tract, so that the database was made to include all the variables to all the different types of injuries. Thus, variables that apply to a particular injury type will be non-applicable to the other. These attribute values create difficulties to the extraction of meaningful knowledge from the database. The solution to this problem, for example, could be to create different forms and tables to record victims based on the type of surgical repair performed. Some important benefits that this solution can provide are, ease in generating reports in simple statistical tools and decrease the task of filling non applicable attribute values if the case is only of a specific type.
The predictive model can assist urinary fistula surgical repair outcome prediction with the given levels of accuracy and weighted area under the ROC curve. The model can also be used to provide post-operative advice and make consultation with a victim who has already undergone surgical repair. With the development of small knowledge base system the usability of the model can go further to the time of actual surgery, by making the system available on hand held small portable computers. But before moving to the construction of knowledge base system (KBS) that contains knowledge of the domain area as depicted by the model obtained, the researcher would like to give some recommendation about the data attribute values captured. First, the entry of errors to columns of the database should be protected by predefining the valid values the attribute can take. Second, to eradicate some inapplicable values for a particular case it would be better to capture data based on the type of surgical intervention that are needed by the situation of victims who came for treatment.
Finally, it has been observed that classification algorithms differ based on the performance of the model they build. With the short period of time given to this research, it was found impossible to experiment more than four algorithms for predictive model building. Therefore, to come up with a model that may show better performance even from the model used to extract predictive rules, classification algorithms such as support vector machine (SVM), multi-layer perceptrones (MLP) and many others can be experimented. This will help to compare the performances of the models with the model from this research, and to move onto the level of deployment.
The authors declared that they have no competing interests.
Minale Tefera wrote the proposal, participated in data collection, analyzed the data and drafted the paper. Mr. Getachew Jemaneh and Dr. Mitike Mola approved the proposal with some revisions, participated in data collection and analysis, commented on the analysis and improved the first draft. All the three authors and Feleke Doyore revised subsequent drafts of the paper. Feleke Doyore prepared this manuscript for publication.
Our earnest gratitude goes Health and Medical sciences college, Addis Ababa University for proper review and approval of this paper. We would also like to extend our gratitude to data collectors for their patience to bring this meaningful information. Our special thanks also extended to Addis Ababa University for financial support for this study.