Predictive Analytics for Planning Inspections of Linear Sewer Infrastructure in Guelph, ON

City: Guelph, Ontario, Canada
Organization: School of Engineering, University of Guelph, Guelph, ON
Project Start Date: Not specified (possibly as early as 2012, after inspection dataset was completed)
Project End Date: 12 May 2014 (first published)
Reference: Harvey, R. R., & McBean, E. A. (2014). Comparing the utility of decision trees and support vector machines when planning inspections of linear sewer infrastructure. Journal of Hydroinformatics, 16(6), 1265–1279. https://doi.org/10.2166/hydro.2014.007

Harvey, R. (2015). An Introduction to Asset Management Tools Municipal Water, Wastewater and Stormwater Systems. Credit Valley Conservation. https://cvc.ca/wp-content/uploads/2016/09/Appendix-E-Asset-Management-Tools-for-Municipal-Water-Systems.pdf
Problem: Individual pipes in a sanitary sewer system often remain operational beyond their design life. Unchecked aging pipes with cracks, holes and fractures can result in untreated wastewater leaking into the environment (exfiltration), posing contamination threats to the natural environment and municipal water supply. Studies have shown that about 90% of the City of Guelph’s drinking water wells contain at least one sewage derived contaminant. Groundwater infiltration into defective sanitary sewers can also overload the system during wet periods, increase wastewater treatment costs and increase failure probability of adjacent infrastructure (ex. roads). The potential collapse of a sanitary sewer can have severe consequences including service disruptions, environmental contamination and higher costs for emergency repairs.

Municipalities have a proactive approach to mitigate failure risk by conducting condition assessments of sewer pipes, primarily using the closed-circuit television (CCTV) visual technique. However, this method is expensive and time consuming; thus, inspections are usually limited to small segments of an entire system. Ultimately, there is a need to predict which sanitary sewers are most likely to be in a poor condition and should be inspected with CCTV. There is also a need to learn about which pipe attributes are most indicative of its structural condition in Guelph.
Technical Solution: The study built and compared two different models (support vector machines and decision tree classifiers) to ‘predict’ the structural condition of pipes in the sanitary sewer inspection records from 2008 to 2011 in order to determine the most accurate method.

Data science methods used:

1. Stratified random sampling was used to partition the dataset into training, evaluation and test sets (70, 10, 20 ratio).
2. Transformation to binary classification – the dataset was transformed from a five-class to a two-class output format in order to address imbalanced data issues.
3. Decision trees were used to determine pipe conditions based on inspection data attributes (10 inputs)
4. K-fold Cross-validation was used to train and test the algorithms.
5. Confusion matrix was used to evaluate predictive performance of the two models.
6. Receiver operating characteristic (ROC) curve was used to visualize and compare predictive performance of each method’s model.
Datasets Used:
  • Dataset 1: Records of Sanitary Sewer Pipe Inspections (n=1,825), City of Guelph, 2008 to 2011
Outcome: Pre Solution Performance:
Prior to the study, there were limited publications on the validity of predictions made at the individual pipe level. Decision trees were also never used previously to predict pipe condition from already available data.

Post Solution Performance:
Decision trees were found to be a simple and effective method for predicting sanitary sewer pipe condition from available inspection records, with an overall test accuracy of 76% in this study. It was determined that decision tree model outperformed the support vector machine model. The resulting decision tree also provided insight into how pipe parameters influence pipe deterioration – pipe age was the first determinant followed by burial depth.

The decision tree model is an opportunity for the municipality to learn from the existing inspection data to identify which sanitary sewers are most likely to be in poor condition, allowing for better planning and ‘bad’ pipe yields during future condition inspections.
In a 2015 Credit Valley Conservation publication by one of the study researchers, it was mentioned that a data analytic tool was being developed to aid municipalities with CCTV inspection and guide future efforts.
Issues that arose: Unbalanced data was the primary issues. The sewer inspection dataset was class imbalanced since there were few Class 4 pipes (five-class format), which would result in the number of pipes in poor condition to be underestimated. This issue was mitigated by transforming the dataset into binary classification (classes 1-3 are “good”, classes 4-5 are “bad’).
Status: Terminated. No indication that the study findings were actually implemented successfully.
Entered by: October 30, 2020: Ana Brankovan, ana.brankovan@mail.utoronto.ca


CEM1002,
Civil Engineering, University of Toronto
Contact: msf@eil.utoronto.ca