Early Detection of Properties at Risk of Blight Using Spatiotemporal Data

City:

Cincinnati, Ohio, United States of America

Organization:

The Department of Buildings & Inspections, City of Cincinnati

Center for Data Science and Public Policy, University of Chicago

Project Start Date:

2015

Project End Date:

This is a continuous effort; the project continues to date

Reference:

Blancas Reyes, Eduardo et al. (2016), “Early Detection of Properties at Risk of Blight Using Spatiotemporal Data” In Sematic Scholar, retrieved from https://www.semanticscholar.org/paper/Early-detection-of-properties-at-risk-of-blight-Reyes-Helsby/ed39b2f58777466c20ea85b1948800f8e377fe90#paper-header

Problem:

In the City of Cincinnati, the current process addresses blight via reactive building inspections and code enforcement. Inspectors respond to citizen complaints and then work with property owners to bring their buildings into compliance with regulations. Under the current process, for homes that will one day become vacated, inspectors only receive a complaint in approximately 25% of cases. Thus, a large fraction of at-risk homes are unknown to the building inspectors who are trying to reduce neighborhood blight

Technical Solution:

Machine Learning method, supervised learning. They used geographical data from the city and historical data on home inspections to train a ML model to provide proactive suggestions for property inspections aimed at catching blight early. The model generates a ranked list of properties that is used to determined which should be inspected.

They used Python’s scikit-learn package to train the model with the following classifiers: AdaBoost, Random Forest, Extra Trees ,Gradient Boosting, Logistic Regression and Support Vector Machines.

They present a predictive approach for prioritizing city inspections as a tool to identify and prevent urban blight in the city of Cincinnati.

The Model was built upon a number of parcel-level features and spatiotemporal features and predicts whether a home is at risk of having a building code violation in the near future.

Goal: to find properties at risk of code violations as efficiently as possible.

Datasets Used:

The following are the datasets used to train their predictive model.

The datasets span different intervals, but they have the most data for 2012-2015 and thus decided to concentrate their analysis on that time frame.

·       Inspections dataset from The Department of Building and Inspections

·       Cincinnati Area Geographic Information System provided by Hamilton country

·       Property Taxes dataset from Hamilton County

·       Census data

·       311 Service Request

·       Building permits

·       Crime incidents dataset recorded by Cincinnati Police Department

·       Fire Department dataset

·       Property Sales dataset

Outcome:

Pre model: Around 6,000 inspections take place in Cincinnati every year - which represents roughly 4% of the total number of properties. Only 60% of those inspected are found to be have some type of building code violation. Under the current process, for homes that will one day become vacated, inspectors only receive a complaint in approximately 25% of cases.

 

Using this model, the city can increase the precision of their building inspections from 60% to 70%. This model can also be of use by other city agencies. For example, several community development corporations are active in Cincinnati, purchasing and renovating blighted properties to increase the attractiveness of their neighborhoods

Issues that arose:

·      There is a need to continuously train and evaluate models as new data comes into the system

 

·      One important potential improvement is geocoding more addresses, especially for the Sales datasets. Here, they lost a considerable amount of data, which could be a source of bias in the current model.

 

·      They could improve the feature selection process. They are currently selecting features based only on their spatiotemporal parameters. A   potentially better approach would be to use a feature selection algorithm to better identify non-informative features.

 

·      Ethical considerations: since the labels they are using come from a biased inspection process (only a 27% of all parcels in the city have ever been inspected), acting on the model without further evaluation can potentially have unintended consequences and ethical issues.

Status:

Ongoing effort

Entered by:

September 28, 2019: Lorena Camargo, Lorena.camargo@mail.utoronto.ca



CEM1002,
Civil Engineering, University of Toronto
Contact: msf@eil.utoronto.ca