Energy and Environment
Predictive Enforcement of Hazardous Waste Regulations
The Environmental Protection Agency (EPA) regularly conducts inspections of facilities that handle hazardous materials. Of the 1,500 inspections that the EPA conducts annually, approximately 30-40% of these inspections lead to finding a violation. The process for prioritizing inspections varies between regions; some inspections are prescribed (for example, large quantity generators are supposed to be inspected at least once every five years), some are chosen based on national priorities (a certain chemical or industry might be of interest) and the remainder are chosen by regional managers. Regional managers use their domain knowledge to select facilities to inspect. The EPA wants to adopt a data driven approach to investigation targeting, using historical inspection data to predict the risk of severe violations.
Using EPA data on reporting, monitoring, and enforcement, DSaPP developed and evaluated predictive models to identify likely violators. We used temporal cross-validation to evaluate our best models, and found that in terms of precision in the top 5%, our model was able to perform nearly twice as well as the baseline. As a result, the EPA will be able to rank potential violators, better allocate inspection resources, and maximize the impact of each investigation to keep America’s air and water clean. In fact, this is projected to correspond to an additional reduction of 620,000 tons of pollution every year. DSaPP’s predictive model will also serve as proof of concept for how the EPA can use predictive analytics in the future.