Developing a clustering-based empirical Bayes analysis method for hotspot identification

PI: Yinhai Wang (UW), yinhai@uw.edu
Dates: 12/16/2015 – 12/15/2016
Status: Completed
Project Information
Final Technical Report

The identification of sites with promise, also known as crash hotspots or hazardous locations, is the first step in the overall safety management process. One widely applied approach to this task is the popular empirical Bayes (EB) method. The EB method is described and recommended in Highway Safety Manual (2010) for roadway safety management. The EB method can correct for regression-to-the mean bias and refine the predicted mean of an entity.

Although the EB method has several advantages, there are a few issues associated with the methodology which may limit its widespread application. First, the accuracy of the EB method depends largely on the selection of the reference population or grouping of similar sites, and the definition of “similar” is a somewhat open question. When estimating the safety performance function, the crash data are often collected from different geographic locations to ensure the adequacy of sample size for valid statistical estimation. As a result, the aggregated crash data often contain heterogeneity. When conducting an EB analysis, the reference group must be similar to the target group in terms of geometric design, traffic volumes, etc. Manually identifying such a reference group is rather time consuming task for transportation safety analysts whose time could be better spent elsewhere. Second, the EB procedure is relatively complicated and requires a transportation safety analyst with considerable training and experience to implement it for a safety evaluation. Thus, the training investment required to prepare analysts to undertake EB evaluations can be a barrier. As a result, some quick and dirty conventional evaluation methods may be applied as a compromise of convenience, which may produce questionable results.

Given that the specification of correct reference groups is critical for the accuracy of the EB methodology, the primary objective of this research will examine different clustering algorithms and develop a complete procedure to automatically identify appropriate reference groups for the EB analysis. We will develop an analysis tool to automate implementation of the clustering-based EB method. The purpose of this javascript-based tool is to provide a user-friendly platform, which can be easily operated by transportation safety analysts to identify sites with potential for safety treatments.