Vision Zero Analysis at the Regional Scale
HIN Index

Vision Zero Analysis at the Regional Scale

Across the world, Vision Zero is a powerful and popular framework to conceptualize the safety of our cities' streets. It connects data to planning actions and recenters our conversations around designing cities for people. A critical task for cities considering any type of Vision Zero Plan is to identify which locations pose the highest risk of dangerous traffic collisions, so that they can be prioritized for improvements. To this end, a common strategy is to develop a High Injury Network (HIN), which identifies the streets with the largest concentration of collisions where a victim was killed or severely injured (KSI).

HINs are typically developed for a single city at a time. Consequently, our understanding of collision trends is often very focused, and we rarely have a chance to see what is going on across multiple jurisdictional boundaries. I have always been curious about what a regional HIN could show us, so, as part of recent projects, I have developed a series of Python-based tools to automate the development of robust, regional HINs. This article shares the details of a personal project: a regional HIN for the San Francisco Bay Area (Alameda, San Mateo, Santa Clara, and San Francisco Counties).

Breaking Down the Goals of a HIN

A HIN, at its core, is a prioritization analysis. Your goal as the HIN developer is to identify the highest priority locations along a network for further analysis and safety improvements. However, the HIN also serves as a communication tool: improving a small percentage of the city's streets can address the majority of KSI collisions occurring on the network.

For example, one of the first statistics presented on San Francisco's Vision Zero page is:

In San Francisco, more than 70 percent of severe and fatal traffic injuries occur on just 12 percent of city streets. 

This statistic focuses the problem definition of San Francisco's Vision Zero program.

So, some possible goals of a HIN could then be formulated as:

  • Identify the subset of the network where most collisions occur (>50%).
  • Limit the cumulative size of the network to a manageable subset. This subset should be realistically improvable over a short to intermediate time horizon (3-10 years). Many HINs identified as part of Vision Zero efforts are usually less than 20% of the network, though this amount is debatable.

Metrics Matter: Developing a Multimodal Index

Urban planners familiar with prioritization analysis and performance measures understand the importance of establishing good metrics to connect data to desired outcomes. Developing an effective HIN quickly at an appreciable scale requires metrics that can be combined into a single index.

The approach I chose for the San Francisco Bay Area was the following:

  • Collect 5 years (2011-2016) of collision data for San Francisco, San Mateo, Alameda, and Santa Clara Counties from TIMs. Filter out collisions on highways.
  • Weight collision points by severity.
  • Associate weighted collision densities with a study network using kernel density estimation (KDE).
  • Compute percentile scores for bicycle, pedestrian, and vehicle collisions on the associated density values.
  • Compute a final index score combining percentile ranks of the associated density values.

An unconventional aspect of this analysis is the use of KDE as a basis for a collision index. Typically, collision rates or indices are derived by spatially joining collisions to a study network. This ultimately runs into two problems, and both originate from trying to force a point pattern onto a predefined study network.

  • Modifiable Linear Unit Problem (MLUP): This is essentially the same observation as the Modifiable Areal Unit Problem (MAUP) applied to networks. If you associate collisions with predefined segments, you can run into issues with aggregation bias. This is because it is hard to define a non-arbitrary network segmentation. A commonly-accepted study network preparation method is to break the network at intersections, but this typically assigns shorter segments to downtown grids relative to rural roads. If you normalize by length or volumes, the indicated crash rates are still dependent on how you aggregated collisions to each segment.
  • Segments, Not Intersections: Conventional collision analysis associates each collision to a single segment to avoid double counting them. This is a great thing to do for future aggregations, but it comes at the disadvantage of forcing collisions at intersections to be divided amongst their approaches. This implicitly attributes each collision to one of the intersection's approaches, not to the intersection itself.

KDE addresses these issues by using an estimation methodology specifically designed for point pattern analysis. KDE is, in fact, so powerful that it not only underlies some effective clustering algorithms, but many transportation studies use it for collision heat maps because they provide a quick visual understanding of collision clusters. However, a heat map is often where most applications of KDE stop, and the underlying statistics are rarely incorporated into a transportation analysis directly. In essence, KDE allows a compromise between identifying high injury segments and intersections without having to examine them separately.

Using KDE for generating a HIN requires sampling the resulting heat map values on small network segments. The result is impressive:

  • Cartographically, associating density values to a network provides options for visualization that are in a vector format. In addition, by focusing on the street network, it limits the visualization to what transportation planners want to communicate.
  • Statistically, at large scales, raster analysis tends to have many low values across a high-resolution raster. This skews distributions towards low values. Generally, when heat maps are sampled to a network, the distributions are more normal. It also captures intersection effects that would not have been detected with a single segment association.

Collision Accumulation

The final collision index determines the order in which we add street segments to the HIN. This approach allows for a flexible framework to identify parts of a network to prioritize for improvements or further study. For example, while most HINs are purely based on the results of raw collision information, there is little to stop the inclusion of other variables such as those that relate to equity. For example, the HIN derived from the collision index at 60% of Bay Area KSI collisions is shown below.

No alt text provided for this image

Now here is the same map, but adding a bias towards an equity metric derived from MTC's Communities of Concern.

No alt text provided for this image

While I expected KSI collisions to be concentrated at the scale of a Bay Area, the degree of that concentration was surprising. Below, a chart shows the relationship between the cumulative length of the Bay Area HIN (excluding highways / paths) and the cumulative number of KSI collisions. It demonstrates that KSI collisions are highly concentrated to a small portion of the Bay Area’s streets. Approximately 60% of all KSI collisions occur on 8% of the network.

No alt text provided for this image

The video below helps visualize this process on a map by showing the HIN at different collision accumulation percentages.

Ideas for Future Work

Vision Zero is more than just a plan, but a framework that can be applied in multiple planning contexts and projects. In the future, I plan to focus on both communication and analysis strategies that can scale and fit into a scenario-oriented framework to address key planning uncertainties. Some future ideas include:

  • Identify potential similarity metrics to be used in a spatial clustering analysis (Mean Shift, DBSCAN, etc.) to identify preliminary groups of collisions. These collision clusters can be used in cross tabulations aimed at identifying trends in "why" or "what" caused collisions.
  • Integrate emerging data sources into future analysis from vendors such as EcopiaTech, Mapillary, and Mobileye that use advances in computer vision to create insight from photos and aerial imagery.
  • Connect summaries identifying collision causes and trends to potential improvements on key corridors. Based on the typical cross section, develop 3D models of potential improvements using a previously developed production process using Python and CityEngine. The goal would be to connect abstract data analysis to concrete recommendations that are specific to a location's collision profiles.

Limitations & Key Considerations

This process is not perfect by any means. The methodology outlined intends to serve as a robust quick response tool to generate HIN's so that safety considerations can be integrated into a more diverse portfolio of project types such as corridor studies, studies for smaller cities, and on regional scale projects where conventional methods require a high degree of data preparation. The approach used for the Bay Area HIN for example does not normalize for exposure and I did not smooth out the network with a dissolve. In addition, there was no attempt to conduct some type of rolling window analysis as realistically segmentation treatments at that scale can be difficult with many curvilinear streets (working on that). In addition, there are better methods of KDE estimation for networks that could have been used for this project, but I chose simplicity for this exercise. Revisions too the tools used for this analysis since this articles publication make such adjustments easy to make at scale though (barriers in KDE estimation etc). Regardless, it provides a quick response methodology to prioritize locations for further consideration and potential improvements.

Everything about this is great! Really enjoyed reading, and appreciated the nod to limitations with snapping data points to road segments. For my thesis I tried to run statistics on road design features and severe collisions to attempt a better understanding of the correlation among crashes and the built environment. I couldn't quite get it to link up (mostly because a granular inventory on road design features doesn't exist), but it would be really cool if someone could figure that out! Anyway, loved your post.

Fascinating. At a 20,000 ft level, this network seems to match the high-frequency transit network. That is an important factor to consider when recommending interventions. What can be done to simultaneously improve safety, particularly of vulnerable users of the ROW, without imposing a huge burden on the transit operators? I would argue that safety and transit performance can be improved, but it will come at the cost of things like on-street parking or general purpose lanes.

My P+W friends have been working on the Austin version of this. It's a really novel idea! One of the biggest "issues" at play for any of these pieces is converting insights into design solutions. So what if you can identify seriously dangerous intersections, then what? Can you convince neighborhoods and public officials to invest in new designs from there? Do you use a carrot or a stick? They're hard to answer. On the analytics side, how much meta data can you connect to these maos? Curious if you can aggregate other elements of data to the streets / neighborhoods i.e. the speed limit, poverty, demographics, etc? This is essentially an epidemiological study, and as such, the more meta data, the better, especially for building models!

David, this is very interesting! We've been working on similar things over here, as you know, and one thing I'm curious about with your approach is whether you have considered the differences between this approach and using a network KDE, rather than the planar KDE you've implemented here. Any thoughts?

Kindred spirits! I'm inspired by your code, but it looks like we have different intents. When you get back, we can chat- but I'll run a subset of the TIM's hosted SWITR data through this routine to be ready for it. I don't currently approach Time of Day with that routine, but I could see it being very valuable for collision analysis. Your observation about Modifiable Linear Unit is right on, and I'm regularly thinking about the Modifiable temporal unit problem (meaningless or meaningful temporal bins). That's a lot of the intent/focus of the routine I have. 'Is ESRI evaluating rolling temporal statistics for tools like kernel densities as well (similar to pandas rolling functions)? Also I was curious if a temporal iterator in model builder was feasible (so you could time enable every tool theoretically- I felt like iterator field values was close)?' Not in an official, core software capacity (not yet, anyway!). Sorry- can't reach that youtube video. I'll send a pm with my contact info.

Like
Reply

To view or add a comment, sign in

More articles by David Wasserman, AICP

  • Artificial Intelligence & Planning Practice

    It is really hard to be comprehensive and approachable on complex and multidisciplinary topics such as Artificial…

    4 Comments
  • The Art of Learning by Example

    This article published in Planning Magazine explores how AI can change planning practice through the exploration of…

    3 Comments
  • Experiments in Procedural Cartography

    A project I have been working on recently got me thinking about the interactions of cartography, procedural modeling…

    3 Comments

Others also viewed

Explore content categories