Data-Informed Planning in California

March 14, 2017

Alta recently completed a draft of California’s first statewide bicycle and pedestrian plan. This document provides strategies and actions to support bicycling and walking, with major themes throughout of improving active transportation data collection and data systems for the state. In support of this guidance, we developed a baseline data collection methodology for the state, that covers demand, safety, and infrastructure data.


One of the exciting sources of demand data that we have available to us comes from permanent bicycle and pedestrian counters that have already been installed in the San Francisco and San Diego regions. To help understand how future permanent count data systems should be structured and developed, we analyzed the data from the counters that are already in place to identify “factor groups,” or groups of locations with similar traffic patterns at them (for more information on this concept, see the Portland State University Guide to Bicycle & Pedestrian Count Programs).

Analysis of Historical Counts

The count data was collected using induction loops to count bicyclists and passive infrared sensors to count pedestrians — you can read more about these technologies and others in our “Innovation in Bicycle and Pedestrian Counts” white paper. To group the count locations, we first calculated how the traffic volumes vary over the course of a weekday and over the course of a week at each site. We then applied a technique known as k-means clustering to these patterns, which group count locations based on similar characteristics.

For the hourly weekday patterns, we found 4 groupings for both bicycle and pedestrian traffic patterns. Notably, we don’t see strong evidence of AM/PM peaking at any of the observed pedestrian count locations. The key difference between the groups is when the “peak” begins and how long it extends through the day. For the bicycle counts, we see a dominant AM/PM peak at locations in downtown San Francisco, while locations on bike paths outside of the central cities tend to have less peaking activity on weekdays.



For the day of week traffic patterns, there are generally three broad categorizations: locations with more activity on weekends than weekdays (“recreational”), locations with roughly equal activity rates on weekdays and weekends (“mixed”), and locations with higher activity on weekdays than on weekends (“utilitarian”). The observed bicycling activity is best described by these three groups, while for the pedestrian count locations we also found a fourth group of locations with vastly higher walking rates on weekends than on weekdays.



The observed utilitarian cycling sites are primarily in the urban core of San Francisco (and Emeryville), the mixed sites are in the less dense urban areas of San Francisco and San Diego, and the recreational sites are primarily at locations along multi-use paths on the edges of the cities, and along the coast in San Diego county.

How can this be applied?

The primary reason we develop factor groups is to make short-duration counts more effective. When we conduct “peak hour[1]” counts, as suggested by the National Bicycle & Pedestrian Documentation Project, or 24-hour counts, we need a way to make these counts comparable to each other. As we’ve seen here in California, the patterns of bicycling and walking activity can vary dramatically between different locations. If we only have a glimpse into this overall volume, as we get with short-duration counts, it is useful to be able to make an informed judgment as how this observed count period relates to the rest of the day, week, or year.

For instance, let’s consider if we were take a bicycle count somewhere in downtown San Francisco and somewhere in Carlsbad between 4 and 6 PM on a weekday. The hourly factor groups tell us that those 2 hours of cyclists in San Francisco are a larger proportion of the overall day’s traffic than at the Carlsbad site, and thus over-represent the daily traffic. Similarly, if we assume that the weekdays at these sites are interchangeable, we would over-estimate the total bicycle traffic at the Carlsbad site and underestimate the total bicycle traffic at the San Francisco site. By getting a deeper understanding of how these patterns vary across space, however, we can calculate more accurate total traffic volume estimates.

Read more about California’s first statewide bicycle and pedestrian plan.

[1] Which, as we’ve seen, might not actually be the peak period.