Leveraging Social Media Data to Map Road Traffic Crashes

by Arianna Legovini , Robert Marty , Sveta Milusheva , Guadalupe Bedoya
X

Challenge

Road traffic crashes are among the world’s most pressing public health challenges. Crashes are the leading cause of death for those 5-29 years old and are the 8th leading cause of death considering all ages. Crash data offers a tool to guide efforts to enhance road safety; data allow identifying the most risky road segments and enable evaluating interventions aimed at improving safety. However, crash data are rare where crashes are most common. Low- and middle-income countries (LMICS) account for 92% of traffic deaths, yet administrative records of crashes in LMICS often underestimate crashes and, when data do exist, they are often recorded on paper.

Solution

The rise of smartphones and social media has empowered citizens to voice concerns. Kenyans in particular have leveraged these technologies to voice concerns about their country’s notable challenges with traffic and crashes. @Ma3Route, an X account and platform for crowdsourcing transport information, has 1.5 million followers and is used by many across Kenya to report traffic jams and crashes—see the example posts below:

Leveraging the Twitter (now X) API, facilitate through the Development Data Partnership, the Smart and Safe Kenya Transport (smarTTrans) team at the World Bank has worked to translate crowdsourced crash reports from @Ma3Route to structured data on crashes for Nairobi—Kenya’s capital and largest city. Using the API, the team queried over 1 million tweets from @Ma3Route since its start in 2012 and developed algorithms to (1) determine whether the tweet reports a crash, (2) geolocate the crash based on the text of the tweet (few people had geolocation enabled), and (3) group crash reports into individual crashes.

After developing the algorithms, the team worked to groundtruth the crowdsourced crash reports—partnering with a delivery company (Sendy) to confirm the presence of a crash at the location determined by the algorithm (the paper here describes the groundtruthing and algorithms). In parallel, the team has worked to digitize administrative records of crashes—enabling comparisons between administrative records and crowdsourced reports (see the blog here).

Figure 1. Number of Crashes from 2020 through 2023

Impact

In total, these efforts have produced data on the time and location of 30,000 crashes across Nairobi—which the team has recently publicly released. By making the dataset public, we hope that others will creatively leverage the data to improve road safety outcomes. We have already had students from local universities, NGOs and individual researchers start to use this data to help generate more knowledge and potential solutions to the road safety challenge in Kenya.

Mapping crashes has enabled the identification of blackspots and high-risk corridors in Nairobi. For example, the above map shows crashes concentrated in Nairobi’s Central Business District as well as along major roads, such as Thika Road.

The team has since used crash data—both derived from @Ma3Route as well as from administrative records—for subsequent analysis. For example, our working paper shows that Kenya’s curfew policies in response to COVID-19 increased crashes in the hours before the curfew—as people rushed home ahead of the curfew.

Altogether, these analyses demonstrate how crash data can be used to target interventions, evaluate policies, and understand the larger consequences of crashes. Moreover, as digital technologies continue to spread in lower income countries, we hope this example of leveraging crowdsourced data to inform policy can serve as a model for other contexts. The Development Data Partnership is pivotal in helping to make the data from these technologies available more widely, making it possible to generate these analyses and support more effective policies in the countries where we work.


Read more

https://blogs.worldbank.org/en/opendata/newly-released-dataset-maps-30-000-road-crashes-in-nairobi-using?