Mapping Activity in West Bank#
Analysis Goals#
The goal of this analysis is to map the activity in the West Bank using GPS data as a proxy for peope’s mobility.
Dataset#
The dataset being used in this analysisis is the Mapbox movement data. The description below was adapated from the original source:
The Mapbox Movement data set is based on de-identified underlying mobile device activity that grows and changes every day. Any mobility data set is inherently skewed between highly populated regions (urban and metro areas) and sparsely populated regions (rural areas). So instead of providing “raw counts” of measurements, a custom normalization process is applied. This process measures and smooths out the impact of otherwise unpredictable changes in mobile device usage and calculates an activity index, which is more appropriate for comparison across time spans.
Dataset Period#
The dataset covers period from January 2022 to December 2023. Note that there are some gaps where there is no data for some months.
Spatial Characteristics#
The data is provided at GPS coordinates (which are de-identified, jittered).
Temporal Characteristics#
The dataset we use in the analysis is based on activity generated over a a 24 hour time span.
Key Variables#
The variable we are using is activity_index as described above. The easiest way to think about is that its a measure of how many people are moving/visit a region over a specific time (in this case a 24 hour period). Thus, higher values of this index should correspond to alot of people visiting an area with
Prepare Geography DataFrames#
Show code cell source
# ================================================
# VIEW AND CHECK H3 INDEXES
# ================================================
m = gdf_wb_h3.explore(tiles="OpenStreetMap")
m
Get and Process Mapbox Movement Data#
Load and preprocess data#
#
Add geographic information to the Mapbox events#
Adm2 regions
H3 regions
# ================================================
# Add administrative regions to Mapbox events data
# ================================================
# keep only columns we need for admin geodataframe
gdf_adm2_wb = gdf_adm2_wb[['ADM1_EN', 'ADM2_EN', 'ADM2_PCODE', 'AREA_SQKM', 'geometry']]
gdf_mapbox_adm2 = add_geography_mapbox_events(df_mapbox, x_col="xlon", y_col="xlat", gdf_regions=gdf_adm2_wb)
# ================================================
# Add H3 indexes to Mapbox events data
# ================================================
gdf_mapbox_h3 = add_geography_mapbox_events(df_mapbox, x_col="xlon", y_col="xlat", gdf_regions=gdf_wb_h3)
Aggregate Activity by Region and Time#
Spatial aggregation units: Admin level 2 and H3 index
Temporal aggregation units: Month, week
Monthly aggregation at ADM2 level#
Generate a Activity Trends#
Summary statistics#
Show code cell source
cols = ["ADM2_EN", "date", "activity_index_sum"]
df_tmp = df_mapbox_monthly_adm2[cols]
df_tmp.sort_values(by=["activity_index_sum"], ascending=False)
ADM2_EN | date | activity_index_sum | |
---|---|---|---|
122 | Ramallah | 2022-03-01 | 15057.030766 |
131 | Ramallah | 2023-01-01 | 14612.023041 |
129 | Ramallah | 2022-10-01 | 14555.311015 |
125 | Ramallah | 2022-06-01 | 14282.661059 |
133 | Ramallah | 2023-03-01 | 13828.758482 |
... | ... | ... | ... |
29 | Tubas | 2022-10-01 | 578.932338 |
31 | Tubas | 2023-01-01 | 531.227671 |
28 | Tubas | 2022-09-01 | 511.841719 |
32 | Tubas | 2023-02-01 | 426.845737 |
30 | Tubas | 2022-11-01 | 342.297292 |
220 rows × 3 columns
West Bank monthly activity trends#
2022
2023
Activity trends by governorates#
Show code cell source
output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)
plot = get_line_chart(
df_mapbox_monthly_adm2,
"Monthly Activity Trend in Governorates of West Bank (2022-2023)",
"Source: Mapbox Movement",
subtitle="",
category="ADM2_EN",
measure="activity_index_sum",
date_col="date",
)
show(plot)
Activity trends by by H3 hexagonal grid#
Show code cell source
import folium
from folium.plugins import TimestampedGeoJson
import geopandas as gpd
import pandas as pd
import branca.colormap as cm
# Create a base map
m = folium.Map(location=[gdf.geometry.centroid.y.mean(), gdf.geometry.centroid.x.mean()], zoom_start=10)
# Create a color scale
# Clip data to 5th and 95th percentiles
lower_bound = np.percentile(gdf['activity_index_sum'], 5)
upper_bound = np.percentile(gdf['activity_index_sum'], 95)
colormap = cm.LinearColormap(
colors=['blue', 'lightblue', 'white', 'pink', 'red'],
vmin=lower_bound,
vmax=upper_bound
)
colormap.caption = 'Activity Index Sum'
colormap.add_to(m)
# Prepare features for TimestampedGeoJson
features = []
for idx, row in gdf.iterrows():
color = colormap(row['activity_index_sum'])
feature = {
'type': 'Feature',
'geometry': row['geometry'].__geo_interface__,
'properties': {
'time': row['date'].date().__str__(),
'style': {
'color': 'black',
'fillColor': color,
'fillOpacity': 0.7,
'weight': 1,
'opacity': 0.9
},
'id': row['h3_index'],
'activity_index_sum': row['activity_index_sum'],
'tooltip': f"H3 Index: {row['h3_index']}<br>Activity Sum: {row['activity_index_sum']}"
}
}
features.append(feature)
# Add TimestampedGeoJson to the map
TimestampedGeoJson(
{'type': 'FeatureCollection', 'features': features},
period='P1D',
add_last_point=True,
auto_play=False,
loop=False,
max_speed=1,
loop_button=True,
date_options='YYYY/MM/DD',
time_slider_drag_update=True
).add_to(m)
# Add administrative boundaries
gdf_adm = gdf_adm2_wb[['ADM2_EN', 'ADM2_PCODE', 'geometry']]
folium.GeoJson(
gdf_adm,
name='Admin2 boundary',
style_function=lambda x: {
'color': '#808080',
'weight': 2,
'fillOpacity': 0
}
).add_to(m)
# Add roads
gdf_roads = gdf_roads[['Class', 'Name', 'geometry']]
folium.GeoJson(
gdf_roads,
name='Roads',
style_function=lambda x: {
'color': '#D3D3D3',
'weight': 1
}
).add_to(m)
# Add layer control
folium.LayerControl().add_to(m)
# Save and display the map
m.save('wb-activity-evolution-2022-2023.html')
m # Display in Jupyter Notebook (if applicable)
/var/folders/k5/p4nvl2pj4gq2ks0j7qvrhwbh0000gp/T/ipykernel_18739/621941167.py:8: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.