Estimating Activity Through Point of Interest Visits Using Mobility Data

Estimating Activity Through Point of Interest Visits Using Mobility Data#

Analyzing the frequency of visits within or near points of interest, including residential, commercial, and industrial zones, has the potential to provide insights into the economic ramifications of the recent earthquakes near Nurdağı, Türkiye. In a manner akin to the now discontinued Google Community Mobility Reports, the following methodology aims to monitor fluctuations in mobility, quantified by visit counts, within a set of OpenStreetMap points of interest relative to a baseline relative to January 2023.

Data#

In this section, we import from the data sources, available either publicly or via Datasets. Please that data is a placeholder location where the data must be placed. When using non-public data, please carefully abide by your organization’s data privacy policy, data classification and the terms and conditions.

Area of Interest#

In this study, the area of interest is defined as the affected 11 provinces in Türkiye and Syria as shown below.

Make this Notebook Trusted to load map: File -> Trust Notebook

Points of Interest#

Using the Humanitarian OpenStreetMap’s Export Tool, the project team obtained OpenStreetMap landuse points of interest within a clipping boundary defined by the area of interest between Türkiye and Syria defined above.

POI = dask_geopandas.read_file(
    "../../data/external/export.hotosm.org/hotosm_worldbank_syrtur_points_of_interest_gpkg_uid_ 22612c83-de9e-4e7a-a796-163299eed008.zip!hotosm_worldbank_syrtur_points_of_interest.gpkg",
    npartitions=16,
).set_crs("EPSG:4326")

To illustrate, we visualize below the points of interest tagged with landuse=industrial.

POI[POI[TAG].isin(["industrial"])].compute().explore(color="blue")

Make this Notebook Trusted to load map: File -> Trust Notebook

Mobility Data#

Veraset Movement provides a panel of human mobility data, based on data collection of GPS-enabled devices location.

ddf = dd.read_parquet(
    f"../../data/final/panels/{PANEL}",
)

Calculating date from the date and time the points were collected.

ddf["date"] = dd.to_datetime(ddf["datetime"].dt.date)

Methodology#

In parallel with the now discontinued Google Community Mobility Reports, the outlined methodology aims to monitor variations in mobility, measured by the frequency of visits, within points of interest sourced from OpenStreetMap compared to a baseline. It’s important to note that the mobility data reflects a subset of the overall population within an area, specifically individuals who have activated the Location Services setting on their mobile devices. It is crucial to understand that this data does not represent total population density. Additionally, we highlight that this calculation is based on a spatial join approach, which determines whether a device has been detected within an area of interest at least once. This method, while straightforward, represents a simplified approach compared to more advanced techniques such as estimating stay locations and visits.

Utilizing Dask-GeoPandas, we conduct a spatial join between the device traces and points of interest. Subsequently, we perform spatiotemporal aggregation of device counts by H3 grid cells and daily intervals, as well as by landuse classifications such as residential areas.

gddf = dask_geopandas.from_dask_dataframe(
    ddf[["uid", "hex_id", "date"]],
    geometry=dask_geopandas.points_from_xy(ddf, "longitude", "latitude"),
).set_crs("EPSG:4326")

Next, the spatial join being executed (without H3) to calculate the device density for each spatial (H3 resolution 6) and for each temporal bin (daily).

result = (
    gddf.sjoin(POI, predicate="within")
    .groupby([TAG, "hex_id", "date"])["uid"]
    .nunique()
    .to_frame(name="count")
    .reset_index()
    .compute()
)

Finally, the results joined into administrative divisions.

result = result.merge(
    AOI[["hex_id", "ADM0_PCODE", "ADM1_PCODE", "ADM2_PCODE"]], on="hex_id"
).sort_values(["date"])

Results#

In this section, we visualize device count detected daily within each of the following OSM landuse classification.

FEATURES = [
    "residential",
    "commercial",
    "industrial",
    "education",
    "farmland",
    "construction",
]

Show code cell source Hide code cell source

def plot_visits(data, title="Points-of-Interest Visit Trends in Syria and Türkiye"):
    """Plot number of visits to OSM points of interest"""

    p = figure(
        title=title,
        width=650,
        height=600,
        x_axis_label="Date",
        x_axis_type="datetime",
        y_axis_label="Number of devices",
        y_axis_type="log",
        y_range=(0.75, 10**6),
        tools="pan,wheel_zoom,box_zoom,reset,save,box_select",
    )
    p.add_layout(Legend(), "right")
    p.add_layout(
        Title(
            text=f"Count of Devices Identified within OpenStreetMap '{TAG}' for Each 3-Day Time Interval",
            text_font_size="12pt",
            text_font_style="italic",
        ),
        "above",
    )
    p.add_layout(
        Title(
            text=f"Source: Veraset Movement. Creation date: {datetime.today().strftime('%d %B %Y')}. Feedback: datalab@worldbank.org.",
            text_font_size="10pt",
            text_font_style="italic",
        ),
        "below",
    )

    # plot spans
    p.renderers.extend(
        [
            Span(
                location=datetime(2023, 2, 6),
                dimension="height",
                line_color="grey",
                line_width=2,
                line_dash=(4, 4),
            ),
            Span(
                location=datetime(2023, 2, 20),
                dimension="height",
                line_color="grey",
                line_width=2,
                line_dash=(4, 4),
            ),
        ]
    )

    # plot lines
    renderers = []
    for column, color in zip(FEATURES, COLORS):
        try:
            r = p.line(
                data.index[:-1],
                data[column][:-1],
                legend_label=column,
                line_color=color,
                line_width=2,
            )
            renderers.append(r)
        except Exception:
            pass

    p.add_tools(
        HoverTool(
            tooltips="date: @x{%F}, devices: @y",
            formatters={"@x": "datetime"},
        )
    )

    p.legend.location = "bottom_left"
    p.legend.click_policy = "hide"
    p.title.text_font_size = "16pt"
    p.sizing_mode = "scale_both"
    return p

Through the aggregation of visit counts, we present a smoothed tally indicating the number of detected users within the entire area for each 3-day time interval.

data = (
    result.pivot_table("count", index=["date"], columns=[TAG], aggfunc="sum")
    .groupby(pd.Grouper(freq="3D"))
    .sum()
)

Loading BokehJS ...

By first-level administrative divisions#

Through the aggregation of visit counts, we present a smoothed tally indicating the number of detected users within each first-level administrative division and for each 3-day time period.

Syria#

Türkiye#

Limitations#

Limitations of using mobility data to estimate economic activity#

Warning

Sample Bias: The sampled population is composed of GPS-enabled devices drawn out from a longituginal mobility data panel. It is important to emphasize the sampled population is obtained via convenience sampling and that the mobility data panel represents only a subset of the total population in an area at a time, specifically only users that turned on location tracking on their mobile device. Thus, derived metrics do not represent the total population density.
Incomplete Coverage: Mobility data is typically collected from sources such as mobile phone networks, GPS devices, or transportation systems. These sources may not be representative of the entire population or all economic activities, leading to sample bias and potentially inaccurate estimations.Not all individuals or businesses have access to devices or services that generate mobility data. This can result in incomplete coverage and potential underrepresentation of certain demographic groups or economic sectors.
Lack of Contextual Information: Mobility data primarily captures movement patterns and geolocation information. It may lack other crucial contextual information, such as transactional data, business types, or specific economic activities, which are essential for accurate estimation of economic activity.

Limitations of using points of interest database from OpenStreetMap#