Analysing business activity trends in Turkiye#
Business Activity Trends are a crisis-triggered dataset released by Meta. The dataset is relesed at a GADM admin 2 level. GADM shpefiles are slightly different from the official shapefiles used in this project from HdX, UNOCHA. Two Business Activity Trends datasets are used in this analysis - COVID-19 triggered dataset and Turkiye Earthquake triggered dataset.
This notebook shows the implementation of
Visualizing a baseline of Business Activity one year prior to the earthquake (February 2022)
Visualizing changes in trends in Business Actviity after the earthquake by admin region and business vertical
Observing Business Activity Trends#
The Business Activity from two datasets is obtained to compare this year’s chnge in activity to last year - COVID-19 triggered dataset (2022) and erthquake triggered datset (2023).
Earthquake triggered business activity dataset#
The Earthquake triggered Business Activity Trends dataset contains daily data from 5th Februaary, 2023 likely till May 2023. This is done at the GADM 2 level. The activity quantile metric is used to measure changes in Business Activity.
Activity quantile (activity_quantile): The level of activity as a quantile relative to the baseline period. This is equivalent to the 7-day average of what the University of Bristol researchers call the aggregated probability integral transform metric (see this article in Nature Communications). It’s calculated by first computing the approximate quantiles (the midquantiles in the article) of each Page’s daily activity relative to their baseline activity. The quantiles are summed and the sum is then shifted, rescaled and variance-adjusted to follow a standard normal distribution. The adjusted sum is then probability transformed through a standard normal cumulative distribution function to get a value between 0 and 1. We then average this value over the last 7 days to smooth out daily fluctuations. We give this metric a quantile interpretation since it compares the daily activity to the distribution of daily activity within the baseline period, where a value around 0.5 is considered normal activity. This is a one-vote-per-Page metric that gives equal weight to all businesses and is not heavily influenced by businesses that post a lot. It is adviced to use this metric, especially if robustness to outliers and numerical stability are important concerns.
COVID-19 triggered business activity dataset#
The COVID-19 triggered Business Activity Trends dataset contains daily data from March 1, 2020 to Nov 29, 2022 at the national level. Activity quantile is used to measure the business activity trends in this dataset as well.
Difference between COVID-19 Business Activity Trends dataset and the Earthquake triggered Business Activity trends dataset#
The difference between then All Business Vertical in the COVID-19 Business Activity Trends dataset and the Earthquake triggered Business Activity trends dataset is that the latter does not include Public Good. The rest of the business activities remain the same across the two datasets (detaled descriptions in the previous page). Another difference is that the baseline value earthquake triggered dataset is a 90 day prior value. For the COVID-19 Business Activity Trends it is 1st March, 2019.
Show code cell source
print(
f'Business Activity Trends has {len(businessActivity2023["polygon_name"].unique())} districts and the boundaries map from UNOCHA contains {len(turkey_adm2["adm2_en"].unique())} districts. {len(turkey_adm2["adm2_en"].unique())-len(businessActivity2023["polygon_name"].unique())} districts do not have data'
)
Business Activity Trends has 575 districts and the boundaries map from UNOCHA contains 948 districts. 373 districts do not have data
Show code cell source
# Convert polygon name to upper to match with boundaries file and change column name
import unicodedata
businessActivity2023["polygon_name"] = businessActivity2023["polygon_name"].apply(
lambda x: x.upper()
)
businessActivity2023["polygon_name"] = businessActivity2023["polygon_name"].apply(
lambda x: unicodedata.normalize("NFD", x).encode("ascii", "ignore").decode("utf-8")
)
# Fixing the business activity trends admin names to match UNOCHA
businessActivity2023["polygon_name"].replace(
{
"EYUP": "EYUPSULTAN",
"ONDOKUZ MAYIS": "19 MAYIS",
"KAZAN": "KAHRAMANKAZAN",
"DOGUBEYAZIT": "DOGUBAYAZIT",
"MUSTAFA KEMALPASA": "MUSTAFAKEMALPASA",
"SULTAN KOCHISAR": "SEREFLIKOCHISAR",
"SINCANLI": "SINANPASA",
"AKKOY": "PAMUKKALE",
"SULTAN KARAHISAR": "SEBINKARAHISAR",
},
inplace=True,
)
matched_districts = list(
set(businessActivity2023["polygon_name"].unique()).intersection(
set(turkey_adm2["adm2_en"].unique())
)
)
print(
list(
businessActivity2023[
~(businessActivity2023["polygon_name"].isin(matched_districts))
]["polygon_name"].unique()
)
)
print(
"The above districts remain unmapped. Note: Merkez district unmapped is due to a bug in the GADM data. To account for this, we used the lat long coordinates and joined with the shapefiles being used in this project from UNOCHA"
)
['MERKEZ']
The above districts remain unmapped. Note: Merkez district unmapped is due to a bug in the GADM data. To account for this, we used the lat long coordinates and joined with the shapefiles being used in this project from UNOCHA
Show code cell source
fig, axs = plt.subplots(2, 2, figsize=(24, 10), sharex=True, sharey=True)
ax = axs.flatten()
dates_of_interest = ["2023-02-05", "2023-02-12", "2023-02-19", "2023-02-26"]
images = []
for i in range(0, len(dates_of_interest)):
turkey_adm2.boundary.plot(ax=ax[i], edgecolor="#D3D3D3", linewidth=0.5)
im = gdf[gdf["ds"] == dates_of_interest[i]][["activity_quantile", "geometry"]].plot(
column="activity_quantile",
ax=ax[i],
legend=False,
cmap="Spectral",
vmin=0,
vmax=1,
)
images.append(im.collections[0])
ax[i].set_title(
f"{dates_of_interest[i]}",
fontsize=14,
)
ax[i].title.set_position([0, 0])
ax[i].set_xticks([])
ax[i].set_yticks([])
ax[i].spines["top"].set_visible(False)
ax[i].spines["bottom"].set_visible(False)
ax[i].spines["right"].set_visible(False)
ax[i].spines["left"].set_visible(False)
cbar = fig.colorbar(images[1], ax=axs)
suptitle = fig.suptitle(
"Business Activity Trends (Admin 2 level compared to 90 day prior baseline)",
fontsize=20,
fontweight="bold",
)
suptitle.set_y(0.95)
suptitle.set_x(0.3)
# ax[0].cax = cbar.ax[0]
for im in images:
im.set_clim(vmin=0, vmax=1)
im.set_cmap("Spectral")
im.set_norm(cbar.norm)
Show code cell source
output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)
tabs = []
# Taking the mean activity quantile for the entire country to allow for comparison with last year's data
df = (
businessActivity2023.groupby(["country", "business_vertical", "ds"])
.mean("activity_quantile")[["activity_quantile"]]
.reset_index()
)
tabs.append(
TabPanel(
child=get_line_plot(
df,
"Business Activity in 2023",
"Source: Data for Good Meta",
earthquakes=True,
subtitle="National average post earthquake compared to 90 day prior baseline",
),
title="2023",
)
)
tabs.append(
TabPanel(
child=get_line_plot(
businessActivity2022,
"Business Activity in 2022 (National average post COVID-19)",
"Source: Data for Good Meta",
subtitle="National average post COVID-19 compared to pre-pandemic baseline",
),
title="2022",
)
)
tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)
Change in Business Activity by business verticals in affected areas#
Although national average may be useful to compare with baseline data available at the same level from 2022, for the sake of understanding earthquake impact, it might be useful to see the impact of the earthquake on business verticals in the affected regions. For this we get the data for the areas of interest within Turkey,and see how business activity changed in those regions.
We show the data here in two formats -
Daily changes in Business Activity
Weekly changes in Business Activity where the data is aggregated at a week level
Show code cell source
nr_adm = len(aoi["pcode"].unique())
# affected_adm2 = list(aoi['pcode'].unique())
affected_adm2 = list(
turkey_adm2[turkey_adm2["pcode"].isin(aoi["pcode"].unique())]["adm2_en"]
)
print(f"There are {nr_adm} admin-2 regions which are of interest")
There are 75 admin-2 regions which are of interest
Show code cell source
output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)
tabs = []
for adm in affected_adm2:
df = gdf[gdf["adm2_en"] == adm]
tabs.append(
TabPanel(
child=get_line_plot(
df,
"Business Activity in Affected Areas",
"Source: Data for Good Meta",
earthquakes=True,
subtitle="GADM2 level average post earthquake compared to 90 day prior baseline",
),
title=adm.capitalize(),
)
)
tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)
Show code cell source
output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)
tabs = []
for adm in affected_adm2:
df = week[week["adm2_en"] == adm]
tabs.append(
TabPanel(
child=get_line_plot(
df,
"Weekly Business Activity in Affected Areas",
"Source: Data for Good Meta",
earthquakes=True,
subtitle="GADM2 level average post earthquake compared to 90 day prior baseline",
),
title=adm.capitalize(),
)
)
tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)