In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from bokeh.plotting import figure, output_file, show, output_notebook
from bokeh.models import Span
from shapely.geometry import Point
import geopandas as gpd
import glob
from datetime import datetime
from bokeh.layouts import Row, column, gridplot
from bokeh.models import Title, Legend, TapTool, Range1d, Tabs, Panel
import matplotlib as mpl
from bokeh.core.validation import silence
from bokeh.core.validation.warnings import MISSING_RENDERERS, EMPTY_LAYOUT

# Set fonts for matplotlib
plt.rcParams['font.family'] = 'Arial'
plt.rcParams['font.size'] = 14

In [2]:
import bokeh
from bokeh.core.validation.warnings import EMPTY_LAYOUT
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(EMPTY_LAYOUT, True)



In [3]:
def convert_to_gdf(df):
    geometry = [Point(xy) for xy in zip(df.longitude, df.latitude)]
    gdf = gpd.GeoDataFrame(df, crs="EPSG:4326", geometry=geometry)

    return gdf

# Business Activity Trends

[Business Activity Trends](https://dataforgood.facebook.com/dfg/tools/business-activity-trends#methodology) are a crisis-triggered dataset released by Meta. The dataset is released at a national level. COVID-19 triggered dataset is used in this analysis. 

Business Activity Trends During Crisis uses data about posting activity on Facebook to measure how local businesses are affected by and recover from crisis events. Given the broad presence of small businesses on the Facebook platform, this dataset aims to provide timely estimates of global business activity without the common limitations of traditional data collection methods, such as scale, speed and nonstandardization. This is a crisis-triggered dataset i.e., it has been created by Meta to support humanitarian relief for post COVID in Egypt. Details about this dataset can be found on [Meta's Data For Good page](https://dataforgood.facebook.com/dfg/tools/business-activity-trends). 


## Data

The Business Activity Trends dataset was provided by [Meta](https://dataforgood.facebook.com/dfg/tools/business-activity-trends) through the proposal [Egypt Country Economic Monitor](https://portal.datapartnership.org/readableproposal/427) of the [Development Data Partnership](https://datapartnership.org). The data consisted of daily business activity quantile information at a GADM 2 level broken down by business vertical. Each cell (row) of the dataset contains data on the daily activity within a polygon-vertical combination. [GADM shapefiles](https://gadm.org/) are slightly different from the official shapefiles used in this project from [Humanitarian Data Exchange](https://data.humdata.org/dataset/cod-ab-tur).


**The COVID-19 triggered Business Activity Trends dataset contains daily data from March 1, 2020 to Nov 29, 2022 at the national level. Activity quantile is used to measure the business activity trends in this dataset as well.**


**Population Sample**
The Business Activity Trends During Crisis dataset uses a static sample of businesses’ Facebook Pages for each crisis defined at each crisis date. It does not take into account new Pages businesses created during the crisis, nor does it exclude Pages removed during the crisis. The sample for each crisis is defined as Facebook Pages that meet the following criteria:
* Have an admin
* Have monthly activity as of the crisis start date
* Were created at least 90 days prior to the crisis start date
* List a physical location
* Are associated with a business as defined by internal business Page classifiers
* Represent a local business according to business vertical categories (which excludes large companies, for example)
* Pass Facebook’s internal quality control measures such as filtering for spam and duplicate Pages



**Business Vericals**
The business verticals are categories determined by the admins of the Facebook Business Page. 

* *All*: Refers to all businesses in the polygon. This includes all of the following categories except public good, because the activity of public good Pages tends to differ from other businesses during crises.
* *Grocery and convenience stores*: Retailers that sell everyday consumable goods including food (typically unprepared foods and ingredients) and a limited range of household goods (like toilet paper). These can include grocery stores, convenience stores, pharmacies and general stores.
* *Retail*: Retail other than grocery and convenience stores such as auto dealers, home goods stores, personal goods stores and general merchandise/big-box stores like Walmart
Restaurants: Businesses that sell prepared food and beverages for on-premise or off-premise dining
* *Local events*: Events, activities and businesses that sell real-life experiences, such as amusement parks, bowling alleys, concert venues and social clubs
* *Professional services*: Services driven by demand from an individual event such as a legal need or health issue that require high customization. Providers usually have an advanced degree or certification and are considered experts and “knowledge workers.” Examples include CPAs, lawyers, medical professionals, architects.
* *Business and utility services*: Business offering business-to-business services like construction, office cleaning, advertising and marketing, and business software solutions. Utility services offer commodity services like electric, phone, internet, water and energy.
* *Home services*: Services driven by demand from an individual event at home such as plumbing or electrical work. Examples include home repairs, photographers, cleaning, mechanics, plumbers, electricians, landscapers, interior decorators.
* *Lifestyle services*: Specific to beauty, care and fitness services. These businesses offer standardized services that are part of a customer's regular routines. Examples include gyms, salons, barbers, and nonmedical and noneducational supervision, like childcare nurseries and pet care.
* *Travel*: Businesses that provide or sell transportation or accommodation services, such as airlines, hotels, car rentals and tour operators
* *Manufacturing*: Businesses that manufacture durable goods (like furniture and cars) or consumable goods (like food and personal goods) and have no or limited business-to-customer sales
* *Public good*: Includes government agencies, nonprofits and religious organizations


## Methodology

This method for understanding local economic activity was first described by the University of Bristol team and published in [Nature Communications](https://www.nature.com/articles/s41467-020-15405-7) {cite:p}`Eyre2020`. Business activity is measured by the volume of posts made by business Pages on Facebook on a daily basis, where a post is defined broadly to include posts, stories and reels created by the business Page anywhere on Facebook. In practice, almost all posts are either made on the business Page itself or in Facebook Groups.

For each crisis event, a baseline posting pattern is established using the 90 days prior to the event start date. Meta then measures the daily posting activity relative to the expected posting activity based on the baseline period. Individual business Page activity is then aggregated by business vertical (proxy for economic sector) and by GADM administrative polygons geographically. 

The business activity is measured through activity quantiles. This is equivalent to the 7-day average of what University of Bristol researchers call the [aggregated probability integral transform metric](https://www.nature.com/articles/s41467-020-15405-7). It is calculated by first computing the approximate quantiles (the midquantiles in the article) of each Page’s daily activity relative to their baseline activity. The quantiles are summed and the sum is then shifted, rescaled and variance-adjusted to follow a standard normal distribution. The adjusted sum is then probability transformed through a standard normal cumulative distribution function to get a value between 0 and 1. Following this, the average of this value over the last 7 days is obtained to smooth out daily fluctuations. This metric is given a quantile interpretation since it compares the daily activity to the distribution of daily activity within the baseline period, where a value around 0.5 is considered normal activity. *This is a one-vote-per-Page metric that gives equal weight to all businesses and is not heavily influenced by businesses that post a lot.* 

The full technical details of the methodology used for this datset can be found in the [white paper](https://scontent-iad3-2.xx.fbcdn.net/v/t39.8562-6/313431392_1209469252938025_9085357585007907228_n.pdf?_nc_cat=100&ccb=1-7&_nc_sid=ae5e01&_nc_ohc=XYjhPigfKDwAX-PRwOp&_nc_ht=scontent-iad3-2.xx&oh=00_AfAXU8Aylea13vEKHZoffq3qBQw2TVadXDPcKp40Ib5Ziw&oe=6428FDCD) authored by researchers from Meta. 


## Implementation

Once the data was obtained from the Meta Data For Good portal, the polygons were transformed to align with the shapefiles provided by UNOCHA. More details can be found in the attached notebook. 


## Limitations

One of the biggest limitations of using this dataset is that it is based entirely on Facebook users. Therefore, it is important to note that this dataset may not be representative of the entire Egyptian population evenly (Palen & Anderson, 2016). The methodology uses posts on Facebook business pages and groups to estimate changes in business activity. This framework is best used to see how quickly business have recovered from a natural disaster, in this case, the earthquake (Eyre et. al., 2020). The methodology relies on the assumption that businesses tend to publish more posts when they are open and fewer when they are closed, hence analysing the aggregated posting activity of a group of businesses over time it is possible to infer when they are open or closed. 


In [4]:
egypt_adm2 = gpd.read_file('../../data/shapefiles/egy_admbnda_adm2_capmas_20170421/egy_admbnda_adm2_capmas_20170421.shp')
egypt_adm1 = gpd.read_file('../../data/shapefiles/egy_admbnda_adm1_capmas_20170421/egy_admbnda_adm1_capmas_20170421.shp')

In [46]:
all_files = glob.glob('../../data/business_activity_trends/3661471785186399_2020-03-01_2022-11-30_csv/*.csv')

businessActivity = pd.DataFrame(columns = ['polygon_id', 'polygon_name', 'polygon_level', 'polygon_version',
       'country', 'business_vertical', 'activity_quantile', 'latitude',
       'longitude', 'ds'])

li = []

for file in all_files:
    df1 = pd.read_csv(file)
    li.append(df1)

businessActivity = pd.concat(li, axis=0)
businessActivity = businessActivity[businessActivity['country']=='EG']

In [47]:
# convert columns to datetime
businessActivity['ds'] = businessActivity['ds'].apply(lambda x: pd.to_datetime(x))
businessActivity = businessActivity[businessActivity['country']=='EG' ]

In [48]:
business_verticals = list(businessActivity['business_vertical'].unique())

print(f'COVID-19 Business Actvity Trends has the following business verticals {business_verticals}')

COVID-19 Business Actvity Trends has the following business verticals ['Home Services', 'Manufacturing', 'Local Events', 'Grocery & Convenience Stores', 'Business & Utility Services', 'Public Good', 'Travel', 'Lifestyle Services', 'Restaurants', 'Retail', 'Professional Services', 'All']


In [20]:
# define color palette
color_palette = [  '#4E79A7',  # Blue
    '#F28E2B',  # Orange
    '#E15759',  # Red
    '#76B7B2',  # Teal
    '#59A14F',  # Green
    '#EDC948',  # Yellow
    '#B07AA1',  # Purple
    '#FF9DA7',  # Pink
    '#9C755F',  # Brown
    '#BAB0AC',  # Gray
    '#7C7C7C',  # Dark gray
    '#6B4C9A',  # Violet
    '#D55E00',  # Orange-red
    '#CC61B0',  # Magenta
    '#0072B2',  # Bright blue
    '#329262',  # Peacock green
    '#9E5B5A',  # Brick red
    '#636363',  # Medium gray
    '#CD9C00',  # Gold
    '#5D69B1',  # Medium blue
]

In [39]:
bokeh.core.validation.silence(EMPTY_LAYOUT, True)

def get_line_plot(businessActivity,title, source, earthquakes=False, subtitle=None):

    p2 = figure(x_axis_type = 'datetime', width = 1000, height = 400, toolbar_location='above')
    p2.add_layout(Legend(), "right")

    for id, business_vertical in enumerate(businessActivity['business_vertical'].unique()):
        df = businessActivity[businessActivity['business_vertical']==business_vertical][['ds', 'activity_quantile']].reset_index(drop=True)
        p2.line(df['ds'], df['activity_quantile'], line_width=2, line_color = color_palette[id], legend_label=business_vertical)

    p2.legend.click_policy='hide'
    if title is not None:
        p2.title = title


    title_fig = figure(title=title, toolbar_location=None,width=800, height=20 )
    title_fig.title.align = "left"
    title_fig.title.text_font_size = "10pt"
    title_fig.border_fill_alpha = 0
    title_fig.outline_line_width=0

    #with silence(MISSING_RENDERERS):
    sub_title = figure(title=source, toolbar_location=None,width=800, height=20 )
    sub_title.title.align = "left"
    sub_title.title.text_font_size = "10pt"
    sub_title.title.text_font_style="normal"
    sub_title.border_fill_alpha = 0
    sub_title.outline_line_width=0

    layout = column(title_fig, p2, sub_title)

#     if earthquakes:
#         p2.renderers.extend([
#         Span(
#             location=datetime(2023, 2, 6),
#             dimension="height",
#             line_color='#7C7C7C',
#             line_width=2,
#             line_dash=(4,4)
#       ),
#         Span(
#             location=datetime(2023, 2, 20),
#             dimension="height",
#             line_color='#7C7C7C',
#             line_width=2,
#             line_dash=(4,4)
#         ),
#     ]
# )

    return p2



In [52]:
df = businessActivity.groupby([pd.Grouper(key='ds', freq='M'), 'business_vertical']).mean().reset_index()

In [57]:
output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)

tabs = []

tabs.append(
    Panel(
    child=get_line_plot(df, f"Business Activity (National average post 1st March 2020)", "Source: Data for Good Meta", subtitle = 'National average post COVID-19 compared to pre-pandemic baseline'),
                title='2022',
            )
            )

tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)

```{figure} ../../docs/images/logo.png
---
height: 0px
---
Please note the above figure is interactive - the user can scroll to the right to view the legend. By clicking the trend lines, the user can turn then 'off' and 'on', to focus on specific trends of interest. The user can also use the control panel in the upper-right of the graph to zoom in on a specific part of the graph, save the graph for downloading.
```

**Observations:** In general, business activity trends in retail, manufacturing, grocery, business, and lifestyle services have been slightly declining from 2020 through 2022, while other sectors have remained relatively stable. 

## References

{cite:empty}`Palen2016-gp`

```{bibliography}
:filter: docname in docnames
:style: unsrt
```