Commuting Zones in Lebanon#

Commuting zones represent the areas where people spend most of their time and conduct most of their economic activity. These areas of economic integration are independent from political boundaries and can illustrate how economic communities and commute patterns transcend regional boundaries.

Data#

Features of Commuting Zones maps#

Updated several times a year using dynamic data from aggregated Facebook user home and work locations, and thus not limited by new census data results Built using a standard approach/methodology for the entire globe Contains metrics calculated to yield basic demographic, infrastructure and economic measures for each commuting zone Created using well-researched graph-theory methods such as Louvain algorithm clustering and Voronoi shapes Available as full polygon shapefiles so that they can be used for a wide range of geospatial analyses Agnostic of traditional administrative boundaries (except for sensitive countries, which are removed manually)

Population sample: Commuting Zones draws from a sample of people who use the Facebook mobile app and have enabled the Location Services setting. More info on Location Services on Facebook can be found here.

Spatial methods: The Commuting Zones spatial shapes are built by defining a community network. We start with a node for each city or town where people who use Facebook live. We define the edges of the graph using this formula:

(#residents moving from i to j + #residents moving from j to i) / (#residents in i + #residents in j)

We then reduce the complexity of this graph using a Louvain clustering algorithm. Once we have a simplified network/graph, we use Voronoi shapes to define the polygon for the commuting zone. Here is a more detailed walk-through of how commuting zones are generated.

Temporal span: Commuting zones are rebuilt at most every 3 months, when the commuting zone shapes are generated using the previous few weeks of Location Services data.

Minimum counts/size: To generate a commuting zone, there must be at least 50 people estimated to live within its boundary and a minimum size of at least 1 kilometer by 1 kilometer.

Descriptions of columns#

  • Estimated population (win_population): Estimated population within the zone (calculated from the publicly available Facebook High-Resolution Population Density Maps or WorldPop estimates). These population estimates are provided as counts per grid tile on the earth’s surface. We map each of these tiles to a commuting zone, then aggregate population by taking the sum of all the tiles within the commuting zone polygon defined in the geometry field. We then winsorizethe bottom and top 5% of commuting zones. This means that the population counts per commuting zone below the 5% percentile are replaced by the 5% percentile value, and those above the 95% percentile are replaced by the 95% percentile value. All commuting zones have a population of at least 50.

  • Estimated road length (win_roads_km): Estimated length of roads within the zone in kilometers (calculated from the publicly available OpenStreetMap and roads we’ve detected that are missing from it). We calculate this value by taking the sum of the length of all roads within the commuting zone polygon defined in the geometry field. We then winsorize the bottom and top 5% of commuting zones. This means that the road length counts per commuting zone below the 5% percentile are replaced by the 5% percentile value, and those above the 95% percentile are replaced by the 95% percentile value.

  • Area (area): Area of commuting zone in square kilometers. All commuting zones are at least 1 kilometer by 1 kilometer.

  • Restaurant/Bar count (restaurant_bar_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Library count (library_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Grocery/Food count (grocery_food_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Education count (education_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Local business count (local_business_locations_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Parks count (parks_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

  • Health services count (health_services_count): A scaled score of 1-1,000 calculated by counting all the Facebook Pages of this type within each commuting zone globally. We then winsorize those values. This means that the Page counts per commuting zone below the 2.5% percentile are replaced by the 2.5% percentile value, and those above the 97.5% percentile are replaced by the 97.5% percentile value. The winsorized count of Pages in each commuting zone is then scaled between 0 and 1,000. That means that the smallest winsorized value is given a score of 0 and the highest is given a score of 1,000, with all intermediate values assigned linearly.

Hide code cell source
fig, axs = plt.subplots(1, 3, figsize=(20, 8))
ax = axs.flatten()

for id, year in enumerate([2021, 2022, 2023]):
    if id != 2:
        legend_bool = False
    else:
        legend_bool = 2
    commuting_zones[commuting_zones["cz_gen_ds"].dt.year == year].plot(
        column="win_population", ax=ax[id], vmin=0, vmax=4500000, legend=legend_bool
    )
    ax[id].set_title(year)
    ax[id].spines["top"].set_visible(False)
    ax[id].spines["bottom"].set_visible(False)
    ax[id].spines["right"].set_visible(False)
    ax[id].spines["left"].set_visible(False)

    ax[id].set_xticks([])
    ax[id].set_yticks([])

plt.axis("off")
plt.suptitle("Estimated population counts")
Text(0.5, 0.98, 'Estimated population counts')
../../_images/77963c8e67336f3e3213dead7c029d5e2c9356935bf942d790367b420da845bf.png
commuting_zones[["cz_gen_ds", "win_population"]].groupby("cz_gen_ds").sum()
win_population
cz_gen_ds
2020-04-10 67239
2021-03-01 1881941
2021-04-07 6925978
2022-08-07 6970833
2023-03-05 11366693
Hide code cell source
fig, axs = plt.subplots(1, 3, figsize=(20, 8))
ax = axs.flatten()

for id, year in enumerate([2021, 2022, 2023]):
    if id != 2:
        legend_bool = False
    else:
        legend_bool = 2
    commuting_zones[commuting_zones["cz_gen_ds"].dt.year == year].plot(
        column="win_roads_km", ax=ax[id], vmin=0, vmax=11000, legend=legend_bool
    )
    ax[id].set_title(year)
    ax[id].spines["top"].set_visible(False)
    ax[id].spines["bottom"].set_visible(False)
    ax[id].spines["right"].set_visible(False)
    ax[id].spines["left"].set_visible(False)

    ax[id].set_xticks([])
    ax[id].set_yticks([])

plt.axis("off")
plt.suptitle("Estimated road length in km")
Text(0.5, 0.98, 'Estimated road length in km')
../../_images/68d3dddcb1e7f43497d0416714753082c9e50cf6b3095c2ed49ed037fa6018cb.png
Hide code cell source
fig, axs = plt.subplots(2, 3, figsize=(12, 10))
ax = axs.flatten()

plt.axis("off")

for idx, type in enumerate(
    [
        "restaurant_bar_count",
        "library_count",
        "grocery_food_count",
        "education_count",
        "local_business_locations_count",
        "health_services_count",
    ]
):
    if idx == 5:
        legend_bool = True
    else:
        legend_bool = False
    commuting_zones.plot(column=type, ax=ax[idx], vmin=0, vmax=1000, legend=legend_bool)
    ax[idx].set_title(type.replace("_", " ").capitalize())
    ax[idx].axis("off")

    plt.suptitle("Number of Amenities in Commuting Zones")
../../_images/75f171c908a0ff859a90c84e25dc1ec06b538f73a3dfac8cb9a393cdf1247fa9.png

Observations#

  • The area close to the Syrian border appears to have a lot of economic activity and population