Introducing the Data Partnership on GitHub

by Gabriel Stefanini Vicente

After launching in 2019, when the Inter-American Development Bank and the International Monetary Fund joined the initiave, the Development Data Partnership has scaled up fast, with more than 100 projects having been brought to life supported by partnerships with more than 20 companies.

From its inception, the Partnership has watched a growing interest and success using third-party data for economic development - especially where data scarcity poses extra challenges. Today, motivated by the Sustainable Development Goals, we are supporting teams working on confronting COVID-19 second wave, mobility indicators during the COVID-19 pandemic, COVID-19 Agent-Based Model, locust spread in Africa, estimating access to employment opportunities, accessibility, road safety, gender inequality, digital divide, food security and many others.

The key to expanding the success is to continue building up from the Partnership’s pillars: legal foundation, responsible data use, multi-disciplinary teams, secure architecture, data partnership management and data goods.

As more data partners and contributors join the initiative and more methodologies, results and derivative works are created, feeding them back to the community - with the curation of the Partnership – shall create a virtuous circle where common challenges, resources and expertise are shared and improved upon.

To achieve the goal and to give the community a home, we are introducing the Development Data Partnership on GitHub!

On GitHub, the Partnership welcomes specialists, researchers, data scientists and contributors to collaborate, find code examples as well as open issues, start discussions and learn from colleagues by example.

Another component is our content curation. As more and more projects join the community, you will find more examples and applications of third-party data’s potential for international development.

Documentation

For each Data Partner, the Partnership compiled a selection of tutorials and examples for each respective section on the Documentation.

Through the Development Data Partnership, you are now a contributor

access to a trove of data documentation, code snippets, collaboration tools and compute resources, all designed to help you get started on your project more quickly and effectively.

We are not reinventing the wheel; inspired by the literate programming paradigm, the documentation offers a centralized place for references and examples as well as details about the datasets, such as how to access the data (Getting Access), terms and conditions of the master data license, Summary and Examples.

In addition, we are bringing the power of Jupyter Book to create a beautiful rendition of notebooks, presenting them like reading a book and improving discoverability and dissemination of knowledge.

Projects

Getting Access

After submitting your proposal to Partnership and having it approved by the data partner, the section gives you details of how you should expect to get access to the data.

Even before your project take shape, the section helps estimate what kind what resources will be required from your team.

Summary

The rationale for a data summary is to give a sneak peek into the datasets, even before submitting a project proposal and exploring the data yourself.

For example, when exploring the mobile data opportunities, you will be able to see the number of records by geographic region and day by day, so you can evalulate the quality of the sampling of each data partner. In addition, you will be able to compare those metrics between different partners, possibly deciding for one or a combination of datasets.

Examples and Snippets

Taking an inspiration from Don’t repeat yourself (DRY) principle, examples, snippets and facilitators are at the heart of what the documentation was brought to life.

An example of a facilitator that was created by Data Partership is for handling Facebook.

When downloading from Facebook GeoInsights, unfortunately, no API is available. Fortunately, the Data Partnership supports an easy alternative for programmatically download the files via the Facebook class on the datapartnership Python package.

After installing the package, you will be able to create a login session and download the files programmatically from your notebook. See more about Facebook on the repository.

Tips and Tricks

As the name suggests, this section is where you can find tips and guides on how to set up common tools in projects, such as AWS, Python and environment variables.

Collaboration

The Partnership is not only about licensing and data, but it is about sharing code, knowledge and building a community. That is why we take advantage of everything GitHub has to offer.

You are more than welcome to share your own contributions. Certainly your team’s efforts will be much appreciated and recognized by your peers and the public.