About Data Commons

Publicly available data from open sources (census.gov, NOAA, data.gov, etc.) are vital resources for students and researchers in a variety of disciplines. Unfortunately, processing these datasets is often tedious and cumbersome. Organizations follow distinctive practices for codifying datasets. Combining data from different sources requires mapping common entities (city, county, etc.) and resolving different types of keys/identifiers. This process is time consuming and can increase the likelihood for methodological errors.

Data Commons attempts to synthesize a single graph from these different data sources. It links references to the same entities (such as cities, counties, organizations, etc.) across different datasets to nodes on the graph, so that users can access data about a particular entity aggregated from different sources without data cleaning or joining. Like the Web, Data Commons is open - any user can contribute datasets or build applications powered by the graph. In the long term, we hope the data contained within Data Commons will be useful to students, researchers, and enthusiasts across different disciplines. Though we've "jump-started" the graph with data from publicly available sources (CDC, US Census, FBI, etc.), we encourage you to join and contribute.

See also