Explore the data using one of our visualization tools (Scatter, Map,
Timeline), the Place Explorer, or use the APIs via Python notebooks or build your own
REST based application. Explore the thousands of variables in the Statistical Variable Explorer. Contribute to the project on Github .
New update
We have launched a new Data Download Tool that allows you to easily download statistical data for a large number of places with just a few button clicks.
We have continuously added data to the Data Commons graph over the last 4 years. As of April 2022, the graph contains:
1.4 trillion triples
3 billion time series
2.9 million places
100,000 variables
Separately, the Biomedical Data Commons includes 200,000 variables and 850 billion triples.
Building connections
Data Commons covers many topics, from Demographics and Economics to
Emissions and the Climate. The benefit of aggregated data from across multiple data sets is that it now becomes much easier to build connections across these data sets. Here are some of our favorite data excursions …
some alarming, some sad, but always illuminating.
Climate change
Max projected summer temperatures for US counties (RCP 4.5) (source: NASA)
Climate change is not just about reducing carbon emissions. It is also
about adaptation to the change that is already happening. The change in
temperature is not as simple as 1.5°C vs 2°C vs 2.5°C. These are global
averages aggregated over a period of time. At every one of those levels,
there will be places that become much hotter and places that become
colder. The timing of peak temperatures also changes.
Explore what temperatures might be according to the CCSM4 model :
The progression in Europe over time (2040,
2045,
2050)
Heart condition vs. max projected summer temperature for US counties (RCP 4.5) (source: CDC, NASA)
Climate change is not just about reducing carbon emissions. It is also
about adaptation to the change that is already happening. The heart
condition and temperature scatter is a scatter plot of the expected
peak temperatures (in 30 years) in a county with the fraction of people
suffering from coronary ailments. Note the outliers in the upper right
quadrant: counties like Todd County, SD and Oglala Lakota
County, SD with high incidence of coronary disease can also expect
some of the highest temperature rises2, something we need to prepare for.
Water withdrawal trends in California (source: USGS)
Scarcity of water, for crops, animals and humans, could well be one of
the things most at risk from climate change. California and
the Southwest are some of the biggest consumers of water. However,
we can see that utilization is improving. California in
particular, has seen irrigation water consumption go down, while
increasing agriculture yields. Household water
consumption has stayed flat over the last 30+ years, while population has gone up substantially.
However, digging deeper we find that in Imperial County (the third highest county in terms of water
consumption), the use of groundwater has risen sharply even though overall water consumption, including surface water
use, has gone down.
Fraction of positive Covid-19 cases vs. Fraction of uninsured across US counties (source: US Census, New York Times)
As many insightful articles from the New York Times and others pointed
out, Covid-19 affected African American communities much more.
Unfortunately, Covid-19 prevalence is correlated3 with many other
indicators. For example, we see that Covid-19 infection rates are highly
correlated with the fraction of the population that is uninsured, with the fraction of the population in poverty, the fraction of the population on
food stamps, etc.
Of course, these are just correlations. This Colab
notebook digs deeper, performing a causal analysis to discover the
most variables most causally predictive of Covid-19 occurrence and
morbidity.
Income and other inequalities
Prevalence of obesity vs. Fraction of population in poverty (source: US Census, CDC)
Studies from CDC and
others have shown a correlation3 between obesity and poverty. In this Colab
notebook , we explore the relation between poverty, unemployment and
obesity. Unfortunately, many other medical conditions are inversely
correlated with economic well being.
Explore the relation between these variables for counties across the US:
1. RCP 2.6 (optimistic), represents a stringent mitigation scenario, while RCP 8.5 (pessimistic) represents a scenario with very high Greenhouse Gas emissions. Source: IPCC