Launch of the New Data.gov Catalog

On February 5, 2021 we will be launching a new version of the Data.gov catalog.

The new catalog is the culmination of many months of work in updating the behind-the-scenes functioning of the Data.gov catalog, which automatically harvests over 1000 different sources from federal, state and local open data sources to provide a comprehensive catalog of open government data. The new catalog runs an updated version of CKAN, the open source technology behind the Data.gov catalog, and should improve the process of automatically updating Data.gov with the most recent datasets.

Most users will not notice any differences in the new catalog. At first, the front page of the catalog will show a small reduction in total datasets. That reflects the deletion of outdated datasets from harvest sources that are no longer maintained, mostly at the state and local level. As we have noted in the past, the front page number of total datasets changes frequently as many harvest sources are checked every day for updates. Also, the catalog counts collections of datasets as “1” in the total number. The validation and update of all 1,098 harvest sources should result in an improved user experience with a higher degree of accuracy and fewer broken links. In the coming months, we will deploy additional features to continue to improve the quality of the Data.gov catalog.

COVID-19 is Complex, as is COVID-19 Open Data

This post was originally published on the Healthdata.gov blog by Kristen Honey, Chief Data Scientist and Senior Advisor to Assistant Secretary for Health (ASH), HHS; Amy Gleason, Data Strategy and Execution Workgroup Lead, U.S. Digital Service; and Kevin Duvall, Deputy Chief Data Officer (CDO), Office of the CDO, HHS

 

Due to demand, please access the data files here:
– Download: https://healthdata.gov/sites/default/files/reported_hospital_capacity_admissions_facility-level_weekly_average_timeseries_20201207.csv
– Data Dictionary: https://healthdata.gov/covid-19-reported-patient-impact-and-hospital-capacity-facility-data-dictionary

Today, the U.S. Department of Health and Human Services (HHS) published COVID-19 hospital data at the facility level for each week, going back to August 1st 2020. This information was previously aggregated by state. This new data and its data quality are paramount to the U.S. pandemic response. COVID-19 models, analyses, and predictive analytics can only be as good as the data they ingest. By sharing these new hospital COVID-19 capacity data at the facility-level with the public, the opportunity exists to improve data quality.

This data is tremendously complex and is the result of substantial ongoing efforts by hospital, state, and Federal personnel who collaborate to meet daily data reporting requirements. This ongoing collaboration has steadily improved data quality and reporting consistency in recent weeks, even as the content of hospital capacity data has become more complex.

We opted not to have perfect be the enemy of good, so these datasets will have imperfections. To continue improving the quality of data, we welcome your feedback. When more people access and use the data, we have more collective ability to identify gaps, errors, or other problems with these COVID-19 datasets. All stakeholders — local community coordinators, data scientists, data journalists, and healthcare researchers — are encouraged to download, analyze, study the datasets, and share your feedback with HealthData.gov.

We also encourage you to visit and contribute to Frequently Asked Questions (FAQs) about this hospital-level data. These living FAQs are crowdsourced from the public, so your contributions will improve these FAQs over time.

Commitment to Transparency and Open Data

By opening COVID-19 datasets, our collective goal is to accelerate scientific and public health insights and shorten the time it takes for COVID-19 information and solutions to save lives. Our HHS efforts are enabling a data-driven ecosystem for everyone. Based on public feedback and interaction with the COVID-19 datasets, this open-data ecosystem will evolve while guided by these principles: Transparency. Sharing. Privacy. Security. Community.

Today’s unprecedented pandemic demands near-real-time data sharing across government, across diverse sectors, and with the public. HHS is aggressively responding to this call-to-action by responsibly unlocking information wherever our societal benefits of data access outweigh the potential risks. This is a collaborative effort led by HHS Office of the Chief Data Officer, within the Office of the Chief Information Officer, in close coordination with the Centers for Disease Control and Prevention (CDC), local/state/tribal governments, HHS Office of the Assistant Secretary for Health (OASH), HHS Office of the Assistant Secretary for Preparedness and Response (ASPR), and the White House U.S. Digital Service.

Beyond the whole-of-government response, COVID-19 necessitates a whole-of-America response. Your public input so far has enabled HHS to identify and prioritize information of highest value. We appreciate the hard-working journalists, data scientists, academic healthcare researchers, business innovators, and Americans who are giving time, energy, and insights to help us responsibly open data related to the novel coronavirus.

The data reporting is one part of the whole-of-government response and is to ensure that every patient requiring hospitalization receives the care they need. Patients should not be discouraged from seeking hospital care based on their interpretation of the data. Hospitals have protocols in place to keep patients safe from exposure and to ensure all patients are prioritized for care. By working together across sectors, we are harnessing all capabilities and resources to unleash the power of U.S. data for the COVID-19 response.

Together, we can combat COVID-19.

Disclaimer: Please visit the CDC’s website on COVID-19 for the most up-to-date information and COVID-19 guidance.

Improving Access to Older Adult Health Data for Timely Use Amid COVID-19 and Beyond

Today marks the launch of the Older Adults Health Data Collection – a new resource cataloging over 200 Federal datasets previously available on Data.gov related to the health of older Americans. This centralized location will assist experts from academia, industry, government, civil society, and the public in accessing datasets from various Federal agencies and across a range of health-related issues (e.g. health status, health risks and behaviors, and health care) to advance our collective knowledge and understanding of the health of older adults.

Analyzing data on the older adult population in the time periods before, during, and after the pandemic is an important step to gaining a better understanding of both the health of this age group as well the immediate and long-term impacts of the pandemic. The Older Adults Health Data Collection contains datasets that capture outcomes directly related to COVID-19 as well as others that do not, and the number of COVID-19 related datasets in the collection is anticipated to grow over time. Datasets without COVID-19-specific data elements still provide important context on the health of older adults before the pandemic and are likely to incorporate COVID-19 data elements in their data collection procedures in the months and years to come.

Demographic trends underscore just how vital it will be to understand this growing population group. By 2030, one in five Americans will be 65 years and older. Over the next 40 years, the number of Americans 65 years and older will almost double.1 COVID-19 has also significantly impacted this age group, as underscored by evidence that older adults have experienced the highest hospitalization and morbidity rates from the pandemic.2 This is partly the result of older adults having high rates of chronic diseases, such as heart disease, type 2 diabetes, arthritis, and cancer, which are the Nation’s leading drivers of illness, disability, deaths, and health care costs. 

Projected number of older adults in the United States1 Coronavirus disease 2019 (COVID-19)–associated hospitalization rates2
   


This new resource aligns with the intent of the Foundations for Evidence-Based Policymaking Act of 2018 and directives in the Federal Data Strategy 2020 Action Plan, both of which encourage increased public access to Government data, collaboration with non-Government entities, interagency collaboration, and protection of data security and confidentiality.  This data collection also supports the efforts of the Federal Interagency Forum on Aging Related Statistics, which aims to improve the quality and utility of data on the aging population.  Two of the Aging Forum’s goals include encouraging cross-national research on the aging population and promoting communication among data producers, researchers, and public policymakers. Additional non-health and non-Federal datasets related to older adults are available at agingstats.gov.

As the first national open data site, Data.gov is the ideal platform to provide a wide variety of stakeholders with access to the Older Adults Health Data Collection. Data in this collection has been made available in open formats while ensuring privacy and security, consistent with all applicable laws, regulations, and policies governing data use, disclosure, and sharing. By reducing the time spent searching for data on the health of older adults from hours to seconds, the Older Adults Health Data Collection serves as a helpful resource to put Federal data to work on behalf of Americans.

To access the Older Adults Health Data Collection, follow this link: https://catalog.data.gov/group/older-adults-health-data

Sources

  1. U.S. Census Bureau, 2017 National Population Projections.
  2. Garg S, Kim L, Whitaker M, et al. Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Confirmed Coronavirus Disease 2019 — COVID-NET, 14 States, March 1–30, 2020. MMWR Morb Mortal Wkly Rep 2020;69:458–4  64. DOI: http://dx.doi.org/10.15585/mmwr.mm6915e3

 

Vijeth Iyengar, PhD, is a Policy Advisor at the White House Office of Science and Technology Policy

Mark C. Bicket, MD, PhD, is a White House Fellow at the White House Office of Science and Technology Policy

 

Hourly Electric Grid Monitor reports new information on U.S. electricity demand, net generation, and interchange collected by the U.S. Energy Information Administration

To a federal statistical agency like the U.S. Energy Information Administration (EIA), there’s nothing more satisfying than providing needed information that can facilitate more informed analysis and policy decisions on a national and regional level. EIA recently launched its new Hourly Electric Grid Monitor, a redesigned and enhanced version of EIA’s existing U.S. Electric System Operating Data website.  The data for the Hourly Electric Grid Monitor come from the Form EIA-930, Hourly and Daily Balancing Authority Operations Report, which collects hourly electricity demand, forecast demand, net generation, and interchange data from the 65 electricity balancing authorities that operate the electric grid in the Lower 48 states.  The Hourly Electric Grid Monitor incorporates two new data elements: hourly electricity generation by energy source and hourly subregional demand. The new website also provides new and more flexible options for visualizing the data and allows users to create custom dashboards that can be saved and shared.

Although electric system balancing authorities covering most of the United States have released real-time information on grid operations since the late 1990s, EIA’s Hourly Electric Grid Monitor expands the availability of data to the entire contiguous 48 states, and makes it available in a consistent format from a single source.

Among other applications, the data can be used to provide timely information on electric system recovery after power interruptions and to help evaluate the effects of renewable energy, smart grid, and demand-response programs on power system operations.  The tool allows you to visualize and analyze:

  • Total U.S. and regional electricity demand on an hourly basis
  • The varied mix of energy sources used to generate electricity at different times and locations
  • The hourly flow of electricity between electric systems
  • The wide variety in electric systems’ daily demand shapes and the seasonality of daily demand patterns
  • The extent to which electric systems rely on internal and external sources of supply to meet their demand
  • Potential stress on electric systems when actual demand significantly exceeds forecasted demand
  • Total hourly flows of electricity with Canada and Mexico

Have fun exploring!

Your Open Data Story

Open government data enables us to create tools that deliver insights on topics ranging from education and health to entrepreneurship and foreign aid. Families reviewing college options can compare tuition, graduation rates, and potential post-graduation salaries, using the College Scorecard– an application built with Department of Education data. By analyzing CDC data on infant mortality and the USDA’s Food Access Research Atlas, researchers can study the relationship between a city’s infant mortality rates and citizens’ access to healthy food options- add on data from the U.S. Census Bureau, and those same researchers can tell a story about how a family’s annual household income may determine a newborn’s health outlook. A maker of personalized key chains can use the Social Security Administration’s Popular Baby Names dataset to predict which names on key chains will sell the most per U.S. state- in California, Noah was the most popular male name while Sophia was the most popular female name for the years 2014 and 2015. How much in U.S. Foreign Aid was allocated to Mexico for the fiscal year 2015? Foreignassistance.gov has the answer.

Data.gov invites you to share your open data stories as you explore or download specific open government datasets. Doing so will provide feedback to government agencies about which datasets are in high demand and which ones need improvement. It will also help the Data.gov team curate open data topics and special features- including coverage of open data events and hackathons.

Share your open data story here