Skip to main content

Navigating a Path Forward Through Health Data Uncertainty

SHADAC Staff
August 18, 2025

Introduction

For many decades, modern U.S. governance has been influenced by the concept of “evidence-based policymaking” — a philosophy that government can produce better results if public policy is crafted using data and research evidence to meet its goals.[1],[2] This model of policymaking hinges on the availability of relevant data and research findings that can be used as evidence to guide decision making, which is why the federal government has developed a sophisticated infrastructure for collecting and publishing various forms of public data.

Developments in recent months, however, have raised worries that the federal government could scale back its long-term leadership role in public data collection and publication. While there are many types of public data collected and published by the government, one such category is data that come from the numerous surveys conducted by a panoply of government agencies; these data, collected using varied survey methods, help answer a wide range of questions of concern to both the public and private sectors. 

For instance, the Substance Abuse and Mental Health Services Administration (SAMHSA) conducts the National Survey on Drug Use and Health (NSDUH) to produce estimates on the prevalence of certain mental health conditions and forms of substance use, which have been critical public health concerns for many years; and the U.S. Centers for Disease Control and Prevention funds and assists states in running the Behavioral Risk Factor Surveillance System (BRFSS) survey, which is used to monitor population health across a number of measures, including smoking, obesity, and cancer screening rates.

For the purposes of evidence-based policy, a primary concern about reductions of federal investments in collection and publication of survey data is a potential for the creation of “data gaps” that could hamper the policymaking process. Such data gaps could occur if the federal government cancels a handful of surveys or suppresses specific measures that serve as the sole sources of data on particular topics. In some cases, gaps could emerge for certain topics even if only one unique survey were to be canceled.

Background on Federal Data Stewardship

Since the founding of the United States, the collection of public data has been an important function of the federal government, with the U.S. Constitution itself directing the government to conduct a population census every 10 years. The results of the decennial U.S. census are used for a variety of government purposes, such as to decide on apportionment of representation in the U.S. Congress for the states, and to guide the distribution of funds for public programs that are based on population. 

Over time, however, the federal government has taken an expanding role in the collection and publication of other kinds of data, as well. A famous example that is reported on a monthly basis are the unemployment data published by the U.S. Bureau of Labor Statistics, which it began compiling in 1940 (soon after the Great Depression) through the Current Population Survey.[3] These data are designed to serve two purposes: First, they can be used by the federal government itself to help guide public programs and policymaking, especially as it took a more assertive role in economic and social policy in response to the Depression. Second, by reporting these data publicly—rather than hording for use only by the government—they became a public good that could be used by businesses, charities, and others to monitor and respond to economic conditions, playing a key role in building a robust economy overall.

The example of unemployment data is simply a case study, illustrating the rationale underpinning the role of the federal government as the longtime premier data steward for the United States as a whole. There are many reasons the federal government is well-suited to this role, such as:

  • Authority: Under federalism, the United States has multiple levels of government — federal, state, local — each with its own authorities and jurisdictions. While states, for instance, collect and publish a variety of different data, they typically focus within their own boundaries. The federal government, in contrast, holds cross-state authorities and jurisdiction across all U.S. territory. As such, the federal government can collect data in multiple states and publish those data in ways that are comparable across boundaries.

     

  • Resources: By virtue of its larger tax base and greater borrowing power, the federal government has more financial resources to engage in data collection efforts that can be expensive, such as fielding large surveys. Additionally, the federal government has built a sizeable administrative state with civil servants who have unique expertise in data collection and analysis; the federal government can also leverage its prestige to tap the expertise of others outside of government as public service. These superior resources can also result in efficiencies (i.e., economies of scale), as it is generally more cost-effective to conduct one large survey than 50 individual state surveys, for example.

     

  • Trust: While not unique to the federal government, data collection initiatives sponsored by agencies at any level of government are designed to benefit the public interest, rather than private interests. That focus on the public, rather than private, interest can help to engender trust, providing government-sponsored data collection efforts a competitive advantage over privately sponsored efforts. For instance, an individual may be more likely to respond to a survey about health by a known and trusted government agency than a similar survey sponsored by a health insurance company or a pharmaceutical company.

Despite this history and the advantages of federal data collection, recent developments have raised uncertainty about whether the federal government remains committed to playing the same data stewardship role as it has historically.

SHADAC’s State Alternatives for Health Data Continuity Project

In response to uncertainty about whether the federal government may eliminate or scale back important surveys, SHADAC has undertaken a new project called State Alternatives for Health Data Continuity, supported by the Robert Wood Johnson Foundation. 

This project includes two main components: 

  • Data scan — SHADAC is conducting a review of federal surveys that are focused on health topics, as well as broader surveys that include important health-related measures, such as the U.S. Centers for Disease Control and Prevention’s Behavioral Risk Factor Surveillance System survey, with estimates on numerous health status, health-risk behaviors and other topics. We are also reviewing survey documentation for information on health-related question domains and measurement concepts, and assessing where surveys may include similar measures or measures are unique. The purpose of this effort is mainly to identify where data gaps could arise if the federal government were to eliminate or scale back particular surveys.

     

  • Expert interviews — SHADAC is also seeking to understand the potential implications of data gaps that could occur, and considering options for filling data gaps. We will interview researchers and government analysts who work with federal survey data to produce evidence for informing health policymaking, as well as representatives of other key stakeholder groups, such as organizations with experience in conducting state-level surveys, and foundations that fund research using health surveys. In addition to helping to understand the risks associated with data gaps that could come about from reductions in federal investments in health-related surveys, we will assess options for alternative data collection initiatives for filling potential data gaps.

This project is focused on federal health surveys, as well as some broader federal surveys with important health-related measures. While there are many sources of federal data, including health data, it is important to acknowledge the unique role that surveys contribute. 

For some questions, a survey is often the most practical or feasible way to collect data. As an example, because the U.S. health insurance landscape is made up of a variety of different forms of private and public health insurance coverage that can be regulated at varying levels of government (e.g., federal, state), there is no single U.S. agency that keeps administrative data on the particulars of everyone’s health plans, such as premiums, cost-sharing, networks, etc. In that case, a health insurance survey collecting those data from a small, representative sample of the population is likely to be a more practical option than having a single federal agency establish and maintain a database for the full U.S. population.

And, for other questions, collecting data from individuals using a survey questionnaire is really the only realistic way to collect such data. For example, there is simply no administrative data set (and no way to create one) that has data on individuals’ personal experiences relating to health, such as satisfaction and experiences with health care providers or health insurance plans. While one could collect qualitative data through interviews or focus groups on individuals’ experiences, a survey is the only method to collect these data on a larger scale and with methods to ensure they are representative of the population overall.

Contingency Planning to Anticipate and Fill Data Gaps

Part of the purpose of the Health Data Continuity project is to help individuals and organizations involved in the policymaking process to engage in contingency planning for determining how to respond if reductions in federal health-related survey infrastructure were to result in data gaps. Our aim is for the results of this project to be useful for any interested stakeholders, including elected and appointed public officials, government agencies, public health and health policy researchers, health-focused private sector and non-profit organizations, and anyone with a stake in health policy.

However, we also believe that individuals and organizations can and should begin contingency planning before potential data gaps present themselves. 

This can be done in a three-step process:

  1. Anticipate gaps and identify risks — First, determine what federal data sources you or your organization uses and how those data are used; then, identify what gaps and risks could arise if those data became unavailable. This may be using a federal-health related survey directly, such as using BRFSS data to identify demographic subpopulations with the highest rates of tobacco use, which could be used to direct smoking-cessation initiatives; or this could be indirectly, such as by using Census Bureau population estimates derived from the American Community Survey (ACS) to weight data in a survey you conduct yourself, which are used in public reports. 

    Depending on the priorities and purposes of your organization, it may be possible to move forward without filling certain data gaps if they don’t pertain directly to work that is core to your mission. However, if the potential disappearance of certain data would seriously threaten your ability to conduct work core to your mission, then you should investigate options for filling potential data gaps. In addition to the relative size of risks (i.e., the level of danger) of different potential data gaps, another consideration may be the time horizon. For instance, your organization may wish to prioritize contingency planning for potential data gaps that could arise in the near-term, while taking additional time to investigate the risks and potential solutions for longer-term areas of uncertainty. 

     

  2. Scan for alternative secondary data sources — Our modern world is swimming in data — sometimes the data we need, but sometimes not. If your data risk assessment from Step One determines you would need to fill data gaps, the most cost-effective and efficient alternative is often using another source of secondary data, meaning data that someone else has already collected. 

    Because of the vast data collection infrastructure that the federal government has built to support its evidence-based policymaking processes, there may be other federal data sources that you or your organization could use to fill gaps if one data source is discontinued. In some instances, you may be able to fill data gaps using surveys similar to those you are backfilling. But there are other options as well, such as data that the federal government collects and makes public in the course of running public programs. 

    In other cases, you may need to consider alternatives outside the federal government entirely. For instance, it is possible that state or local government agencies conduct surveys that include the data you need, or they may have administrative data that could fill the gaps. Additionally, there may be non-profit or private sector data sources that could fill your data gaps, such as health surveys conducted by foundations or survey research firms — though, unlike government data sources that are often made available publicly at little or no cost, these other survey data stewards may charge a fee to obtain their data.

     

  3. Consider primary data collection — While generally not a first choice, due to the higher costs, you or your organization could engage in your own primary data collection to fill data gaps. There are numerous ways to go about this. The best option will usually be driven by factors such as your expertise, resources, and potential partners, among others. For instance, if you have the relevant expertise and staffing, your organization could perhaps conduct a small survey yourselves. Conversely, if you have only limited expertise in survey data collection but have the financial resources to outsource the work, you could engage a survey vendor to help you design and field a survey. 

    Additionally, you may consider whether there are partners with whom you or your organization could collaborate. For instance, another organization may already operate a survey that includes most of the data you need, minus a few gaps. In that case, you might be able to persuade them to add questions to their survey in exchange for financial or other supports to defray the costs of their survey. Or, if no such partner with an existing survey exists, you and your organization may be able to build a new group with two or more partners to collaboratively fund, develop, and field a survey.

Conclusion

The longtime leadership of the federal government in collecting and publishing health-related data from surveys has played a crucial role in furthering evidence-based policymaking. However, this new era of uncertainty about the future of these data sources means that individuals and organizations who have historically used those data need to engage in contingency planning. 

With our State Alternatives for Health Data Continuity project, SHADAC is developing resources to help anyone involved in the health policy process consider their options for identifying and filling gaps in important measures of health. We hope to begin publishing resources on our website this year (2025). In the meantime, if you have questions about the project, please contact us


References

[1] Baron, J. (2018). A Brief History of Evidence-Based Policy. The ANNALS of the American Academy of Political and Social Science, 678(1), 40-50. https://doi.org/10.1177/0002716218763128

[2] Brownson, R. C., Chriqui, J. F., & Stamatakis, K. A. (2009). Understanding Evidence-Based Public Health Policy. American Journal of Public Health, 99(9), 1576–1583. https://pmc.ncbi.nlm.nih.gov/articles/PMC2724448/

[3] U.S. Bureau of Labor Statistics. (2015). Labor Force Statistics from the Current Population Survey: How the Government Measures Unemployment. https://www.bls.gov/cps/cps_htgm.htm