The Transformed Medicaid Statistical Information System (T-MSIS) is the largest national database of current Medicaid and Children’s Health Insurance Program (CHIP) beneficiary information collected from all U.S. states, most territories (Puerto Rico, Guam, and the Virgin Islands), and the District of Columbia (DC).[1] T-MSIS data are critical for monitoring and evaluating the utilization of Medicaid and CHIP, which together provide health insurance coverage to almost 77 million people.[2]
Due to their size and complexity, T-MSIS data files are challenging to use directly for research and analytic purposes. To optimize these files for health services research, the Centers for Medicare and Medicaid Services (CMS) repackages them into a user-friendly, research-ready format called T-MSIS Analytic Files (TAF) Research Identifiable Files (RIF). One such file, the Annual Demographic and Eligibility (DE) file, contains national demographic data, including race and ethnicity information for Medicaid and CHIP beneficiaries. This information is vital for assessing enrollment, access to services, and quality of care across racial and ethnic groups in the Medicaid/CHIP population, whose members are particularly vulnerable due to limited income, physical and cognitive disabilities, old age, complex medical conditions, housing insecurity, and other social, economic, behavioral, and health needs.
To guide researchers and other consumers in their use of T-MSIS data, CMS produces data quality assessments of the completeness of race and ethnicity data along with other data such as enrollment, claims, expenditures, and service use. The Data Quality (DQ) assessments for race and ethnicity data have been posted for data years 2014 through 2023 and indicate varying levels of “concern” regarding race and ethnicity data completeness. CMS provides Preliminary files (which include data that is not fully mature, due to incomplete or in progress submissions by states) and Release 1 files (fully mature data recommended for analysis).[3] Some years, if data was corrected or quality was improved, CMS will release an additional version of the data called Release 2. All file versions have their own DQ assessment.[4]
While completeness of race and ethnicity data reported to CMS has historically remained inconsistent among the states, territories, and DC, SHADAC has been monitoring the quality of these data over time. This blog explores the 2023 TAF Data Release 1, the most recent T-MSIS race and ethnicity data for which a DQ assessment is available. This year’s release saw a slight shift in improvement of data quality compared to the 2022 release, as we detail below.
Overall, since the 2019 T-MSIS race and ethnicity data release, the data quality has seen variations over time, although slight improvements can be noticed, particularly with more states having usable data and moving from the high concern to medium concern or low concern categories. We describe and discuss these trends further in this blog, as seen in Table 3. We also provide a brief analysis of data quality trends over time that we plan to continue to monitor in future T-MSIS file releases.
Evaluation of T-MSIS Race and Ethnicity Data
DQ assessments for each year and data version of T-MSIS data are housed in the Data Quality Atlas (DQ Atlas), an online evaluation tool developed as a companion to T-MSIS data.[5] The DQ Atlas assesses T-MSIS race and ethnicity data using two criteria: the percentage of beneficiaries with missing race and/or ethnicity values in the TAF; and the number of race/ethnicity categories (out of five) that differ by more than ten percentage points between the TAF and American Community Survey (ACS) data.
Taken together, these two criteria indicate the level of “concern” (i.e., reliability) for states’ T-MSIS race/ethnicity data. To construct the external ACS benchmark for evaluating T-MSIS data, creators of the DQ Atlas combine race and ethnicity categories in the ACS to mirror race and ethnicity categories reported in the TAF (see Table 1). More information about the evaluation of T-MSIS race and ethnicity data is available in the DQ Atlas’ Background and Methods Resource.
Five “concern” categories appear in the DQ Atlas: Low Concern, Medium Concern, High Concern, Unusable, and Unclassified. States with substantial missing race/ethnicity data or race/ethnicity data that are inconsistent with the ACS—a premier source of demographic data—are grouped into either the High Concern or Unusable categories, whereas states with relatively complete race/ethnicity data or race/ethnicity data that align with ACS estimates are grouped into either the Low Concern or Medium Concern categories. The Unclassified category includes states for which benchmark data are incomplete or unavailable for a given data year and version.
Table 1. Crosswalk of Race and Ethnicity Variables between the TAF and ACS
Source: Medicaid.gov. (n.d.). DQ Atlas: Background and methods resource [PDF file]. Available from https://www.medicaid.gov/dq-atlas/downloads/background-and-methods/TAF-DQ-Race-Ethnicity.pdf. Accessed February 15, 2026.
Long description found in the ‘Methods’ section of the source document.
Quality Assessment by State
Table 2 shows the Race and Ethnicity DQ Assessments for the 2023 TAF Data Release 1. The categorization criteria used to determine the levels of concern for the 2023 TAF Release 1 data are the same as those used to assess T-MSIS data from previous years and versions. There were 17 states that received a rating of “Low Concern.” There were 22 states (including Puerto Rico [PR]) that fell into the “Medium Concern” category.
Most of the “Medium Concern” states (20 of 22) fell into the subcategory denoting the higher percentage range of missing race/ethnicity data (10% – <20%). A similar pattern can be seen among the “High Concern” states, most of which (8 of 12) fell into the subcategory denoting the highest percentage range of missing race/ethnicity data (from 20% – <50%).
Finally, 12 states (including DC) received a rating of “High Concern.” One state (Utah) received an “Unusable” rating, meaning it was missing at least 50% (>50%) of race/ethnicity data. Guam and the Virgin Islands (VI) are categorized as “Unclassified” in the 2023 TAF Data Release 1 due to insufficient or incomplete data, and do not appear in Table 2.
Table 2. Race and Ethnicity Data Quality Assessment, 2023 T-MSIS Analytic File (TAF) Data Release 1
Data Quality | Percent of Beneficiaries with Missing race/ethnicity values | Number of Race/Ethnicity | Number of | States |
|---|---|---|---|---|
Low Concern | <10% | 0 | 17 | AK, CO, DE, ID, MI, MO, NE, NV, NH, NM, NC, ND, OH, OK, PA, SD, WA |
Medium Concern | <10% | 1 or 2 | 2 | KS, VA |
Medium Concern | 10% – <20% | 0 or 1 | 20 | AL, AR, CA, FL, GA, IL, IN, ME, MD, MN, MS, MT, NJ, OR, PR, TN, TX, VT, WV, WI |
High Concern | <10% | 3 or more | - | - |
High Concern | 10% – <20% | 2 or more | 4 | AZ, KY, LA, RI |
High Concern | 20% – <50% | Any value | 8 | CT, DC, HI, IA, MA, NY, SC, WY |
Unusable | >50% | Any value | 1 | UT |
Table 2 Notes: *T-MSIS includes all 50 states, the District of Columbia (DC), and the U.S. territories of Guam, Puerto Rico (PR), and the Virgin Islands (VI). However, a DQ assessment is not available for Guam and VI in the 2023 TAF (Data Version: Release 1) due to incomplete/unavailable data.
Despite ongoing state-level variation in the completeness of race and ethnicity data reported to CMS, SHADAC researchers have noted a slight trend toward better quality data overall since beginning tracking these assessments in 2019, particularly noticeable in the decrease in number of states with unusable data. Although, as seen in Table 3, improvement has not been linear.
For example, the number of states with data of “High Concern” decreased by two states from 2022 to 2023, with Oregon and Tennessee moving from the “High Concern” category down to the “Medium Concern” category. The 2023 race/ethnicity TAF data from 12 states received a rating of “High Concern” compared to 14 states’ data in 2022 and 11 states’ data in 2021. The number of states with data of “Medium Concern” has stayed the same from 2022 to 2023, at 22 states. The number of states with data of “Low Concern” increased by 2 states from 2022 to 2023, with Colorado and Idaho moving to the category from the “Medium Concern” category. Only one state (Utah) received an “Unusable” data assessment, staying the same between 2022 and 2023.
Table 3. Race and Ethnicity Data Quality Over Time: State Counts of 2020–2023 T-MSIS Analytic File (TAF) Data Releases
Data Quality Assessment | 2019 | 2020 | 2021 | 2022 | 2023 |
|---|---|---|---|---|---|
Low Concern | 15 | 15 | 16 | 15 | 17 |
Medium Concern | 14 | 17 | 22 | 22 | 22 |
High Concern | 17 | 16 | 11 | 14 | 12 |
Unusable | 5 | 4 | 3 | 1 | 1 |
Sources: SHADAC. (January 11, 2022). Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files Updated December 2021 – Features 2019 Data. Available from https://www.shadac.org/news/updated-raceeth-data-tmsis-files-2019data. Accessed March 27 2026.; SHADAC. (January 11, 2023). Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files: 2020 Data Assessment. Available from https://www.shadac.org/news/raceethnicity-data-cms-medicaid-t-msis-analytic-files-2020-data-assessment. Accessed March 27 2026.; SHADAC. (December 6, 2023). Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files: 2021 Data Assessment. Available from https://www.shadac.org/news/raceethnicity-data-cms-medicaid-t-msis-analytic-files-2021-data-assessment. Accessed March 27 2026,; SHADAC. (November 26, 2024). Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files: 2022 Data Assessment. Available from https://www.shadac.org/news/race-ethnicity-data-tmsis-analytic-files-TAF-2022-data. Accessed March 27 2026.
Visualizing T-MSIS Data in the DQ Atlas
The DQ Atlas enables users to generate maps and tables that compare the quality of T-MSIS data between states across different topics, such as race/ethnicity, age, income, and gender. Visualizing T-MSIS data in this manner can help researchers quickly assess the completeness of a single variable as well as the relative completeness (or incompleteness) of certain variables compared to others. For example, in the 2023 TAF Data Release 1, all states and territories received a “Low Concern” rating for age data, whereas only 34 states and territories received a “Low Concern” rating for income data.
We have used this feature to generate a map of the race/ethnicity data quality assessments (see Figure 1).
Figure 1. 2023 TAF Data Release 1 Data Quality Assessments of Beneficiary Race/Ethnicity by U.S. State/Territory, Map
Notes: Green = low concern; yellow = medium concern; orange = high concern; red = unusable; grey = unclassified.
Source: Medicaid.gov. (n.d.). DQ Atlas: Race and Ethnicity [2023 Data set: Version: Release 1]. Available from https://www.medicaid.gov/dq-atlas/landing/topics/single/map?topic=g3m16&tafVersionId=50. Accessed February 15, 2026.
Looking Ahead
Increasingly, a wide diversity of voices from non-profits, health insurers, disability policy researchers, and state-based marketplaces have called for improving the collection of race, ethnicity, and language data, often with the goals of having more complete data and advancing health equity. CMS also supports efforts to improve the quality and availability of T-MSIS data, primarily through the TAF releases and data quality assessments, which allow for continued tracking of goals toward improving data quality and collection. CMS also works to assist states in improving their data collection work at the front end through setting data quality compliance standards which states must meet to receive financial assistance, and through performance indicators (PIs) which are used to evaluate state data quality at the conclusion of their annual collection and reporting efforts. Jointly, these strategies aim to enhance the methods through which states and the federal government work to accurately capture the diversity of the U.S. population who interact with or are served by the Medicaid and CHIP programs.
Continued progress toward implementation of the Office of Management and Budget’s (OMB’s) Statistical Policy Directive No. 15 (i.e., SPD 15: Standards related to the collection of race and ethnicity data) is an encouraging step towards the goal of having more complete and granular race/ethnicity data. This directive aligns with available evidence on collecting race/ethnicity data, and with recommendations that are consistent with standards implemented by states leading in REL (race, ethnicity, language) data collection, such as California, Oregon, and New York. Importantly, SPD15 explicitly states that these standards should serve as a minimum baseline to collect and provide more granular data.
While these standards are specifically named as minimum reporting categories for data collection throughout the federal government, specifically, once fully adopted by 2029, they are likely to shape data collection and reporting across all sectors, including in state-level survey data, race/ethnicity data collected through the Medicaid application process, and more.
While there is encouraging progress in the evolving standards of race/ethnicity data collection, many states still express difficulties reporting their data, as there is misalignment in how state eligibility systems, Medicaid Management Information System (MMIS), and T-MSIS format race and ethnicity data. Before states submit data to T-MSIS, they must reformat and aggregate data, which may affect the quality of submitted data. One approach proposed by the Medicaid and CHIP Payment and Access Commission (MACPAC) to improve the collection and reporting of data is providing states with an updated model application using evidence-based approaches to race and ethnicity questions that improve applicant response rate and data accuracy.
SHADAC plans to continue to monitor and track T-MSIS data quality assessments as they become available. To stay up to date with our latest publications, subscribe to our newsletter or follow our LinkedIn page. If you’re interested in race/ethnicity data collection and how it could be improved, check out the following publications from SHADAC and our partners:
- Considerations from SHADAC: Proposed Revisions to Federal Standards for Collecting Race/Ethnicity Data
- Collecting Race, Ethnicity, and Language (REL) Data on Medicaid Applications: 50-State Review Shows Wide Variation in How States Gather this Information
- Exploring Strategies to Fill Gaps in Medicaid Race, Ethnicity, and Language Data
Sources
[1] Medicaid.gov. Transformed Medicaid Statistical Information System (T-MSIS). Retrieved October 20, 2022, from https://www.medicaid.gov/medicaid/data-systems/macbis/transformed-medicaid-statistical-information-system-t-msis/index.html#
[2] Medicaid.gov. October 2025 Medicaid & CHIP Enrollment Data Highlights. Retrieved on February 15, 2026, from https://www.medicaid.gov/medicaid/program-information/medicaid-and-chip-enrollment-data/report-highlights/index.html
[3] ResDAC. (December 19, 2024). 2022 and 2023 T-MSIS Medicaid and CHIP Data Now Available. Retrieved February 15, 2026, from https://resdac.org/cms-data/files/taf-ip
[4] ResDAC. (n.d.). T-MSIS Analytic File Inpatient. Retrieved February 15, 2026, from https://resdac.org/cms-news/2022-and-2023-t-msis-medicaid-and-chip-data-now-available
[5] Saunders, H., & Chidambaram, P. (April 28, 2022). Medicaid Administrative Data: Challenges with Race, Ethnicity, and Other Demographic Variables. Kaiser Family Foundation. Retrieved October 31, 2022, from https://www.kff.org/medicaid/issue-brief/medicaid-administrative-data-challenges-with-race-ethnicity-and-other-demographic-variables/