Blog & News
Race/Ethnicity Data in CMS Medicaid (T-MSIS) Analytic Files Updated February 2020 – Features 2018 DataOriginally published August 2020; Updated February 2021:
As the coronavirus (COVID-19) crisis evolves, it has become increasingly clear that vulnerable subpopulations are being disproportionately impacted, both in terms of disease burden and the economic downturn. Medicaid has always played an essential role in providing coverage and care to vulnerable populations, particularly during economic downturns. As a result, the ability to evaluate enrollment, access to services, and quality of care by race and ethnicity subpopulations is critical.
The Transformed Medicaid Statistical Information System, or T‑MSIS Analytic Files (TAF), are an enhanced set of data on beneficiaries in Medicaid and the Children’s Health Insurance Program (CHIP). The Centers for Medicare and Medicaid Services (CMS) recently released updated information about the completeness of race/ethnicity information TAF. This includes information on data completeness for data years 2014-2019, as well as notes on the data versions (e.g., 2019 data are available, but listed as preliminary and have not been assigned an assessment score). CMS classifies data in the TAF files under various levels of “concern” in order to help guide researchers considering use of the data.
Validation Using American Community Survey (ACS) Data
While past versions of the Data Quality (DQ) assessment were based only on the percentage of beneficiaries with missing race and/or ethnicity values, the new assessment now also validates the data against an external benchmark, the U.S. Census Bureau's American Community Survey (ACS). This was done using analysis of the ACS 5-year public use microdata Sample (PUMS) File. The analysis calculated the distribution of self-reported race for all individuals who identified having the following health insurance: “Medicaid, Medical Assistance, or any kind of government-assistance plan for those with low incomes or a disability” (2018 ACS Questionnaire). Categories of race in the ACS can be combined to mirror those reported in the TAF (see Table 1). For more information on the construction of the race/ethnicity codes in the TAF we encourage you to read the excellent Background and Methods Resource, published by CMS.
Table 1. Crosswalk of Race and Ethnicity Variables between the TAF and ACS
Flag Value in TAF
|Combination of Race
and Hispanic Variables in ACS
|Hispanic, all races||7=Hispanic, all races||Hispanic, all races|
|4= American Indian and
Alaska Native, non-Hispanic
|- American Indian alone
- Alaska Native alone
- American Indian and Alaska Native tribes specified; or American Indian
or Alaska native, non-specified and no other race
- Native Hawaiian and other Pacific Islander alone
- Some other race alone
- Two or more races
Quality Assessment by State
Table 2 shows the Race and Ethnicity Data Quality Assessment for the 2018 TAF (Version: Release 1). The majority of states (32) were missing more than 10% of race/ethnicity data and only about one-third received a “low concern” rating. Roughly half the states had more than one Race/Ethnicity Category where TAF differed from the ACS by more than 10% (not shown). The overall effect of adding this step to the assessment was to lower the quality assessment rating of states that had more complete data (low rate of missing values), but high discordance compared to the ACS.
Table 2. Race and Ethnicity Data Quality Assessment, 2018 T-MSIS Analytic File (TAF)
|Percent of beneficiaries with missing race/ethnicity values||Number of race/ethnicity
categories where TAF differs from
ACS by more than 10%
|Low Concern||<10%||0||17||AK, CA, DE, IL, IN, ME, MI, NV, NM, NC, ND, OH, OK, PA, SD, VA, WA|
|Medium Concern||<10%||1 or 2||2||CO, ID|
|10-20||0 or 1||12||AL, FL, GA, KY, MD, MN, MT, NH, NJ, TX, VT, WI|
|High Concern||<10%||3 or more||0||-|
|10-20%||2 or more||9||AR, DC, HI, IA, KS, LA, MO, WY|
|20-50%||Any value||7||AZ, CT, MA, NY, SC, UT, WV|
|Unusable||>50%||Any value||4||MS, NE, RI, TN|
Notes: Though the T-MSIS includes all 50 states, the District of Columbia (D.C.), and the U.S. territories of Puerto Rico and the Virgin Islands, the latter two territories are excluded from the 2018 TAF (Data Version: Release 1) because they do not have 2018 Data Quality (DQ) Assessments or other associated information in the DQ Atlas and are therefore considered “unclassified.”
Source: Medicaid.gov. (n.d.). DQ Atlas: Race and Ethnicity [2018 data set]. Available from https://www.medicaid.gov/dq-atlas/landing/topics/single/map?topic=g3m16. Accessed February 8, 2020.
Using T-MSIS Data
CMS notes that some states may not have complete data on race and ethnicity because they follow the guidance from the Office of Management and Budget (OMB) that establishes self-identification as the preferred means of obtaining this information, and not all beneficiaries disclose this information. In addition, their evaluation of these data looks only at “missingness” and not other quality issues; for example, states that do not report certain race/ethnicity groups that comprise a substantial proportion of the state’s population. CMS encourages TAF users interested in using the race/ethnicity data for research to further assess its validity by comparing TAF distributions to external benchmarks such as state-level data on race/ethnicity and Medicaid/CHIP enrollment from the American Community Survey (ACS).
Data Quality Atlas
CMS’s Data Quality Atlas provides an interactive tool where users can generate maps (see Figure 1 below), tables, and a variety of other data quality measures related to the TAF. The tool provides a quick way to assess the completeness of multiple variables that may be relevant to specific research questions. It also provides helpful context for the completeness of certain variables compared to other variables. For instance, while all states fall into the “low concern” category for the completeness of age data, over half of the states are classified as having “unusable” family income data.
Figure 1. Data Quality Assessments of Beneficiary Information by U.S. State/Territory: 2018 Race and Ethnicity Data
Notes: Green = low concern; yellow = medium concern; orange = high concern; red = unusable; white = unclassified.
Source: Medicaid.gov. (n.d.). DQ Atlas: Race and Ethnicity [2018 Data set: Version 1]. Available from https://www.medicaid.gov/dq-atlas/landing/topics/single/map?topic=g3m16. Accessed February 9, 2021.
Over time, expectations are that the completeness and quality of data in the TAF will continue to improve. In the meantime, CMS’s investments in making the strengths and limitations of these data transparent are critically important for guiding researchers.
States are also intently focused on understanding and addressing the completeness and quality of race and ethnicity data across their data systems in order to better understand and address persistent health disparities. SHADAC, with support from the Robert Wood Johnson Foundation (RWJF) for its State Health and Value Strategies (SHVS) initiative, has been closely tracking the extent to which states are reporting critical health equity measures related to the coronavirus pandemic and vaccination efforts.
SHADAC researchers have also been exploring strategies for filling gaps in Medicaid race, ethnicity, and language (REL) data via Medicaid, and are producing an issue brief that provides a 50-state summary of how Medicaid agencies collect this data through their application process.
1 Medicaid.gov. (n.d.). DQ Atlas: Background and methods resource [PDF file]. Available from https://www.medicaid.gov/dq-atlas/downloads/data_by_topic/TAF_DQ_Race_Ethnicity/TAF_DQ_Race_Ethnicity.pdf. Accessed February 9, 2021.
2 Medicaid.gov. (n.d.). Exploring data quality (DQ) assessments by topic. Available from https://www.medicaid.gov/dq-atlas/landing/topics/info. Accessed February 9, 2021.