Virginia Tech® home

Equitable Health Care Via Ethical and Provable De-Identification

Researchers will develop strategies to provide secure data from clinics serving vulnerable populations to policymakers and criminal justice partners, enabling them to make data-informed decisions that are  representative, free of bias, and inclusive without breaching individuals’ privacy.

Funded by: CCI Northern Virginia Node

Project Investigators

Principal Investigator (PI): Evgenios Kornaropoulos, assistant professor, George Mason University School of Computing

Co-PI: Rebecca Sutter, term associate professor, George Mason University School of Nursing

Rationale and Background

George Mason University's Mason and Partners (MAP) clinics, a network of academic nurse-managed health facilities, serve uninsured and refugee populations.

The clinics don’t require a photo ID for free services such as health care, school physicals, mental health services, and substance-use treatment. 

This patient population doesn’t represent a typical hospital/insurance/federal dataset, and has little to no data footprint. These individuals are also among the most vulnerable with respect to privacy. Mishandling their information could have life-changing repercussions (risk of deportation, stigma from substance abuse).

Yet, their data has the potential to be key to policy decisions toward more equitable health care systems.

This creates tension between the need for officials to make data-informed decisions by gaining access to the information from medically underserved areas (MUAs), and the ethical obligation to protect the privacy of those in need.


Researchers will explore the possibilities and challenges that stem from the trade-off between utility and privacy when sharing de-identified data in a clinical setting that serves the most vulnerable. Research plan phases include:

  • The computer science team will visit clinics to access the tools, techniques and protocols used for de-identification.
  • The team will categorize the approaches used in practice and formulate the leakage of these mechanisms.
  • The team will analyze and develop provable leakage-suppression techniques (and potentially lower bounds and impossibility results) for de-identification approaches used in a clinical setting.
  • The team will design experiments and write a conference article that summarizes the findings.

Projected Outcomes

Research results will help develop a foundation to expand and implement strategies to be more inclusive of non-traditional data sources and develop best practices to inform community-based clinics serving vulnerable populations via:

  • Analysis of current approaches for de-identification used in a clinical setting.
  • Guidelines on how to use de-identification mechanisms that comply with HIPAA regulations while providing a meaningful notion of privacy and assessment of the privacy properties of free de-identification software.
  • Submission of study outcomes to a top-tier peer-reviewed conference for public availability.