The Hidden Danger: How AI Bias in Healthcare Can Harm Patients, and 15 Ways to Fix It
Introduction: The Promise and Peril of Intelligent Healthcare
Artificial intelligence (AI) is rapidly moving from science fiction into everyday clinical settings and public health systems, especially with its accelerated use during the COVID-19 pandemic. These sophisticated applications, such as machine learning and deep learning, have enormous potential to revolutionize medicine, making care fairer and better overall. However, this powerful technology carries an enormous risk: if AI systems are not carefully designed, deployed, and monitored, they can deepen or reinforce existing health inequalities.
Health equity is not just about giving everyone the same services; it’s about achieving "the absence of unfair and avoidable or remediable differences in health among population groups". Achieving true health equity means intentionally focusing attention on the needs of populations who are most vulnerable to poor health outcomes due to social conditions, past injustices, or inherent bias—whether conscious or unconscious—in established structures and policies.
To address the tension between AI’s potential and its risks, a comprehensive study was undertaken to map out the specific problems—the "equity issues"—that AI introduces, and identify the practical ways—the "strategies"—that developers, regulators, and users can employ to fix them.
Unpacking the Blueprint: Identifying AI’s Weak Points
This major investigation was conducted as a scoping review, a research method perfect for summarizing findings from a vast and diverse body of knowledge, including both academic journals and "gray literature" like news articles and reports. The researchers searched numerous academic databases, news sources, and even the Food and Drug Administration (FDA) website for materials published between 2014 and 2021 that discussed AI, health equity, and related strategies.
After an exhaustive search, 660 documents were included in the final analysis. From these documents, the study authors identified 18 distinct equity issues and 15 concrete strategies proposed to address them.
To make sense of the vast array of issues, the researchers organized them using a four-step framework that mirrors the life cycle of any AI application:
Background Context: The systemic and structural factors that influence why and how a model is built.
Data Characteristics: The quality and quantity of the information used to teach the AI.
Model Design: The specific choices made when creating the algorithm (variables, fairness goals, etc.).
Deployment: How the model is evaluated, used, and maintained in the real world.
Phase 1: Background Context Issues (The Human Factor)
The very start of the AI development process—the environment and the people involved—can introduce bias.
Biased or Nonrepresentative Developers: If the development team lacks diverse characteristics, experiences, and roles, they will inevitably have blind spots and mismatched priorities, meaning they might overlook the needs of certain populations.
Diminished Accountability: When individuals are harmed by AI applications, the lack of clear developer accountability makes it difficult or impossible for them to get compensation or restitution.
Enabling Discrimination: In extreme cases, developers might use AI algorithms to intentionally discriminate against certain groups, either out of malice or for financial gain.
Phase 2: The Data Disaster (The Most Common Problems)
Issues related to Data Characteristics and Model Design were found to be the most common equity problems discussed in the literature. This makes sense, as AI is only as smart (and fair) as the information it is fed. Nearly two-thirds of all identified problems were related to data.
Limited Information on Population Characteristics: Data may lack necessary detail about people's traits, causing unlike groups to be lumped together (e.g., classifying race as just "White and non-White"). This aggregation can hide specific needs and vulnerabilities.
Unrepresentative Data or Small Sample Sizes: If the training data doesn't adequately reflect the target population—especially vulnerable groups—the AI will perform poorly or inaccurately when used on those people. This is a crucial disparity.
Bias Ingrained in Data: When the data used to train the AI reflects past discrimination or existing societal disparities (e.g., historically unequal access to care), the algorithm will incorporate and perpetuate those unfair patterns, making the problem worse.
Inclusion of Sensitive Variables: Including sensitive factors like race or income might cause the algorithm to discriminate directly on those factors, leading to unfair outcomes.
Exclusion of Sensitive Variables: Conversely, excluding sensitive information might reduce the algorithm's accuracy for some groups because it lacks the necessary explanatory power to account for systematic differences.
Limited Reporting of Information on Protected Groups: Without clear reporting on who was included in the training data and how the model performed for specific subgroups, it's impossible to know if or where the model has discriminatory impacts.
Phase 3: The Model Design Dilemma (The Choices)
This phase focuses on the technical decisions made when the AI model is built.
Algorithms Are Not Interpretable: Many powerful AI models (like deep learning networks) act as "black boxes." When we cannot understand why a model reached a specific decision, it is extremely difficult to evaluate whether that decision-making process was fair or equitable.
Optimizing Algorithm Accuracy and Fairness May Conflict: Developers often face a difficult trade-off: pursuing the highest possible accuracy may mean compromising on fairness constraints, and vice versa. This tension means that improving equity may come at the expense of decreased overall accuracy for the population.
Ambiguity in and Conflict Among Conceptions of Equity: There is no single definition of fairness. Different conceptions of equity may be mutually exclusive, or they may require sensitive data to properly measure and enforce.
Phase 4: Deployment Risks in the Real World
Once the model is built, how it is used can still introduce or exacerbate bias.
Proprietary Algorithms or Data Unavailable for Evaluation: If the underlying technology or data used to train the model is kept secret or proprietary, outside regulators and evaluators cannot effectively assess the risk of bias.
Overreliance on AI Applications: Users (like doctors or health system administrators) may blindly trust the algorithmic outputs, implementing decisions even when common sense or contrary evidence suggests otherwise. This trust can quickly perpetuate inherent biases baked into the model.
Underreliance on AI Applications: Conversely, users may dismiss the algorithm’s outputs if those outputs challenge their own existing biases, thereby perpetuating human discrimination.
Repurposing Existing AI Applications Outside Original Scope: Models are often built for one specific task or population. If they are reused for new populations or different functions without rigorous re-evaluation, they can bypass crucial safety checks designed for appropriate use.
Application Development or Implementation Is Rushed: Tight deadlines can force developers to skip essential steps, such as using low-quality data or avoiding thorough validation, which significantly exacerbates equity issues.
Unequal Access to AI: If advanced AI tools are deployed more commonly in high-income areas, the benefits will naturally flow disproportionately to wealthier groups, amplifying existing health disparities.
The Solutions: 15 Strategies to Achieve Fairness
Fortunately, the literature did not just identify problems; it offered 15 actionable strategies that stakeholders can use to combat AI bias.
Solutions Focused on People and Policy (Background Context):
Foster Diversity: Create AI development teams with diverse backgrounds, experiences, and perspectives to increase awareness of equity concerns and reduce blind spots.
Train Developers and Users: Educate those who create and use AI on equity considerations and the ethical implications of the technology.
Engage the Broader Community: Involve the community and vulnerable groups from the very beginning (conception) through the end (post-deployment) to ensure that the AI prioritizes real-world equity concerns.
Improve Governance: Establish strong regulations and industry standards to align AI applications with social norms, including requirements for equity, safety, and transparency.
Solutions Focused on Data (Data Characteristics):
Improve Diversity, Quality, or Quantity of Data: Train models using large, diverse data sets that are fully representative of the population the AI will serve, and ensure the data contains all relevant features.
Exclude Sensitive Variables to Correct for Bias: Remove sensitive factors (like race or income) to prevent the model from discriminating directly on these characteristics.
Include Sensitive Variables to Correct for Bias: Alternatively, include sensitive variables to improve the model's accuracy, increase its explanatory power, and make it easier to test explicitly for inequitable impacts across subgroups.
Solutions Focused on the Algorithm (Model Design):
Enforce Fairness Goals: Define a specific fairness rule (or "norm") and program the model to enforce it, which might involve editing the input data or modifying the model’s objectives.
Improve Interpretability or Explainability of the Algorithm: Choose models that are inherently easy to understand (like decision trees), or build mechanisms that can explain the model’s complex decisions.
Evaluate Disparities in Model Performance: Test how the model performs across a wide range of specific subgroups (particularly those who might be disadvantaged) using multiple metrics (like accuracy, false-positive rate, and false-negative rate), then report the group-level results and revise the model accordingly.
Use Equity-Focused Checklists, Guidelines, and Similar Tools: Incorporate standardized checklists into the development, review, and usage workflows to help developers and reviewers identify potential bias.
Solutions Focused on Use and Accountability (Deployment Practices):
Increase Model Reporting and Transparency: Provide comprehensive information about AI equity issues, requiring standardized equity analysis and increasing independent model reviews.
Seek or Provide Restitution for Those Negatively Impacted by AI: Proactively offer compensation to those harmed by biased AI or create clear legal frameworks that allow individuals to seek restitution.
Avoid or Reduce Use of AI: If efforts to improve equity are unsuccessful or if the harm is too severe, stakeholders should consider discontinuing or reducing the use of the model entirely.
Provide Resources to Those with Less Access to AI: Address unequal access by subsidizing necessary infrastructure, developing educational programs, or otherwise ensuring disadvantaged groups can benefit from the technology.
Conclusion: A Small Set of Strategies for a Broad Range of Issues
A key finding of this systematic mapping is that a small number of strategies can effectively address a wide range of issues. The most frequently cited strategies are: improving data quality, evaluating performance disparities across groups, increasing model transparency, engaging the community, and improving governance. These five strategies cover aspects across all four stages of the AI lifecycle.
For instance, a developer concerned about a predictive model's fairness might start with these top five strategies. They could easily evaluate disparities in model performance and increase model reporting and transparency. They could also implement low-cost measures, such as reviewing their work using an equity-focused checklist like the Prediction Model Risk of Bias Assessment Tool (PROBAST).
By thoughtfully adopting complementary sets of strategies that cover the full spectrum of equity issues, AI developers and users have the power to mitigate the most pressing risks. While no system is perfect, and existing human decision-making is already flawed, by implementing these equity safeguards, AI models offer a genuine chance to improve fairness over the status quo.
In essence, governing AI for health equity is like building a massive, complex bridge across a dangerous chasm. The issues are the defects in the blueprints, the faulty materials (biased data), and the shoddy installation (rushed deployment). The strategies are the necessary quality control checks, the mandatory inspections (evaluating disparities), the use of stronger, more inclusive materials (diverse data), and the establishment of clear regulations (governance). You cannot just fix one pillar; you need a comprehensive, multi-layered approach to ensure that the bridge serves everyone equally, especially those who rely on it most.
Health Equity Advocates:
Dr. Camara Phyllis Jones A family physician and epidemiologist, Dr. Jones is known for her influential work in defining, measuring, and addressing the impacts of racism on health. She uses allegories to explain complex topics like institutional racism, personally mediated racism, and internalized racism, aiming to "name racism" and catalyze a national campaign against it. She has held significant roles, including being a Medical Officer at the CDC and the past President of the American Public Health Association.
Dr. David Satcher The 16th U.S. Surgeon General (1998-2002), Dr. Satcher is a leading voice on public health policy and health equity, known for his work to eliminate racial and ethnic health disparities. He founded the Satcher Health Leadership Institute at the Morehouse School of Medicine, which focuses on public health leadership and eliminating disparities. He also served as the first African American director of the Centers for Disease Control and Prevention (CDC).
Dr. Marilyn Hughes Gaston A pioneering pediatrician and public health advocate, Dr. Gaston dedicated her career to improving medical care for poor and minority families. Her groundbreaking 1986 study on sickle cell disease in newborns led to a national screening program and demonstrated the effectiveness of penicillin in preventing fatal infections, significantly changing the standard of care for the disease. She was also the first Black female physician to be appointed director of the Health Resources and Services Administration's Bureau of Primary Health Care.