AI Use in Healthcare: Ethical Framework for Diagnostic Applications

· Machine Learning · Sabanci University
AI EthicsHealthcare AIExplainabilityContestabilityDeep LearningGRAD-CAMSHAP

Supervisor: Dilara Keküllüoğlu

Overview

Deep learning models are increasingly deployed in healthcare diagnostics — examining chest X-rays, MRIs, and CT scans to identify diseases including pneumonia, cancer, and cardiovascular conditions. Systems like CheXNet have demonstrated performance comparable to radiologists on specific tasks. Yet high technical performance alone is insufficient for responsible clinical deployment: bias, lack of transparency, privacy vulnerabilities, and accountability gaps remain unresolved challenges that can directly harm patients.

This project conducts a systematic review of AI diagnostic systems and develops a comprehensive ethical design framework for their responsible use in clinical settings, with a particular focus on contestability as the mechanism through which human oversight becomes meaningful rather than nominal.

Capabilities and Risks of AI Diagnostic Systems

AI diagnostic systems offer genuine advantages — processing large volumes of medical data consistently, reducing diagnostic delays, and enabling early disease detection. However, four principal risk categories undermine responsible deployment:

RiskDescription
Data BiasTraining datasets lacking diversity perform poorly on underrepresented populations, producing unequal healthcare outcomes
Explainability GapBlack-box models make it difficult to understand how predictions are generated, reducing clinician trust and limiting clinical adoption
PrivacyMedical image data is vulnerable to re-identification attacks; sensitive patient information requires robust anonymization and access controls
AccountabilityWhen AI errors cause patient harm, tracing responsibility across developers, institutions, and clinicians is legally and ethically complex

Ethical Design Framework

Six complementary principles form an integrated framework for trustworthy clinical AI. Each principle addresses a specific identified risk while collectively ensuring the system is not only accurate but safe, transparent, and human-centered.

Contestability

Contestability enables patients and clinicians to challenge AI-generated decisions rather than passively accepting them. A diagnostic AI must provide sufficient information for meaningful disagreement — not merely surface-level outputs that appear convincing after the fact.

Effective contestability requires:

  • Clinician override capabilities with justification logging
  • Interpretable outputs: confidence scores, key diagnostic indicators, and alternative diagnoses
  • Feedback channels and audit logs enabling users to report disagreements and track decisions over time

Contestability enhances trust and accountability without compromising system performance — focusing on providing decision-relevant information rather than fully exposing complex model internals (Ploug & Holm, 2020).

Explainability

Explainability ensures AI decision-making is transparent and interpretable to clinicians and patients. Implementation integrates visual explanation methods — GRAD-CAM and SHAP outputs presented as heatmaps — showing which regions of a medical image drove a prediction. Explainability is the connective tissue between all other ethical principles: interpretable outputs make contestability, accountability, and answerability operationally effective.

Accountability

AI-assisted diagnoses must be fully traceable. The framework retains model outputs, confidence ratings, and system logs; physicians maintain final authority and can disregard AI recommendations; mechanisms for reviewing and auditing incorrect predictions are embedded in the system design.

Privacy

Medical image data requires anonymization so individuals cannot be re-identified, encryption during storage and processing, and access restricted to authorized users only — preventing misuse of sensitive patient information and ensuring legal and ethical compliance.

Fairness

The system monitors performance differences across patient groups — by age, gender, and background — and flags disparities for correction, ensuring equal diagnostic accuracy for all patients and preventing AI from amplifying existing healthcare inequalities.

Answerability

Every prediction is accompanied by a clear, structured explanation — including relevant image regions, confidence scores, and the main factors influencing the decision — so clinicians can meaningfully assess whether the output is reliable and appropriate in its specific clinical context.

© 2026 Hooshmand Rahmani. All rights reserved.