Appendix H — AI Governance Policy Template

Appendix H: Organizational AI Governance Policy Template

Purpose

This appendix provides a comprehensive, customizable policy template for health departments and healthcare organizations deploying AI systems. Use this as a starting point for your organization’s AI governance framework.

Who should use this: - Health department leadership developing AI policies - Hospital C-suite establishing governance frameworks - Chief Medical Information Officers (CMIOs) implementing AI oversight - IRB/ethics committees reviewing AI deployments - Quality and safety officers monitoring AI systems

What’s included: - Complete policy template (ready to customize) - Section-by-section implementation guidance - Real-world examples from leading institutions - Regulatory alignment checklist (FDA, HIPAA, state laws)

Introduction: Why Your Organization Needs an AI Policy

The Regulatory Landscape (2024-2025)

The regulatory environment for health AI has evolved rapidly:

Federal: - FDA AI/ML Medical Device Action Plan (2021, updated 2024) - Requires monitoring plans, transparency, and bias testing for clinical decision support tools (U.S. Food and Drug Administration 2024) - Executive Order 14110 on Safe AI (October 2023) - Mandates safety testing for high-risk AI systems, including healthcare applications (WhiteHouse2023AIExecutiveOrder?) - HIPAA Security Rule - Applies to AI systems processing PHI; requires risk assessments and safeguards (HHS2024HIPAASecurity?)

International: - EU AI Act (2024) - Classifies most health AI as “high-risk”; requires conformity assessments, documentation, and human oversight (European Parliament and Council 2024) - WHO Ethics & Governance of AI for Health (2021) - Six guiding principles: protect autonomy, promote well-being, ensure transparency, foster accountability, ensure equity, promote sustainability (WHO2021AIEthics?)

State-Level (Examples): - California AB 2013 (2024) - Requires algorithmic impact assessments for government agencies using AI in healthcare (CaliforniaAB2013?) - New York Local Law 144 (2023) - Mandates bias audits for automated decision systems (NYCLocalLaw144?)

Without a policy, your organization is exposed to: - ❌ Regulatory violations (FDA, HIPAA, state laws) - ❌ Patient safety incidents - ❌ Liability from algorithmic harm - ❌ Reputational damage from bias or failures - ❌ Inconsistent AI deployment practices

Template Overview

This template consists of 10 sections:

Purpose and Scope - What this policy covers
Definitions - Key terms
Governance Structure - Who is responsible
AI Acquisition and Procurement - Vendor evaluation
Development and Validation - Internal AI development
Deployment and Implementation - How AI is deployed
Monitoring and Maintenance - Ongoing oversight
Data Privacy and Security - Protecting patient data
Fairness, Equity, and Bias - Ensuring equitable AI
Incident Response and Accountability - When things go wrong

Customization Instructions: - Text in [BRACKETS] should be replaced with your organization-specific information - Sections marked [OPTIONAL] can be omitted if not applicable - Add your organization’s logo, letterhead, and approval signatures

ORGANIZATIONAL AI GOVERNANCE POLICY

Document Control

Policy Name	Artificial Intelligence Governance Policy
Policy Number	[INSERT NUMBER]
Version	1.0
Effective Date	[INSERT DATE]
Review Date	[Annual/Biannual]
Approval Authority	[CEO, Board of Directors, CMO]
Policy Owner	[Chief Medical Information Officer / Chief AI Officer]
Applies To	All staff, contractors, and vendors deploying or using AI systems

1. Purpose and Scope

1.1 Purpose

This policy establishes [ORGANIZATION NAME]’s framework for the responsible development, acquisition, deployment, and monitoring of artificial intelligence (AI) and machine learning (ML) systems in healthcare and public health applications.

Policy Objectives: - Ensure patient safety and clinical efficacy of AI systems - Protect patient privacy and data security - Promote fairness and equity in AI deployment - Establish accountability for AI-related decisions - Comply with applicable laws and regulations (FDA, HIPAA, state laws) - Maintain transparency and trust with patients and stakeholders

1.2 Scope

This policy applies to:

✅ All AI/ML systems that: - Process protected health information (PHI) - Support clinical decision-making (diagnosis, treatment, triage) - Automate patient care workflows - Predict health outcomes or risk stratification - Allocate healthcare resources - Affect patient care delivery

✅ All personnel involved in: - Acquiring or procuring AI systems - Developing or training AI models - Deploying AI systems in clinical or operational settings - Monitoring AI system performance - Making decisions informed by AI outputs

This policy does NOT apply to:

❌ General-purpose software without ML components (e.g., word processors, scheduling systems) ❌ Research AI projects not involving patient care (subject to separate IRB policies) ❌ Administrative AI with no patient impact (e.g., inventory management) [OPTIONAL]

1.3 Definitions

Artificial Intelligence (AI): Systems that perform tasks typically requiring human intelligence, including learning from data, pattern recognition, prediction, and decision support (Russell2020AI?).

Machine Learning (ML): Subset of AI where systems learn from data without explicit programming (Goodfellow2016DeepLearning?).

High-Risk AI System: AI systems that directly impact patient safety, clinical outcomes, or healthcare access. Includes: - Diagnostic support tools - Treatment recommendation systems - Patient triage and risk stratification - Resource allocation algorithms - Clinical decision support systems (CDSS)

Clinical Validation: Process of evaluating AI system performance on clinical outcomes (not just technical metrics) in real-world settings (Nagendran2020AIValidation?).

Algorithmic Bias: Systematic and unfair discrimination in AI outputs based on race, ethnicity, sex, age, disability, or other protected characteristics (Obermeyer2019AlgorithmicBias?).

2. Governance Structure

2.1 AI Governance Committee

[ORGANIZATION NAME] shall establish an AI Governance Committee responsible for oversight of all AI systems.

Committee Composition (Minimum): - Chair: Chief Medical Information Officer (CMIO) or Chief AI Officer - Chief Medical Officer (CMO) or designee - Chief Information Security Officer (CISO) or designee - Quality and Patient Safety Officer - Chief Equity Officer or Health Equity Lead - Bioethicist or Ethics Committee representative - Legal Counsel (with expertise in health law and AI) - Clinical Champions (2-3 physician/nurse representatives from affected specialties) - [OPTIONAL] Patient Representative or Patient Advocacy Leader

Committee Responsibilities: 1. Review and approve all AI system acquisitions and deployments 2. Establish validation requirements and success criteria 3. Monitor AI system performance and safety 4. Investigate AI-related incidents and adverse events 5. Ensure compliance with this policy and applicable regulations 6. Update policy annually based on new evidence and regulations 7. Educate staff on responsible AI use

Meeting Frequency: Quarterly (minimum); ad-hoc for urgent reviews

Quorum: Majority of voting members (must include Chair, CMO, and Ethicist)

Documentation: All decisions recorded in meeting minutes; maintained for 7 years

2.2 Roles and Responsibilities

AI Governance Committee: - Policy development and updates - System-level oversight and approval - Incident response coordination

Clinical Department Leaders: - Identify AI use cases in their specialties - Ensure clinical staff training - Monitor impact on clinical workflows - Report performance issues or safety concerns

IT/Informatics Department: - Technical infrastructure for AI deployment - Data security and privacy protections - System integration and interoperability - Performance monitoring dashboards

Quality and Safety: - Monitor AI system safety metrics - Investigate adverse events - Track bias and equity metrics - Coordinate corrective actions

Legal and Compliance: - Regulatory compliance (FDA, HIPAA, state laws) - Vendor contract review - Liability and risk assessment - Documentation of compliance activities

3. AI Acquisition and Procurement

3.1 Vendor Evaluation Requirements

All AI systems acquired from external vendors must undergo evaluation using the AI Vendor Evaluation Checklist (Appendix G of Public Health AI Handbook) (Tegomoh2025Handbook?).

Mandatory evaluation domains:

1. Technical Validation (Weight: 25%) - [ ] External validation at ≥3 independent sites - [ ] Peer-reviewed publication of validation results - [ ] Independent researchers (not vendor employees) conducted validation - [ ] Performance metrics reported (AUC, sensitivity, specificity, PPV) - [ ] Generalizability evidence to [ORGANIZATION NAME]’s patient population

2. Clinical Safety (Weight: 25%) - [ ] Safety testing and failure mode analysis conducted - [ ] Prospective outcome studies demonstrating clinical benefit - [ ] Alert burden quantified (false positive rate ≤20% for clinical alerts) - [ ] Adverse event monitoring system in place - [ ] Human factors testing with end-users

3. Fairness and Equity (Weight: 20%) - [ ] Bias audit conducted by independent researchers - [ ] Performance stratified by race, ethnicity, sex, age, socioeconomic status - [ ] Training data demographics disclosed - [ ] Plan for ongoing bias monitoring

4. Privacy and Security (Weight: 15%) - [ ] HIPAA Business Associate Agreement (BAA) provided - [ ] SOC 2 Type II certification or equivalent - [ ] Encryption at rest and in transit (AES-256, TLS 1.3+) - [ ] Data minimization principles applied - [ ] Data retention and deletion policies disclosed

5. Workflow Integration (Weight: 10%) - [ ] User research conducted with target users - [ ] EHR integration demonstrated (for [ORGANIZATION’S EHR SYSTEM]) - [ ] Training program provided - [ ] Technical support available (phone + email)

6. Business Viability (Weight: 5%) - [ ] Company financially stable (≥3 years in business or well-funded) - [ ] ≥3 customer references provided - [ ] FDA clearance/approval (if applicable) - [ ] Transparent pricing with no hidden costs

Scoring: - 8-10/10: Proceed with pilot deployment - 6-7.9/10: Conditional deployment with mitigation plans - <6/10: Do not deploy; request improvements or seek alternatives

3.2 Procurement Process

Step 1: Needs Assessment (2-4 weeks) - Clinical department submits AI use case proposal - AI Governance Committee reviews and approves/rejects - If approved, proceed to vendor evaluation

Step 2: Vendor Evaluation (4-8 weeks) - Procurement team conducts RFP process - AI Governance Committee scores vendors using evaluation checklist - Top 2-3 vendors invited for on-site demonstrations

Step 3: Pilot Agreement (2-4 weeks) - Legal reviews contract terms - Required contract clauses (see Section 3.3) - Pilot duration: 3-6 months in limited scope

Step 4: Pilot Deployment (3-6 months) - Deploy in 1-2 units with intensive monitoring - Collect data on technical performance, clinical outcomes, user satisfaction - AI Governance Committee reviews pilot results

Step 5: Go/No-Go Decision - If pilot successful: Proceed to full deployment - If pilot unsuccessful: Terminate contract (no penalty per pilot clause) - Publish results (positive or negative) for transparency

3.3 Required Contract Clauses

All vendor contracts shall include:

Performance Guarantees:

Vendor guarantees AI system will achieve the following performance metrics
at [ORGANIZATION NAME]:
- [PRIMARY METRIC]: ≥ [THRESHOLD]
- False Positive Rate: ≤ [THRESHOLD]%
- User Satisfaction: ≥ [THRESHOLD]/5

If performance falls below these thresholds for [TIMEFRAME], [ORGANIZATION]
may terminate without penalty and receive full refund.

Fairness Requirements:

Vendor warrants AI system demonstrates equitable performance across demographic
groups (performance difference <10% across race, ethnicity, sex, age).

Vendor will provide quarterly bias audit reports. If disparate impact identified,
Vendor must remediate within [TIMEFRAME] or [ORGANIZATION] may terminate.

Data Privacy and Security:

Vendor agrees to:
- Sign HIPAA Business Associate Agreement before PHI access
- Encrypt data at rest (AES-256) and in transit (TLS 1.3+)
- Store data only in HIPAA-compliant infrastructure in [COUNTRY/REGION]
- Not use [ORGANIZATION] data for Vendor's R&D without written consent
- Delete all [ORGANIZATION] data within 30 days of contract termination
- Provide audit logs of data access quarterly

Validation Rights:

[ORGANIZATION] has the right to:
- Conduct independent validation studies
- Publish validation results (positive or negative)
- Access model performance dashboards in real-time
- Receive technical documentation for validation

Liability and Indemnification:

Vendor indemnifies [ORGANIZATION] for:
- Patient harm from AI errors (up to $[AMOUNT])
- Regulatory fines from Vendor non-compliance
- Data breaches from Vendor security failures

Vendor maintains professional liability insurance of ≥$[AMOUNT].

Termination Rights:

[ORGANIZATION] may terminate for:
- Cause (breach): Immediate termination, full refund
- Safety concerns: Immediate termination if patient safety risk
- Non-performance: If system fails performance guarantees
- Convenience: [X]-day notice, pro-rated refund

Upon termination, Vendor must delete all data and provide data export.

4. Development and Validation (Internal AI Development)

4.1 Applicability

This section applies when [ORGANIZATION NAME] develops AI systems internally (not acquired from vendors).

[OPTIONAL: Skip this section if organization does not plan internal AI development]

4.2 Development Requirements

IRB/Ethics Review: - All AI development involving patient data requires IRB approval - Expedited review for retrospective, de-identified data studies - Full board review for prospective deployment affecting patient care

Development Team Composition: - Multidisciplinary: Data scientists + clinicians + ethicists - Clinical co-leads for all projects (not just consultants) - Health equity expert involvement from project start

Training Data Requirements: - Real patient outcomes (not synthetic or hypothetical cases) (Ross and Swetlitz 2018) - Representative sample: Diverse by race, ethnicity, age, sex, socioeconomic status - Document known biases in historical data - Data quality assessment before modeling - Minimum sample size: [e.g., 1000 patients for rare outcomes, 10,000+ for common outcomes]

Validation Requirements:

Stage 1: Internal Validation - Hold-out test set (≥20% of data) - Never used during training or hyperparameter tuning - Temporal validation for time-series data (train on past, test on future)

Stage 2: External Validation - Test at ≥2 independent sites not involved in development - Diverse patient populations - Report performance stratified by demographics - Independent researchers conduct validation

Stage 3: Prospective Validation - Pilot deployment in controlled setting - Compare AI-assisted care to standard care - Measure clinical outcomes (not just AUC) - Monitor for unintended consequences

Documentation Requirements: - Model card documenting training data, performance, limitations (Mitchell2019ModelCards?) - Code and data (if sharable) archived in institutional repository - Audit trail of all modeling decisions

4.3 Fairness and Bias Testing

Mandatory bias audits before deployment:

Subgroup Analysis:
- Performance stratified by: race, ethnicity, sex, age, insurance type, socioeconomic status
- Disparate impact test: Performance difference >10% across groups = FAIL
- If failed: Implement bias mitigation before deployment
Calibration Analysis:
- Is model equally well-calibrated across demographic groups?
- Use calibration plots by subgroup
Feature Importance by Group:
- Do different features drive predictions for different groups?
- May indicate bias or population differences requiring clinical review
Fairness Metrics:
- Report multiple definitions: demographic parity, equalized odds, equal opportunity
- No single metric sufficient; review holistically (Rajkomar2018FairML?)

If bias detected: - Document in model card - Implement mitigation: re-sampling, re-weighting, fairness constraints - If mitigation insufficient: DO NOT DEPLOY - Consider whether problem is solvable with AI or requires systemic interventions

5. Deployment and Implementation

5.1 Pilot Deployment Requirements

All AI systems (vendor or internal) must undergo pilot before full deployment.

Pilot Scope: - 1-3 clinical units or departments - 3-6 month duration - Intensive monitoring and support

Pre-Deployment Checklist:

Technical: - [ ] System integrated with EHR/IT infrastructure - [ ] Performance monitoring dashboard operational - [ ] Backup procedures in place for system failures - [ ] Data security measures implemented and tested

Clinical: - [ ] Clinical workflows documented and staff trained - [ ] Alert thresholds configured and tested - [ ] Escalation procedures defined (what to do if AI fails) - [ ] Clinical oversight process established

Governance: - [ ] AI Governance Committee approval obtained - [ ] IRB approval (if applicable) - [ ] Patient notification process defined (if applicable) - [ ] Informed consent procedures (for high-risk systems)

5.2 User Training Requirements

All staff using AI systems must complete training:

Onboarding Training (Required before first use): - How the AI system works (conceptual understanding) - What the system does and does NOT do (limitations) - How to interpret AI outputs (scores, alerts, recommendations) - When to override AI recommendations (clinical judgment paramount) - How to report issues or concerns

Continuing Education (Annual): - Review of system performance and any changes - Case studies: correct AI use and common errors - Updates to policies or procedures

Competency Assessment: - [OPTIONAL] Quiz or practical assessment before independent use - [OPTIONAL] Periodic competency re-assessment

5.4 Go-Live Process

Phase 1: Silent Mode (Week 1-2) - AI generates outputs but NOT shown to clinicians - Verify technical stability, data quality, integration - Identify and fix bugs before clinical use

Phase 2: Alert Mode (Week 3-4) - AI outputs shown to clinicians as informational - Clinicians provide feedback on usefulness and accuracy - Measure alert burden and response rates

Phase 3: Active Mode (Week 5+) - AI fully integrated into clinical workflow - Intensive monitoring continues - Weekly review by AI Governance Committee

Go/No-Go Decision (Month 3-6): - Pre-defined success criteria met? Proceed to full deployment - Partial success? Iterate and continue pilot - Failure? Terminate and lessons learned

6. Monitoring and Maintenance

6.1 Continuous Performance Monitoring

All deployed AI systems require ongoing monitoring:

Technical Performance Metrics (Monitored Continuously): - Prediction accuracy (AUC, sensitivity, specificity) - Alert rate (alerts per day per unit) - False positive and false negative rates - System uptime and latency - Data quality scores (completeness, validity)

Clinical Outcome Metrics (Monitored Monthly): - Patient outcomes (mortality, readmissions, complications) - Time to diagnosis or treatment - Length of stay - Clinician satisfaction and trust - Patient satisfaction (if applicable)

Fairness Metrics (Monitored Quarterly): - Performance stratified by demographics - Alert rates by demographic group - Clinical outcomes by demographic group - Disparate impact analysis

Monitoring Dashboards: - Real-time dashboard accessible to AI Governance Committee - Automated alerts for performance degradation >10% - Monthly summary reports to clinical leadership

6.2 Model Drift Detection

AI models degrade over time as populations and care practices change (Finlayson2021AIModelDrift?).

Drift Monitoring: - Data drift: Are input feature distributions changing? - Concept drift: Is relationship between features and outcomes changing? - Performance drift: Is model accuracy decreasing?

Automated Drift Detection: - Compare current data distribution to training data (monthly) - Alert if distribution shift >threshold (e.g., Kolmogorov-Smirnov test p<0.01) - Alert if performance decline >5% from baseline

Response to Drift: 1. Investigate root cause (population change? Documentation change? Model failure?) 2. If temporary: Continue monitoring 3. If persistent: Retrain model with recent data 4. If severe: Deactivate system until resolved

6.3 Model Retraining

When to retrain: - Performance drift detected (>5% decline for >2 months) - Significant change in clinical practice or guidelines - New data available (e.g., additional 12 months) - Regulatory requirement or vendor update

Retraining Process: - Full validation cycle required (same as initial deployment) - Cannot skip validation even for “minor” updates - Document changes in model card - Notify users of updates and any behavior changes

6.4 Adverse Event Reporting

AI-related adverse events must be reported:

Reportable Events: - Patient harm attributed to AI error or malfunction - Near-miss events (AI error caught before patient harm) - Systematic bias affecting patient subgroups - Privacy or security breaches involving AI system data

Reporting Process: 1. Immediate report to AI Governance Committee Chair (within 24 hours) 2. Investigation by Quality and Safety (within 7 days) 3. Root cause analysis (within 30 days) 4. Corrective actions implemented 5. Report to FDA if device-related (within 30 days per FDA requirements) (FDA2024MDRReporting?)

Learning from Events: - Quarterly summary of all AI-related events - Lessons learned shared with clinical staff - Policy updates if systemic issues identified

7. Data Privacy and Security

7.1 HIPAA Compliance

All AI systems processing PHI must comply with HIPAA Security Rule (HHS2024HIPAASecurity?).

Administrative Safeguards: - Designate AI system security official - Workforce training on AI data security - Access controls (role-based, least privilege) - Audit logs of all data access (retained 6 years)

Physical Safeguards: - Secure data centers (if on-premise) - Workstation security policies - Device and media controls

Technical Safeguards: - Encryption at rest (AES-256 minimum) - Encryption in transit (TLS 1.3 minimum) - Multi-factor authentication for system access - Automatic logoff after inactivity

Risk Assessment: - Annual HIPAA risk assessment for all AI systems - Document risks and mitigation strategies - Test security controls annually

7.2 Data Minimization

Collect only data necessary for AI function (HIPAA minimum necessary rule).

Data Minimization Principles: - Define minimum data set required for each AI system - Request access to subset of EHR data (not entire database) - De-identify data when possible (for non-clinical uses) - Limit data retention to minimum required period

Example: - Readmission prediction model needs: demographics, diagnoses, medications, utilization - Does NOT need: clinical notes, radiology images, genetic data - Request only necessary data fields

7.3 Data Retention and Deletion

Data Lifecycle Management:

Active Use: - Data retained as long as AI system deployed and providing clinical value

System Decommissioning: - When AI system retired, all associated data must be: - Archived for regulatory requirements (e.g., 7 years for HIPAA) - Deleted securely after retention period - Certificate of destruction obtained (if vendor data)

Vendor Data: - Upon contract termination, vendor must delete all [ORGANIZATION] data within 30 days - Provide written certification of deletion - Audit vendor compliance if high-risk data

8. Fairness, Equity, and Bias

8.1 Commitment to Health Equity

[ORGANIZATION NAME] is committed to ensuring AI systems promote health equity and do not perpetuate or exacerbate health disparities.

Equity Principles: 1. Fairness by Design: Equity considerations integrated from project inception 2. Diverse Representation: Training data reflects patient population diversity 3. Transparent Reporting: Performance stratified by demographics 4. Continuous Monitoring: Ongoing bias surveillance 5. Community Engagement: Affected communities involved in AI governance

8.2 Bias Detection and Mitigation

Pre-Deployment Bias Audit (Required):

Conducted by [Quality/Equity Department or External Auditor]:

Data Representativeness:
- Compare training data demographics to [ORGANIZATION]’s patient population
- Identify underrepresented groups
- Assess whether underrepresentation introduces bias
Performance Equity:
- AUC, sensitivity, specificity by: race, ethnicity, sex, age, insurance, zip code
- Performance difference >10% = RED FLAG
- Investigate cause and implement mitigation
Outcome Equity:
- Do AI recommendations lead to equitable care?
- Example: Are referrals to specialists equally likely for Black and White patients with same AI score?
Proxy Variable Analysis:
- Does AI use variables correlated with race/ethnicity as proxies?
- Example: Zip code, insurance type, hospital site
- Review whether proxies introduce bias (Obermeyer2019AlgorithmicBias?)

Mitigation Strategies (if bias detected): - Re-sampling to balance training data - Re-weighting to emphasize underrepresented groups - Fairness constraints in model training - Post-processing to equalize outcomes across groups - If mitigation insufficient: DO NOT DEPLOY

8.3 Ongoing Equity Monitoring

Quarterly Equity Dashboard:

Metrics reported to AI Governance Committee: - AI system usage by patient demographics - Performance metrics by subgroup - Clinical outcomes by subgroup - Alert/referral rates by subgroup

Annual Equity Audit: - External auditor reviews all AI systems - Community stakeholder engagement sessions - Equity action plans for systems with disparities

9. Incident Response and Accountability

9.1 Incident Classification

Level 1 (Critical): - Patient death or serious injury attributed to AI - Systematic bias causing widespread harm - Major data breach (PHI of >500 patients) - FDA recall or regulatory action

Response: Immediate system deactivation, CEO notification, root cause analysis within 7 days

Level 2 (Major): - Patient harm requiring intervention (no permanent injury) - Performance degradation >20% - Minor data breach (<500 patients) - Vendor contract breach

Response: Urgent AI Governance Committee review, investigation within 14 days, corrective action within 30 days

Level 3 (Minor): - Near-miss events (caught before patient harm) - Performance degradation 10-20% - User complaints or usability issues - Technical failures (quickly resolved)

Response: Standard investigation, review at next quarterly meeting

9.2 Incident Response Procedure

Step 1: Detect and Report (0-24 hours) - Anyone can report incident via [REPORTING SYSTEM] - On-call AI Governance Committee member notified - Immediate risk assessment: Continue, restrict, or deactivate system?

Step 2: Investigate (1-7 days for Level 1; 1-14 days for Level 2) - Assemble investigation team (Quality, IT, Clinical, Legal) - Root cause analysis using established frameworks - Document timeline, contributing factors, harm caused

Step 3: Implement Corrective Actions (7-30 days) - Immediate fixes to prevent recurrence - System updates or reconfiguration - Policy or procedure updates - Additional staff training

Step 4: Report (30 days) - Internal report to AI Governance Committee and leadership - External reporting if required: - FDA (medical device adverse events) (FDA2024MDRReporting?) - HHS (HIPAA breaches >500 patients) - State health departments (if required)

Step 5: Learn and Improve (Ongoing) - Lessons learned shared organization-wide - Update policies and training - Contribute to field knowledge (de-identified case studies)

9.3 Accountability Framework

Who is accountable for AI decisions?

Clinical Decisions: - Clinician is ALWAYS accountable for clinical decisions - AI is a tool; clinician retains responsibility - Clinician may override AI recommendations with documentation - “AI made me do it” is NOT a valid defense

Organizational Accountability: - [ORGANIZATION NAME] is accountable for: - Appropriate AI system selection and validation - Adequate training and support for users - Monitoring and maintenance - Response to identified problems

Vendor Accountability: - Vendors are accountable for: - Accuracy of marketing claims - System performance as specified in contract - Security and privacy protections - Notification of known issues or limitations

Liability Insurance: - [ORGANIZATION] maintains professional and cyber liability insurance covering AI-related risks - Vendors required to maintain minimum insurance per contracts

10. Policy Compliance and Review

10.1 Compliance Monitoring

Compliance Officer (designated by AI Governance Committee) monitors: - All AI acquisitions undergo required evaluation - Pilots completed before full deployment - Monitoring dashboards reviewed quarterly - Adverse events reported and investigated - Training completed by all users - Documentation maintained per requirements

Audit Schedule: - Internal audit annually - External audit every 2 years [OPTIONAL]

10.2 Policy Violations

Violations of this policy may result in: - System deactivation until compliance achieved - Retraining required for involved staff - Disciplinary action per [ORGANIZATION] HR policies - Termination of vendor contracts - Regulatory reporting (if violation causes patient harm)

10.3 Policy Review and Updates

Annual Review: - AI Governance Committee reviews policy annually - Update based on: - New regulations or guidance - Lessons learned from incidents - Advances in AI safety and fairness methods - Feedback from staff and patients

Version Control: - All policy versions archived - Changes documented with rationale - Staff notified of significant changes

Approval Process: - Updated policy requires approval by [CEO, CMO, Board] - Effective date after approval and staff notification

Implementation Guidance

Getting Started: 90-Day Implementation Plan

Month 1: Establish Governance - Week 1-2: Form AI Governance Committee - Week 3-4: Customize this policy template for your organization - Month 1 End: Policy approved by leadership

Month 2: Inventory and Assessment - Week 5-6: Inventory all AI systems currently in use - Week 7-8: Assess compliance with new policy; identify gaps - Month 2 End: Prioritized list of remediation actions

Month 3: Training and Rollout - Week 9-10: Train AI Governance Committee and key staff - Week 11-12: Communicate policy to all staff; launch reporting system - Month 3 End: Policy fully operational

Resources for Implementation

Templates and Tools: - AI Vendor Evaluation Checklist (Appendix G) - Model Card Template (Mitchell2019ModelCards?) - Bias Audit Checklist (Section 8.2) - Incident Report Form (Section 9.2)

External Resources: - FDA AI/ML Medical Device Action Plan: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices - WHO AI Ethics Principles: https://www.who.int/publications/i/item/9789240029200 - Coalition for Health AI (CHAI): https://www.coalitionforhealthai.org/

Expert Consultation: Consider engaging external consultants for: - Legal review of policy (healthcare attorney with AI expertise) - Technical review (AI safety researcher) - Equity audit (health equity researcher)

Example Policies from Leading Institutions

Examples (De-identified)

Large Academic Medical Center (East Coast): - Established AI Oversight Committee (2022) - Requires external validation for all AI purchases - Mandatory 6-month pilot before full deployment - Quarterly equity audits - Result: 3 AI systems deployed successfully, 2 rejected after evaluation

State Health Department (West Coast): - Developed AI governance framework (2023) - Prioritizes equity (50% of evaluation score) - Community Advisory Board reviews AI proposals - Public dashboard showing AI system performance - Result: Increased public trust, model for other states

Regional Health System (Midwest): - Conservative approach: Only FDA-cleared AI devices - Extensive validation required (12-month pilot minimum) - Patient notification for all AI use - Result: Slow but safe deployment; zero AI-related adverse events

References

Federal Regulations and Guidance:

FDA. (2024). Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. U.S. Food and Drug Administration. Retrieved from https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices (U.S. Food and Drug Administration 2024)
White House. (2023). Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. Executive Order 14110. Retrieved from https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/ (WhiteHouse2023AIExecutiveOrder?)
HHS. (2024). HIPAA Security Rule. U.S. Department of Health and Human Services. Retrieved from https://www.hhs.gov/hipaa/for-professionals/security/index.html (HHS2024HIPAASecurity?)
FDA. (2024). Medical Device Reporting (MDR): How to Report Medical Device Problems. Retrieved from https://www.fda.gov/medical-devices/medical-device-safety/medical-device-reporting-mdr-how-report-medical-device-problems (FDA2024MDRReporting?)

International Standards:

European Parliament. (2024). Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act). Official Journal of the European Union. Retrieved from https://eur-lex.europa.eu/eli/reg/2024/1689/oj (European Parliament and Council 2024)
WHO. (2021). Ethics and governance of artificial intelligence for health. World Health Organization. Geneva. ISBN: 978-92-4-002920-3. Retrieved from https://www.who.int/publications/i/item/9789240029200 (WHO2021AIEthics?)

State-Level Regulations:

California Assembly Bill 2013 (2024). Automated Decision Systems: Algorithmic Discrimination. California State Legislature. (CaliforniaAB2013?)
New York City Local Law 144 (2023). Automated Employment Decision Tools. NYC Administrative Code. (NYCLocalLaw144?)

Academic Literature - AI Validation and Safety:

Nagendran, M., Chen, Y., Lovejoy, C. A., et al. (2020). Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ, 368, m689. DOI: 10.1136/bmj.m689 (Nagendran2020AIValidation?)
Finlayson, S. G., Subbaswamy, A., Singh, K., et al. (2021). The Clinician and Dataset Shift in Artificial Intelligence. New England Journal of Medicine, 385(3), 283-286. DOI: 10.1056/NEJMc2104626 (Finlayson2021AIModelDrift?)
Ross, C., & Swetlitz, I. (2017). IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments. STAT News. Retrieved from https://www.statnews.com/2017/07/25/ibm-watson-cancer-unsafe/ (Ross and Swetlitz 2018)

Academic Literature - Bias and Fairness:

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447-453. DOI: 10.1126/science.aax2342 (Obermeyer2019AlgorithmicBias?)
Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., & Chin, M. H. (2018). Ensuring Fairness in Machine Learning to Advance Health Equity. Annals of Internal Medicine, 169(12), 866-872. DOI: 10.7326/M18-1990 (Rajkomar2018FairML?)

Documentation and Transparency:

Mitchell, M., Wu, S., Zaldivar, A., et al. (2019). Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* ’19), 220-229. DOI: 10.1145/3287560.3287596 (Mitchell2019ModelCards?)

General AI Resources:

Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Pearson. (Russell2020AI?)
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. (Goodfellow2016DeepLearning?)
Tegomoh, B. (2025). The Public Health AI Handbook: A Practical Guide for Epidemiologists and Public Health Practitioners. Retrieved from https://github.com/BryanTegomoh/publichealth-ai-handbook (Tegomoh2025Handbook?)

Appendix: Sample Monitoring Dashboard Metrics

Monthly AI System Performance Dashboard

System Name: [AI System Name] Deployment Date: [Date] Reporting Period: [Month/Year] Status: 🟢 Normal / 🟡 Warning / 🔴 Critical

Technical Performance

Metric	Current	Baseline	Change	Status
AUC	0.78	0.80	-2.5%	🟡
Sensitivity	0.75	0.75	0%	🟢
Specificity	0.82	0.85	-3.5%	🟡
PPV	0.68	0.70	-2.9%	🟡
False Positive Rate	18%	15%	+3%	🟡
System Uptime	99.2%	99.5%	-0.3%	🟢

Clinical Outcomes

Metric	Current	Baseline	Change	Status
30-day Mortality	8.2%	8.5%	-3.5%	🟢
Length of Stay	5.2 days	5.5 days	-5.5%	🟢
Time to Treatment	45 min	52 min	-13.5%	🟢

Equity Metrics

Subgroup	AUC	Sensitivity	Specificity	Status
White	0.79	0.76	0.83	🟢
Black	0.76	0.73	0.80	🟡
Hispanic	0.77	0.74	0.81	🟢
Asian	0.80	0.77	0.84	🟢
Age <65	0.80	0.77	0.84	🟢
Age ≥65	0.76	0.73	0.80	🟡

Disparate Impact Assessment: Performance difference between best and worst subgroup = 4% (threshold: 10%) Status: 🟢 Acceptable disparity

User Experience

Metric	Current	Target	Status
Clinician Satisfaction	3.8/5	≥4.0/5	🟡
Alert Response Rate	75%	≥80%	🟡
Average Time Per Alert	3.2 min	<5 min	🟢

Incidents

Level 1: 0
Level 2: 0
Level 3: 2 (minor technical glitches, resolved)

Action Items

🟡 Investigate performance decline in Black patients (AUC 0.76 vs 0.79 for White patients)
🟡 Improve clinician satisfaction through workflow optimization
🟢 Continue monthly monitoring

Next Review: [Date]

END OF POLICY TEMPLATE

This policy template is provided as a starting point. Organizations should customize based on their specific needs, regulatory environment, and risk tolerance. Consult with legal counsel, AI experts, and clinical leadership before finalizing and implementing.