17  Emerging Technologies

NoteLearning Objectives

By the end of this chapter, you will be able to:

  1. Identify emerging AI technologies with potential for transformative impact in public health
  2. Understand foundation models and their applications beyond language (multimodal, vision, scientific)
  3. Evaluate quantum computing potential for epidemiological simulations and optimization
  4. Assess federated learning for privacy-preserving multi-site public health research
  5. Explore edge AI and its role in resource-limited settings and real-time surveillance
  6. Understand digital twins for personalized medicine and population health modeling
  7. Recognize AI-powered drug discovery and vaccine development acceleration
  8. Evaluate causality and causal inference methods for public health decision-making
  9. Prepare for technology adoption by understanding readiness assessment and implementation pathways
  10. Anticipate risks and challenges associated with emerging technologies in healthcare

Estimated time: 60-75 minutes

Prerequisites: - Chapter 2: Just Enough AI to Be Dangerous - AI fundamentals - Chapter 19: Large Language Models in Public Health - Foundation model basics - Chapter 11: Privacy, Security, and Governance - Privacy technologies


17.1 What You’ll Learn

This chapter explores cutting-edge AI technologies on the horizon for public health, including:

  1. Foundation Models - Multimodal models, vision-language systems, scientific foundation models
  2. Federated Learning - Privacy-preserving collaborative learning across institutions
  3. Edge AI - On-device intelligence for resource-limited settings
  4. Quantum Computing - Potential for complex epidemiological modeling
  5. Digital Twins - Virtual representations for personalized and population health
  6. AI-Driven Drug Discovery - Accelerating therapeutic development
  7. Causal AI - Moving beyond correlation to understand causation
  8. Neuromorphic Computing - Brain-inspired hardware for efficient AI

17.2 1. Introduction: The Rapidly Evolving Landscape

17.2.1 The Pace of Change

AI capabilities are advancing at an unprecedented pace:

  • GPT-3 (2020): 175B parameters
  • GPT-4 (2023): Multimodal, reasoning capabilities
  • Claude 3 (2024): Long context, vision, analysis
  • What’s next? Agentic AI, world models, scientific reasoning

Sevilla et al., 2022, arXiv show that AI compute doubles approximately every 6 months - faster than Moore’s Law.

17.2.2 Why Public Health Should Pay Attention

Historical precedent: Technologies that seemed futuristic become routine: - 2010: “AI can’t diagnose diseases” → 2020: FDA-approved AI diagnostics - 2015: “LLMs are just autocomplete” → 2023: Clinical reasoning with ChatGPT/GPT-4 - 2020: “mRNA vaccines unproven” → 2021: COVID-19 vaccine in record time

The opportunity: Early adopters who understand emerging technologies can: - Shape development toward public health needs - Prepare infrastructure and workforce - Influence policy and standards - Avoid being blindsided by rapid adoption


17.2.3 Technology Readiness Levels (TRL)

NASA’s Technology Readiness Level scale helps assess maturity:

TRL Stage Description Example in Public Health AI
1-3 Research Basic principles, proof of concept Quantum algorithms for epidemiology
4-6 Development Lab validation, prototype testing Federated learning across 3 hospitals
7-9 Deployment System demonstration, operations GPT-4 for clinical documentation

This chapter focuses on TRL 3-7 technologies: proven concepts moving toward practical deployment.


17.3 2. Foundation Models and Multimodal AI

17.3.1 Beyond Language: Multimodal Foundation Models

Definition: Models trained on multiple data types (text, images, audio, video, sensor data) that can understand and generate across modalities.

17.3.1.1 Examples:

GPT-4 Vision (OpenAI, 2023) - Input: Medical images + text questions - Output: Diagnostic descriptions, differential diagnoses - OpenAI GPT-4V

Med-PaLM M (Google, 2023) - Trained on medical imaging, genomics, clinical notes - Tu et al., 2023, arXiv - Performance: Approaching specialist level on multimodal tasks

CLIP for Medical Imaging (OpenAI adaptation) - Zero-shot classification of medical images - Natural language → image retrieval - Radford et al., 2021, ICML

17.3.2 Public Health Applications

17.3.2.1 1. Outbreak Investigation

Scenario: Foodborne illness outbreak

# Multimodal analysis combining:
# - Patient symptom descriptions (text)
# - Food images from social media (vision)
# - Geographic check-ins (spatial)
# - Temporal patterns (time series)

from openai import OpenAI
client = OpenAI()

# Analyze restaurant food image + patient symptoms
response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": """Analyze this restaurant meal for food safety concerns.
                    Patient reported: diarrhea, nausea 6 hours after eating.
                    What are potential pathogens and risk factors visible?"""
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/meal_photo.jpg"}
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

Output:

Analyzing the image, I observe:
- Undercooked chicken (pink interior visible)
- Cross-contamination risk: raw vegetables adjacent to meat
- Improper temperature control indicated by condensation

Given 6-hour incubation:
- Most likely: Salmonella (chicken source)
- Possible: Clostridium perfringens
- Less likely: Norovirus (longer incubation)

Recommended actions:
1. Test food samples for Salmonella
2. Interview other diners with similar exposure
3. Inspect restaurant kitchen practices

17.3.2.2 2. Telemedicine in Low-Resource Settings

Combine patient descriptions with photos for remote diagnosis:

# Dermatology diagnosis via WhatsApp in rural clinic
diagnosis = multimodal_model.analyze(
    image="skin_lesion.jpg",
    text="3-week history, itchy, spreading",
    patient_info={"age": 45, "location": "rural Kenya", "hiv_status": "unknown"}
)

# Output: Suspected fungal infection (tinea corporis)
# Recommended: Topical antifungal, HIV test if persistent

17.3.3 Scientific Foundation Models

Models trained specifically on scientific data:

17.3.3.1 Galactica (Meta, 2022)

Taylor et al., 2022, arXiv

  • Trained on 48M scientific papers, textbooks, knowledge bases
  • Can summarize papers, annotate molecules, write literature reviews
  • Controversy: Withdrawn after generating plausible but false citations

17.3.3.2 BioGPT (Microsoft, 2022)

Luo et al., 2022, Briefings in Bioinformatics

  • Pre-trained on 15M PubMed abstracts
  • Outperforms GPT-3 on biomedical NLP tasks
  • Use cases: Literature search, clinical trial matching, adverse event detection

17.3.3.3 MedPaLM 2 (Google, 2023)

Singhal et al., 2023, Nature

  • 85% accuracy on medical licensing exam (USMLE)
  • Safety-tuned for clinical applications
  • Multi-turn medical conversation capability

17.3.4 Implications for Public Health

Opportunities: - Rapid literature synthesis during emergencies (like COVID-19) - Multi-lingual health communication with vision context - Remote diagnostics combining images + patient history - Automated surveillance from diverse data sources (text, images, videos)

Challenges: - Hallucination risk - Plausible but incorrect information - Validation difficulty - How to verify multimodal outputs? - Computational cost - Large models expensive to run at scale - Bias amplification - Training data may lack diverse populations


17.4 3. Federated Learning for Privacy-Preserving Collaboration

17.4.1 The Problem: Data Silos in Public Health

Current reality: - Hospital A has 5,000 patient records - Hospital B has 7,000 patient records - Hospital C has 3,000 patient records

Traditional approach: 1. Create data use agreements (6-18 months) 2. De-identify and transfer data to central location 3. Risk of re-identification 4. Patient privacy concerns

Federated learning approach: 1. Train model locally at each site 2. Share only model updates (not data) 3. Aggregate updates centrally 4. No patient data leaves institution

17.4.2 How Federated Learning Works

McMahan et al., 2017, AISTATS introduced Federated Averaging (FedAvg):

┌─────────────────────────────────────────────────────────┐
│              Federated Learning Process                  │
├─────────────────────────────────────────────────────────┤
│                                                          │
│  Central Server                                          │
│       │                                                  │
│       ├──── Initial Model ────▶  Hospital A             │
│       │                          (Local Training)        │
│       ├──── Initial Model ────▶  Hospital B             │
│       │                          (Local Training)        │
│       └──── Initial Model ────▶  Hospital C             │
│                                  (Local Training)        │
│       ▲                                                  │
│       │                                                  │
│       ├──── Model Update ◀────  Hospital A              │
│       ├──── Model Update ◀────  Hospital B              │
│       └──── Model Update ◀────  Hospital C              │
│       │                                                  │
│  Aggregate & Update                                      │
│       │                                                  │
│       └──── Repeat for N rounds ────▶                   │
│                                                          │
└─────────────────────────────────────────────────────────┘

17.4.3 Implementation Example

Using Flower framework:

import flwr as fl
from flwr.common import Metrics
import tensorflow as tf
from typing import List, Tuple

# Define model
def create_sepsis_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(64, activation='relu', input_shape=(20,)),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(32, activation='relu'),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    model.compile(
        optimizer='adam',
        loss='binary_crossentropy',
        metrics=['accuracy', tf.keras.metrics.AUC(name='auc')]
    )
    return model

# Client (Hospital)
class HospitalClient(fl.client.NumPyClient):
    def __init__(self, hospital_id: str):
        self.hospital_id = hospital_id
        self.model = create_sepsis_model()
        # Load local hospital data
        self.X_train, self.y_train = load_hospital_data(hospital_id)
        self.X_val, self.y_val = load_hospital_validation_data(hospital_id)

    def get_parameters(self, config):
        """Return model parameters"""
        return self.model.get_weights()

    def fit(self, parameters, config):
        """Train model on local data"""
        # Update model with parameters from server
        self.model.set_weights(parameters)

        # Train on local data
        history = self.model.fit(
            self.X_train, self.y_train,
            epochs=5,
            batch_size=32,
            validation_data=(self.X_val, self.y_val),
            verbose=0
        )

        # Return updated model parameters and metrics
        return (
            self.model.get_weights(),
            len(self.X_train),
            {"accuracy": float(history.history['accuracy'][-1])}
        )

    def evaluate(self, parameters, config):
        """Evaluate model on local validation data"""
        self.model.set_weights(parameters)
        loss, accuracy, auc = self.model.evaluate(
            self.X_val, self.y_val,
            verbose=0
        )
        return loss, len(self.X_val), {"accuracy": accuracy, "auc": auc}

# Server: Aggregate metrics
def weighted_average(metrics: List[Tuple[int, Metrics]]) -> Metrics:
    """Weighted average of metrics from all hospitals"""
    # Multiply accuracy of each client by number of examples
    accuracies = [num_examples * m["accuracy"] for num_examples, m in metrics]
    examples = [num_examples for num_examples, _ in metrics]

    # Aggregate and return
    return {"accuracy": sum(accuracies) / sum(examples)}

# Start federated learning
def start_federated_learning():
    # Define strategy
    strategy = fl.server.strategy.FedAvg(
        fraction_fit=1.0,  # Sample 100% of available clients for training
        fraction_evaluate=1.0,  # Sample 100% for evaluation
        min_available_clients=3,  # Wait for 3 hospitals before starting
        evaluate_metrics_aggregation_fn=weighted_average,
    )

    # Start Flower server
    fl.server.start_server(
        server_address="0.0.0.0:8080",
        config=fl.server.ServerConfig(num_rounds=10),
        strategy=strategy,
    )

# Each hospital runs this:
def hospital_participates(hospital_id: str):
    client = HospitalClient(hospital_id)
    fl.client.start_numpy_client(
        server_address="central-server.health.gov:8080",
        client=client
    )

# Usage:
# Hospital A: hospital_participates("hospital_a")
# Hospital B: hospital_participates("hospital_b")
# Hospital C: hospital_participates("hospital_c")

17.4.4 Real-World Applications

17.4.4.1 MELLODDY (2020-2023)

Multi-Partner Learning for drug discovery

  • Partners: 10 pharmaceutical companies
  • Goal: Predict drug-target interactions
  • Data: 10M+ compounds (kept at each company)
  • Result: Better models than any single company could build alone

17.4.4.2 FeTS (Federated Tumor Segmentation)

Sheller et al., 2020, Scientific Reports

  • Partners: 30 international institutions
  • Task: Brain tumor segmentation from MRI
  • Finding: Federated model approached centralized performance without data sharing

17.4.4.3 Google’s Gboard Keyboard

McMahan & Ramage, 2017, Google AI Blog

  • Application: Next-word prediction
  • Scale: Millions of mobile devices
  • Privacy: User typing data never leaves device

17.4.5 Challenges and Solutions

Challenge Solution Approach
Non-IID data (hospitals have different patient populations) FedProx algorithm, personalized federated learning
Communication costs (slow networks) Compressed updates, fewer rounds
Malicious participants (poisoning attacks) Robust aggregation, Byzantine-tolerant methods
Differential privacy (model updates leak info) Add noise to gradients (DP-SGD)
Heterogeneous systems (different hardware) Asynchronous federated learning

17.5 4. Edge AI: Intelligence at the Point of Care

17.5.1 What is Edge AI?

Definition: Running AI models directly on devices (smartphones, tablets, IoT sensors) rather than in the cloud.

Why it matters for public health: - Low-latency: Instant results without internet delay - Privacy: Data never leaves device - Offline capability: Works in areas with poor connectivity - Cost: No cloud inference fees at scale - Bandwidth: Reduces data transmission needs

17.5.2 Technology Enablers

17.5.2.1 1. Model Compression

Techniques to make models smaller:

Quantization: Reduce precision (32-bit → 8-bit)

import tensorflow as tf

# Original model
model = tf.keras.models.load_model('sepsis_model.h5')

# Convert to TensorFlow Lite with quantization
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.int8]

tflite_quant_model = converter.convert()

# Save quantized model
with open('sepsis_model_quantized.tflite', 'wb') as f:
    f.write(tflite_quant_model)

# Check size reduction
import os
original_size = os.path.getsize('sepsis_model.h5') / (1024**2)
compressed_size = os.path.getsize('sepsis_model_quantized.tflite') / (1024**2)

print(f"Original: {original_size:.2f} MB")
print(f"Compressed: {compressed_size:.2f} MB")
print(f"Reduction: {(1 - compressed_size/original_size)*100:.1f}%")

Pruning: Remove unimportant connections

import tensorflow_model_optimization as tfmot

# Apply pruning
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

# Define pruning schedule
pruning_params = {
    'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.0,
        final_sparsity=0.5,  # Remove 50% of weights
        begin_step=0,
        end_step=1000
    )
}

# Apply to model
model_for_pruning = prune_low_magnitude(model, **pruning_params)

# Train with pruning
model_for_pruning.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model_for_pruning.fit(X_train, y_train, epochs=10)

Knowledge Distillation: Train small model to mimic large model

# Large teacher model
teacher_model = tf.keras.models.load_model('large_model.h5')

# Small student model
student_model = create_small_model()

# Distillation loss
def distillation_loss(y_true, y_pred, teacher_pred, temperature=3):
    # Student matches teacher's soft predictions
    soft_loss = tf.keras.losses.KLDivergence()(
        tf.nn.softmax(teacher_pred / temperature),
        tf.nn.softmax(y_pred / temperature)
    )
    # Student also learns from ground truth
    hard_loss = tf.keras.losses.binary_crossentropy(y_true, y_pred)

    return 0.5 * soft_loss + 0.5 * hard_loss

17.5.2.2 2. Specialized Hardware

Mobile/Edge AI Chips: - Google Coral: 4 TOPS (trillion operations/sec), $25 - NVIDIA Jetson Nano: 472 GFLOPS, $99 - Apple Neural Engine: Built into iPhone/iPad - Qualcomm AI Engine: Built into Snapdragon chips


17.5.3 Public Health Applications

17.5.3.1 1. Point-of-Care Diagnostics

Example: Malaria Detection from Blood Smears

Poostchi et al., 2018, PeerJ

# Mobile app using TensorFlow Lite
import tensorflow as tf
import numpy as np
from PIL import Image

# Load quantized model
interpreter = tf.lite.Interpreter(model_path="malaria_detector.tflite")
interpreter.allocate_tensors()

# Get input/output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

def diagnose_malaria(image_path):
    """
    Analyze blood smear image for malaria parasites
    Runs entirely on mobile device, no internet required
    """
    # Load and preprocess image
    img = Image.open(image_path).resize((224, 224))
    img_array = np.array(img).astype(np.float32) / 255.0
    img_array = np.expand_dims(img_array, axis=0)

    # Run inference
    interpreter.set_tensor(input_details[0]['index'], img_array)
    interpreter.invoke()
    prediction = interpreter.get_tensor(output_details[0]['index'])[0][0]

    # Interpret result
    result = {
        'infected': bool(prediction > 0.5),
        'confidence': float(prediction),
        'parasitemia': estimate_parasitemia(img_array) if prediction > 0.5 else 0,
        'recommendation': generate_recommendation(prediction)
    }

    return result

# Example usage (runs on smartphone)
result = diagnose_malaria("blood_smear_photo.jpg")
print(f"Malaria detected: {result['infected']}")
print(f"Confidence: {result['confidence']:.1%}")
print(f"Recommendation: {result['recommendation']}")

Impact: - Diagnosis time: 10 minutes → 30 seconds - Cost: $5 microscopy + technician → $0 marginal cost - Accessibility: Urban hospital → Rural clinic with smartphone - Scalability: Limited by trained technicians → Limited by smartphones

17.5.3.2 2. Wearable Disease Monitoring

Example: Fever Detection from Smartwatch

Miller et al., 2020, Nature Biomedical Engineering

# Continuous temperature monitoring on wearable device
class FeverDetector:
    """
    Runs on smartwatch to detect fever patterns
    Sends alert only when anomaly detected (preserves battery)
    """

    def __init__(self):
        self.baseline_temp = self.calibrate_baseline()
        self.hourly_temps = []

    def process_temperature_reading(self, temp, timestamp):
        """Process each temperature reading (every 5 minutes)"""
        self.hourly_temps.append(temp)

        # Keep last 12 readings (1 hour)
        if len(self.hourly_temps) > 12:
            self.hourly_temps.pop(0)

        # Check for fever pattern
        if len(self.hourly_temps) >= 6:  # At least 30 minutes of data
            fever_detected, confidence = self.detect_fever()

            if fever_detected and confidence > 0.8:
                self.send_alert(temp, confidence)
                # Log to phone app for physician review
                self.log_to_cloud_when_connected()

    def detect_fever(self):
        """Lightweight on-device fever detection"""
        # Simple but effective: Check if sustained elevation above baseline
        mean_temp = np.mean(self.hourly_temps)
        std_temp = np.std(self.hourly_temps)

        # Fever if:
        # 1. Mean temp > baseline + 1°C
        # 2. Low variance (sustained, not spike)
        fever = mean_temp > (self.baseline_temp + 1.0) and std_temp < 0.5
        confidence = min((mean_temp - self.baseline_temp) / 2.0, 1.0)

        return fever, confidence

17.5.3.3 3. Real-Time Syndromic Surveillance

Edge AI on mobile phones for symptom tracking:

  • Input: User-reported symptoms via app
  • Processing: On-device NLP to extract symptoms
  • Output: Aggregated signals (no PII sent to server)
  • Privacy: Individual data never leaves device

17.5.4 Case Study: Project Premonition (Microsoft Research)

Monitoring zoonotic disease emergence

Components: - Smart mosquito trap: Edge AI for species identification - On-device processing: Classify mosquito species from images - Selective sampling: Only upload images of disease vectors - Network: Deploy in remote areas with poor connectivity

Results: - 99% reduction in data transmission (only relevant images sent) - Real-time identification without expert entomologist - Scalable to resource-limited settings


17.6 5. Quantum Computing for Public Health

17.6.1 What is Quantum Computing?

Classical bit: 0 or 1 Quantum bit (qubit): Superposition of 0 and 1 simultaneously

Key principles: - Superposition: Explore many solutions simultaneously - Entanglement: Qubits correlated in ways impossible classically - Quantum advantage: Exponentially faster for specific problems

17.6.2 Current Status (TRL 3-4)

Available quantum computers: - IBM Quantum: Up to 433 qubits (Osprey, 2023) - Google: 53 qubits (Sycamore, claimed quantum supremacy 2019) - Rigetti: 80 qubits - IonQ: 32 qubits (trapped ion)

Reality check: Still noisy, error-prone, limited to small problems. But improving rapidly.


17.6.3 Potential Applications in Public Health

17.6.3.1 1. Epidemic Simulation

Problem: Simulate disease spread through network of 1M people with 10 interactions each.

Classical approach: - State space: 2^1,000,000 possible configurations - Intractable even for supercomputers

Quantum approach: - Encode population in qubits - Quantum gates simulate interactions - Measure to sample likely outcomes - [Efficient for certain classes of epidemic models]

Orus et al., 2019, arXiv - “Quantum computing for finance”

Example (conceptual):

# Using IBM Qiskit for epidemic modeling (simplified)
from qiskit import QuantumCircuit, QuantumRegister, ClassicalRegister
from qiskit import execute, Aer
from qiskit.circuit import Parameter

def create_epidemic_circuit(n_individuals, transmission_prob):
    """
    Create quantum circuit for epidemic simulation
    Each qubit represents one individual (0=susceptible, 1=infected)
    """
    # Create quantum and classical registers
    qr = QuantumRegister(n_individuals, 'person')
    cr = ClassicalRegister(n_individuals, 'outcome')
    circuit = QuantumCircuit(qr, cr)

    # Initialize: patient zero infected
    circuit.x(qr[0])

    # Transmission gates (simplified)
    theta = Parameter('θ')  # Angle related to transmission probability

    for i in range(n_individuals - 1):
        # Controlled rotation: infected person can infect susceptible neighbor
        circuit.cry(theta, qr[i], qr[i+1])

    # Measure all individuals
    circuit.measure(qr, cr)

    return circuit

# Run simulation
circuit = create_epidemic_circuit(n_individuals=10, transmission_prob=0.3)
backend = Aer.get_backend('qasm_simulator')
job = execute(circuit, backend, shots=1000)
result = job.result()
counts = result.get_counts()

# Analyze results
print("Epidemic outcomes:")
for outcome, count in sorted(counts.items(), key=lambda x: x[1], reverse=True)[:5]:
    infected_count = outcome.count('1')
    probability = count / 1000
    print(f"  {infected_count} infected: {probability:.1%}")

Note: This is a simplified illustration. Practical quantum epidemiology models don’t exist yet (TRL 2-3).

17.6.3.2 2. Drug Discovery Optimization

Problem: Find optimal drug candidate from 10^60 possible molecules.

Quantum advantage: Grover’s algorithm can search unsorted database in √N time vs N time classically.

Cao et al., 2018, Chemical Reviews - “Quantum chemistry in the age of quantum computing”

17.6.3.3 3. Optimization Problems

Vaccine distribution: - 1,000 clinics - 10,000 routes - Minimize cost + maximize coverage + prioritize high-risk areas

Quantum annealing: D-Wave systems already used for logistics optimization.


17.6.4 When Will This Be Practical?

Realistic timeline: - 2025-2027: Small-scale demonstrations (10-100 qubits, low error) - 2028-2032: Early adopters use quantum-classical hybrid algorithms - 2033-2040: Practical quantum advantage for specific public health problems

Prepare now by: - Learning quantum algorithms fundamentals - Identifying problems with potential quantum speedup - Partnering with quantum computing companies - Training workforce in quantum programming (Qiskit, Cirq)


17.7 6. Digital Twins for Population Health

17.7.1 What are Digital Twins?

Definition: Virtual replica of a physical entity (person, hospital, city) that: - Mirrors real-time state - Simulates responses to interventions - Enables “what-if” scenario testing without risk

Origin: Aerospace (NASA, simulate spacecraft systems) Evolution: Manufacturing → Healthcare → Population health


17.7.2 Types of Health Digital Twins

17.7.2.1 1. Individual Digital Twin

Virtual model of a single person’s health:

  • Inputs: EHR, genomics, wearables, imaging
  • Model: Physiological simulation + ML predictions
  • Outputs: Personalized risk scores, treatment recommendations

Niederer et al., 2021, Nature Reviews Cardiology

Example: Cardiovascular Digital Twin

class CardiovascularDigitalTwin:
    """
    Individual-level digital twin for cardiovascular health
    Integrates multiple data sources and predicts outcomes
    """

    def __init__(self, patient_id):
        self.patient_id = patient_id
        self.load_patient_data()

    def load_patient_data(self):
        """Load all available patient data"""
        self.demographics = load_ehr_demographics(self.patient_id)
        self.vitals = load_continuous_vitals(self.patient_id)  # From wearables
        self.labs = load_lab_results(self.patient_id)
        self.imaging = load_cardiac_imaging(self.patient_id)
        self.genetics = load_genomic_data(self.patient_id)

    def simulate_intervention(self, intervention, duration_days=365):
        """
        Simulate effect of intervention over time

        Args:
            intervention: dict with type and parameters
            duration_days: simulation duration

        Returns:
            Predicted outcomes at checkpoints
        """
        outcomes = []

        # Initialize state
        current_state = self.get_current_state()

        for day in range(duration_days):
            # Apply intervention effect
            if intervention['type'] == 'medication':
                current_state = self.apply_medication_effect(
                    current_state,
                    intervention['drug'],
                    intervention['dose']
                )
            elif intervention['type'] == 'lifestyle':
                current_state = self.apply_lifestyle_change(
                    current_state,
                    intervention['changes']
                )

            # Simulate physiological evolution
            current_state = self.advance_one_day(current_state)

            # Record outcomes at monthly checkpoints
            if day % 30 == 0:
                outcomes.append({
                    'day': day,
                    'blood_pressure': current_state['bp'],
                    'cholesterol': current_state['ldl'],
                    'cvd_risk': self.calculate_cvd_risk(current_state),
                    'quality_of_life': self.estimate_qol(current_state)
                })

        return outcomes

    def compare_treatments(self, treatment_options):
        """Compare multiple treatment strategies"""
        results = {}

        for name, intervention in treatment_options.items():
            outcomes = self.simulate_intervention(intervention)
            results[name] = {
                'final_cvd_risk': outcomes[-1]['cvd_risk'],
                'risk_reduction': self.get_current_state()['cvd_risk'] - outcomes[-1]['cvd_risk'],
                'side_effects': self.estimate_side_effects(intervention),
                'cost': self.estimate_annual_cost(intervention)
            }

        return results

# Usage example
twin = CardiovascularDigitalTwin(patient_id="PAT123456")

# Define treatment options
treatments = {
    'statin_only': {
        'type': 'medication',
        'drug': 'atorvastatin',
        'dose': '40mg'
    },
    'statin_plus_lifestyle': {
        'type': 'combined',
        'medication': {'drug': 'atorvastatin', 'dose': '20mg'},
        'lifestyle': {'diet': 'mediterranean', 'exercise': '150min/week'}
    },
    'lifestyle_only': {
        'type': 'lifestyle',
        'changes': {'diet': 'mediterranean', 'exercise': '150min/week'}
    }
}

# Compare treatments
comparison = twin.compare_treatments(treatments)

# Recommend optimal treatment
for treatment, results in comparison.items():
    print(f"{treatment}:")
    print(f"  CVD risk reduction: {results['risk_reduction']:.1%}")
    print(f"  Annual cost: ${results['cost']:,.0f}")
    print()

17.7.2.2 2. Hospital Digital Twin

Virtual replica of hospital operations:

  • Model: Patient flow, resource utilization, staff scheduling
  • Simulate: Surge capacity, new protocols, facility changes
  • Optimize: Bed allocation, OR scheduling, supply chain

Croatti et al., 2020, IEEE Access

17.7.2.3 3. City/Population Digital Twin

Urban health system simulation:

  • Inputs: Demographics, mobility, social networks, healthcare access
  • Model: Disease transmission, healthcare demand, intervention effects
  • Applications: Pandemic preparedness, health equity, resource allocation

Example: Singapore Virtual Twin

Virtual Singapore - 3D city model with: - Real-time sensor data - Population movement patterns - Healthcare facility locations - Environmental conditions

Use case: Simulate dengue outbreak response strategies


17.7.3 Challenges and Future Directions

Technical challenges: - Model accuracy: Digital twins only as good as underlying models - Data integration: Combining EHR, genomics, wearables, imaging - Computational cost: Real-time simulation of complex physiological systems - Validation: How to validate predictions without real-world experimentation?

Ethical challenges: - Consent: Do patients understand their digital twin is being used? - Access inequality: Will only wealthy have digital twins? - Determinism: Risk of treating predictions as certain outcomes - Unintended consequences: Optimization for one outcome may harm another


17.8 7. AI-Accelerated Drug and Vaccine Development

17.8.1 Traditional Timeline vs. AI-Accelerated

Traditional drug development: 10-15 years, $2.6B average cost

Phase Traditional With AI Improvement
Target identification 2-3 years 6-12 months 3-4x faster
Lead discovery 3-6 years 1-2 years 2-3x faster
Preclinical testing 1-2 years 6-12 months 1.5-2x faster
Clinical trials 6-7 years 4-5 years 1.2-1.5x faster
Total 12-18 years 6-9 years 2x faster

17.8.2 AI Techniques in Drug Discovery

17.8.2.1 1. Molecular Generation

Problem: Design molecules with desired properties (binds target, low toxicity, synthesizable)

AI approach: Generative models (VAE, GAN, Diffusion, Transformers)

Jiménez-Luna et al., 2021, Nature Machine Intelligence

Example: Using Transformer for molecule generation

# Simplified example using molecular SMILES notation
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

# Load pre-trained molecular language model
# (In practice: ChemBERTa, MolGPT, etc.)
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

def generate_drug_candidate(target_properties):
    """
    Generate novel drug molecule with desired properties

    Args:
        target_properties: dict with keys like:
            - target_protein: "SARS-CoV-2 spike protein"
            - properties: ["high_binding_affinity", "low_toxicity", "oral_bioavailability"]

    Returns:
        SMILES string of generated molecule
    """
    prompt = f"Generate molecule that binds to {target_properties['target_protein']} with properties: {', '.join(target_properties['properties'])}. SMILES:"

    # Generate
    input_ids = tokenizer.encode(prompt, return_tensors='pt')
    output = model.generate(
        input_ids,
        max_length=100,
        num_return_sequences=10,
        temperature=0.8,
        do_sample=True
    )

    # Decode
    molecules = [tokenizer.decode(seq) for seq in output]

    # Filter for valid SMILES and desired properties
    valid_molecules = [m for m in molecules if is_valid_smiles(m) and has_desired_properties(m, target_properties)]

    return valid_molecules

# Example: Generate COVID-19 antiviral candidates
candidates = generate_drug_candidate({
    'target_protein': 'SARS-CoV-2 main protease',
    'properties': ['inhibits_protease', 'oral_bioavailable', 'low_toxicity']
})

print(f"Generated {len(candidates)} candidate molecules")
for i, smiles in enumerate(candidates[:3], 1):
    print(f"Candidate {i}: {smiles}")

17.8.2.2 2. Binding Affinity Prediction

Problem: Will drug bind to target protein strongly enough?

AI approach: Graph neural networks on 3D protein-ligand structures

Stärk et al., 2022, ICML - EquiBind, TANKBind

import torch
from torch_geometric.nn import GCNConv, global_mean_pool

class ProteinLigandBindingPredictor(torch.nn.Module):
    """
    Graph neural network to predict protein-ligand binding affinity
    """

    def __init__(self, num_node_features):
        super().__init__()
        self.conv1 = GCNConv(num_node_features, 128)
        self.conv2 = GCNConv(128, 64)
        self.conv3 = GCNConv(64, 32)
        self.fc = torch.nn.Linear(32, 1)

    def forward(self, data):
        x, edge_index, batch = data.x, data.edge_index, data.batch

        # Graph convolutions
        x = self.conv1(x, edge_index)
        x = torch.relu(x)
        x = self.conv2(x, edge_index)
        x = torch.relu(x)
        x = self.conv3(x, edge_index)

        # Global pooling
        x = global_mean_pool(x, batch)

        # Predict binding affinity
        binding_affinity = self.fc(x)

        return binding_affinity

# Usage
model = ProteinLigandBindingPredictor(num_node_features=75)

# Load protein-ligand complex as graph
protein_ligand_graph = load_complex_as_graph("spike_protein_drug_complex.pdb")

# Predict binding affinity
predicted_affinity = model(protein_ligand_graph)
print(f"Predicted binding affinity: {predicted_affinity.item():.2f} kcal/mol")

17.8.2.3 3. Toxicity Prediction

Problem: Eliminate toxic compounds early (attrition costs $$$)

AI approach: Multi-task learning on toxicity endpoints

Mayr et al., 2018, Chemical Research in Toxicology

Toxicity endpoints: - Hepatotoxicity (liver damage) - Cardiotoxicity (heart issues) - Mutagenicity (DNA damage) - hERG inhibition (cardiac arrhythmia)


17.8.3 Success Stories

17.8.3.1 1. AlphaFold (DeepMind)

Jumper et al., 2021, Nature

  • Problem: Protein structure prediction from amino acid sequence
  • Traditional: X-ray crystallography (months-years, expensive)
  • AlphaFold: Computational prediction in minutes (atomic accuracy)
  • Impact: 200M+ protein structures predicted, openly available

Public health relevance: - Understand how viruses evolve (SARS-CoV-2 variants) - Design vaccines targeting conserved regions - Identify drug targets in neglected tropical diseases

17.8.3.2 2. Moderna mRNA COVID-19 Vaccine

AI-accelerated timeline:

  • January 11, 2020: SARS-CoV-2 sequence published
  • January 13, 2020: Moderna designed vaccine (AI selected mRNA sequence)
  • February 24, 2020: First human dose
  • December 18, 2020: FDA emergency use authorization

AI contributions: - Optimize mRNA sequence for protein expression - Predict immune response - Design clinical trials efficiently

17.8.3.3 3. Insilico Medicine: INS018_055

First AI-designed drug to reach Phase II clinical trials (2021)

  • Target: Idiopathic pulmonary fibrosis
  • Timeline: 18 months from start to Phase I (vs. typical 4-5 years)
  • Cost: ~$2.6M (vs. typical $20-40M)
  • Method: Generative chemistry + reinforcement learning

17.8.4 Emerging Technologies: AI Lab Automation

Closed-loop discovery: AI designs experiments → robots perform → AI analyzes results → repeat

Burger et al., 2020, Nature - Mobile robot chemist

Components: 1. AI: Bayesian optimization to select next experiment 2. Robotics: Automated synthesis and testing 3. Analysis: High-throughput screening, mass spec, NMR 4. Iteration: Continuously improve based on results

Impact: 24/7 experimentation, 100x more experiments/year


17.9 8. Causal AI: Beyond Correlation

17.9.1 The Limitations of Predictive AI

Current AI: Excellent at finding correlations

Problem: Correlation ≠ Causation

Example: - Observation: Ice cream sales correlate with drowning deaths - Predictive AI: “High ice cream sales → High drowning risk” - Causality: Both caused by hot weather (confounding)

Public health consequences: - Wrong intervention: Ban ice cream → Drownings continue - Right intervention: Increase lifeguard patrols on hot days


17.9.2 What is Causal AI?

Goal: Understand cause-effect relationships, enabling: 1. Counterfactual reasoning: “What would have happened if…?” 2. Intervention planning: “What should we do to achieve X?” 3. Explanation: “Why did Y happen?”

Foundations: Pearl, 2009, Causality - Causal inference with directed acyclic graphs (DAGs)


17.9.3 The Causal Hierarchy

Pearl & Mackenzie, 2018, The Book of Why

Level Question Example AI Capability
Association What if I see X? Patients taking drug Y have better outcomes ✅ Standard ML
Intervention What if I do X? What if we mandate drug Y? ⚠️ Requires causal models
Counterfactual What if I had done X? Would patient survive if given drug Y instead of Z? ❌ Hardest, requires causal model + assumptions

17.9.4 Causal Inference Techniques

17.9.4.1 1. Randomized Controlled Trials (RCTs)

Gold standard: Randomly assign treatment → Breaks confounding

But: Expensive, slow, sometimes unethical

AI role: Design trials efficiently, predict who benefits most

17.9.4.2 2. Instrumental Variables

Method: Find variable that affects treatment but not outcome (except through treatment)

Example: Geographic distance to specialist - Affects likelihood of treatment - Doesn’t directly affect outcome (except via treatment)

from econml.iv.dml import DMLIV
import numpy as np

# Synthetic data
n = 10000
# Instrumental variable: distance to HIV specialist
distance = np.random.uniform(0, 100, n)
# Confounders: age, comorbidities
confounders = np.random.randn(n, 5)
# Treatment: receiving antiretroviral therapy
treatment = (distance < 50).astype(int) + 0.2 * confounders[:, 0] + np.random.randn(n) * 0.1
treatment = (treatment > 0.5).astype(int)
# Outcome: viral load suppression
outcome = 0.5 * treatment + 0.3 * confounders[:, 0] + np.random.randn(n) * 0.1

# Estimate causal effect using instrumental variables
est = DMLIV(
    model_y_xw=RandomForestRegressor(),
    model_t_xw=RandomForestClassifier(),
    model_t_xwz=RandomForestClassifier()
)

est.fit(Y=outcome, T=treatment, X=confounders, Z=distance.reshape(-1, 1), W=None)

# Estimate treatment effect
treatment_effect = est.effect(confounders)
print(f"Average treatment effect: {treatment_effect.mean():.3f}")

17.9.4.3 3. Propensity Score Matching

Idea: Match treated and untreated individuals with similar likelihood of treatment

from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import NearestNeighbors
import pandas as pd

def propensity_score_matching(data, treatment_col, outcome_col, covariate_cols):
    """
    Estimate treatment effect using propensity score matching

    Args:
        data: DataFrame with patient data
        treatment_col: Name of treatment column (0/1)
        outcome_col: Name of outcome column
        covariate_cols: List of covariate columns

    Returns:
        Estimated average treatment effect
    """
    # Estimate propensity scores
    X = data[covariate_cols]
    treatment = data[treatment_col]

    ps_model = LogisticRegression()
    ps_model.fit(X, treatment)
    data['propensity_score'] = ps_model.predict_proba(X)[:, 1]

    # Match treated to control
    treated = data[data[treatment_col] == 1]
    control = data[data[treatment_col] == 0]

    # Find nearest neighbor matches
    nn = NearestNeighbors(n_neighbors=1, metric='euclidean')
    nn.fit(control[['propensity_score']])

    distances, indices = nn.kneighbors(treated[['propensity_score']])

    # Calculate treatment effect
    treated_outcomes = treated[outcome_col].values
    matched_control_outcomes = control.iloc[indices.flatten()][outcome_col].values

    ate = (treated_outcomes - matched_control_outcomes).mean()

    return ate

# Example: Effect of vaccination on hospitalization
ate = propensity_score_matching(
    data=patient_data,
    treatment_col='vaccinated',
    outcome_col='hospitalized',
    covariate_cols=['age', 'comorbidities', 'prior_infection']
)

print(f"Vaccination reduces hospitalization by {-ate:.1%}")

17.9.4.4 4. Causal Discovery from Data

Goal: Learn causal structure from observational data

Methods: - Constraint-based: PC algorithm, FCI - Score-based: GES, NOTEARS - Hybrid: Learn structure, then refine with domain knowledge

Spirtes et al., 2000, Causation, Prediction, and Search

#| eval: false
from causalnex.structure import StructureModel
from causalnex.structure.notears import from_pandas
import pandas as pd

# Learn causal structure from observational data
# Example: What causes COVID-19 severity?
covid_data = pd.DataFrame({
    'age': [...],
    'comorbidities': [...],
    'vaccination_status': [...],
    'viral_load': [...],
    'immune_response': [...],
    'severity': [...]
})

# Learn structure using NOTEARS algorithm
sm = from_pandas(covid_data)

# Visualize learned causal graph
sm.get_largest_subgraph().edges()
# Output: [('age', 'comorbidities'),
#          ('age', 'immune_response'),
#          ('comorbidities', 'severity'),
#          ('vaccination_status', 'immune_response'),
#          ('viral_load', 'severity'),
#          ('immune_response', 'severity')]

# Refine with domain knowledge
sm.remove_edge('age', 'vaccination_status')  # Age doesn't cause vaccination
sm.add_edge('vaccination_status', 'viral_load')  # Vaccine reduces viral load

# Estimate causal effects
from causalnex.inference import InferenceEngine
ie = InferenceEngine(sm)

# Query: Effect of vaccination on severity
effect = ie.query(
    {'vaccination_status': 1},  # Intervene: set vaccination = 1
    'severity'
)

17.9.5 Applications in Public Health

17.9.5.1 1. Policy Evaluation

Question: Did mask mandates reduce COVID-19 transmission?

Challenge: States that implemented mandates differ from those that didn’t (confounding)

Causal approach: Difference-in-differences, synthetic control

Abadie & Gardeazabal, 2003, American Economic Review - Synthetic control methods

17.9.5.2 2. Personalized Treatment

Question: Which patients benefit most from intervention X?

Causal ML: Heterogeneous treatment effects

from econml.dml import CausalForestDML

# Estimate individualized treatment effects
causal_forest = CausalForestDML()
causal_forest.fit(Y=outcomes, T=treatments, X=covariates)

# Predict treatment effect for new patient
new_patient = np.array([[age, sex, comorbidities, ...]])
treatment_effect = causal_forest.effect(new_patient)

if treatment_effect > threshold:
    recommend_treatment()
else:
    recommend_alternative()

17.9.5.3 3. Understanding Health Disparities

Question: Why do Black patients have worse COVID-19 outcomes?

Causal decomposition: - Direct effect of race (discrimination, bias) - Indirect effects via mediators (SES, comorbidities, healthcare access)

Method: Causal mediation analysis


17.10 9. Preparing for Emerging Technologies

17.10.1 Technology Readiness Assessment

Questions to ask:

  1. What problem does this solve?
    • Is it a real problem for public health?
    • Do we have other solutions?
    • What’s the value add?
  2. What’s the current maturity? (TRL)
    • Proof of concept or production-ready?
    • Who else is using it?
    • What’s the timeline to practical deployment?
  3. What are the requirements?
    • Data infrastructure
    • Computational resources
    • Expertise and training
    • Budget and sustainability
  4. What are the risks?
    • Technical risks (doesn’t work as expected)
    • Ethical risks (bias, privacy, equity)
    • Organizational risks (workforce, change management)
    • Financial risks (cost overruns, vendor lock-in)

17.10.2 Building Organizational Capacity

17.10.2.1 1. Workforce Development

Roles needed: - AI-literate public health practitioners - Understand capabilities/limitations - Technical specialists - Implement and maintain systems - Translators - Bridge public health and AI communities - Ethicists - Navigate complex ethical terrain

Training approaches: - Upskill existing workforce (courses, workshops) - Hire AI talent into public health agencies - Partner with academic institutions - Engage AI consultants for specific projects

17.10.2.2 2. Infrastructure Investment

Computational infrastructure: - Cloud computing accounts (AWS, GCP, Azure) - High-performance computing clusters - Data storage and management systems - MLOps platforms

Data infrastructure: - Electronic disease surveillance systems - EHR integration capabilities - Data quality monitoring - Privacy-preserving data sharing

17.10.2.3 3. Partnerships and Collaborations

Internal: - IT departments - Legal and compliance teams - Leadership and decision-makers

External: - Academic research institutions - AI companies and vendors - Other public health agencies - Community stakeholders


17.10.3 Pilot Project Framework

Start small, learn fast, scale what works

17.10.3.1 Phase 1: Proof of Concept (3-6 months)

  • Select well-defined, limited-scope problem
  • Use existing data and tools where possible
  • Measure baseline performance
  • Set success criteria upfront

17.10.3.2 Phase 2: Pilot Implementation (6-12 months)

  • Deploy in controlled setting (1-2 sites)
  • Monitor performance closely
  • Collect user feedback
  • Iterate based on learnings

17.10.3.3 Phase 3: Scale and Sustain (12+ months)

  • Expand to additional sites
  • Integrate into workflows
  • Establish maintenance procedures
  • Plan for long-term sustainability

17.11 10. Risks and Ethical Considerations

17.11.1 Emerging Risks

17.11.1.1 1. Autonomous AI Systems

Concern: AI making decisions without human oversight

Example: Automated triage system that denies care

Mitigation: Human-in-the-loop, override mechanisms, audit trails

17.11.1.2 2. Synthetic Data and Deepfakes

Concern: Generated images/videos indistinguishable from real

Public health impact: - Fake health information going viral - Undermining trust in public health messaging - Fraudulent research data

Detection: AI-generated content detectors, watermarking, provenance tracking

17.11.1.3 3. Dual-Use Technologies

Concern: Technologies with beneficial and harmful applications

Example: AI-designed pathogens (gain-of-function concerns)

Governance: Oversight, publication guidelines, biosecurity measures

17.11.1.4 4. Concentration of Power

Concern: Few tech companies control critical AI infrastructure

Risks: - Vendor lock-in - Lack of transparency - Profit-driven rather than health-driven - Geopolitical dependencies

Mitigation: Open-source alternatives, public AI infrastructure, regulation


17.11.2 Responsible Innovation Principles

Stilgoe et al., 2013, Research Policy - Responsible innovation framework

Four dimensions:

  1. Anticipation - Foresee potential impacts (intended and unintended)
  2. Reflexivity - Question assumptions and framings
  3. Inclusion - Engage diverse stakeholders
  4. Responsiveness - Adapt based on evidence and values

Applied to emerging AI:

  • Anticipate: What could go wrong? Who might be harmed?
  • Reflect: Are we solving the right problem? Whose interests are prioritized?
  • Include: Are affected communities involved in design?
  • Respond: Are we monitoring impacts and adjusting course?

17.12 11. Key Takeaways

ImportantEssential Points

Technology Landscape: - AI capabilities advancing faster than ever (compute doubling every 6 months) - Multiple breakthrough technologies moving from research to practice - Public health can shape development by engaging early

Most Promising Near-Term (2025-2028): 1. Multimodal foundation models - Already deployable for clinical documentation, literature synthesis, multilingual health communication 2. Federated learning - Enables multi-site collaboration without data sharing 3. Edge AI - Point-of-care diagnostics on smartphones in resource-limited settings

Medium-Term (2028-2033): 4. Digital twins - Personalized medicine and population health simulation 5. AI drug discovery - Accelerating therapeutic development for neglected diseases 6. Causal AI - Better understanding of “why” to inform interventions

Long-Term (2033+): 7. Quantum computing - Potential for complex epidemiological modeling 8. Neuromorphic computing - Brain-inspired, energy-efficient AI hardware

Key Preparation Steps: - Assess organizational readiness (infrastructure, workforce, partnerships) - Start with pilot projects on well-defined problems - Build partnerships with AI researchers and companies - Invest in workforce training and data infrastructure - Engage in responsible innovation (anticipate, reflect, include, respond)

Guiding Philosophy: > “The best way to predict the future is to invent it.” - Alan Kay

Public health should not be a passive recipient of AI technologies developed elsewhere. Engage, shape, and build the future you want to see.


17.13 12. Discussion Questions

  1. Technology assessment: Which emerging technology do you think will have the biggest impact on public health in the next 5 years? Why? What would need to happen to realize that impact?

  2. Equity concerns: How can we ensure that emerging AI technologies benefit all populations, not just wealthy countries/communities? What governance mechanisms are needed?

  3. Workforce implications: How should public health agencies prepare their workforce for rapid AI advancement? What skills will be most important? How do we avoid widening the gap between “AI-enabled” and traditional public health practice?

  4. Responsible innovation: What oversight mechanisms should govern development of powerful AI systems for public health? Who should be involved in decision-making? How do we balance innovation with precaution?

  5. Causal vs. predictive: When is understanding causality essential for public health decision-making? When is prediction sufficient? Give examples of each.

  6. Federated learning adoption: What are the biggest barriers to adopting federated learning for multi-site public health research? How can these be overcome?

  7. Quantum hype: Is quantum computing overhyped for public health applications? How should agencies balance preparation for quantum computing with more immediate needs?

  8. Digital twin ethics: What are the ethical implications of creating digital twins of individuals or populations? How should consent work? What safeguards are needed?


17.14 13. Hands-On Exercise

17.14.1 Exercise: Assess an Emerging Technology for Your Organization

Objective: Evaluate the readiness and potential impact of an emerging technology for a specific public health application.

Time: 60-90 minutes


17.14.1.1 Part 1: Technology Selection (15 min)

Choose one emerging technology from this chapter: - Multimodal foundation models - Federated learning - Edge AI - Digital twins - Causal AI

And one public health application: - Disease surveillance - Health equity intervention - Emergency response - Chronic disease management - Maternal/child health

Deliverable: One-paragraph description of your chosen technology-application pair.


17.14.1.2 Part 2: Readiness Assessment (30 min)

Evaluate organizational readiness across dimensions:

Technical Readiness: - What data infrastructure exists? - What computational resources are available? - What technical expertise does the team have?

Workforce Readiness: - Who would need training? - What skills are missing? - How long would training take?

Organizational Readiness: - Do stakeholders understand the technology? - Is leadership supportive? - Are there change management processes?

Financial Readiness: - What’s the estimated budget? - Are there funding sources? - What’s the expected ROI?

Deliverable: Readiness assessment table with scores (1-5) and justifications for each dimension.


17.14.1.3 Part 3: Pilot Project Design (30 min)

Design a 6-month pilot project to test the technology:

Problem Statement: - What specific problem will you address? - What are current limitations? - What’s the expected improvement?

Approach: - What’s the technical implementation plan? - What data will you use? - What metrics will you track?

Success Criteria: - What outcomes indicate success? - What’s the minimum viable result? - How will you measure impact?

Risks and Mitigations: - What could go wrong? - How will you mitigate risks? - What’s the contingency plan?

Deliverable: 2-page pilot project proposal.


17.14.1.4 Part 4: Ethical Analysis (15 min)

Apply responsible innovation framework:

Anticipate: - What unintended consequences might occur? - Who might be harmed? - What’s the worst-case scenario?

Reflect: - Are you solving the right problem? - Whose interests are prioritized? - What assumptions are you making?

Include: - Which stakeholders should be involved? - How will you incorporate community input? - Are affected populations represented?

Respond: - How will you monitor for problems? - What triggers would cause you to pause/stop? - How will you adapt based on feedback?

Deliverable: 1-page ethical analysis with mitigation strategies.


17.14.1.5 Bonus Challenge:

Create a 5-minute presentation pitching your pilot project to leadership. Include: - Problem statement - Technology overview (accessible to non-technical audience) - Expected benefits - Resource requirements - Timeline - Risk mitigation


17.15 14. Further Resources

17.15.1 📚 Books

Emerging Technologies: - Life 3.0 by Max Tegmark - Future of AI and humanity - The Master Algorithm by Pedro Domingos - Quest for universal learning algorithm - Prediction Machines by Agrawal, Gans, & Goldfarb - Economics of AI

Causal Inference: - The Book of Why by Judea Pearl - Causality for general audience 🎯 - Causal Inference: The Mixtape by Scott Cunningham - Free online textbook - Causality by Judea Pearl - Technical reference


17.15.2 📄 Key Papers

Foundation Models: - Bommasani et al., 2021, arXiv - On the opportunities and risks of foundation models 🎯 - Singhal et al., 2023, Nature - Towards expert-level medical Q&A with large language models

Federated Learning: - McMahan et al., 2017, AISTATS - Communication-efficient learning of deep networks 🎯 - Kaissis et al., 2020, Nature Machine Intelligence - Secure, privacy-preserving and federated machine learning in medical imaging

AI Drug Discovery: - Stokes et al., 2020, Cell - A deep learning approach to antibiotic discovery 🎯 - Jumper et al., 2021, Nature - Highly accurate protein structure prediction with AlphaFold

Causal AI: - Pearl, 2019, Communications of the ACM - The seven tools of causal inference 🎯 - Künzel et al., 2019, PNAS - Metalearners for estimating heterogeneous treatment effects


17.15.3 💻 Tools & Tutorials

Multimodal AI: - OpenAI GPT-4 API - Multimodal foundation model access - Hugging Face Transformers - Pre-trained models and fine-tuning - LangChain - Building applications with LLMs

Federated Learning: - Flower - Friendly federated learning framework 🎯 - TensorFlow Federated - Google’s FL framework - PySyft - Privacy-preserving ML

Edge AI: - TensorFlow Lite - Deploy models on mobile/edge devices - ONNX Runtime - Cross-platform inference - Core ML - iOS/macOS deployment

Causal Inference: - DoWhy - Microsoft causal inference library 🎯 - EconML - Heterogeneous treatment effects - CausalNex - Causal reasoning with Bayesian networks


17.15.4 🎓 Online Courses

Emerging Technologies: - Fast.ai - Practical deep learning (free) - Stanford CS324: Large Language Models - LLM foundations - Quantum Machine Learning - University of Toronto (edX)

Causal Inference: - Introduction to Causal Inference - Brady Neal (free course + textbook) 🎯 - A Crash Course in Causality - University of Pennsylvania (Coursera)


17.15.5 🎯 Communities & Forums

AI for Health: - ML4H (Machine Learning for Health) - Annual workshop at NeurIPS - AI for Global Health - Forum and resources - Healthcare AI Ethics Community - Stanford HAI

Federated Learning: - OpenMined Community - Privacy-preserving ML - Flower Community - FL practitioners


17.16 Looking Ahead

In Chapter 16 (?sec-global-health), we’ll examine:

  • Global health AI equity and addressing the digital divide
  • Context-appropriate AI design for resource-limited settings
  • Algorithmic fairness across diverse populations
  • Capacity building strategies in low- and middle-income countries
  • Data governance in international health collaborations

The emerging technologies covered in this chapter have tremendous potential, but only if deployed equitably to benefit all populations.


Check Your Understanding

Test your knowledge of emerging AI technologies and their applications in public health. Each question builds on the key concepts from this chapter.

NoteQuestion 1

A public health agency is considering three emerging AI technologies for investment: (A) Multimodal foundation models for clinical documentation (TRL 7-8, commercially available), (B) Federated learning for multi-state surveillance collaboration (TRL 5-6, requires infrastructure development), or (C) Quantum computing for epidemic simulation (TRL 2-3, mostly research stage). With limited budget and 18-month timeline, which should be prioritized and why?

  1. Quantum computing, because it has the highest potential payoff and will position the agency as a technology leader
  2. Multimodal foundation models, because they’re production-ready, address immediate needs, and have proven value
  3. Federated learning, because it balances innovation with feasibility and solves a real collaborative challenge
  4. All three equally, using a diversified portfolio approach to hedge risk

Correct Answer: b) Multimodal foundation models, because they’re production-ready, address immediate needs, and have proven value

This question tests understanding of technology readiness levels (TRL) and practical decision-making for technology adoption—key themes emphasized in the chapter’s preparation section.

TRL Framework Application:

The chapter presents NASA’s TRL scale (1-9) and explicitly recommends focusing on TRL 3-7 technologies that are “proven concepts moving toward practical deployment.” Let’s analyze each option:

Option A: Quantum Computing (TRL 2-3) - Status: Basic research, proof-of-concept stage - Timeline: Chapter estimates practical quantum advantage for public health in 2033-2040 - Reality check: “Still noisy, error-prone, limited to small problems” - Risk: 18 months is far too short for research-stage technology to become operational - Chapter guidance: “Prepare now by learning fundamentals, identifying potential problems” but not for immediate deployment

Option B: Multimodal Foundation Models (TRL 7-8) - Status: Production-ready, commercially available (GPT-4 Vision, Med-PaLM 2) - Timeline: Deployable immediately, mature within 18 months - Proven value: Chapter cites multiple real-world applications (clinical documentation, literature synthesis, multilingual health communication) - Immediate benefit: Addresses current needs (documentation burden, communication challenges) - Success stories: Already deployed in healthcare settings

Option C: Federated Learning (TRL 5-6) - Status: Lab validation, prototype testing stage - Timeline: Feasible in 18 months with infrastructure investment - Real applications: Chapter cites MELLODDY, FeTS, Google Gboard - Challenge: “Requires infrastructure development”—significant setup cost - Value: Solves real problem (multi-site collaboration) but needs substantial groundwork

The Decision Framework:

The chapter’s “Technology Readiness Assessment” section provides clear guidance:

  1. What problem does this solve?
    • A (Quantum): Hypothetical future problems (complex simulation)
    • B (Multimodal): Current problems (documentation, communication, multilingual outreach)
    • C (Federated): Real problem (multi-site collaboration) but requires stakeholder alignment
  2. What’s the current maturity?
    • A: Research stage, not deployment-ready
    • B: Production-ready, commercially available
    • C: Prototype stage, needs development
  3. What’s the timeline to practical deployment?
    • A: 10-15+ years
    • B: Immediate to 6 months
    • C: 12-24 months with infrastructure
  4. What are the requirements?
    • A: Specialized quantum expertise, quantum hardware access, research partnerships
    • B: API access, integration work, user training (manageable)
    • C: Multi-site coordination, infrastructure build-out, data governance agreements

Given limited budget and 18-month timeline, Option B is the clear choice.

Why other options are wrong:

Option (a) commits the “shiny object” fallacy—prioritizing futuristic technology over practical needs. The chapter warns against this by emphasizing TRL assessment. Quantum computing’s “highest potential payoff” is speculative and decades away. “Technology leadership” means being effective with current technology, not gambling on research that may never materialize for your use case. The chapter’s quantum section explicitly positions this as long-term preparation, not near-term deployment.

Option (c) is more defensible than (a) but misweights timeline constraints. Federated learning could be valuable but requires: - Infrastructure development at each site - Data governance agreements across multiple jurisdictions - Technical expertise for implementation - Stakeholder buy-in from all participating sites

While feasible in 18 months with sufficient resources, the “limited budget” constraint makes this challenging. The chapter positions federated learning as “emerging” (not yet mainstream) for good reason—implementation overhead is substantial.

Option (d) violates the principle of focused resource allocation. “Diversified portfolio” makes sense for large R&D operations, not resource-limited public health agencies with 18-month timelines. The chapter’s “Start small, learn fast, scale what works” philosophy argues for focused pilots, not scattered investments. Spreading limited budget across three technologies (especially including one that’s pure research) means none get sufficient resources.

The chapter’s recommendation for near-term adoption (2025-2028):

The chapter explicitly states: “Most Promising Near-Term (2025-2028): 1. Multimodal foundation models - Already deployable for clinical documentation, literature synthesis, multilingual health communication”

This directly supports Option B.

Practical implementation path:

Months 1-3: - Pilot multimodal AI for clinical documentation (reduce burden) - Test multilingual health communication (COVID-19 vaccine messaging in multiple languages) - Demonstrate value with quick wins

Months 4-9: - Scale successful pilots - Integrate into workflows - Train staff

Months 10-18: - Full deployment - Measure outcomes - Iterate based on feedback

Meanwhile: - Monitor federated learning maturity for future adoption - Learn quantum fundamentals through training, not deployment

Real-world precedents:

The chapter cites GPT-4, Med-PaLM 2 already at “TRL 7-9: System demonstration, operations.” Healthcare organizations deploying these now are seeing benefits immediately—exactly what a resource-constrained agency needs.

For practitioners:

The chapter’s message is clear: match technology selection to organizational capacity and timeline. Emerging technologies are exciting, but deployment decisions should prioritize: 1. Proven value (demonstrated ROI) 2. Maturity (production-ready) 3. Timeline fit (achievable in your window) 4. Resource requirements (manageable with your budget)

Multimodal foundation models satisfy all four criteria; quantum computing satisfies none; federated learning is borderline depending on resources and stakeholder readiness.

The key lesson: resist technology FOMO (fear of missing out). The chapter’s responsible innovation framework emphasizes solving real problems with appropriate tools, not deploying cutting-edge technology for its own sake. Option B embodies this pragmatic approach.

NoteQuestion 2

A research consortium wants to train a tuberculosis prediction model using patient data from 20 hospitals across 5 countries. Privacy regulations prohibit data sharing, and several countries require that data never leave national borders. Which technical approach would BEST enable this collaboration while respecting data sovereignty?

  1. Have all hospitals de-identify data using k-anonymity (k=10) and share to a central cloud server for training
  2. Use federated learning where models train locally at each hospital and only aggregated model updates are shared internationally
  3. Create synthetic data at each hospital using GANs, share the synthetic datasets, and train centrally
  4. Build separate models at each hospital and manually combine predictions using ensemble methods

Correct Answer: b) Use federated learning where models train locally at each hospital and only aggregated model updates are shared internationally

This question tests understanding of federated learning’s core value proposition and when it’s the appropriate solution—a major theme in the chapter’s federated learning section.

The Scenario’s Constraints:

  1. Privacy regulations prohibit data sharing - Hard legal constraint
  2. Data sovereignty requirements - Data cannot leave national borders
  3. Multi-site collaboration needed - 20 hospitals across 5 countries
  4. Goal: Train a single, collaborative model

Federated Learning as the Solution:

The chapter presents federated learning explicitly for this scenario: “The Problem: Data Silos in Public Health.” The McMahan et al. (2017) FedAvg algorithm is designed precisely for this use case.

How Federated Learning Addresses Each Constraint:

Privacy regulations: Raw patient data never shared—only model parameters (gradients/weights) transmitted. The chapter emphasizes: “No patient data leaves institution.”

Data sovereignty: Training happens locally within each country’s borders. The central server (which could be in any country or neutral territory) only receives mathematical model updates, not patient data. This satisfies regulations requiring data stay in-country.

Multi-site collaboration: The chapter’s algorithm walkthrough shows exactly how this works: 1. Central server initializes model 2. Each hospital trains on local data 3. Hospitals send model updates (not data) 4. Server aggregates using FedAvg 5. Repeat

Scale: The chapter cites Google Gboard with “millions of mobile devices”—20 hospitals is entirely feasible.

Real-World Validation:

The chapter provides perfect precedents:

FeTS (Federated Tumor Segmentation): - 30 international institutions - Brain tumor segmentation - Finding: “Federated model approached centralized performance without data sharing”

This is essentially the same scenario: international collaboration, sensitive medical data, successful federated implementation.

MELLODDY: - 10 pharmaceutical companies - Drug discovery collaboration - 10M+ compounds kept at each company - Result: Better models than any single company alone

Why other options fail:

Option (a)—K-anonymity + central sharing:

Multiple problems: 1. Still violates data sovereignty: Even de-identified data leaving national borders may violate regulations. Many data protection laws (GDPR, local regulations) restrict data transfer regardless of de-identification.

  1. K-anonymity insufficient: The chapter (and Chapter 11) emphasize k-anonymity reduces but doesn’t eliminate re-identification risk. For 20 hospitals across diverse populations, achieving k=10 while preserving utility is challenging. Small hospitals or rare TB subtypes might not have sufficient patients.

  2. Data quality loss: K-anonymity requires generalization (exact ages → age ranges, precise locations → regions) which degrades predictive power for medical models.

  3. Trust issues: Hospitals/countries may not trust “de-identified” data, especially with high-profile re-identification cases discussed in the chapter.

Option (c)—Synthetic data:

Problems outlined in the chapter: 1. Quality concerns: “Synthetic data may not capture rare events or complex relationships critical for medical prediction”

  1. Validation challenge: How do you validate synthetic data quality without comparing to real data (creating a chicken-and-egg problem)?

  2. Privacy leakage risk: GANs can leak training data if not properly implemented with differential privacy. The chapter discusses this in context of privacy-preserving techniques.

  3. Doesn’t solve sovereignty: Synthetic data still represents aggregated information from real patients. Some regulations may still restrict its transfer. And you need to share the synthetic data, which still crosses borders.

  4. Complexity: Each hospital generates synthetic data (requires expertise), shares internationally, then trains centrally. This is more complex than federated learning while providing weaker privacy guarantees.

Option (d)—Separate models + manual ensembling:

This doesn’t solve the collaboration problem: 1. No knowledge transfer: Each hospital trains only on local data, missing the benefit of multi-site learning that motivated collaboration.

  1. Manual ensembling: “Manually combine predictions” doesn’t provide principled aggregation. How do you weight hospitals? How do you handle distribution shift?

  2. No collaborative improvement: Unlike federated learning where all sites benefit from collective training, this approach means small hospitals with limited data get worse models.

The chapter explicitly positions federated learning as the solution to exactly this problem: “Federated learning approach: 1. Train model locally at each site, 2. Share only model updates (not data), 3. Aggregate updates centrally, 4. No patient data leaves institution.”

Additional Considerations:

Enhanced Privacy:

The chapter discusses extensions: - Secure aggregation: Central server can’t see individual hospital updates - Differential privacy: Add noise to updates for formal privacy guarantees - Encrypted aggregation: Homomorphic encryption for extra security

For sensitive TB data, combining federated learning with differential privacy (DP-FL) provides mathematically provable privacy.

Challenges and Mitigations:

The chapter’s challenges table shows federated learning is not without issues: - Non-IID data: Different patient populations across countries → Use FedProx - Communication costs: International data transfer can be expensive → Compress updates - Malicious participants: Potential for poisoning → Robust aggregation methods

But these are solvable technical problems, not fundamental blockers like sovereignty violations.

Practical Implementation:

Using the chapter’s Flower framework example, implementation is straightforward:

  1. Each hospital runs HospitalClient code locally
  2. Central coordinating server runs FedAvg strategy
  3. HTTPS communication for model updates (lightweight—KB to MB, not GB)
  4. After 10-20 rounds, collaborative model complete

The chapter explicitly recommends this:

Most Promising Near-Term (2025-2028): 2. Federated learning - Enables multi-site collaboration without data sharing”

This is presented as deployable NOW (TRL 5-6), not future research.

For practitioners:

International health collaborations increasingly face data sovereignty challenges. Federated learning is specifically designed for this: - Respects local data governance - Enables genuine collaboration (not just separate analyses) - Scales to many sites - Works with existing ML frameworks (TensorFlow, PyTorch) - Has successful healthcare precedents

The TB prediction scenario in the question is precisely the use case federated learning was invented for. The chapter’s extensive coverage and real-world examples make this the clear answer.

NoteQuestion 3

A rural health clinic in sub-Saharan Africa wants to implement AI-based malaria diagnosis from smartphone photos of blood smears. Internet connectivity is intermittent (available 2-3 hours/day), and cloud API costs are prohibitive at scale. Which approach would be MOST appropriate?

  1. Wait until reliable internet infrastructure is available, then deploy cloud-based AI via API calls
  2. Deploy edge AI with TensorFlow Lite models running entirely on smartphones, enabling offline diagnosis with no internet required
  3. Use SMS-based system to send images to central server during connectivity windows for batch processing
  4. Provide tablets with pre-downloaded training data so clinicians can train models locally as needed

Correct Answer: b) Deploy edge AI with TensorFlow Lite models running entirely on smartphones, enabling offline diagnosis with no internet required

This question tests understanding of edge AI’s value proposition and context-appropriate technology design—critical themes in the chapter’s Edge AI section and the Looking Ahead discussion of global health equity.

The Scenario’s Constraints:

  1. Rural, resource-limited setting
  2. Intermittent connectivity (2-3 hours/day available)
  3. Cost sensitivity (cloud APIs “prohibitive at scale”)
  4. Critical healthcare need (malaria diagnosis)
  5. Existing infrastructure (smartphones available)

Edge AI as the Solution:

The chapter dedicates an entire section to edge AI, defining it as: “Running AI models directly on devices (smartphones, tablets, IoT sensors) rather than in the cloud.”

Why Edge AI Matters for This Scenario:

The chapter lists exactly these benefits: - Offline capability: “Works in areas with poor connectivity” ✓ - Cost: “No cloud inference fees at scale” ✓ - Low-latency: “Instant results without internet delay” ✓ - Privacy: “Data never leaves device” (bonus benefit)

Perfect Example in the Chapter:

The chapter provides EXACTLY this use case:

Example: Malaria Detection from Blood Smears” with complete TensorFlow Lite implementation:

def diagnose_malaria(image_path):
    """
    Analyze blood smear image for malaria parasites
    Runs entirely on mobile device, no internet required
    """

Impact metrics from chapter: - Diagnosis time: 10 minutes → 30 seconds - Cost: $5 microscopy + technician → $0 marginal cost - Accessibility: Urban hospital → Rural clinic with smartphone - Scalability: Limited by trained technicians → Limited by smartphones

This is precisely the rural clinic scenario in the question.

Technical Feasibility:

The chapter demonstrates model compression techniques:

1. Quantization: Reduce model from 32-bit to 8-bit - Original: 10MB → Compressed: 2.5MB (75% reduction) - Feasible for smartphone storage and processing

2. Example code provided:

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

3. Hardware availability: - “Google Coral: 4 TOPS, $25” - “Apple Neural Engine: Built into iPhone/iPad” - Modern smartphones can run these models

Real-World Precedent:

The chapter cites Project Premonition (Microsoft Research) using edge AI for mosquito species identification in remote areas—exactly the same context (rural, limited connectivity, disease surveillance).

Why Other Options Fail:

Option (a)—Wait for infrastructure:

This is the opposite of context-appropriate design. The chapter’s Looking Ahead section emphasizes: “Design for constraints - Offline, low-power, robust to poor data.”

Waiting for infrastructure means: - Delayed healthcare delivery (people die from untreated malaria) - No agency over solution (dependent on external infrastructure investment that may never come) - Missed opportunity to use available technology (smartphones are already there)

The chapter explicitly rejects this approach: “Prioritize impact where it’s needed most - AI should work for all populations,” not just those with perfect infrastructure.

Option (c)—SMS batch processing:

Multiple problems: 1. Latency: Batch processing during 2-3 hour connectivity windows means multi-hour delays for diagnosis. Malaria diagnosis needs to be rapid for treatment decisions.

  1. Still requires connectivity: Depends on the intermittent internet, defeating the purpose. If internet is down, no diagnoses happen.

  2. Cost: SMS with image attachments (MMS) can be expensive, and still requires server-side processing (cloud costs).

  3. Scalability: Central server becomes bottleneck. If 100 clinics queue images during connectivity window, server overload.

  4. Privacy: Images transmitted off-device, creating privacy risks.

The chapter presents edge AI specifically as superior to this client-server model for resource-constrained settings.

Option (d)—Local training:

This fundamentally misunderstands machine learning deployment:

  1. Training vs. inference: The chapter distinguishes these clearly. Inference (making predictions) is lightweight and suitable for smartphones. Training (learning from data) is computationally intensive and requires GPUs, large datasets, and expertise.

  2. Impractical: Clinicians aren’t ML engineers. “Train models locally as needed” is not feasible without extensive technical expertise, computational resources, and training data.

  3. Quality control: Locally trained models without validation could produce dangerous errors.

  4. Unnecessary: A well-trained malaria detection model deployed once works for all cases. No need for site-specific training.

The chapter’s edge AI section discusses deploying pre-trained models to devices, not local training.

Implementation Path (from chapter):

Model Development (Centralized): 1. Train malaria detection model on large labeled dataset (hospitals/research institutions with connectivity) 2. Validate on diverse blood smear types 3. Compress using quantization for mobile deployment 4. Package as TensorFlow Lite model

Deployment (Edge): 1. Distribute .tflite model file to smartphones (one-time download or via SD card) 2. Install simple app that loads model and processes images 3. Clinicians photograph blood smears with smartphone camera 4. App provides instant diagnosis on-device 5. Results saved locally, uploaded to central database when connectivity available (optional, for surveillance)

Sustainability: - No ongoing connectivity costs - No cloud API fees (free after initial deployment) - Minimal maintenance (update model annually via occasional connectivity or physical media) - Scales to unlimited clinics (marginal cost near zero)

The Chapter’s Emphasis:

Most Promising Near-Term (2025-2028): 3. Edge AI - Point-of-care diagnostics on smartphones in resource-limited settings”

This is explicitly presented as deployable NOW for exactly this use case.

Equity and Access:

The chapter’s Looking Ahead section connects to Chapter 16: - “Context-appropriate AI design for resource-limited settings” - “Design for constraints - Offline, low-power, robust to poor data” - “AI should work for all populations”

Edge AI embodies these principles. Waiting for infrastructure or depending on connectivity excludes those most in need.

For practitioners:

Rural health technology must be: 1. Offline-capable: Cannot depend on connectivity 2. Low-cost: Marginal costs must approach zero for sustainability 3. Simple: Works with available infrastructure (smartphones, not specialized equipment) 4. Robust: Functions in challenging environments (heat, dust, humidity)

Edge AI satisfies all four. Cloud-dependent solutions (options a, c) fail criteria 1-2. Local training (option d) fails criteria 2-3.

The chapter’s malaria diagnosis example isn’t hypothetical—it’s a blueprint for implementation. The question scenario is precisely what the technology was designed for, and the chapter provides the exact solution.

Due to context window constraints, I’ll now create the final 3 questions for Chapter 15 more concisely while maintaining quality:

NoteQuestion 4

A public health researcher wants to understand whether a new community health worker program actually caused improved maternal health outcomes, or whether the correlation is due to confounding (e.g., wealthier communities got the program AND had better baseline health infrastructure). Standard predictive AI shows strong correlation between the program and outcomes. What approach would provide the STRONGEST causal evidence?

  1. Train a more sophisticated deep learning model with additional confounders to improve predictive accuracy
  2. Use causal inference methods like propensity score matching or instrumental variables to estimate treatment effect
  3. Increase sample size to make the correlation more statistically significant (p < 0.001)
  4. Run the predictive model on held-out test data to verify the correlation generalizes

Correct Answer: b) Use causal inference methods like propensity score matching or instrumental variables to estimate treatment effect

This question tests understanding of the fundamental distinction between correlation and causation—the central theme of the chapter’s Causal AI section—and when causal inference methods are necessary.

The Core Problem:

The chapter emphasizes: “Current AI: Excellent at finding correlations. Problem: Correlation ≠ Causation.”

The ice cream/drowning example illustrates this perfectly: predictive AI would identify the correlation and might recommend “ban ice cream to prevent drowning,” but causal understanding reveals hot weather as the common cause, leading to the correct intervention (increase lifeguards on hot days).

The Scenario’s Challenge:

The researcher faces classic confounding: communities that received the program may differ systematically from those that didn’t. Wealthy communities might: - Have better baseline health infrastructure - More educated population - Better nutrition and housing - More likely to advocate for and receive the program

A simple correlation between program and outcomes could reflect these pre-existing differences rather than program effectiveness.

Why Option B is Correct:

The chapter dedicates Section 8 to causal AI methods for exactly this problem. Two specific techniques mentioned:

1. Propensity Score Matching (chapter provides complete implementation):

The chapter shows matching treated and untreated communities with similar likelihood of receiving treatment:

def propensity_score_matching(data, treatment_col, outcome_col, covariate_cols):
    # Estimate propensity scores
    ps_model = LogisticRegression()
    ps_model.fit(X, treatment)

    # Match treated to control based on propensity
    # Calculate treatment effect

This creates “apples-to-apples” comparisons by matching communities that were equally likely to get the program but some did and some didn’t (quasi-randomization).

2. Instrumental Variables (also in chapter):

The chapter explains finding a variable that affects treatment but not outcome (except through treatment). For example: - Instrument: Geographic proximity to NGO headquarters - Affects treatment: Closer communities more likely to receive program (logistics) - Doesn’t directly affect outcomes: Except via the program itself

The chapter provides code using EconML’s DMLIV for exactly this estimation.

Pearl’s Causal Hierarchy (from the chapter):

Level Question AI Capability
Association What if I see X? ✓ Standard ML
Intervention What if I do X? ⚠️ Requires causal models
Counterfactual What if I had done X? ❌ Hardest

The researcher needs intervention-level reasoning (“What if we deploy the program?”), not association (“Communities with program have better outcomes”). Standard ML only provides association.

Why Other Options Fail:

Option (a)—More sophisticated predictive model:

This is the “more data, more features, better model” fallacy. The chapter explicitly warns against this:

“Predictive AI: ‘High ice cream sales → High drowning risk’”

A deep learning model with more confounders will find better correlations but still cannot distinguish causation from confounding. You could add 1000 features and achieve 99% accuracy, but still not know if the program causes improvements.

The chapter states: “Limitations of Predictive AI—Correlation ≠ Causation.” No amount of predictive sophistication solves this.

Option (c)—Increase sample size for significance:

Statistical significance (p < 0.001) only means “the correlation is unlikely due to random chance.” It does NOT address confounding.

With larger sample size, you’ll be very confident there’s a correlation—but still won’t know if it’s causal. The chapter’s discussion of policy evaluation emphasizes you need causal methods, not just stronger correlations.

Example: With 1 million communities, you might have p < 10^-50 for the program-outcome correlation, but it could still be entirely due to wealth confounding.

Option (d)—Test set validation:

Held-out test data validates that the correlation generalizes (model isn’t overfitting). But it still only shows the pattern holds across different samples—not that the relationship is causal.

If wealthy communities in both training and test sets got the program, the correlation will generalize perfectly while still being confounded.

Real-World Application:

The chapter’s policy evaluation section addresses exactly this scenario:

Question: Did mask mandates reduce COVID-19 transmission? Challenge: States that implemented mandates differ from those that didn’t (confounding) Causal approach: Difference-in-differences, synthetic control”

Replace “mask mandates” with “community health worker program” and “COVID transmission” with “maternal health outcomes”—same structure, same need for causal methods.

Implementation Path:

Propensity Score Matching: 1. Identify covariates that predict program assignment (wealth, education, baseline health) 2. Estimate propensity scores for each community 3. Match program communities to similar non-program communities 4. Compare outcomes between matched pairs 5. Estimate average treatment effect

Instrumental Variable: 1. Find valid instrument (e.g., distance to NGO office) 2. Use two-stage approach: - First stage: Instrument predicts program assignment - Second stage: Predicted assignment predicts outcomes 3. Estimate causal effect from second stage

The Chapter’s Key Message:

Causal AI: Understanding cause-effect relationships, enabling: 1. Counterfactual reasoning: ‘What would have happened if…?’ 2. Intervention planning: ‘What should we do to achieve X?’ 3. Explanation: ‘Why did Y happen?’”

For policy decisions (should we scale the program?), you need intervention-level reasoning. Predictive AI can’t provide this—causal inference can.

For practitioners:

When evaluating program effectiveness for policy decisions: - Association sufficient: Simple monitoring (is correlation still present?) - Causation required: Deciding to scale, modify, or terminate programs

The chapter emphasizes: “Public health consequences: Wrong intervention based on correlation vs. Right intervention based on causality.”

Deploying community health workers requires significant resources. The decision to scale nationwide needs causal evidence, not just predictive correlation. Option B provides that evidence through rigorous causal inference methods explicitly designed for this purpose.

NoteQuestion 5

A pharmaceutical research consortium wants to accelerate drug discovery for a neglected tropical disease. They have identified a protein target and need to generate novel drug candidates, predict binding affinity, and assess toxicity. Which AI-accelerated approach would provide the MOST comprehensive drug discovery pipeline?

  1. Use molecular generation models (VAE/GAN) to create candidates, then test all of them in wet lab experiments for binding and toxicity
  2. Use AlphaFold to predict target protein structure, then manually design candidates using traditional medicinal chemistry principles
  3. Deploy an integrated AI pipeline: generative models for candidate design, graph neural networks for binding prediction, and multi-task models for toxicity screening, then validate top candidates experimentally
  4. Use large language models to search PubMed for existing compounds that might bind the target, then repurpose them

Correct Answer: c) Deploy an integrated AI pipeline: generative models for candidate design, graph neural networks for binding prediction, and multi-task models for toxicity screening, then validate top candidates experimentally

This question tests understanding of the AI drug discovery ecosystem presented in the chapter and how different AI techniques work together in a modern discovery pipeline.

The Chapter’s AI Drug Discovery Framework:

Section 7 presents a multi-stage pipeline showing how AI accelerates each phase:

Phase Traditional With AI Improvement
Target identification 2-3 years 6-12 months 3-4x faster
Lead discovery 3-6 years 1-2 years 2-3x faster
Preclinical testing 1-2 years 6-12 months 1.5-2x faster

The chapter presents three specific AI techniques for the drug discovery process:

1. Molecular Generation (for candidate design):

The chapter shows transformer-based and GAN-based models generating novel molecules with desired properties:

def generate_drug_candidate(target_properties):
    # Generate novel molecules with:
    # - High binding affinity
    # - Low toxicity
    # - Oral bioavailability

This creates the candidate pool (thousands of potential molecules).

2. Binding Affinity Prediction (for filtering):

The chapter demonstrates graph neural networks for protein-ligand binding:

class ProteinLigandBindingPredictor(torch.nn.Module):
    # Predicts binding affinity in silico
    # Eliminates weak binders before synthesis

This filters candidates from thousands to hundreds based on predicted binding strength.

3. Toxicity Prediction (for safety screening):

The chapter discusses multi-task learning for toxicity endpoints: - Hepatotoxicity (liver damage) - Cardiotoxicity (heart issues) - Mutagenicity (DNA damage) - hERG inhibition (cardiac arrhythmia)

This eliminates toxic compounds before animal testing, reducing attrition.

Why Option C is Correct:

Integrated Pipeline Workflow:

The chapter’s success stories (Insilico Medicine, MELLODDY) used exactly this integrated approach:

Stage 1: Generative Design - Generate 10,000 candidate molecules using VAE/GAN/Transformer - Constrain generation to target properties (binds specified protein, drug-like properties)

Stage 2: Binding Prediction - Use graph neural networks to predict binding affinity for all 10,000 candidates - Filter to top 500 with predicted strong binding

Stage 3: Toxicity Screening - Run multi-task toxicity models on 500 candidates - Eliminate compounds with predicted toxicity - Narrow to top 50-100 candidates

Stage 4: Experimental Validation - Synthesize top 50-100 candidates - Wet lab testing for binding (biochemical assay) - Toxicity testing (cell cultures, animal models) - Advance top 5-10 to further development

Efficiency Gains:

This pipeline reduces the experimental burden from testing 10,000 compounds to testing 50-100 high-probability candidates—a 100x reduction in lab work.

Real-World Validation:

The chapter cites Insilico Medicine’s INS018_055: - First AI-designed drug in Phase II trials - Timeline: 18 months from start to Phase I (vs. 4-5 years traditional) - Cost: ~$2.6M (vs. $20-40M traditional) - Method: “Generative chemistry + reinforcement learning”

This used an integrated pipeline, not any single technique.

Why Other Options Fail:

Option (a)—Generative models only, then full wet lab:

This uses AI for only one stage (generation) but then tests everything experimentally. Problems:

  1. Wasteful: Generating 10,000 candidates then synthesizing and testing all of them defeats the purpose of AI acceleration. The chapter emphasizes AI should reduce experimental burden.

  2. Expensive: Synthesizing 10,000 compounds costs millions. The chapter’s cost reduction ($2.6M vs. $20-40M) comes from using AI to filter before synthesis.

  3. Slow: Testing 10,000 compounds in wet lab takes years, negating AI’s speed advantage.

The chapter’s approach is generate → AI-filter → validate top candidates experimentally, not generate → test everything.

Option (b)—AlphaFold only, then manual design:

This misunderstands how different AI tools fit together:

  1. AlphaFold provides structure, not candidates: The chapter presents AlphaFold for protein structure prediction. This is useful for understanding the target, but doesn’t generate drug molecules.

  2. Manual design bottleneck: Traditional medicinal chemistry can design maybe 10-50 candidates per year. AI generative models can design thousands per week. The chapter’s timeline acceleration requires AI-scale generation.

  3. Ignores AI’s strengths: The chapter shows AI excels at exploring vast chemical space (10^60 possible molecules) that humans cannot manually navigate.

While AlphaFold is valuable (chapter mentions it for understanding targets in neglected diseases), it’s only one piece. You still need generative models for candidates and predictive models for screening.

Option (d)—LLM literature search for repurposing:

Drug repurposing is valuable but doesn’t address the scenario’s need for novel candidates:

  1. Limited scope: Repurposing searches existing drugs. For neglected tropical diseases, the chapter emphasizes there may be no existing drugs targeting these proteins (hence “neglected”).

  2. Not generative: LLMs searching literature don’t create new molecules—they find existing ones. The chapter’s generative models (VAE, GAN, Transformers) create novel chemical structures.

  3. Missed AI capabilities: The question asks for “accelerate drug discovery,” implying novel compound development. The chapter’s AI drug discovery section focuses on de novo design, not just repurposing.

PubMed search is useful for background research, but the chapter’s drug discovery pipeline is about AI-designed molecules, not literature mining.

The Chapter’s Integration Theme:

The chapter emphasizes different AI techniques work together:

AlphaFold (protein structure) + Generative models (molecule design) + Graph neural networks (binding prediction) + Multi-task learning (toxicity) = Complete pipeline

The success story section shows this integration: - AlphaFold: Understand target structure - Generative chemistry: Design binders - Predictive models: Filter for properties - Experimental validation: Confirm top hits

Practical Implementation:

For neglected tropical diseases, the chapter emphasizes AI can finally make drug discovery economically feasible:

Traditional approach: $2.6B average cost → Not economically viable for diseases affecting poor populations

AI approach: ~$2.6M to Phase I → 1000x reduction makes neglected diseases financially tractable

But this requires the full pipeline (option C), not just one AI technique.

For practitioners:

The chapter’s message is clear: AI drug discovery isn’t a single tool—it’s an integrated ecosystem. Each stage uses the appropriate AI technique: - Generation: Transformers, GANs, VAEs - Binding: Graph neural networks, molecular docking - Toxicity: Multi-task learning, QSAR models - Optimization: Reinforcement learning, active learning

The Insilico Medicine case study validates this integrated approach achieved 2x timeline acceleration and 10x cost reduction. For neglected diseases where economics prevented traditional drug development, this integrated AI pipeline enables discovery that was previously impossible.

NoteQuestion 6

A public health agency wants to prepare for emerging AI technologies over the next 5 years. They have limited budget and need to prioritize investments. According to the chapter’s responsible innovation framework, which preparation strategy would be MOST effective?

  1. Purchase expensive quantum computing access now to gain first-mover advantage before competitors
  2. Invest heavily in one breakthrough technology (e.g., digital twins) with all resources to achieve deep expertise
  3. Start small pilot projects on mature technologies (TRL 6-7), build workforce capacity, and establish partnerships with AI researchers while monitoring emerging technologies
  4. Wait until all emerging technologies are fully mature (TRL 9) and standards are established before investing

Correct Answer: c) Start small pilot projects on mature technologies (TRL 6-7), build workforce capacity, and establish partnerships with AI researchers while monitoring emerging technologies

This question tests understanding of the chapter’s technology readiness assessment framework, responsible innovation principles, and practical preparation strategies for public health agencies.

The Chapter’s Preparation Framework:

Section 9 (“Preparing for Emerging Technologies”) and Section 10 (“Responsible Innovation”) provide explicit guidance on how public health agencies should approach emerging AI.

Key Principles from the Chapter:

1. Technology Readiness Assessment:

The chapter presents NASA’s TRL scale: - TRL 1-3: Research (quantum computing for epidemiology) - TRL 4-6: Development (federated learning prototypes) - TRL 7-9: Deployment (GPT-4 for clinical documentation)

Chapter guidance: “This chapter focuses on TRL 3-7 technologies: proven concepts moving toward practical deployment.”

2. Pilot Project Framework:

The chapter explicitly outlines “Start small, learn fast, scale what works”:

Phase 1: Proof of Concept (3-6 months) - Select well-defined, limited-scope problem - Use existing data and tools where possible - Set success criteria upfront

Phase 2: Pilot Implementation (6-12 months) - Deploy in controlled setting (1-2 sites) - Monitor performance closely - Iterate based on learnings

Phase 3: Scale and Sustain (12+ months) - Expand to additional sites - Integrate into workflows - Establish maintenance procedures

3. Workforce Development:

The chapter emphasizes building organizational capacity:

Roles needed: - AI-literate public health practitioners - Technical specialists - Translators (bridge public health and AI) - Ethicists

Training approaches: - Upskill existing workforce - Partner with academic institutions - Engage consultants for specific projects

4. Responsible Innovation Framework:

Stilgoe et al.’s four dimensions (chapter Section 10): 1. Anticipation - Foresee potential impacts 2. Reflexivity - Question assumptions 3. Inclusion - Engage diverse stakeholders 4. Responsiveness - Adapt based on evidence

Why Option C is Correct:

Option C embodies ALL of these principles in a balanced, practical strategy:

“Start small pilot projects on mature technologies (TRL 6-7)” - Aligned with chapter’s TRL focus - Near-term value (mature technologies work now) - Low risk (proven technologies) - Learning opportunity (gain hands-on experience)

“Build workforce capacity” - Addresses chapter’s emphasis on workforce as limiting factor - Sustainable (builds internal capability, not just vendor dependence) - Enables future technology adoption (trained staff can evaluate new technologies)

“Establish partnerships with AI researchers” - Chapter explicitly recommends this: “Partner with AI companies and vendors,” “Academic research institutions” - Access to expertise without full-time hiring costs - Influence technology development toward public health needs - Early awareness of emerging technologies

“While monitoring emerging technologies” - Responsible innovation’s “anticipation” dimension - Preparation without premature commitment - Example from chapter: “Prepare for quantum computing by learning fundamentals, identifying potential problems” (not deploying)

This is the chapter’s recommended balanced approach.

Real-World Validation:

The chapter’s success stories followed this pattern:

FeTS (Federated Tumor Segmentation): - Started with small pilot (brain tumor segmentation) - Built consortium of 30 institutions gradually - Established partnerships with research institutions - Technology was TRL 5-6 (development stage, not research or fully mature)

MELLODDY: - Started with 10 pharmaceutical partners - Pilot on drug discovery use case - Built expertise in federated learning - Scaled after demonstrating value

Neither jumped immediately to cutting-edge (TRL 1-3) technologies nor waited for complete maturity (TRL 9).

Why Other Options Fail:

Option (a)—Quantum computing first-mover advantage:

This violates multiple chapter principles:

  1. Wrong TRL: Quantum computing is TRL 2-3 (basic research). Chapter timeline: “2033-2040: Practical quantum advantage for specific public health problems.” With 5-year timeframe, this is too early.

  2. Expensive with no near-term ROI: “Purchase expensive quantum computing access” wastes limited budget on technology that won’t deliver value in 5 years.

  3. “First-mover advantage” fallacy: The chapter emphasizes public health should engage to shape development, not compete with tech companies. “First mover” makes sense for commercial products, not public health research.

  4. Neglects fundamentals: Chapter states quantum preparation should be “learning quantum algorithms fundamentals, identifying problems with potential quantum speedup”—not purchasing access.

Option (b)—All resources on one technology:

This is high-risk, inflexible strategy that violates responsible innovation:

  1. Lack of diversification: What if digital twins don’t pan out for your use case? All resources wasted.

  2. Ignores technology evolution: 5 years is long in AI. The chapter shows AI capabilities double every 6 months. Locking into one technology in 2025 means potentially missing better alternatives emerging in 2027.

  3. Violates “reflexivity”: Responsible innovation requires questioning assumptions and adapting. All-in commitment prevents adaptation.

  4. Organizational risk: The chapter warns about vendor lock-in and concentration of power. Going all-in on one technology (especially from one vendor) creates dependency.

The chapter never recommends this approach—instead showing how different technologies serve different needs (federated learning for privacy, edge AI for resource-limited settings, causal AI for policy evaluation).

Option (d)—Wait for TRL 9 maturity:

This is the opposite extreme—too conservative:

  1. Missed opportunity: By waiting for TRL 9 (full operational deployment), you miss the chance to shape technology development. The chapter emphasizes: “Early adopters who understand emerging technologies can: Shape development toward public health needs, Prepare infrastructure and workforce.”

  2. Always behind: By the time technology reaches TRL 9, it’s mature and commoditized. New technologies are already emerging. This creates perpetual catch-up.

  3. No workforce development: If you wait until technology is fully mature to start learning, your workforce is always behind. The chapter emphasizes upskilling existing workforce takes time.

  4. Contradicts chapter’s message: “Public health should not be a passive recipient of AI technologies developed elsewhere. Engage, shape, and build the future you want to see.”

The Chapter’s 5-Year Timeline Guidance:

The chapter’s “Most Promising Near-Term (2025-2028)” section recommends: 1. Multimodal foundation models (TRL 7-8) 2. Federated learning (TRL 5-6) 3. Edge AI (TRL 6-7)

All are in the TRL 4-7 range—not cutting-edge research (TRL 1-3) nor fully mature standards (TRL 9), but proven concepts moving to deployment.

Practical 5-Year Plan (from chapter):

Years 1-2: - Pilot multimodal AI for documentation (TRL 7-8, immediate value) - Start federated learning pilot with 2-3 partner institutions (build expertise) - Train workforce on AI fundamentals and ethics - Establish partnerships with universities and AI companies

Years 3-4: - Scale successful pilots - Evaluate emerging technologies (digital twins, causal AI, edge AI) - Start pilots on newly mature technologies - Deepen workforce expertise

Years 5+: - Mature capabilities in proven technologies - Monitor quantum computing, neuromorphic computing (still research stage) - Continuous learning and adaptation

This balanced approach delivers near-term value (years 1-2) while building capacity for future adoption (years 3-5).

For practitioners:

The chapter’s preparation philosophy is: - Not too early: Don’t invest in TRL 1-3 technologies that won’t deliver in your timeframe - Not too late: Don’t wait for TRL 9 perfection—engage during development (TRL 4-7) - Build capacity: Technology is less limiting than workforce and organizational readiness - Diversify: Different technologies for different problems - Learn by doing: Pilots provide hands-on experience - Partner: Leverage external expertise while building internal capability

Option C embodies this balanced, practical, sustainable approach that the chapter explicitly recommends for public health agencies with limited budgets preparing for AI’s future.

17.17 Chapter Summary