Skip to content

Lecture 6 — Medical AI & Digital Twins (Van Rooij + Bontje)

Paper: Wang, W., He, F., Li, Y., Tang, S., Li, X., Xia, J., & Lv, Z. (2023). Data information processing of traffic digital twins in smart cities using edge intelligent federation learning. Information Processing & Management, 60(1), 103171.

Type: Thematic (orange). Two short half-lectures combined into one session: - Van Rooij: ML applied to medical / neuro data — classification and stratification. - Bontje: digital twins of urban traffic — sensor-driven simulations of cities.

Both halves share the Promises vs. Risks framing that has structured the thematic lectures.


Half A — AI in Medicine (Van Rooij)

Background

AI applications in healthcare have skyrocketed in the last decade: - AI tools to develop new medicines - AI tools to analyse wearables data - AI tools to diagnose patients - AI tools to investigate (neuro)biological underpinnings of disorders

Two methodological families

Method Type Purpose
Classification Supervised ML Predict a-priori group labels (patient vs. control, diagnosis) and identify which features are most predictive
Stratification Unsupervised ML Identify hidden / underlying subgroups within a population (data-driven phenotyping)

Three worked examples

Topic 1 — Classifying ADHD patients from fMRI

  • Data: fMRI activation maps during an Inhibition task.
  • Classifier: Gaussian Process Classifier (a Bayesian ML method).
  • Outcome: patient diagnosis + a feature-weight map showing which voxels drove the prediction.
  • Performance: N = 700; accuracy 77.1%; sensitivity 75%; specificity 80%; ROC AUC ≈ 0.82.
Promises Risks
Better classification of patients Determinism / wrongful interpretation (a 77%-accurate model is not "the brain causes ADHD")
More insight into neurobiology of disorders Inaccurate individual predictions
More insight into between-subject heterogeneity Malignant use (e.g. by insurance companies)

Topic 2 — Predicting COVID-19 cases & deaths from demographics

  • Data: COVID-19 clinical data + community-level demographics.
  • Method: logistic regression.
  • Outcome: odds-ratios / risk estimates for community-level COVID cases and deaths.
Promises Risks
Better understanding of COVID-19 spread Discrimination / inequality (some groups labelled "high-risk" → over-policing or exclusion)
Targeted interventions in at-risk populations Wrongful attribution of causality (black-box confounders)
Improvement of healthcare system Wrongful biological determinism
"Garbage in, garbage out" — biased input data → biased policy

Topic 3 — Stratifying autism (ASD) subjects from brain structure

  • Data: structural morphometry of 53 brain segments.
  • Methods: normative modelling (deviation from a learned normative range) + spectral clustering.
  • Outcome: data-driven clusters of patients with distinct clinical profiles.
Promises Risks
Insight into between-subject heterogeneity Statistically spurious clusters that don't replicate
Targeted interventions based on subject profile

Van Rooij's wrap-up

  • Many different applications for AI in the medical field.
  • More data, better access, better methods.
  • Benefits to science and health are real, but so are the risks.
  • Understanding the AI methods themselves is necessary to mitigate the risks — black-box clinical decisions cannot be ethically justified.

Half B — Digital Twins of (Urban) Traffic (Bontje)

What is a digital twin?

A virtual, bi-directional model of a physical city that visualises urban processes in real time and supports planning, management, and decision-making. "Bi-directional" = the twin both reflects sensor data from the city and feeds decisions back to it.

Why for cities?

  • Cities are complex and constantly changing.
  • DT data makes that complexity visible.
  • DTs support faster and better-informed decisions — traffic, air pollution, accessibility, safety, livability.

Traffic specifically

  • Traffic is one of the most important urban systems.
  • Goal: predict and control traffic flow.
  • Use cases: safety and accessibility analysis; event planning / crowd management; scenario testing ("what if we close a road?").

The Dutch state of play (DMI programme)

  • No "network" of DTs yet, but lots of shared assets: 3D city models, dashboards, simulations, fieldlabs / pilots.
  • Aim: move from isolated pilots / closed systems toward open, modular, reusable systems — reusable building blocks, shared standards, "Digital Twin as a Service," a European Digital Twin Appstore.

Emerging techniques

  • 2D/3D visualisation.
  • AI and language models as analysis layers.
  • Dynamic sensor data with local AI — sensors collect real-time traffic data; local computing (edge processing) close to the street processes it; AI recognises traffic situations; the system feeds back to vehicles or traffic infrastructure.

Challenges

  • Many twins are still pilots.
  • Expensive custom solutions.
  • Dependence on one software provider (vendor lock-in).
  • Need for standards.
  • Uncertainty in models.

SWOT analysis of DTs of urban traffic (Bontje's framing)

Positive Negative
Internal Strengths: scenario testing; better data-based decisions Weaknesses: limited real-time traffic data; model uncertainty
External Opportunities: national network of local DTs; reusable standards/modules; AI + sensor data Threats: privacy risks; overreliance on models

IOS connections explicitly drawn by Bontje

  • Open Cities — test redevelopment scenarios before implementation; compare effects on movement, accessibility, safety, livability.
  • Behaviour and Institutions — embed cognitive models of pedestrians/cyclists in DTs to simulate how people perceive, decide and move (this is Bontje's PhD topic; bridges to L2 cognitive modelling).
  • Fair Transitions — DTs make redevelopment impacts visible across different user groups, supporting inclusive and transparent decisions.

Paper 7 — Wang et al. (2023): Edge intelligent federation learning for traffic DTs

⚠️ Reconstructed from the title and general knowledge — confirm against the paper.

What's in the title

  • Traffic digital twins in smart cities — same object Bontje describes.
  • Data information processing — the central problem: each car, sensor, intersection produces data; how do you fuse it into a coherent, real-time twin?
  • Edge intelligent federation learning — the methodological contribution. Federated learning trains a shared model across many local devices without moving raw data to a central server; "edge intelligent" puts the inference and partial training on the local sensor/device.

Why federated + edge for DTs?

  • Bandwidth: streaming all raw sensor data to a central server is infeasible at city scale.
  • Latency: traffic-control decisions need millisecond responses; round-tripping to the cloud is too slow.
  • Privacy: GDPR + the privacy concern Bontje flagged as a Threat in the SWOT — federation keeps raw data local; only gradient updates leave the device.
  • Resilience: a local intersection's model keeps working even if the central system is unreachable.

Likely paper structure

  • Architecture: device → edge → city layer.
  • Federation protocol: clients compute local gradients on local traffic data; aggregator combines them into a global model; global model is deployed back to clients.
  • Evaluation: prediction accuracy (traffic flow, congestion, accident risk) vs. centralised baselines, communication cost, latency, privacy guarantees.

Why this paper closes the loop with L1's CDR

  • Stewardship of citizen-generated traffic data → federation/edge is how you operationalise stewardship at city scale.
  • Transparency / Truth → DTs are only legitimate if their predictions are auditable; federated systems can be more auditable than monolithic clouds but only if standards demand it.
  • Bontje's Threats (privacy risks, overreliance) → federated edge architecture mitigates the first but not the second.

Why this matters for an open society

L7 is methodologically diverse but normatively unified by the Promises × Risks lens:

  • Medical AI → Transitions & Wellbeing pillar (healthcare delivery), with Equity & Diversity stakes (discriminatory misuse, biological determinism).
  • Digital twins → Open Cities and Behaviour & Institutions platforms, with privacy and overreliance as Equity & Democracy threats.
  • The unifying argument across both halves: AI in high-stakes thematic domains demands method-literate citizens, scientists, and policymakers — you cannot evaluate the legitimacy of a clinical classifier or a smart-city DT without understanding the basics of the underlying algorithm. This is essentially the course's core normative claim, re-stated in a thematic key.

Likely essay-question angles

  1. "Distinguish classification from stratification in medical AI. Apply each to a concrete example from Van Rooij's lecture and discuss its risks."
  2. "What is a digital twin of a city? Use Bontje's SWOT to argue whether the Dutch DMI programme is more likely to advance or threaten the IOS pillar 'Open Cities'."
  3. "Wang et al. (2023) use federated edge learning for traffic DTs. Explain how this architecture mitigates some of Bontje's listed threats but not others. Tie back to Elliott's TRUST."
  4. "Compare medical AI and traffic DTs as deployments of opaque ML systems in high-stakes public domains. Which of Grimmelikhuijsen & Meijer's six threats is most acute for each?"

Quick self-test

  1. Difference between classification and stratification in ML — give one medical example of each.
  2. Three risks Van Rooij identifies for classification of psychiatric patients from neuroimaging.
  3. Definition of a digital twin — and what does "bi-directional" mean here?
  4. Bontje's SWOT: name one item in each of the four cells.
  5. Why federated learning + edge inference for traffic DTs? Three reasons.
  6. Which IOS platforms does each half of L7 most directly speak to?

Source slides

Bontje — Digital Twins

Open AIOS lecture6_Bontje_2026.pdf in new tab ↗

Van Rooij — AI in Medicine

Open AIOS_lecture_AI-in-Medicine_2026_DvR.pdf in new tab ↗