Lecture 6 — Medical AI & Digital Twins (Van Rooij + Bontje)
Paper: Wang, W., He, F., Li, Y., Tang, S., Li, X., Xia, J., & Lv, Z. (2023). Data information processing of traffic digital twins in smart cities using edge intelligent federation learning. Information Processing & Management, 60(1), 103171.
Type: Thematic (orange). Two short half-lectures combined into one session: - Van Rooij: ML applied to medical / neuro data — classification and stratification. - Bontje: digital twins of urban traffic — sensor-driven simulations of cities.
Both halves share the Promises vs. Risks framing that has structured the thematic lectures.
Half A — AI in Medicine (Van Rooij)
Background
AI applications in healthcare have skyrocketed in the last decade: - AI tools to develop new medicines - AI tools to analyse wearables data - AI tools to diagnose patients - AI tools to investigate (neuro)biological underpinnings of disorders
Two methodological families
| Method | Type | Purpose |
|---|---|---|
| Classification | Supervised ML | Predict a-priori group labels (patient vs. control, diagnosis) and identify which features are most predictive |
| Stratification | Unsupervised ML | Identify hidden / underlying subgroups within a population (data-driven phenotyping) |
Three worked examples
Topic 1 — Classifying ADHD patients from fMRI
- Data: fMRI activation maps during an Inhibition task.
- Classifier: Gaussian Process Classifier (a Bayesian ML method).
- Outcome: patient diagnosis + a feature-weight map showing which voxels drove the prediction.
- Performance: N = 700; accuracy 77.1%; sensitivity 75%; specificity 80%; ROC AUC ≈ 0.82.
| Promises | Risks |
|---|---|
| Better classification of patients | Determinism / wrongful interpretation (a 77%-accurate model is not "the brain causes ADHD") |
| More insight into neurobiology of disorders | Inaccurate individual predictions |
| More insight into between-subject heterogeneity | Malignant use (e.g. by insurance companies) |
Topic 2 — Predicting COVID-19 cases & deaths from demographics
- Data: COVID-19 clinical data + community-level demographics.
- Method: logistic regression.
- Outcome: odds-ratios / risk estimates for community-level COVID cases and deaths.
| Promises | Risks |
|---|---|
| Better understanding of COVID-19 spread | Discrimination / inequality (some groups labelled "high-risk" → over-policing or exclusion) |
| Targeted interventions in at-risk populations | Wrongful attribution of causality (black-box confounders) |
| Improvement of healthcare system | Wrongful biological determinism |
| "Garbage in, garbage out" — biased input data → biased policy |
Topic 3 — Stratifying autism (ASD) subjects from brain structure
- Data: structural morphometry of 53 brain segments.
- Methods: normative modelling (deviation from a learned normative range) + spectral clustering.
- Outcome: data-driven clusters of patients with distinct clinical profiles.
| Promises | Risks |
|---|---|
| Insight into between-subject heterogeneity | Statistically spurious clusters that don't replicate |
| Targeted interventions based on subject profile |
Van Rooij's wrap-up
- Many different applications for AI in the medical field.
- More data, better access, better methods.
- Benefits to science and health are real, but so are the risks.
- Understanding the AI methods themselves is necessary to mitigate the risks — black-box clinical decisions cannot be ethically justified.
Half B — Digital Twins of (Urban) Traffic (Bontje)
What is a digital twin?
A virtual, bi-directional model of a physical city that visualises urban processes in real time and supports planning, management, and decision-making. "Bi-directional" = the twin both reflects sensor data from the city and feeds decisions back to it.
Why for cities?
- Cities are complex and constantly changing.
- DT data makes that complexity visible.
- DTs support faster and better-informed decisions — traffic, air pollution, accessibility, safety, livability.
Traffic specifically
- Traffic is one of the most important urban systems.
- Goal: predict and control traffic flow.
- Use cases: safety and accessibility analysis; event planning / crowd management; scenario testing ("what if we close a road?").
The Dutch state of play (DMI programme)
- No "network" of DTs yet, but lots of shared assets: 3D city models, dashboards, simulations, fieldlabs / pilots.
- Aim: move from isolated pilots / closed systems toward open, modular, reusable systems — reusable building blocks, shared standards, "Digital Twin as a Service," a European Digital Twin Appstore.
Emerging techniques
- 2D/3D visualisation.
- AI and language models as analysis layers.
- Dynamic sensor data with local AI — sensors collect real-time traffic data; local computing (edge processing) close to the street processes it; AI recognises traffic situations; the system feeds back to vehicles or traffic infrastructure.
Challenges
- Many twins are still pilots.
- Expensive custom solutions.
- Dependence on one software provider (vendor lock-in).
- Need for standards.
- Uncertainty in models.
SWOT analysis of DTs of urban traffic (Bontje's framing)
| Positive | Negative | |
|---|---|---|
| Internal | Strengths: scenario testing; better data-based decisions | Weaknesses: limited real-time traffic data; model uncertainty |
| External | Opportunities: national network of local DTs; reusable standards/modules; AI + sensor data | Threats: privacy risks; overreliance on models |
IOS connections explicitly drawn by Bontje
- Open Cities — test redevelopment scenarios before implementation; compare effects on movement, accessibility, safety, livability.
- Behaviour and Institutions — embed cognitive models of pedestrians/cyclists in DTs to simulate how people perceive, decide and move (this is Bontje's PhD topic; bridges to L2 cognitive modelling).
- Fair Transitions — DTs make redevelopment impacts visible across different user groups, supporting inclusive and transparent decisions.
Paper 7 — Wang et al. (2023): Edge intelligent federation learning for traffic DTs
⚠️ Reconstructed from the title and general knowledge — confirm against the paper.
What's in the title
- Traffic digital twins in smart cities — same object Bontje describes.
- Data information processing — the central problem: each car, sensor, intersection produces data; how do you fuse it into a coherent, real-time twin?
- Edge intelligent federation learning — the methodological contribution. Federated learning trains a shared model across many local devices without moving raw data to a central server; "edge intelligent" puts the inference and partial training on the local sensor/device.
Why federated + edge for DTs?
- Bandwidth: streaming all raw sensor data to a central server is infeasible at city scale.
- Latency: traffic-control decisions need millisecond responses; round-tripping to the cloud is too slow.
- Privacy: GDPR + the privacy concern Bontje flagged as a Threat in the SWOT — federation keeps raw data local; only gradient updates leave the device.
- Resilience: a local intersection's model keeps working even if the central system is unreachable.
Likely paper structure
- Architecture: device → edge → city layer.
- Federation protocol: clients compute local gradients on local traffic data; aggregator combines them into a global model; global model is deployed back to clients.
- Evaluation: prediction accuracy (traffic flow, congestion, accident risk) vs. centralised baselines, communication cost, latency, privacy guarantees.
Why this paper closes the loop with L1's CDR
- Stewardship of citizen-generated traffic data → federation/edge is how you operationalise stewardship at city scale.
- Transparency / Truth → DTs are only legitimate if their predictions are auditable; federated systems can be more auditable than monolithic clouds but only if standards demand it.
- Bontje's Threats (privacy risks, overreliance) → federated edge architecture mitigates the first but not the second.
Why this matters for an open society
L7 is methodologically diverse but normatively unified by the Promises × Risks lens:
- Medical AI → Transitions & Wellbeing pillar (healthcare delivery), with Equity & Diversity stakes (discriminatory misuse, biological determinism).
- Digital twins → Open Cities and Behaviour & Institutions platforms, with privacy and overreliance as Equity & Democracy threats.
- The unifying argument across both halves: AI in high-stakes thematic domains demands method-literate citizens, scientists, and policymakers — you cannot evaluate the legitimacy of a clinical classifier or a smart-city DT without understanding the basics of the underlying algorithm. This is essentially the course's core normative claim, re-stated in a thematic key.
Likely essay-question angles
- "Distinguish classification from stratification in medical AI. Apply each to a concrete example from Van Rooij's lecture and discuss its risks."
- "What is a digital twin of a city? Use Bontje's SWOT to argue whether the Dutch DMI programme is more likely to advance or threaten the IOS pillar 'Open Cities'."
- "Wang et al. (2023) use federated edge learning for traffic DTs. Explain how this architecture mitigates some of Bontje's listed threats but not others. Tie back to Elliott's TRUST."
- "Compare medical AI and traffic DTs as deployments of opaque ML systems in high-stakes public domains. Which of Grimmelikhuijsen & Meijer's six threats is most acute for each?"
Quick self-test
- Difference between classification and stratification in ML — give one medical example of each.
- Three risks Van Rooij identifies for classification of psychiatric patients from neuroimaging.
- Definition of a digital twin — and what does "bi-directional" mean here?
- Bontje's SWOT: name one item in each of the four cells.
- Why federated learning + edge inference for traffic DTs? Three reasons.
- Which IOS platforms does each half of L7 most directly speak to?
Source slides
Bontje — Digital Twins
Open AIOS lecture6_Bontje_2026.pdf in new tab ↗