Skip to content

AIOS Flashcards

Cover the right column with your hand; try to answer before peeking.


Lecture 1 — Intro & IOS / Elliott CDR

Q A
Four bases of an open society? (1) Openness to diversity of knowledge; (2) openness to emancipatory movements & individual rights; (3) constitutional democracy & rule of law; (4) contestable markets & open borders.
Definition of "institutions"? The building blocks of society — the rules of the game: written rules + associated organisations, and unwritten rules & networks.
Formal vs. informal institution — one AI-relevant example of each? Formal: the EU AI Act. Informal: user expectations of how a chatbot should behave / journalistic norms about reporting AI-generated content.
Three IOS pillars? Democracy & Good Governance · Transitions & Wellbeing · Equity & Diversity.
Three lecture types in this course (and what each does)? Methodological (tools to study AI's impact) · Thematic (examples + mitigation) · Practical (workgroups for the grant).
What does TRUST stand for (Elliott et al. CDR)? Transparency, Responsibility, Understanding, Stewardship, Truth.
Three central tenets of CDR (Venn diagram)? Promoting Economic Transparency · Promoting Societal Wellbeing · Reducing Tech Impact on Environment.
What does the intersection of all three CDR tenets represent? Purpose & TRUST.
The "Elliott guiding question"? "If we permit AI to make life-changing decisions, what are the opportunity costs, data trade-offs, and implications for social, economic, technical, legal, and environmental systems?"
Why is CDR a corporate framework rather than state/individual? It locates responsibility for AI's societal impact with the firms deploying it, complementing (not replacing) regulators — a governance-by-organisation rather than governance-by-law approach.
How many IOS platforms in total, across the 3 pillars? 15.

Lecture 2 — Modeling individual decision making (Van Maanen) / Palada et al. 2016

Q A
Newell's four time scales of human action? Biological (μs–ms) · Cognitive (100 ms–10 s) · Rational (mins–hours) · Social (days–months).
Which time-scale band does Lecture 2 sit in? Cognitive (perception, memory, attention, simple decisions).
Three levels of integration of cognitive theory? Anecdotal (general intuitions) · Computational (hypothesised computations) · Algorithmic (hypothesised cognitive processes / formal models).
Which level gives better individual-difference predictions, and why? Algorithmic — it specifies a mechanism with fit-able parameters, so it can capture how each user differs (decay rate, drift rate, threshold).
Four parameters of the Linear Ballistic Accumulator (LBA)? Start point (U[0,A]) · Drift rate (N(v,s)) · Threshold (b) · Non-decision time (t₀).
How is a decision produced in the LBA? Each option has its own accumulator racing toward threshold; the fastest accumulator wins.
Which LBA parameter is moved by task difficulty, and which by speed/accuracy instruction? Difficulty → drift rate. Speed/accuracy instruction → threshold.
Mechanistic explanation for the speed–accuracy trade-off? Speed instruction lowers the threshold → decisions are made on less-accumulated, noisier evidence → more errors.
ACT-R activation equation (gloss)? B_i = ln(Σ t_j^-d) — activation is a function of frequency (n), recency (t_j) and decay (d).
What is "need probability" in the Anderson & Schooler (1991) memory model? The probability that an item will be required in the near future — drives the forget-vs-retain decision.
Palada et al. (2016) — task, manipulation, key result? Cloud-monitoring target detection (proxy for air-traffic control). Workload manipulated in 4 levels. As workload rises, RT drops, errors rise; LBA fit shows workload lowers both drift rate (task harder) and threshold (strategic).
How does the LBA support individual user-modelling? Each user's parameters (cautiousness=threshold, processing efficiency=drift rate, execution time=non-decision time) classify them as good/medium/poor decision-maker.
Which IOS platforms does L2 most directly connect to? Future of Work (Transitions & Wellbeing) and In/Equality (Equity & Diversity).

Lecture 3 — Autonomy / interaction with AI (Hortensius) / Rahwan Machine Behaviour

Q A
Rahwan et al. (2019) — working definition of AI agent? "Complex and simple algorithms used to make decisions."
Three reasons machine behaviour needs to be studied as its own field? (1) Ubiquity of AI; (2) complexity/opacity (often closed code/data); (3) beneficial AND detrimental effects.
Rahwan's three scales of machine behaviour? Individual machine · Collective machine · Hybrid human–machine.
Three modes of hybrid human–machine behaviour? Machines shape humans · Humans shape machines (engineering) · Human–machine co-behaviour (e.g., Tay).
Rahwan's four domains where machine behaviour matters? Democracy · Kinetics · Markets · Society.
Three vertices of the interdisciplinary triangle? Engineering of AI · Scientific study of behaviour · Study of impact of technology — machine behaviour bridges them.
What are digital traces (Rafaeli et al. 2019)? Records/logs of behaviour (Facebook likes, tweets, browsing, cookies). Contextual data: when, where, how long.
Three advantages of digital traces for psych research? (1) Bigger / different samples — beyond WEIRD; (2) detailed contextual measurement of behaviour-in-the-wild; (3) "digital dossier" reduces experimental demand bias.
Main bias of digital-trace data? Self-selection — which platform's users you observe shapes your conclusions.
Kosinski et al. (2013) — input, sample, method? Facebook likes (binary 1/0) of 58,466 US users (myPersonality app); SVD to 100 components → regression with 10-fold CV.
Three highest-accuracy predictions from Kosinski et al.? Ethnicity (Caucasian/African-American) AUC 0.95 · Gender 0.93 · Gay (male) 0.88.
Hinds & Joinson (2019) key finding? With ~300 Likes the algorithm matches a spouse's personality-prediction accuracy (~0.56), beating friends, family, cohabitants.
Matz et al. (2017) — psychological targeting result and key caveat? Trait-congruent ads ~1.4–1.8× higher conversion. Caveat: Eckles et al. 2018 letter — field targeting studies face internal-validity threats.
Kramer et al. (2014) "Facebook study" — what was manipulated and what was found? Newsfeed valence (more/fewer positive vs. negative posts) for ~689k users. Users' own posts shifted in the manipulated direction. Effect tiny (d≈0.001–0.02) but population enormous.
Why does the small effect in Kramer matter despite being tiny? At population scale (billions of users × continuous feed) the cumulative behavioural and democratic implications are large; also: ethics — no informed consent.
Cambridge Analytica connects which loop? Digital traces (Facebook likes) → psychographic prediction → psychological targeting → political behaviour (Brexit / 2016 election). The full Rahwan "Democracy" hybrid loop.

Lecture 4 — Emergence of collective patterns (Klein) / Douven-Hegselmann

Q A
Define emergence in one sentence. Complex social patterns arise from individual interactions and may have properties that do not immediately follow from individuals or their properties — "the sum is more than its parts."
Coleman's bathtub — four moves? (1) Macro → Situation → Actor (downward); (2) Actor → Selection → Action; (3) Action → Aggregation → Updated Macro; (4) methodological individualism: every macro explanation refers to individual agents.
Five analytic concepts for ABMs? Emergence · Path dependencies · Tipping points · Non-monotonicity · Direction of effect.
Four "directions of effect" between individual motives and emergent pattern? (1) Aligned (possibly stronger than individual motive — Schelling segregation); (2) Accidental; (3) Opposed (Adam Smith's invisible hand); (4) No individual counterpart (trends, flocking).
Hegselmann–Krause model — what is ε and how does it work? Each agent's confidence interval. The agent only listens to others whose opinion lies within ε of its own; new opinion = average over those neighbours.
HK update rule (gloss)? pos_new^i = mean of {pos_old^j :
What does the base HK model show about polarization? Polarization can emerge even when every agent is unbiased and rational — clusters drift outside one another's ε and stop talking.
Three agent types in Douven & Hegselmann (2021)? Free Riders (social only) · Truth Seekers (pos_new = (1−α)·pos_social + α·τ) · Campaigners (fixed position ρ).
Misinformation vs. disinformation? Misinformation: aim to make public believe a falsehood. Disinformation: aim to impede/distract from believing a truth. Misinformation logically implies disinformation, not vice versa.
Two counter-intuitive D&H findings without truth-seekers? (1) More extreme campaign positions hurt the campaigner (isolation outside ε); (2) more/stronger campaigners can impede their own campaign.
What flips when you add truth-seekers? A subtler campaign (ρ close to τ) outperforms a bold one, and adding more campaigners now helps the campaign.
Main limitations of the model (why it's qualitative not quantitative)? No real network structure · no substantive arguments exchanged · same ε for everyone · double-counting · heavily idealised.
Why is ABM the "AI tool" for L4 phenomena? Tipping points and path dependencies make analytic solutions intractable and intuitions unreliable; ABMs simulate the micro→macro link explicitly.

Lecture 5 — Attitude & Linguistic Models (Van der Vegt) / threat assessment paper

Q A
What is NLP, in one sentence? Large-scale text analysis: quantify human language (text → numbers) and use linguistic features in AI models to predict outcomes.
Two NLP methods covered in L5? Supervised machine learning (e.g., Google Perspective API) · Dictionary-based NLP (curated word lists).
Six Perspective API measures? Toxicity · Severe toxicity · Identity attack · Insult · Profanity · Threat. (Each 0–1, where 1 = 100% of people would agree.)
Study 1 — data and N? 1,909,844 tweets @mentioning all Dutch party leaders (n=22) in 2022, via Twitter Academic API.
Study 1 — main results? Male politicians get higher toxicity/insults/profanity scores; no gender difference for threats; significant gender × ethnic-minority interactions — female ethnic-minority politicians receive the most threatening tweets.
Van der Vegt et al. (2023) — the methodological warning? Google Perspective API's "identity-attack" measure under-detects misogynistic content even though gender is in its definition. Using it uncritically under-counts abuse against women → mis-prioritised protection.
Two strengths and two weaknesses of dictionary-based NLP? Strengths: transparent, interpretable. Weaknesses: labour-intensive to curate; blind to context/irony/sarcasm.
Study 2 (Baele, Brace & Ging 2024) — input and method? 172-word expert-curated dictionary (violent verbs, weapons nouns, dehumanising nouns) applied to 11.7M posts from 33 incel-related platforms.
What is threat assessment? Estimating the risk of violence (plus seriousness and likelihood) by teams of police, mental-health pros and investigative psychologists, using structured professional-judgement tools.
What is CTAP-25? A 25-indicator Communications Threat Assessment Protocol checklist that yields a Level of Concern (Low/Medium/High). Indicators include threats, weapons references, end-of-tether language, homicidal ideation, divine-mission belief, "gut reaction," etc.
Three applications of AI in close protection? OSINT dashboards (collect/summarise) · Sentiment analysis tracking general attitude over time · Prioritisation models that rank incoming messages by predicted call-for-violence score.
Two civil-liberty risks of AI-augmented OSINT? (1) Proprietary biased models systematically under- or over-flag particular groups (Perspective example); (2) chilling effect on legitimate political speech if false-positive rate is non-trivial.
Which IOS pillars does L5 most directly connect to? Security in Open Societies (Democracy & Good Governance) and Equity & Diversity (algorithmic justice of moderation tools).

Lecture 7 — Trust in AI / Grimmelikhuijsen-Meijer legitimacy

Q A
Three psychological components of trust (in an actor)? Competence/ability · Benevolence · Integrity.
Algorithm aversion vs. algorithm appreciation? Aversion (Dietvorst 2015): after seeing an algorithm err, people prefer humans even when the algorithm is on average better. Appreciation (Logg, Minson & Moore 2019): on novel unfamiliar tasks, people prefer algorithmic advice over equivalent human advice. Moderators: visibility of errors, domain familiarity, framing.
Three categories of legitimacy in public-admin theory? Input (who decided, who was represented) · Throughput (process: transparency, accountability, contestability) · Output (outcome quality: effective, fair).
Grimmelikhuijsen & Meijer's six threats to ADM legitimacy? (1) Reduced expertise / deskilling; (2) Opacity / lack of transparency; (3) Bias and unequal treatment; (4) Privacy infringement; (5) Reduced human oversight and accountability; (6) Erosion of public values.
What does "calibrated institutional response" mean? Each of the six threats requires a different institutional mitigation (audits for bias, XAI for opacity, human-in-the-loop for accountability, DPIAs for privacy, training for deskilling, democratic deliberation for value erosion). No single fix addresses all six.
The Dutch toeslagenaffaire — what was it and what did it bring down? The childcare-benefits fraud-detection scandal. The Belastingdienst's algorithmic risk model used ethnic-proxy variables (dual nationality, postal code) → systematic over-flagging of immigrant families → mass wrongful debt collection. Caused the resignation of the entire Rutte III cabinet in January 2021.
Which legitimacy threats showed up most in toeslagenaffaire? All six, but especially: bias (ethnic proxies), opacity (families couldn't see why flagged), accountability (no one was responsible), and public-value erosion ("hard line" optimisation crushed fairness).
How does L7 connect back to Elliott's TRUST framework (L1)? Grimmelikhuijsen & Meijer's threats are essentially failures of one or more TRUST letters — opacity = failed Transparency; accountability gap = failed Responsibility; bias = failed Truth; data sprawl = failed Stewardship.

Lecture 6 — Medical AI / Digital Twins (Van Rooij + Bontje) / Wang et al. 2023

Q A
Classification vs. stratification in medical ML? Classification = supervised, predict a-priori group labels (patient/control); identifies which features are most predictive. Stratification = unsupervised, identify hidden subgroups within a population (data-driven phenotyping).
Topic 1 example (Van Rooij) — task, method, performance? ADHD vs. controls from fMRI inhibition-task activation. Gaussian Process Classifier. Acc 77%, Sens 75%, Spec 80%, ROC AUC ~0.82.
Three risks of classifying psychiatric patients from neural data? Determinism / wrongful interpretation; inaccurate predictions; malignant use (e.g. insurers denying coverage).
Topic 2 (COVID-19 demographics) — risks? Discrimination/inequality, wrongful causal attribution (black-box confounders), biological determinism, "what you put in is what you get out."
Topic 3 (ASD stratification) — methods? Structural morphometry of 53 brain segments → normative modelling + spectral clustering → data-driven clusters with distinct clinical profiles.
Definition of a digital twin (Bontje)? A virtual, bi-directional model of the physical city; visualises urban processes in real time; supports planning/management/decision-making.
What does "bi-directional" mean for a DT? The twin both reflects sensor data from the city AND feeds decisions back to physical infrastructure (vehicles, traffic lights).
Three Dutch-DT current-state assets? 3D city models · Dashboards · Simulations / fieldlabs (i.e., DMI programme pilots).
DMI future goal? Move from isolated pilots to open modular reusable systems — reusable building blocks, shared standards, "Digital Twin as a Service," European Digital Twin Appstore.
Bontje's SWOT — one item per cell? Strengths: scenario testing. Weaknesses: model uncertainty. Opportunities: national DT network with reusable modules. Threats: privacy risks; overreliance on models.
Which IOS platforms does Bontje connect DTs to? Open Cities (redevelopment scenarios) · Behaviour & Institutions (embedded cognitive models of pedestrians/cyclists) · Fair Transitions (impact visible across user groups).
Why federated + edge learning for traffic DTs (Wang et al. 2023)? Bandwidth (can't stream all raw sensor data), latency (need ms responses), privacy (raw data stays local; only gradient updates leave the device), resilience (local model keeps working if cloud is down).
How does federated edge learning relate back to Elliott's TRUST? Operationalises Stewardship (data minimisation) at city scale, and can support Transparency/Truth if standards mandate auditability. Doesn't by itself solve overreliance on models.

Lecture 8 — Synthesis (Van Rooij)

Q A
Van Rooij's four-level scaling? Human cognition (L2) · Human/AI psychology (L3, L7) · Human/AI networks (L4, L5) · Human/AI society (L1, L6).
Exam logistics? Fri 2026-05-29, 11:00–13:00, EDUC-BETA (Rupert D extra time). Laptops provided. 9 essay questions in 2 hours.
Mock-exam Q2 first answer (LBA + time pressure)? Threshold (response caution, boundary, b). Time pressure → people lower their threshold → faster, less accurate.
Mock-exam Q2 second answer (Palada UAV operators + clouds)? Drift rate. Clouds obscure target features → difficult to extract evidence → lower rate of accumulation → more missed targets.
Mock-exam Q3 first answer (tipping point definition)? Small changes in a parameter produce drastic, often non-continuous changes in the output. Examples: climate, revolutions, polarization phase shifts.
Mock-exam Q3 second answer (D&H free riders → truth seekers)? Free riders don't reduce the number of truth-seekers reaching truth, but have a slightly negative impact on truth-seekers' performance (MSE from truth) when campaigners are subtle — they get dragged away from truth by interacting with subtly-misled free riders.
Mock-exam Q4 — one key problem of AI for rare-event prediction (e.g. terror)? Base-rate / class-imbalance fallacy: even a 99%-accurate model produces overwhelmingly more false positives than true positives at a 1-in-a-million base rate; combine with biased measurement (van der Vegt 2023) and the false positives concentrate on marginalised groups. Tools should aid analysts, not replace them.
Mock-exam Q5 — which digital-traces statements are TRUE? A (bigger/non-WEIRD samples), C (detailed contextual measurement), D (ethical questions about consent and ownership). False: B (digital dossier is implicit+explicit, not only explicit) and E (digital traces still affected by self-selection / awareness biases).
One-sentence cross-lecture synthesis test? Pick an IOS platform → list which lectures speak to it → which paper supports each → state the unifying methodological move.