Lecture 3 — Autonomy & Psychology of Interaction with AI (Hortensius)

Paper: Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J.-F., Breazeal, C., … Wellman, M. (2019). Machine behaviour. Nature, 568, 477–486.

Type: Methodological. It moves from individual-level cognition to individual-machine interaction. (Framing this as "one step up in Newell's time-scale hierarchy from L2" is the course's own cross-lecture synthesis; Newell's bands come from L2, not from the Rahwan paper or the Hortensius slides.)

Must-know core → Minimum to Pass

Rahwan 2019 Machine Behaviour: 3 scales (individual/collective/hybrid) × 4 domains (Democracy/Kinetics/Markets/Society) · digital traces (3 advantages + self-selection bias) · Kosinski 2013 (likes→traits; computer≈spouse, Hinds & Joinson) · Kramer 2014 (tiny effect, huge population).

Lecture in one paragraph

AI systems are now everywhere; their effects on individual psychology (mood, attention, belief, behaviour) and on society are large but debated without data. This lecture argues we need a new discipline — machine behaviour (Rahwan et al. 2019) — to study AI systems the way ethology studies animals: as objects of empirical, quantitative, interdisciplinary inquiry. The two big enablers are (i) digital traces — the logs that internet platforms accumulate about us — and (ii) machine-learning analytics that turn those traces into predictions of psychological states. The lecture demonstrates with two case studies: personality prediction from Facebook likes (Kosinski, Stillwell & Graepel 2013) and emotion contagion at scale (Kramer, Guillory & Hancock 2014 — "the Facebook study").

Paper 3 — Rahwan et al. (2019), Machine Behaviour

Working definition

AI agents are "complex and simple algorithms used to make decisions."

Why a new field?

Ubiquity of algorithms/AI/technology.
Complexity and opacity of these systems (e.g., AI breast-cancer screening paper in Nature 2020 — "you have to pay to read details, can't get the data, can't get the code" — Salathé).
They have both beneficial and detrimental effects that we cannot reason about anecdotally.

The framework — three scales of machine behaviour

Individual machine behaviour — one algorithm
Collective machine behaviour — many algorithms interacting (e.g. algorithmic trading flash crashes)
Hybrid human–machine behaviour — humans and algorithms in mutual feedback

Within hybrid: - Machines shape humans (recommendation, addiction, mood) - Humans shape machines (engineering: choice of algorithm) - Human–machine co-behaviour (e.g. Microsoft Tay learning racism from users within a day)

Four domains where this matters (Rahwan's table)

Domain	Example questions
Democracy	News-ranking → filter bubbles? Algorithmic justice → racial discrimination in parole? Predictive policing → false convictions?
Kinetics	Autonomous vehicles → how aggressively to overtake? How to distribute risk between passenger and pedestrian? Autonomous weapons → respect proportionality? distinguish combatants from civilians?
Markets	Algorithmic trading → market manipulation / systemic crash risk? Algorithmic pricing → tacit collusion? price discrimination?
Society	Online dating → facial features? does it amplify or reduce homophily? Conversational robots → promote products to children? affect collective behaviours?

The interdisciplinary triangle

Engineering of AI ↔ Scientific study of behaviour ↔ Study of impact of technology, with machine behaviour as the central bridge generating new quantitative evidence, new scientific questions, and new engineering practices.

Why an essay-relevant frame

Rahwan's paper is the methodological hub for thinking about AI's social role. Almost any thematic question in this course (L5 linguistic threat assessment, L7 trust in algorithmic decisions, L6 medical AI) can be framed as machine behaviour at one of the three scales in one of the four domains.

Key concept 1 — Digital traces

(Rafaeli, Ashtar & Altman 2019, Current Directions in Psych Sci)

Digital traces are records (logs) of people's behaviour — Facebook likes, tweets, vlogs, YouTube history, cookies. Collected by platforms, sensors, devices; comprise contextual data about when, where, how long.

Three advantages over traditional psych methods: 1. Bigger / different samples — beyond WEIRD (Western, Educated, Industrialised, Rich, Democratic). 2. Detailed, contextual recording of behaviour as it happens. 3. 'Digital dossier': implicit + explicit behaviour together, reducing experimental demand bias.

Main challenge: self-selection bias — Twitter vs. Facebook vs. TikTok users differ.

Five overlapping reasons social media data are valuable (Meshi, Tamir & Heekeren 2015, Trends Cogn Sci): externally valid, real-time continuous, extended-time, quantifiable, pairable with real-world behaviour. Proxies extractable from social media include: offline thoughts (I-statements), emotional states, social conformity, prosocial behaviour, curiosity (scroll time), personality traits, social network, social interaction.

Collection & analysis pipeline

Collect: self-report, API, web scraping, organisational collaboration (e.g. Kramer et al. 2014 with Facebook), smartphones/sensors/apps.
Analyse: traditional stats (correlation, ANOVA), automated tools (sentiment analysis), or deep learning.
Ethical layer: consent, privacy, data ownership, "surveillance capitalism" (Zuboff).

Key concept 2 — Personality prediction from digital traces

Kosinski, Stillwell & Graepel (2013) PNAS

N = 58,466 US Facebook users (myPersonality app).
Input: vector of Facebook likes (1/0), mean ≈170 likes/user, ~55,814 like categories — a 10M user-like matrix.
Pipeline: SVD to reduce to 100 components → regression with 10-fold CV → predict latent traits.

Predicted categorical variables (scored by area under the curve, AUC, where 0.5 is chance and 1.0 is perfect):

Variable	AUC
Caucasian vs African-American	0.95
Gender (male vs female)	0.93
Gay (male)	0.88
Democrat vs Republican	0.85
Christianity vs Islam	0.82
Smokes	0.73
Drinks alcohol	0.70
Uses drugs	0.65

Predicted continuous variables (scored by Pearson correlation \( r \)):

Variable	Pearson \( r \)
Age	0.75
Extraversion	0.40
Intelligence	0.39
Big Five traits (in general)	0.30 to 0.43
Satisfaction with life	0.17

Watch the metric. Categorical traits use AUC; continuous traits use Pearson \( r \). The slide's lighter bars (for example 0.78 for intelligence and 0.75 for extraversion) are values relative to the test-retest ceiling, not the Pearson \( r \); the genuine correlations are \( r = 0.39 \) and \( r = 0.40 \). Do not report the lighter-bar numbers as accuracies or as AUCs.

Key empirical follow-up — Hinds & Joinson 2019 CDPS: computer's average prediction accuracy from Likes (.56) ≈ spouse's accuracy (.58), beating family (.50), cohabitant (.45), friend (.45), work colleague (.27), and humans' average (.49). Conclusion: with enough Likes (≈300), an algorithm knows you about as well as your spouse does.

Psychological targeting (Matz et al. 2017 PNAS)

Pair predicted trait with congruent ad creative (e.g., extraverted ad to high-extraversion audience; "Dance like no one's watching" vs. "Beauty doesn't have to shout").
Two field studies (N = 3.1M and 84k reach). Targeted ads ~1.4–1.8× higher conversion than incongruent ads.
Caveat: Eckles, Gordon & Johnson (2018) PNAS letter argues field-targeting studies "face threats to internal validity" — selection of who-sees-what is confounded with treatment. Don't take effect sizes at face value.

Real-world stakes

Netflix tailoring artwork by genre preference / "taste community"
Airbnb 2015 patent: AI scoring of guest "trustworthiness" by scraping social media (incl. drugs/alcohol associations, "dark triad")
Cambridge Analytica / Facebook scandal — psychographic targeting in elections (Brexit, US 2016). This is the Black-Mirror example that makes Hortensius's "Palaeolithic brain, modern technologies" framing concrete.

Key concept 3 — Emotion contagion (Kramer, Guillory & Hancock 2014, PNAS)

"The Facebook Study"

Facebook experimentally manipulated the emotional valence of ~689,000 users' news-feeds for one week in 2012.
Condition A: reduced positive posts. Condition B: reduced negative posts.
Outcome: users whose feeds had fewer positive posts subsequently produced fewer positive words and more negative words in their own posts (and vice versa).
Effect size: tiny (Cohen's d ≈ 0.001–0.02), but the population is enormous → real-world influence is non-trivial.
The lecture notes a subtlety: this shows content contagion ("Emotion_P1 → Emotion_P2 via displayed content"), not necessarily felt emotion contagion in the strict offline sense.

Why this paper detonated

No informed consent beyond Facebook's TOS → massive ethics backlash → ongoing debate about what counts as research vs. product testing.
Established empirically that platforms can causally manipulate emotional state at population scale — direct evidence for the "machines shape humans" leg of Rahwan's framework, and one of the strongest motivations for the IOS programme.

Why this matters for an open society

Hortensius's overall argument: every open-society pillar is psychologically mediated and therefore vulnerable to AI that knows the user better than the user knows themselves.

Democracy & Good Governance: psychographic targeting (Cambridge Analytica), filter bubbles, algorithmic injustice. Threatens constitutional democracy & rule of law.
Equity & Diversity: prediction models infer protected attributes (ethnicity, sexuality, religion) from innocuous data — even when users do not declare them — enabling redlining/discrimination. Threatens individual rights and equity-platform goals.
Transitions & Wellbeing: emotion contagion + addictive design ("attention economy") affect population mental health (especially adolescents). Threatens openness to emancipatory movements and wellbeing.

The discipline of machine behaviour (the paper) gives us the empirical lever to study these effects rather than just argue about them.

Likely essay-question angles

"Describe Rahwan et al.'s framework. Pick one of the four domains and explain why machine behaviour at the hybrid scale (rather than individual or collective) is the most policy-relevant level."
"Digital traces have three advantages and one major bias. Discuss using Kosinski et al. 2013 as a concrete example. What does the Hinds & Joinson 2019 result imply for individual privacy and corporate due-diligence (e.g., the Airbnb patent)?"
"The Facebook emotion-contagion study showed a statistically real but tiny effect. Why does the small effect size not settle the policy question?"
"How does the Cambridge Analytica case illustrate the loop between digital traces → psychological targeting → political behaviour? Which IOS pillar is most threatened?"

Quick self-test

State Rahwan et al.'s working definition of an AI agent. Name the three scales and the four domains.
What does the interdisciplinary triangle (Engineering / Behaviour / Impact) say about who has to collaborate?
List the three advantages and the main bias of digital-trace data.
In Kosinski et al. 2013, what is the input? What is the modelling pipeline? Roughly how well can the computer predict gender, sexuality, BIG5 extraversion?
What was manipulated in Kramer et al. 2014, what was measured, and why is "small effect, large population" the right framing for its policy implications?
Sketch how psychological targeting + Cambridge Analytica fits into Rahwan's Democracy domain.

Source slides

Open AIOS_lecture3_Hortensius.pdf in new tab ↗