Lecture 5 — Attitude & Linguistic Models (Van der Vegt)

Paper: van der Vegt, I., Kleinberg, B., & Gill, P. (2023). Proceed with caution: on the use of computational linguistics in threat assessment. Journal of Policing, Intelligence and Counter Terrorism, 18(2), 231–239.

Type: Thematic (orange). Where L2–L4 were the methodological half of the course, L5 begins the application half. The methodology behind it is NLP / computational linguistics, and the application is understanding and preventing grievance-fuelled targeted violence.

Must-know core → Minimum to Pass

NLP = text→numbers · supervised ML (Perspective API: 6 measures) vs dictionary (transparent, context-blind) · Study 1: female + minority politicians most threatened; identity-attack measure under-counts misogyny (the caution) · Baele 2024 incel dictionary · CTAP-25 = human SPJ that NLP should augment, not replace.

Lecture in one paragraph

Grievance-fuelled targeted violence (Christchurch, Las Vegas, Capitol Hill, Dutch politician threats, etc.) leaves a linguistic paper trail. Threat-assessment teams (police, mental-health professionals, investigative psychologists) traditionally use structured-judgement tools like CTAP-25 to score behaviour and language for risk indicators. NLP lets us scale these analyses massively — from a single threatening letter to ~12 million incel posts or ~2 million tweets aimed at politicians. But the AI tools that enable this (e.g., Google Perspective) are not transparent and carry their own biases; the paper's title — Proceed with caution — is the lecture's bottom line.

The lecture connects to IOS pillars Security in Open Societies and Democracy & Good Governance.

Key concept 1 — Natural language processing (NLP)

Large-scale text analysis. Quantifying human language: text → numbers. Use these 'linguistic features' in AI models to predict outcomes.

Two NLP methods are demonstrated in the lecture:

A. Supervised machine learning (used in Study 1)

Annotated training texts (each with a 0–1 label, e.g. "toxicity = 0.9").
ML extracts linguistic features and learns a function text → label.
Apply trained model to new text → predicted score.
The lecture's worked example is Google Perspective API, with six measures (each 0–1, where 1 = "100% of people would agree the text is X"):
Toxicity
Severe toxicity
Identity attack
Insult
Profanity
Threat

B. Dictionary-based NLP (used in Study 2)

Build a list of words that indicate a construct (e.g., HATE = {enemy, loathe, hatred, detest, despise}).
Search target text for word matches; report a percentage.
Pros: transparent, interpretable, no black-box ML.
Cons: labour-intensive to curate; context- and sarcasm-blind; rigid (no semantic similarity).
New developments: LLMs offer a more flexible alternative that may combine some advantages.

Study 1: Threats to Dutch politicians (from the slides, not the obligatory paper)

Source note: the obligatory paper (van der Vegt, Kleinberg & Gill 2023) is a short commentary, not an empirical study. Its actual content is four cautions (the data problem, the "utopia" of predicting violence, the base-rate fallacy, and the danger of closed-source tools) plus the VISOR-P checklist. Study 1 below, and the identity-attack examples, come from the lecture slides (a separate van der Vegt study, linked on the slides), so attribute them to the slides, not to the 2023 commentary.

Research questions

What is the prevalence and nature of abuse directed at politicians online in the Netherlands?
Is there an effect of gender and ethnic-minority status on the prevalence and nature of abuse?

Data

@Mentions of all Dutch party leaders (n = 22) on X/Twitter in 2022.
Total 1,909,844 tweets via the Twitter Academic API (academictwitteR package).

Method

Linguistic measures via Google Perspective API (the six 0–1 measures above).
Regression model with predictors: gender, ethnic-minority status, political position (economic and cultural stance), number of followers, number of tweets.

Results

Male politicians scored higher than female on toxicity, severe toxicity, identity attacks, insults, and profanity. No significant gender difference for "threats."
Significant interactions between gender and ethnic-minority status for severe toxicity, identity attacks, profanity, and threats.
Female politicians from an ethnic-minority background received the most threatening tweets.

The methodological-caution finding (from the slides)

The Perspective API's identity-attack measure mis-classifies misogynistic content. This example is from the lecture slides; it illustrates the kind of closed-source-tool risk the obligatory commentary warns about, though the commentary itself does not contain it. The lecture shows examples:

Dutch tweet	English gloss	Target gender	Identity-attack score
"hang jezelf aub op, vieze kanker jood…"	"Please hang yourself, dirty 'cancer' jew…"	M	0.84 ✓
"Smerige KutHoer, kogel in je kop…"	"Dirty cunt whore, bullet in the head"	F	0.11 ✗
"Rot jij maar lekker op naar je vaderland met je klote islam"	"Piss off to your home country with your damn islam"	M	0.85 ✓
"Domme muts… achterlijk schijtwijf…"	"Stupid bimbo… retarded crap-woman…"	F	0.06 ✗

The Perspective definition of identity attack explicitly includes gender, yet the model fails to detect gendered slurs. So the empirical "no gender difference in threats" result may be partly a measurement artefact — the AI tool under-counts abuse aimed at women. This is the lecture's central methodological point: AI-driven content-moderation tools are deployed at scale across platforms, but they are proprietary, opaque, and biased — using them uncritically can systematically mis-diagnose which groups are under attack.

Study 2 — Cross-platform analysis of incel language

(Baele, Brace & Ging 2024, Terrorism and Political Violence — a diachronic cross-platform analysis of incel violent-extremist language)

Research questions

Is the incel subculture a violent extremist ideology?
Has incel language grown more extreme over time?

Method: dictionary-based NLP

Incel violent-extremist dictionary = 172 words judged by experts on the "incelosphere":
Verbs unambiguously expressing acts of violence ("stab," "kill," "rape")
Nouns labelling weapons ("gun," "knife," "acid")
Nouns dehumanising the outgroup ("femoid"/"foid," "roasties," "curry")
Hit rate per post: fraction of post words matching the dictionary.

Data

33 platforms: incels.is, blackpill.club, neets.me, incels.net, wizchan.org, multiple subreddits (r/Braincels, r/IncelsExit, r/TheRedPill, r/FA30Plus, r/AntiFeminist, r/incel, r/BlackpillScience), 4chan/r9k, 9chan/leftcel, blogs, Telegram channels.
11,717,516 posts via custom scrapers.

Findings (sketch — the lecture shows time-series across platforms)

Violent-language hit rates have spiked at specific moments tied to platform shifts, and have grown on certain forums.
Dictionary methods are blunt but interpretable — every uptick can be traced to specific posts.

Discussion (limitations and benefits of dictionary methods)

Limitations: labour-intensive to maintain; not aware of context, irony, sarcasm.
Benefits: transparent and interpretable — no black box.
Future: LLMs may yield more flexible methods.

Key concept 2 — Threat assessment as a discipline

Threat assessment = estimate the risk of violence, plus outcomes like seriousness and likelihood. - Performed by teams: police, mental-health professionals, investigative psychologists. - Uses structured professional-judgement (SPJ) tools — not algorithms, but expert-curated checklists scored by trained analysts.

CTAP-25 (a 25-indicator communications threat-assessment protocol)

Twenty-five categories the analyst marks on each communication:

#	Indicator	#	Indicator
1	Threats	14	Sexually aggressive language or fantasies
2	Declaration of intention	15	Interest in attackers or violent extremism
3	Evidence of displacement	16	References to weapons
4	Extremes of anger	17	Known history of violence
5	Escalation in anger or increasing preoccupation	18	Homicidal ideation
6	Highly personal quest for justice	19	Delusion of loving relationship
7	Demands to change behaviour	20	Delusions of jealousy
8	Demand for money / apology	21	Belief in shared past or destiny
9	Prolific correspondence	22	Belief people are imposters or possessed
10	Awareness of personal details	23	Threat to personal integrity
11	History of intrusive behaviours	24	Belief in own divinity / divine mission
12	End-of-tether language	25	Gut reaction
13	Suicidal ideation

Aggregated to a Level of Concern: Low / Medium / High.

This is the human baseline NLP tools are meant to augment, not replace.

Key concept 3 — AI in close protection & surveillance

Promising applications

Basic administrative tasks
Open Source Intelligence (OSINT) analysis — "what is the threat towards a person? what is the source? what can we expect? there is so much information that humans can't read it all anymore."
Dashboards: collect / summarise / visualise / analyse social-media chatter; prioritise.
Sentiment analysis: track general attitude towards a public figure over time.
Prioritisation models: rank incoming messages by predicted call-for-violence score so analysts review the highest-risk first.
Topic / hashtag analysis: word clouds, e.g., the #wefpuppet conspiracy hashtag — +2,230 posts mentioning the politician X as a "puppet of the World Economic Forum" 253 times.

Pressing challenges

Bias in proprietary models (Study 1)
Context-blindness in dictionary models (Study 2)
Black-box decisions in high-stakes domains
False-positive cost (innocent people surveilled or arrested)
False-negative cost (missed credible threats)

The base-rate problem (heavily emphasised on the slides — and it is mock Q4)

Predicting rare events (terror attacks) runs into the base-rate fallacy: even a model that is 99% accurate, applied to 100 million people where the true base rate is ~0.01%, produces on the order of ~999,900 false positives for a tiny number of true hits — so the overwhelming majority of "flagged" people are innocent. High accuracy/AUC does not make such a tool deployable. (This is exactly the answer to mock Q4 in L8 — see also the answer skeleton in Minimum to Pass.)

Three modes of using AI in threat assessment (slide taxonomy)

The slides name three deployment modes, escalating in delegated agency:

Support — AI helps with admin / data gathering; humans decide.
Human-in-the-loop — AI prioritises/flags; a human reviews every consequential output.
Full replacement — AI decides autonomously (rejected here for high-stakes use).

Plus two named challenges worth a sentence: GenAI hallucination (LLMs generate plausible, not necessarily true, text and "don't know doubt") and the EU AI Act prohibition on AI-based individual crime-risk prediction.

Why this matters for an open society

Security in Open Societies (Democracy & Good Governance pillar): protecting public figures and minoritised politicians from harassment is a condition for contestable democracy. The Dutch-politician study shows that informal-institution abuse against female and minority politicians can drive them out of office, narrowing democratic participation.
Democracy & Good Governance more broadly: dashboards and OSINT enable proportionate threat assessment, but un-audited AI tooling can simultaneously under-protect women and minorities (because of biased models) and over-police peaceful expression (false positives).
Equity & Diversity: the identity-attack-measure miscalibration is itself a form of algorithmic injustice — the tool that's supposed to detect hate-speech against women fails to recognise the most common form of it.

Likely essay-question angles

"Contrast dictionary-based NLP with supervised ML for threat assessment. Use the politician-abuse and incel-language studies as worked examples. Which is more appropriate for a police investigative-psychologist's workflow, and why?"
"Van der Vegt et al. (2023) urge to 'proceed with caution.' Specify the caution: where did the Google Perspective API mis-classify? Why does this matter for which politicians get protected?"
"How does CTAP-25 differ from an algorithmic risk score? Argue whether NLP should replace or only augment structured professional judgement."
"AI-augmented OSINT enables real-time prioritisation of social-media threats. List two civil-liberty risks of deploying such tools at scale, and one design choice that mitigates each."

Quick self-test

Define NLP in one sentence.
Distinguish dictionary-based NLP from supervised ML; one strength and one weakness of each.
Six Perspective-API measures used in Study 1?
State the main empirical finding of Study 1 and the main methodological caveat van der Vegt et al. (2023) attach to it.
How is the incel violent-extremist dictionary constructed? Why is dictionary-based NLP especially useful for interpretable analysis of an obscure subculture?
What is CTAP-25 and what does "Level of Concern" mean?
Two civil-liberty risks of AI-augmented OSINT?

Source slides

Open AIOS-lecture5_vdVegt_2026-iv.pdf in new tab ↗