Bottom line up front: Across the closest reference classes I could find — small ideological forums, radical-flank movements with non-violent majorities, and private/semi-private extremist channels — the empirical record cuts strongly against LessWrong's permissive policy on specific calls for violence. Documented pipelines from forum discussion to real-world attacks are dense and well-traced (Iron March → Atomwaffen and ≥5 murders; 8chan → Christchurch, Poway, El Paso, Buffalo; Terrorgram → Bratislava and 34 other crimes per ProPublica/Frontline); documented cases of public counter-argument deradicalizing a would-be attacker are essentially absent from the peer-reviewed and government-investigation record. The honest "ambiguity" zone is narrow and concerns mostly deplatforming-migration effects, which argue for an enforced norm against advocacy of violence (the policy on essentially every comparable platform), not for explicit permission.
Tompkins's (2015) quantitative re-evaluation of NAVCO data found radical flanks correlate with decreased mobilisation (OR = 1.67, p<0.01) and higher state repression (β = 0.82, p<0.01). Simpson, Willer & Feinberg (PNAS Nexus 2022) found radical flanks can boost moderate factions by contrast — but only when moderates publicly denounce the radicals (their experiment showed radical-tactic exposure reduced identification with the radical faction from M=3.37 to M=2.08, t=13.03, p<0.001). The Ellefsen (2018, Qualitative Sociology) Quebec-FLQ case study and the 2025 Terrorism and Political Violence paper "Unraveling the Radical Flank Effect" both find that moderates protect their movements by "publicly denouncing violence, avoiding interactions with radicals, and signal[ing] to state authorities intent to de-escalate." LessWrong's current policy does roughly the opposite.
Schuurman, Lindekilde, Malthaner, O'Connor, Gill & Bouhana, Studies in Conflict & Terrorism (2019, 42:8), drawing on the EU PRIME project: "ties to online and offline radical milieus are critical to lone actors' adoption and maintenance of both the motive and capability to commit acts of terrorism."
The mechanism is reinforcement, not erosion. Subsequent leakage research (Meloy & Gill 2016, Journal of Threat Assessment and Management 3:1, n=111) found leakage behavior in 85% of the lone-actor sample
— i.e., would-be attackers do broadcast intent, but engagement with the radical milieu sustains rather than dissuades them.
Carthy et al. (Campbell Systematic Reviews, 2020) reviewed 19 mostly-RCT studies and concluded evidence on "intent to act violently is inconclusive";
effects were limited to in-group/out-group attitudes, and "persuasion did not have a significant effect."
Bélanger et al. (Frontiers in Psychology) found high-need-for-closure individuals (precisely the subgroup most at radicalization risk) showed psychological reactance: counter-narratives can "produce the opposite of the desired effect and increase people's support for violent extremist groups."
RAND's evaluation of the Redirect Method (Helmus & Klein, RR-2813, 2018) could only measure exposure and click-through, not attitude change or behavioral outcomes, and explicitly acknowledged "a fundamental gap remains in the understanding of the effectiveness of such programs."
Across the major datasets — NCAVC Lone Offender Study (2019); Gill, Horgan & Deckert (n=119); Meloy & Gill TRAP-18 validation (n=111); Schuurman et al.'s pre-attack-behaviour codebook (198 variables, n=55); the NY AG's Buffalo report; the Royal Commission on Christchurch — the operative pre-attack mechanisms are leakage, fixation, identification, and pathway warning.
Where attacks are prevented, prevention is achieved by law-enforcement action triggered by leakage, not by community counter-argument changing the attacker's mind. The documented deradicalization successes (Life After Hate, EXIT-Germany, ISD's "Counter Conversations") are uniformly private, peer-mentored, long-term interventions by trained formers — not public forum debate.
PauseAI and Stop AI both enforced no-violence-talk norms; Stop AI's own statement says Moreno-Gama "joined the Stop AI public online forum, introduced himself, then asked, 'Will speaking about violence get me banned?' After he was given a firm 'yes,'" he stopped posting. He later acted alone, writing in a manifesto that "If I am going to advocate for others to kill and commit crimes, then I must lead by example and show that I am fully sincere in my message."
This is consistent with three different causal stories (suppression caused him to act alone instead of being deradicalized; suppression prevented him from finding collaborators for a more sophisticated attack; no causal effect). An n=1 cannot distinguish these. But the closest reference-class analogs (Iron March, Terrorgram, the Bratislava pipeline) all show permissive forums producing coordination and identification with predecessors — i.e., the alternative path is not low-risk.
Ribeiro et al. (CSCW 2021) on r/The_Donald and r/Incels migrations found deplatformed users in the migrant communities sometimes become more toxic per capita, and 15.6% of affected users left Reddit while 5% increased toxicity by >70%
(Cima et al. 2024 on Reddit's "Great Ban," arXiv 2401.11254). Chandrasekharan et al. (2017)
and ADL/Squire's Bad Gateway (2023) found deplatforming reduces overall hate-content production and audience.
The relevant comparison for LW is not "ban LW vs. permit violence on LW" but "enforce a no-violence-advocacy norm on LW vs. permit violence advocacy" — and on that comparison, the migration literature is silent or supportive of moderation.
Animal rights / SHAC. SHAC operated a moderated public-facing website that explicitly published "top 20 terror tactics" and identifying information about HLS-adjacent employees. Initially it framed itself as lawful and pre-cleared content with barristers. The actual outcome was an escalation to assault (Brian Cass beaten outside his home; Andrew Gay sprayed with chemicals on his doorstep),
letter-bombs, secondary/tertiary harassment campaigns, and the SHAC-7 federal convictions for conspiracy to violate the Animal Enterprise Protection Act. Thirteen UK SHAC members were jailed in 2009 for between 15 months and 11 years.
The "we're a debate forum, the violence is separate" framing did not survive contact with operational reality — the website itself was found by US and UK courts to be the mechanism of incitement and coordination.
Earth First! / ELF split (1992). Earth First! adopted a public non-violence-against-persons code, which was the explicit precipitant of the Brighton split that founded the Earth Liberation Front. Property destruction continued (the FBI's Operation Backfire indicted 18 in 2006; the Vail arson;
"the Family" cell carried out 40+ arsons 1996–2001),
but the moderate-flank-with-clear-non-violence-norm strategy preserved Earth First!'s public legitimacy. The radical wing that did commit attacks split off rather than co-existing inside the moderated discussion space.
Anti-abortion movement / Army of God. The Army of God Manual and the "Defensive Action Statement" emerged from a 1988 Atlanta jail cell where Operation Rescue arrestees, housed together, could "spell out their preferred tactics" (per the SPLC's "Violence and the Anti-Abortion Movement"). The resulting decades of arson, attempted murder, and murder (David Gunn, John Britton and the Barretts in Pensacola, Barnett Slepian, the Atlanta and Birmingham bombings, Tiller in 2009, Colorado Springs Planned Parenthood in 2015) trace a clear path from a discussion-permissive subculture to leaderless-resistance violence. Mainstream pro-life organizations were forced into explicit denunciation, and the literature is unanimous that the Army of God's violence "alienated many in the larger anti-abortion movement"
— the radical-flank-as-poison-pill effect on the broader movement's political legitimacy.
Civil rights movement (the contrasting case). SNCC, SCLC, and CORE invested heavily in training in nonviolence (Lawson's Nashville workshops; SNCC's "Statements of Discipline"; SCLC's "Handbook for Freedom Army Recruits")
and enforced it as an internal norm. Chenoweth & Stephan's NAVCO data (Why Civil Resistance Works, Columbia, 2011)
subsequently showed that 53% of nonviolent campaigns succeeded vs. 26% of violent ones;
her "3.5% rule" finds every campaign crossing that threshold was primarily nonviolent.
The closest analogy to LW would have been if SCLC had said "we'll allow advocacy of violent direct action on our pamphlets and trust that disagree-votes will deradicalize Stokely Carmichael" — they didn't, and the empirical record vindicates them.
Climate movement. Just Stop Oil, Extinction Rebellion, and the broader A22 network have explicit non-violence codes; ELF, Deep Green Resistance, and SLDT in France function as the radical flank. Social Change Lab's empirical work finds Just Stop Oil's nonviolent disruption increased support for moderate climate groups
(positive radical-flank effect), but this entire structure depends on the nonviolent majority clearly distancing itself from any violent fringe. Carnegie Endowment's 2025 "Why Climate Sabotage Remains an Unlikely Strategy" review notes that movements unable to "rein in the activities of less principled members"
risk being branded as terrorism, with the Chenoweth/Stephan dataset suggesting violence against humans would backfire.
Anti-AI movement. Stop AI's bifurcation (Reichstadter/Kirchner kicked out from PauseAI in 2024; Kirchner missing after allegedly assaulting another organizer for proposing abandoning nonviolence) and PauseAI's tight enforcement against violence-talk are textbook moderate-flank-distancing behaviour. Moreno-Gama's path — joining PauseAI's Discord (34 posts over two years, none with explicit violence calls but one flagged "ambiguous"),
joining Stop AI's forum, being told violence-talk gets banned, going quiet, then writing his manifesto and attacking — is consistent with the model that suppression of in-community advocacy doesn't trivially deradicalize, but it does not refute it either. Critically: he acted alone. There is no evidence of any coordinated cell forming.
Iron March (2011–2017). Functioned as a moderated public forum (per CTC West Point's analysis of the leaked SQL database, founder Alexander Slavros sent ~700 DMs and wrote 7,600 forum posts actively curating discussion). The forum had the LW theory of change available — there was constant debate and internal pushback (Heimbach later "flinched at the idea of TWP becoming another Atomwaffen").
It produced Atomwaffen Division, National Action,
Antipodean Resistance,
and Feuerkrieg Division. Three premeditated violent plots were disrupted while it was online; the majority of skull-mask-network terrorism came after its disappearance
— i.e., the forum itself was a coordination and identity-formation engine, not a deradicalization one.
Terrorgram (2019–2024). PBS/ProPublica investigation identified 35 crimes linked, including the Bratislava shooting; US/UK/Canada/Australia have designated it a terrorist organization.
The Humber/Allison indictment shows direct prosecution evidence of forum-to-attack coordination.
764 / the Com. Per ISD Global (2025): "Between 2020 and 2025, 191 members of 764 (or members of affiliated groups) in 28 different countries have been arrested." The competitive-radicalization-inside-semi-private-channels pattern is the inverse of what LW's theory predicts should happen.
Incels (r/Incels, r/Braincels). Reddit allowed both for years; the eventual bans came after Elliot Rodger's spree and the 2018 Toronto van attack. Empirical evaluations (Chandrasekharan et al. 2017; Ribeiro et al. 2021; the Reddit "Great Ban" study) find mixed but on-net positive effects of community-level moderation on aggregate hate content; the migration cost is real but smaller than the within-platform benefit.
Reconquista Internet (the strongest counter-speech evidence). Garland et al. (EPJ Data Science 2022, n=1.1M tweets across 22 prominent German Twitter accounts) found that after RI emerged in April 2018, the proportion of hate speech in sampled conversations fell from ~30% to ~25%, while counter-speech rose from 13% to 22%.
But the authors explicitly state: "we make no causal claims due to the complexity of discourse dynamics,"
and the outcome measured was the ratio of speech types, not behaviour of would-be attackers. This is the strongest signal in the empirical record for the counter-speech theory, and it does not speak to attack rates at all.
Leakage is the single most prevalent pre-attack warning behavior (Meloy & Gill 2016 found leakage in 85% of their lone-actor sample of 111; Schuurman et al. found ~96% of NCAVC lone-offender sample produced writings intended to be viewed).
Permissive forums could in principle be a honeypot for early law-enforcement intervention — three premeditated violent plots were disrupted while Iron March was online.
But (a) the leakage literature (Kupper & Meloy's "Going Dark") shows attackers actually become less publicly active near the attack; (b) the LW model relies on community counter-argument, not LE intervention; and (c) the LW user base is unlikely to systematically report violence advocacy to authorities. So the leakage-benefit argument applies more to monitored permissive forums (the way the FBI used Iron March data) than to LW.
For your post:
Benchmarks that would change the recommendation: