Content is user-generated and unverified.

Do the Classic Sales-Psychology Effects Still Hold Up? A 2015–2025 Replication Audit

TL;DR

  • The persuasion / sequential-request effects (door-in-the-face, foot-in-the-door) and classic anchoring survived the replication crisis, but are SMALLER than the famous old papers implied. Door-in-the-face was directly replicated in 2021 with near-identical numbers (Genschow et al.: "compliance rates in our replication were similarly high as those Cialdini et al. (1975) found 45 years ago"), and numeric anchoring is among the most robustly replicated effects in all of psychology (meta-analytic d ≈ 0.82).
  • The "choice" and "pricing-trick" effects are the shaky ones. Choice overload (the jam study) has a meta-analytic average effect of "virtually zero" (Scheibehenne, Greifeneder & Todd 2010, N = 5,036), and the decoy effect collapsed in well-powered lab replications (Frederick et al. 2014; Yang & Lynn 2014), surviving only weakly — "roughly 1% change in preference" — in real-world transaction data (Devine et al. 2025).
  • Verdict for a sales playbook: lead with anchoring, door-in-the-face and foot-in-the-door (robust but modest); treat decoy, choice-overload and price-order tactics as "sometimes works, context-dependent," not laws of nature. The user is right to be skeptical of citing 1966/1975/2003 papers as if settled — but for these three effects there is strong 2014–2025 evidence to defend them, while for decoy/choice-overload/Ariely-SSN the modern evidence is genuinely weaker and should be flagged.

Key Findings (verdict per effect)

EffectClassic citeVerdict (2015–2025)Best recent evidence
Door-in-the-faceCialdini 1975ROBUST but modestGenschow et al. 2021 direct replication; Feeley et al. 2012 meta (r = .126 verbal)
Foot-in-the-doorFreedman & Fraser 1966ROBUST but smallPascual & Guéguen 2005 meta; r ≈ .09–.17
Anchoring (estimation)Tversky & Kahneman 1974VERY ROBUSTMany Labs 1 (2014); Schley & Weingarten 2025 d = 0.82
Anchoring (price/valuation)Ariely et al. 2003 (SSN)MIXED / CONTESTEDFudenberg et al. 2012 & Maniadis et al. 2014 weak/failed; some succeed
Decoy / asymmetric dominanceHuber et al. 1982; ArielyWEAK / context-boundFrederick et al. 2014 & Yang & Lynn 2014 failures; Devine et al. 2025 ~1%
Choice overload (jam study)Iyengar & Lepper 2000WEAK / FAILED as a main effectScheibehenne et al. 2010 ≈ 0; Chernev et al. 2015 moderators
Price presentation orderSuk, Lee & Lichtenstein 2012THIN evidenceOriginal only; no large independent replication found
Endowment effectThaler 1980; Kahneman et al. 1990ROBUST but boundedTunçel & Hammitt 2014 meta, WTA/WTP = 3.28

Details

1. Door-in-the-face (DITF) — start big, concede to a smaller request

Verdict: ROBUST, but the real-world effect is modest. This is the strongest "good news" story for defending a classic citation. Genschow, Westfal, Crusius et al. (2021), in the Journal of Personality and Social Psychology (vol. 120, no. 2, pp. e1–e7), ran a direct replication of Cialdini's 1975 original with 391 participants (≈5× the original sample). Asking passersby to chaperone juvenile delinquents on a zoo trip: 34% complied in the small-request-only condition vs. 51% in the large-then-small (DITF) condition — nearly identical to Cialdini's original rates. They concluded: "at least some social psychological findings can transcend a particular time, place, and population."

The meta-analytic picture is more sober. Feeley, Anker & Aloe (2012, Communication Monographs, 79(3), 316–343) meta-analyzed 117 studies (1975–2010): "an overall significant effect of the DITF strategy on verbal compliance (k = 78, r = .126), but an insignificant effect for behavioral compliance (k = 39, r = .052)" (95% CIs: verbal .08–.17; behavioral −.02–.12). Translation: DITF reliably gets people to say yes but is far weaker at producing actual money/behavior. It works better for prosocial requests, larger concessions, lower baseline compliance, and a personal connection. It has worked in retail field experiments (Ebster & Neumayr 2008, alpine cheese-selling, 375 consumers) and voter-turnout campaigns.

2. Foot-in-the-door (FITD) — small request first, then a bigger one

Verdict: ROBUST but small. Successive meta-analyses (Beaman et al. 1983; Dillard et al. 1984; Pascual & Guéguen 2005) consistently find a real but small effect, around r = .09–.17 (Dillard et al. put both FITD and DITF at r ≈ .15–.17 even under optimal conditions). The effect is condition-dependent: it works best when the initial request is non-trivial (enough to shift self-perception), is performed actively, and carries no large external incentive. Online/e-commerce evidence exists: Guéguen's field experiments showed email/online FITD works, and a computer-mediated field study (n = 900 sports-store customers) found a two-step FITD condition produced more new customers than control. Caution: a 2024 nonprofit volunteer field experiment (500+ participants) found neither FITD nor gift-exchange beat a control with a compelling mission — FITD is not automatic.

3. Anchoring — split into two very different cases

(a) Classic numeric / estimation anchoring: VERY ROBUST. Arguably the best-replicated effect in social psychology. In Many Labs 1 (Klein et al. 2014, ~36 labs, ~5,000–6,000 participants), four anchoring tasks were replicated and were among the largest effects in the entire project — four of the five effects with Cohen's d > 1.0 were anchoring variants (item-level point-biserial r = .64–.91). The 2025 meta-analysis "50 Years of Anchoring" (Schley & Weingarten, SSRN) synthesized 2,603 effect sizes (1,283 directly comparing high vs. low anchors) and found a large effect, d = 0.824, 95% CI [0.765, 0.883] (I² = 93.6%), with "only a small reduction from publication-bias corrections." Röseler & Schütz's large meta-analysis (2022, "Open Anchoring Quest" dataset: 96 studies, N = 21,359, 88,914 trials) found no evidence of publication bias and no difference between published and unpublished effects — rare and reassuring.

Two caveats: (i) "incidental"/subliminal anchoring (random numbers from the environment) is fragile — the Critcher & Gilovich incidental-anchor item failed to replicate, and Many Labs 2 (2018) found a near-zero effect (d ≈ 0.04); (ii) Röseler et al. (2024, Meta-Psychology, >50,000 estimates) showed that individual susceptibility to anchoring cannot be measured reliably — anchoring is a robust situational effect, not a stable personality trait.

(b) Price/valuation anchoring — Ariely, Loewenstein & Prelec (2003) "coherent arbitrariness" / Social-Security-Number study: MIXED & CONTESTED. This is the citation the user should be most careful with. Fudenberg, Levine & Maniadis (2012, AEJ: Microeconomics) re-ran the Ariely manipulation and found "much weaker anchoring effects" on product valuations and "no anchoring effects" on lotteries. Maniadis, Tufano & List (2014, AER) re-ran Study 2 (aversive sounds) and reported a "failure to replicate" the strong effect — though Simonsohn (Data Colada) argued their data were actually statistically consistent with the original and merely underpowered. A 2019 Judgment and Decision Making paper concluded WTP anchoring is real but typically smaller than the original. Bottom line: the SSN/willingness-to-pay demonstration is not a settled result — citing it as proof is risky, even though general anchoring is rock-solid.

4. Decoy effect / asymmetric dominance ("the Economist subscription trick")

Verdict: WEAK / strongly context-bound. A genuine cautionary tale. Frederick, Lee & Baskin (2014, Journal of Marketing Research, "The Limits of Attraction") and Yang & Lynn (2014, JMR) ran many well-powered replications and largely failed to reproduce the attraction/decoy effect. Yang & Lynn reported only ~11 reliable effects out of 91 attempts ("significantly fewer than expected"). Their conclusion: the decoy effect is largely an artifact of "stylized" stimuli (two numeric attributes, text descriptions) and mostly vanishes with realistic, pictorial, multi-attribute products. Defenders (Huber, Payne & Puto 2014; Simonson 2014) replied that it holds when conditions are properly replicated, and some real-product follow-ups recover it (Lichters et al. 2017). The most ecologically valid recent test — Devine, Goulding, Harvey, Skatova & Otto (2025, npj Science of Learning, 10:60), analyzing 3.6 million UK grocery wine transactions — found the decoy effect does occur in the wild, but: "The strength of these effects was modest overall (roughly 1% change in preference) and ... depended on consumers' idiosyncratic histories of experience." So: real, but small and fragile — nothing like the dramatic flips in Ariely's anecdote.

5. Choice overload — Iyengar & Lepper (2000) "jam study"

Verdict: FAILED as a universal main effect; survives only as a conditional, moderated effect. Scheibehenne, Greifeneder & Todd (2010, Journal of Consumer Research, 37(3), 409–425) meta-analyzed 63 conditions from 50 experiments (N = 5,036) and found "a mean effect size of virtually zero but considerable variance between studies" — the headline "more choice hurts" claim did not replicate as a general law, and "no sufficient conditions could be identified" for a main effect. Chernev, Böckenholt & Goodman (2015, Journal of Consumer Psychology) meta-analyzed 99 observations (N = 7,202) and reconciled the picture: choice overload is real but conditional, appearing when four moderators are present — high choice-set complexity, high decision-task difficulty, high preference uncertainty, and an effort-minimizing decision goal. Practical translation: piling on options does not reliably backfire; it backfires only under specific, identifiable conditions. The question shifted from whether to when.

6. Price presentation order — Suk, Lee & Lichtenstein (2012)

Verdict: THIN / under-replicated. The original (Journal of Marketing Research, 49(5), 708–717) showed descending price order (high→low) shifts choices toward higher-priced options — including a real bar field study over 8 weeks that raised revenue per beer (about $0.24 more per beer sold). It is theoretically grounded in reference-dependence and price-quality inference, and has been cited and built upon, but I found no large independent direct replication of the high-to-low ordering effect. Adjacent, better-replicated work exists on related framing — e.g., Allard, Hardisty & Griffin (2019) differential price framing, conceptually replicated in a 2023 field study of 45,626 add-to-cart events (Köcher et al., Marketing Letters), which notably found the effect was "considerably less pronounced in actual purchase patterns" than in intentions. Treat price-order as plausible and theory-consistent, but not independently confirmed at scale.

7. Endowment effect (relevant to $1 trials / "ownership" feeling)

Verdict: ROBUST but bounded. The WTA/WTP gap is one of the most replicated findings in behavioral economics. Tunçel & Hammitt (2014, Journal of Environmental Economics and Management, 68(1), 175–187) meta-analyzed the literature and found an overall WTA/WTP ratio of 3.28 (largest for public/environmental goods, ~6.2; smaller for ordinary private goods). Boundary conditions matter: the disparity is smaller for ordinary market goods, for experienced traders, with incentive-compatible elicitation, and Plott & Zeiler (2005) argued part of it is a procedural artifact (subject misconceptions). Related ownership/effort effects remain solid — a 2026 meta-analysis of the IKEA effect (k = 55, N = 5,454) reports d = 0.57. For sales, the "$1 trial / free trial → ownership feeling" logic rests on a real, well-replicated foundation — with the caveat that experienced buyers and clean market settings shrink it.

Broader replication-crisis context

The Reproducibility Project: Psychology (Open Science Collaboration 2015, Science) replicated 100 studies: only 36% reached statistical significance (39% judged subjectively to have replicated), and replication effect sizes were about half the originals. Many Labs 1 (2014) replicated 10/13 effects; Many Labs 2 (2018, N = 15,305, 36 countries) replicated 14/28 (50%); Many Labs 3 replicated 3/10. The economics replication efforts were somewhat higher (Camerer et al. 2016 ≈ 61%; 2018 ≈ 62%). The pattern that matters for sales psychology: classic cognitive/judgment effects (anchoring) and sequential-request compliance effects (DITF, FITD) held up far better than "social priming" and many flashier social-psychology effects. The pricing/choice effects (decoy, choice overload, price-order, valuation anchoring) sit in the contested middle.

Recommendations

Tier 1 — cite confidently with recent evidence (defend against the skeptic):

  • Anchoring (numeric/price reference points): cite Many Labs 1 (2014) and Schley & Weingarten (2025, d = 0.82). The single safest effect to build pricing strategy on.
  • Door-in-the-face: cite Genschow et al. 2021 (the direct replication), not only Cialdini 1975. Be honest that it drives agreement more than completed payment (Feeley et al. 2012).
  • Endowment effect / ownership ($1 trials): cite Tunçel & Hammitt 2014; note it shrinks with experienced buyers.

Tier 2 — use, but state the conditions:

  • Foot-in-the-door: small but real first request, no large incentive; modest effect (r ≈ .10–.17).
  • Choice overload: do not claim "fewer options always sells more." Frame it correctly — overload only bites under complexity/uncertainty (Chernev et al. 2015). Use it to justify simplifying genuinely complex decisions, not as a blanket rule.

Tier 3 — flag as weak; don't overclaim:

  • Decoy effect: present as "can nudge choice ~1% in real settings" (Devine et al. 2025), not a reliable revenue lever; acknowledge the 2014 replication failures.
  • Price presentation order: present as a plausible tactic worth A/B testing, not an established law.
  • Ariely SSN anchoring specifically: avoid citing the SSN study as proof; the valuation-anchoring literature is contested even though general anchoring is robust.

Benchmarks that would change these calls: a large pre-registered field replication of price-order or decoy effects showing a >5% conversion lift would upgrade them to Tier 2; a failed high-powered replication of Genschow-style DITF would downgrade DITF. For the user's own funnels: run A/B tests — effect sizes of r ≈ .10–.15 (DITF/FITD) mean you need large samples to detect them reliably, and lab effect sizes routinely overstate commercial impact.

Caveats

  • Several key sources are working papers/preprints (Schley & Weingarten 2025 on SSRN; Röseler & Schütz 2022 on OSF) — not yet final peer-reviewed versions; numbers may shift slightly. Note: "50 Years of Anchoring" is by Schley & Weingarten, not by Röseler (a common mis-attribution).
  • "Replication success" is defined differently across projects (statistical significance vs. effect-size overlap vs. subjective judgment), so headline percentages (36%, 50%) are not directly comparable.
  • Lab effect sizes overstate real-world commercial impact; the honest field numbers for a sales context are "modest" (DITF) to "~1%" (decoy).
  • I could not locate a large independent direct replication of Suk et al. (2012) price-order; absence of evidence is not evidence of absence, but it warrants caution.
  • Marketing/persuasion literature has historically shown publication bias toward positive results (noted by Feeley et al. 2012 and others); older "it works!" studies should be weighted accordingly. Encouragingly, the anchoring meta-analysis (Röseler & Schütz 2022) is a rare case where no publication bias was detected.
Content is user-generated and unverified.
    Sales Psychology Effects: 2015–2025 Replication Audit | Claude