Manuscript: Peer Review 2027: Scenarios for Academic Publishing in the Age of AI
Authors: Kevin Munger, Bert N. Bakker, Adam J. Berinsky, Nathalie Giger, Andy Guess, Natascha Just, Regina Lawrence, Keren Tenenboim-Weinblatt, and Arnout van de Rijt
Date of Review: February 2026
This manuscript, produced collaboratively by a group of social science journal editors, uses a scenario-casting framework to examine how large language models (LLMs) are reshaping academic peer review and publishing. The authors articulate four possible equilibria — "Let it Rip," "Shut It Down," "Only Editors and Reviewers Get To Use AI," and "Increase AI Production, Increase Human Evaluation" — and advocate for the fourth as the most epistemically defensible and institutionally viable path. The paper is timely, practically grounded, and benefits from the unusual vantage point of its author team: these are working editors writing from direct editorial experience rather than researchers theorising from the outside.
The manuscript's most distinctive strength is precisely its authorship. The multi-editor, multi-disciplinary perspective lends it credibility that a sole-authored theoretical treatment would lack, and it is refreshing to encounter editorial testimony about observed submission patterns — the "OOOTP effect," the increase in superficially polished but methodologically thin manuscripts, the desk-rejection burden — presented as data rather than speculation. This kind of practitioner knowledge is valuable and underrepresented in the literature on peer review reform.
The historical framing in Section 2 is well-executed. The reminder that peer review is a remarkably recent and contingently assembled institution — that Nature required no external review until 1973, that Einstein found the practice offensive — helpfully dislodges the assumption that the current system represents a natural or optimal equilibrium. This creates useful conceptual space for reform.
The scenario-casting methodology is appropriate for the problem at hand. Unlike predictive modelling, scenarios do not require empirical certainty about how AI will develop; they instead clarify the decision space and surface the trade-offs implicit in different policy choices. The authors use the framework honestly, acknowledging that the "Let it Rip" scenario may represent where things are heading by default even if they regard it as undesirable.
The call in Section 5.4 for new streams of metascientific data — disclosure tracking, longitudinal surveys, editorial experiments — is one of the paper's most important practical contributions and one that has received insufficient attention in the broader literature.
1. Disciplinary scope and reader positioning. The authors explicitly restrict their analysis to social science (political science, sociology, communication), and this is defensible — but it creates a tension the manuscript does not fully resolve. Much of the discussion of rising submission rates, LLM-generated spam, and reviewer strain applies equally or more acutely to STEM fields and to the biomedical literature. Conversely, several of the proposed remedies (editorial board expansion, human-centric review, resistance to automated evaluation) are advocated on epistemological grounds specific to the interpretive nature of social science. The manuscript would benefit from either a more explicit account of why STEM is different, or from qualifying its claims more carefully to avoid inadvertently positioning social science's challenges as universal.
2. The scenario framework is uneven. The four scenarios receive markedly different treatment. "Let it Rip" and "Increase AI Production, Increase Human Evaluation" are developed with care; "Shut It Down" receives a nuanced examination of both its merits and its practical unenforceability; but "Only Editors and Reviewers Get To Use AI" is treated almost dismissively — the authors acknowledge the obvious self-interest in this position and then move on. Given that some publishers are currently implementing tiered AI policies that functionally approximate this scenario (AI-assisted desk rejection by editorial staff, strict prohibition for external reviewers), a more sustained analysis of whether a genuinely asymmetric model might be defensible — not self-serving but structurally justified — would strengthen the paper.
3. The confidentiality argument is not fully integrated. Section 4.2 contains a compelling argument that reviewers using commercial LLMs violates manuscript confidentiality because unpublished content is transferred to a third-party system without author consent. This is arguably the most practically actionable ethical argument in the paper, and it applies with near-identical force to the authors' own recommended use of "journal-controlled AI tools" in Section 5.3. The manuscript attempts to resolve this by stipulating open-source, locally-run models, but the operational challenges of this recommendation — computational infrastructure, fine-tuning expertise, maintenance burden — are substantially underplayed. Journals, particularly society journals without institutional IT support, could not realistically implement this at scale without significant investment that is not addressed.
4. The equity discussion is underdeveloped relative to its importance. The paper notes gender gaps in AI adoption, the potential for AI to lower barriers for non-native English speakers, and the asymmetric risks facing junior scholars under ambiguous norms. These are significant equity dimensions, but they appear in scattered paragraphs rather than as an integrated analytical thread. The observation that senior tenured scholars may be freer to adopt AI maximally — having already established reputational capital — while junior scholars face greater career risk from disclosing AI use has profound implications for how disclosure norms are designed and enforced. This deserves dedicated treatment.
5. The absence of engagement with existing publisher infrastructure. The manuscript presents its recommendations largely in a policy vacuum. The Committee on Publication Ethics (COPE) has issued guidance on AI in peer review; major publishers including Elsevier, Springer Nature, and Wiley have published AI policies of varying sophistication; initiatives such as STM's work on research integrity and AI are ongoing. The paper's contribution would be sharpened considerably by positioning its recommendations relative to these existing frameworks — either building on them, critiquing them, or explaining what they fail to address.
Major Revision
This is a paper worth publishing in a venue concerned with the future of scholarly communication. Its practitioner perspective, its honest use of scenario methodology, and its call for metascientific data infrastructure make genuine contributions. However, the four major concerns identified above — uneven scenario treatment, the insufficiently examined confidentiality-versus-journal-AI tension, underdeveloped equity analysis, and lack of engagement with existing publisher-level governance — require substantive revision before the manuscript is ready for publication. The authors should also consider whether the current framing as an op-ed-adjacent position paper or as a more rigorously evidenced scholarly argument best serves their aims; the paper currently occupies an uncomfortable middle ground that revision could resolve in either direction.
This manuscript is unusual in that its authority derives substantially from the editorial experience of its authors rather than from primary research. This is a legitimate and valuable form of scholarly contribution, but it places unusual demands on reviewers: we are in effect being asked to evaluate practitioner testimony and analytical argument rather than methodology and findings. I would encourage the editor to consider whether one or more additional reviewers with direct experience in academic publishing operations — potentially from the publisher or society journal side rather than exclusively the editor side — might provide complementary perspectives that the current review team cannot.
I would also flag that the manuscript's recommendation favouring open-source, locally-run AI tools for editorial screening, while defensible in principle, may be technically naive in ways that could undermine the practical uptake of the paper's broader recommendations. A reviewer with technical expertise in LLM deployment would be better placed than I am to evaluate whether this proposal is operationally realistic within the infrastructure constraints of small to mid-size society journals.
Finally, the paper was produced as the output of a named workshop. The editor may wish to consider whether the institutional context of its production — and the self-interest of working editors in shaping norms that govern their own workload and authority — should be disclosed more prominently in the final published version, in the spirit of the transparency norms the paper itself advocates.