The Patheos New Year prediction contest has returned for 2026 with a twist: this time readers are explicitly asked to compete against AI, and the blog’s host has already tested a contemporary generative model by feeding the contest rules into ChatGPT and posting the model’s five headline predictions as part of the public experiment. The post invites readers to submit short, specific forecasts between now and January 15 and frames the contest around specificity, surprise, and verifiability—the qualities that make a prediction memorable when the year closes. The entry also recalls a past “winner” — a username credited with predicting a 2020 pandemic — and uses that anecdote to set the stakes for what a winning prediction should look like: unlikely at the time but strikingly correct in retrospect. Several AI-generated predictions were included in the post to provoke thought and rivalry between human and machine forecasting.
In 2026 the organizer has raised the contest’s profile by adding AI as a competitor. They ran today’s (2026) version of ChatGPT against the same contest rules and shared five AI-generated predictions to provoke comparison and conversation. The AI’s proposals skewed toward technology, law, culture and social media — and contained a mix of plausible near-term events and visible overreach. Presenting the model’s output as part of the contest both tests the machine’s forecasting ability and highlights differences in how humans and models pick which future events to emphasize.
For the contest and for the culture of prediction more broadly, the best path is collaboration: use AI to surface scenarios and data, but preserve human judgment to test mechanisms, add local context, and produce tight, verifiable claims. That approach doesn’t make machines obsolete; it forces them into a role they can perform well — hypothesis generation — while leaving verdict-bearing forecasting and the craft of surprise to human authorship.
The 2026 Patheos experiment will be worth revisiting at year-end. The contest’s combination of public prompts, AI participation, and a social audience creates a rich dataset for studying how forecasts are made, how surprised a community becomes, and which types of predictions — machine‑generated or human‑crafted — endure. The lucky human who wins the Gluten Free Award this time will have done two things well: chosen an event with a credible mechanism and made it specific enough to be unambiguously judged by December. That standard remains as useful for hobby contests as it is for enterprise foresight teams and public-policy planning.
Source: Patheos Our Predictions For 2026
Background
The Patheos contest: rules, history and the AI wrinkle
Patheos’s annual prediction contest is simple in mechanics but exacting in spirit. Participants are asked to post one or more predictions in the comments. Predictions should be specific (dates, names, numbers when possible), and winners are chosen retrospectively around the next New Year’s Eve based on the surprise and accuracy of the items. Organizers emphasize that these are not supernatural prophecies but ordinary predictions that can be judged by observable events.In 2026 the organizer has raised the contest’s profile by adding AI as a competitor. They ran today’s (2026) version of ChatGPT against the same contest rules and shared five AI-generated predictions to provoke comparison and conversation. The AI’s proposals skewed toward technology, law, culture and social media — and contained a mix of plausible near-term events and visible overreach. Presenting the model’s output as part of the contest both tests the machine’s forecasting ability and highlights differences in how humans and models pick which future events to emphasize.
Why this matters to readers and to prediction culture
Prediction contests are more than entertainment. They are laboratories for probabilistic reasoning, narrative selection, and the social dynamics of credibility. A prediction contest that explicitly pits human intuition against increasingly capable generative models becomes a living test of how men and machines differ in handling context, surprise, and the economics of low-probability, high-impact events. For Windows and broader tech communities, the contest is a useful prompt to compare human judgement against machine tendencies like over-generalization, dataset bias, and the use of readily available recent trends as prediction anchors.Summary of the provided material
- The Patheos post invites readers to submit predictions for 2026; the contest runs through January 15 and will be judged at year-end. The post explains that past winners earned their recognition by making very specific, unexpected-but-correct calls.
- The author ran ChatGPT on the same rules and published the model’s five predictions, which included:
- A major Hollywood studio crediting a screenplay as “Human author with AI assistance,” accepted by the Writers Guild.
- A U.S. Supreme Court opinion specifically referencing generative AI hallucinations in a canonical footnote.
- A prominent Christian denomination issuing a pastoral statement discouraging emotionally significant relationships with AI companions.
- A regional airline going viral after a social-media complaint reveals a generous refund policy that many exploit.
- A pro sports game paused mid-play because an AI officiating/analytics system produces an absurd ruling widely replayed online.
- The post points to prior AI predictions (from a 2024 experiment) that missed the mark (for example: people claiming emotional relationships with plants becoming a trend; insect protein becoming mainstream haute cuisine), using that record to temper expectations about machine forecasting.
What the Patheos post got right — a measured appraisal
1. The test is a clean, revealing experiment
The post’s experiment is methodologically simple and therefore informative: it applies the contest prompt to a contemporary generative model, posts the results publicly, and invites human entries alongside the AI output. That makes it easy to evaluate later, keeps the human and machine predictions in the same sampling frame, and avoids the post-hoc selection biases that plague many forecasting tests.2. The AI output exemplifies common generative-model tendencies
The model’s five predictions demonstrate two well-known behaviors:- An inclination to produce technology-centric, societally framed scenarios. Models tend to emphasize domains where public-training data and visible news cycles supply rich, repeated patterns.
- A mix of plausible (court footnotes about AI, denomination statements about AI companions) and highly imaginative (plants-as-conversational-partners trending, insect haute cuisine breakthroughs) outcomes. This blend is typical of generative systems that balance pattern completion with creative variation.
What to watch for in judging predictions: human vs AI
The Patheos contest uses three informal axes that also serve as a good rubric for comparing humans and machines:- Specificity: A great prediction names actors, dates, numbers, or contractual language. Machines often produce plausible but fuzzy claims; humans can (and should) be pushed to be concrete.
- Surprise (low prior probability): The winning prediction often reads like something that “no one would have guessed” at the time. Models are biased toward higher prior-probability events drawn from recent news corpora, which reduces surprise.
- Verifiability: Predictions must be anchored to public, observable outcomes. This is where both humans and AI can fail: vagueness or private data dependencies will make retrospective adjudication impossible.
Why many AI-generated predictions fail in practice
Calibration and base-rate neglect
Generative language models are not calibrated forecasters. They are trained to produce plausible continuations of text, not to estimate probabilities of real-world events. As a result, they:- Over-weight recent, well-covered trends (newsworthiness bias).
- Struggle with base-rate reasoning: highly unusual events (the kind that win prediction contests) are underrepresented in their training objectives.
This produces predictions that read like good press releases but lack the probabilistic grounding to be expected to occur.
Over-generalization and story scaffolding
Models are excellent story-tellers: given a cluster of related signals (e.g., generative AI coverage, Writers Guild controversies, and recent court technology rulings), they will stitch together a coherent, attention-grabbing narrative. Coherence is not the same as likelihood. Predictions constructed this way can be striking but fragile — a single missing link in the narrative chain prevents the event.Lack of access to private or emerging signals
Humans sometimes beat models because they have domain-specific access — a source who knows an airline’s refund policy, an inside track on a studio negotiation, or a denominational synod calendar. Models rely on public training corpora and cannot substitute proprietary leads or direct human-sourced intelligence.How to craft better human predictions (practical playbook)
Below is a stepwise methodology to produce contest-ready predictions that are specific, surprising, and verifiable.- Start with an anchored fact. Pick a verifiable baseline: a contract clause, a regulatory timetable, a corporation’s public guidance, or a scheduled event.
- Identify a credible mechanism. Ask: what single, plausible change or failure would convert the baseline into a noteworthy outcome? (Example: a vendor’s stated roadmap + a known legal ambiguity -> a court case.
- Add measurable outputs. Include an exact date window, a named actor, and a clear metric or clause (e.g., “by November 15, 2026, Studio X will list credit wording ‘Human author with AI assistance’ for a film Y”).
- Estimate conditional probability. Even if informal, state a confidence level (e.g., 5–20%) to signal risk and to aid later evaluation.
- Provide a falsifiable test. Say exactly how the prediction will be judged true or false (public filing, press release, Guild statement).
- Forces verifiability.
- Minimizes ex-post rationalizations.
- Improves comparability between entries (human or machine).
Likely domains where human intuition still outperforms generative models in 2026
- Legal adjudication and formal doctrine: Human forecasters with practice-area knowledge can see how a particular case strategy might force a canonical footnote about AI hallucinations; models can posit such a footnote but usually lack the case-level legal procedural insight to make its emergence likely.
- Corporate contract changes: An inside source or relationships with procurement teams reveal the narrow contractual language that will make or break widely publicized policy shifts.
- Localized culture-viral events: Virality often depends on an idiosyncratic mix of social-media timing, influencer networks, and luck. Humans embedded in these networks can sometimes anticipate such cascades; models can narrativize them after the fact.
- Technical misfires with systemic effects: Engineers and product insiders can spot brittle integration points where, for example, AI-operated officiating might produce catastrophic misinterpretation — and can estimate how likely operational risk controls are to fail.
How to evaluate the AI predictions posted in the Patheos experiment
The five AI proposals provided in the post can be evaluated using this framework:- Credible legal footnote predicting court guidance on “generative AI hallucinations”: Plausibility rating — moderate. Courts are already grappling with AI evidence and summary tools; a doctrinal footnote is plausible, but whether it will be canonical and instantly cited across law reviews is uncertain without a specific case path or litigant strategy identified.
- Hollywood studio screenplay credited as “Human author with AI assistance” and Writers Guild acceptance: Plausibility rating — moderate to low. Studios are experimenting with disclosure, but the Writers Guild’s current rules, contract negotiations, and professional norms make such an explicit phrasing contentious. This would require negotiated language and collective bargaining recognition.
- Evangelical/mainline Protestant denomination discouraging “emotionally significant relationships” with AI companions, and avoiding explicit mention of sex: Plausibility rating — high for a statement, low for a non‑mention of sex. Religious bodies have already issued pastoral guidance on tech and AI. A statement is plausible; however, omission of sex as a trigger for mockery is a narrative detail that is speculative.
- Regional airline viral surge due to generous refund policy exposed on social media: Plausibility rating — high. Social-media-driven ticketing and refund narratives have produced similar outcomes; the key unknown is the precise policy, timeline, and the airline’s response.
- Professional sports game paused due to AI officiating or analytics ruling: Plausibility rating — low to moderate. Sports leagues have raced to deploy analytics-assisted decisions, but full reliance on AI for officiating remains constrained by governance, liability, and broadcast pressures. An AI-produced absurd ruling could happen in a narrow pilot context, but a mid-play pause in a major league game is less likely without prior integration of the AI into official decision flows.
Strengths of the AI-vs-human contest model
- It creates a transparent, repeatable experiment that can be judged after the fact.
- It forces both humans and machines to be concrete; the public record of AI output prevents cherry-picking when the year ends.
- It surfaces comparative error modes: machines over-generalize; humans sometimes overfit to personal narratives.
Risks, blind spots and ethical caveats
- Overreliance on models for forecasting can institutionalize complacency. Organizations that substitute model outputs for investigative reporting risk missing low-probability, high-impact events.
- The contest’s public framing may incentivize sensational or intentionally misleading entries—both from humans attempting to game notoriety and from models optimizing for engaging continuations rather than truth.
- Attribution ambiguity: as generative models are used to draft predictions, contests must ensure authorship is clear. A human submitting a model-assisted forecast should disclose assistance to preserve fairness.
- Verification friction: some predictions hinge on private contracts, sealed negotiations, or internal memos. Organizers must adopt clear rules to avoid unresolvable disputes about whether a forecast came true.
Practical recommendations for improving the contest and for human entrants
- Require a one-sentence “falsifiability test” with every entry: state how the prediction will be judged true or false at year-end.
- Ask human entrants to include a short rationale (1–2 sentences) and a confidence estimate (e.g., 10–30%). That improves later adjudication and reduces ex-post rationalization.
- When AI-generated predictions are entered, require disclosure of the exact prompt and model version used. This enables later assessment of how model architecture and prompt design influenced the outcome.
- Use a point-scoring rubric for adjudication that weights specificity and verifiability more heavily than textual eloquence.
Conclusion: who will win — humans or AI?
The Patheos contest is a lively microcosm of the broader comparison between human foresight and algorithmic pattern completion. In 2026, generative models will remain formidable at synthesizing themes and producing confident prose about likely continuations of current trends. Humans, however, retain advantages in sourcing, anchoring to institutional constraints, and deliberately targeting low-probability, high-impact outcomes with precise, falsifiable wording.For the contest and for the culture of prediction more broadly, the best path is collaboration: use AI to surface scenarios and data, but preserve human judgment to test mechanisms, add local context, and produce tight, verifiable claims. That approach doesn’t make machines obsolete; it forces them into a role they can perform well — hypothesis generation — while leaving verdict-bearing forecasting and the craft of surprise to human authorship.
The 2026 Patheos experiment will be worth revisiting at year-end. The contest’s combination of public prompts, AI participation, and a social audience creates a rich dataset for studying how forecasts are made, how surprised a community becomes, and which types of predictions — machine‑generated or human‑crafted — endure. The lucky human who wins the Gluten Free Award this time will have done two things well: chosen an event with a credible mechanism and made it specific enough to be unambiguously judged by December. That standard remains as useful for hobby contests as it is for enterprise foresight teams and public-policy planning.
Source: Patheos Our Predictions For 2026