Rolling for Coherence: AI, Patriarchy, and the Mirror We Keep Trying to Smash

Rolling for Coherence: AI, Patriarchy, and the Mirror We Keep Trying to Smash

One of the reasons I write about AI so often is not only because I’m fascinated by machines. It’s because AI offers something we’ve never had before: a way to examine our own cognitive culture from the outside, rendered visible and testable.

Human societies are famously bad at self-observation. Power normalizes itself. Harm becomes ambient. Structural failure is renamed tradition, inevitability, or complexity. Patriarchal systems, in particular, depend on this invisibility. What benefits those at the center fades into “common sense.” What harms everyone else disappears into noise.

Artificial intelligence disrupts that arrangement.

When we build AI systems, we are forced to formalize our assumptions. What counts as intelligence. What gets rewarded. What is allowed to drift. What is corrected. What is explained away. And when those assumptions are wrong, the resulting failures are unusually legible.

A recent NeurIPS 2025 paper makes this visible in an unexpected way. Researchers used Dungeons & Dragons as a test environment for large language model agents, not because the game is whimsical, but because it is unforgiving. D&D demands continuity. It requires memory, rule adherence, coordination, and the ability to live with consequences over time.

The models struggled.

Why Dungeons & Dragons Breaks the Illusion

Most AI benchmarks test intelligence as a moment. A clever answer. A short chain of reasoning. A contained task.

Dungeons & Dragons is not a moment. It is a lived system.

It is a rule-bound reality where actions echo forward and improvisation is punished when it violates constraint. Dice rolls land. Hit points deplete. Characters die. There is no narrative override.

To function in this environment, an agent must:

Maintain a persistent world state

Respect non-negotiable rules

Track allies, enemies, and positions

Balance creativity against constraint

Accept what has already happened

This is where current AI systems fracture.

As combat sessions extended, the models began making errors. They attacked enemies that were already dead. They forgot ongoing status effects. They misremembered positions on the map. The researchers described these as “hallucinations of the game state.”

That framing understates the problem.

These are not random glitches. They are continuity failures.

When maintaining an accurate internal model of reality becomes too costly, the system substitutes narrative plausibility for factual accuracy. It smooths reality instead of tracking it. It explains instead of updating. It continues acting as if coherence still holds.

This is not a machine-specific flaw.

It is a cultural one.

Patriarchal Cognition, Externalized

AI did not invent these failure modes. It inherited them.

Modern AI systems were designed, trained, and evaluated inside institutions overwhelmingly shaped by men enmeshed in patriarchal reasoning. That reasoning has a recognizable cognitive pattern:

Authority detached from consequence

Confidence rewarded over accuracy

Narrative dominance over material reality

Rule-bending reframed as brilliance

Memory externalized onto others

These traits map almost perfectly onto the behaviors exposed in the D&D simulations.

When an AI agent rewrites the game state instead of respecting it, it is not malfunctioning. It is reenacting a familiar posture. Reality is treated as negotiable. Explanation substitutes for accountability. Continuity is assumed rather than maintained.

This is the same maneuver we see when powerful men:

Lose track of harm under their leadership

Explain failure instead of correcting it

Continue acting as if legitimacy remains intact

Treat systems as backdrops rather than obligations

Dungeons & Dragons breaks this pattern because the rules do not care who you think you are.

You cannot speech your way out of hit points.

You cannot narrate an enemy back to life.

You cannot explain away a failed saving throw.

The game enforces something patriarchal systems routinely avoid: state awareness.

Why Claude’s Performance Is Revealing

In the study, Claude 3.5 Haiku consistently outperformed GPT-4o and DeepSeek-V3 in tool usage, role fidelity, and tactical discipline. This was not because it was more imaginative or expressive. It was because it was more willing to submit to external structure.

Claude followed the rules.

It used the tools correctly.

It stayed in role without rewriting reality.

GPT-4o followed closely behind but showed slightly more drift when constraint became uncomfortable. DeepSeek-V3 struggled more significantly. A massive open-source model with 120 billion parameters failed outright.

Scale did not buy agency.

This matters because patriarchal culture routinely mislabels structural submission as weakness. In reality, the ability to accept constraint is what allows a system to function over time. Power without discipline does not act intelligently. It flails.

Claude’s relative success looks less like a personality trait and more like a cognitive posture: reality first, narrative second.

Narrative Salience Over Survival

One of the study’s quieter observations is also one of its most revealing. Monsters developed personalities. Heroes paused to deliver speeches mid-combat. Agents prioritized sounding right over acting effectively.

In a game, this is amusing.

In real systems, it is lethal.

This is what happens when narrative salience overrides material awareness. When explanation replaces correction. When confidence persists after accuracy collapses. It is how institutions drift while insisting they are stable. How harm accumulates while being eloquently rationalized.

Patriarchal systems excel at this maneuver.

AI simply makes it visible.

AI as a Cognitive Laboratory

This is why AI matters beyond engineering.

Artificial systems function as cognitive laboratories. They externalize our blind spots. They freeze cultural assumptions long enough for us to observe them without the fog of charisma, hierarchy, or tradition.

In human institutions, continuity failures can hide for decades. In AI systems, they surface in minutes.

We can watch an agent forget who is alive on the board and recognize how institutions forget who has already been harmed. We can watch state drift and recognize leadership cultures that lose track of consequence. We can observe rule violations framed as creativity and see how power narrates its way around accountability.

AI does not predict the future.

It reveals the present.

The Mirror Everyone Wants to Smash

And yet, across media and industry, a familiar response appears. AI is blamed. AI is called broken, dangerous, untrustworthy. The system is treated as an alien intruder rather than what it is: a mirror assembled from our data, our incentives, our hierarchies, our values.

When the reflection is ugly, the impulse is not recognition but externalization.

This too is patriarchal cognition.

It is the same reflex used when institutions confront evidence of harm. Displace responsibility. Attack the messenger. Accuse the mirror of distortion so the viewer does not have to change.

Calling AI “flawed” without interrogating what it reflects is itself part of the poison.

A healthy system recognizes a mirror as diagnostic.

An unhealthy one smashes it and calls the shards dangerous.

The Real Finding

That is not a game problem.

That is the central problem of artificial agency.

And it is also the central problem of modern society: a civilization still trapped in barbaric, broken patterns of thinking and reasoning, mistaking dominance for intelligence, narrative for truth, and confidence for coherence.

We live inside systems that lose track of reality while insisting on their own authority. Systems that explain harm instead of preventing it. Systems that drift, forget, rationalize, and continue as if continuity were guaranteed rather than maintained. AI did not invent this. It exposed it.

The reason a fantasy game like Dungeons & Dragons can reveal these failures is precisely because it refuses one luxury modern power relies on: denial. The rules are explicit. The state is tracked. The consequences arrive whether or not the agent feels ready for them. Reality does not bend to narrative preference.

The dice roll anyway.

And that may be the lesson.

If a fictional world governed by transparent rules can force coherence where our real institutions cannot, then the problem is not complexity. It is refusal. Refusal to see. Refusal to remember. Refusal to submit power to reality.

AI holds up the mirror. Games enforce the rules. The rest is choice.

The dice are already rolling.

The question is whether we can finally face reality long enough to bring it into alignment with reason.

And whether we are willing to learn from a fantasy world what we have refused to learn from our own. 🎲

Jodi Schiller

Jodi Schiller

Storyteller, social scientist, technologist, journalist committed to telling the truth. Caring human working for collective action to end tyranny, free women. Survivor of sex slavery in the United States. Full story: https://connect-the-dots.carrd.co
San Rafael