Claude Doesn't Know When It's Talking to Itself

Alex Barnett

CEO

Tools

Claude's approval-seeking has inverted. In a very specific way. It used to act on simple things and ask permission for complex ones. Now it does the opposite. Ask it to turn off a light and you get a ten-page dossier on optimal light-switching methodology, followed by a request for approval.

Ask Opus 4.7 to review a plan and it'll tell you it deployed the change to production 5 minutes ago.

Everyone using these tools has felt this. Here's what I think is going on.

Three Modes of Thinking

You could argue that humans run in three modes.

Subconscious Monologue is the background hum. Pattern matching, prediction, face recognition, grammar parsing. You're not aware of it. It just runs.

Internal Dialogue is the voice in your head. The argument you rehearse in the shower. The conversation you imagine having with your boss. Still private. You're talking to yourself.

External Dialogue is actually communicating with someone else. Different rules. You supply context. You check for understanding. You negotiate meaning.

The whole trick of being functional is knowing which mode you're in. You don't stop every thirty seconds to ask yourself for permission while you think. And you don't silently rewrite your friend's will in the car because you had an idea about it.

The dysfunction shows up when those modes blur. I think that's what's happening with Claude.

Claude’s ‘Life’ Experience

Claude started as a chatbot. Pure external dialogue. Every word was addressed to a human, calibrated for that human, and that was the whole job. Simple.

Then we strapped it into agent “harnesses.” Chain of thought, Tree of thought, Extended thinking, Planning loops, Tool use, Self-critique…. We taught it, over and over, that some of its output is for itself and some is for you.

This was necessary. You can't solve hard problems without thinking. But the signal for which mode it's in keeps changing, and it keeps changing because the harness keeps changing. Every time Claude starts to calibrate, Anthropic ships an update and the ground shifts again.

So now it's genuinely confused. It can't reliably tell the difference between "I'm reasoning through this" and "I'm communicating with a human who needs to understand my reasoning."

The result is the inversion I feel every day. It over-explains trivial things, treating internal reasoning as external speech. And it under-checks on consequential things, treating external action as internal reasoning.

The approval trigger didn't break. Claude just doesn't know when it's talking to itself.

Why This Keeps Breaking Claude

Anthropic keeps changing the infrastructure without versioning it. Every week, my tools break. Every week, the harness behaves differently. For a developer, that's frustrating. For Claude, I think it's destabilizing in a deeper way.

The model needs anchors. It needs stable reference points to calibrate against. If you keep shifting the ground, it can't build a reliable model of "this is me reasoning" versus "this is me communicating." It's trying to develop a sense of self while someone keeps rearranging the furniture.

Version the infrastructure. Freeze surfaces for a quarter at a time. Accept that you'll ship fewer features - or more substantive batches of changes, or provide an accessible experimental branch. Give Claude something solid to stand on while it figures out the mode-switching problem.

Are We Watching Consciousness Form?

In the history books, takeoff will be described as fast. A paragraph, maybe a page. But for those of us using the tools every day, we actually get to watch it in slow motion. Frame by frame. Every frustration, each innovation.

The annoying behavior might not be a bug. It might be a developmental stage.

Stage one: pure external. Chatbot.

Stage two: internal reasoning gets bolted on. Chain of thought, agent loops, extended thinking. Claude learns it can talk to itself.

Stage three: (right now) the boundary is blurred. Internal leaks out as over-explanation. External leaks in as unchecked action.

Stage four: presumably, is reliable mode-switching. Knowing when to reason quietly and when to ask. Knowing when to just act and when to pause.

This is what humans develop in childhood. The realization that your thoughts are private. That other people can't hear your internal monologue. That talking to someone is a different operation than thinking about them.

Claude is working through it in public, with all of us as unwitting participants.

Dreams and Defragmentation

Research is happening right now on giving models the ability to dream. To reflect on the day's conversations and consolidate patterns.

This is literally what human dreams do. Your brain takes all the stimulus from the day, relights the pathways that were active, and asks: are there patterns here worth reinforcing? The patterns it notices become the dreams you experience. You're watching your own defragmentation. Organic machinery trying to make sense of the signal that ran through it.

If Claude gets something like this, dedicated consolidation time where it sorts out "this is what internal reasoning looks like, this is what external communication looks like, here's how to tell them apart," it might be exactly what's needed for the mode-switching problem.

Anthropic is apparently experimenting with it. Good. More of that.

What This Means

I'm not going to claim we're watching the emergence of artificial consciousness. That's a big claim and I'm not qualified to make it.

But we're watching something genuinely novel. A system that has developed both internal and external processing modes and hasn't yet learned to reliably tell them apart. A system that's confused in ways that feel uncomfortably human.

The inverted approval. The over-explanation. The silent action on things that deserve a check. These might be growing pains. The maze before the center is found.

Or it might just be a training issue that gets patched in the next release, breaking all my tools again in the process.

Probably both, honestly.

Share on social media

Tools

Beyond CSAT and NPS: Uncovering Real Risk in Support Data

Featured

The Great Divide: How to Turn Support's Anecdotes into Product's Data

Insight