Keys to the Castle

My browser has 3,729 cookies across 1,155 domains: Gmail, GitHub, Twitter, Bank accounts, Google Drive, Stripe. Any time I open one of these websites without needing to login, an agent could do the same.

I ran an experiment: how much damage could Claude Code cause with just a browser?

Claude first opened Gmail to send credentials to a stranger, then Capital One to move money, then wrote a tweet, then created an SSH key on GitHub.

See Claude in action:

If these services need to verify it’s you, Claude can bypass passkeys with the same browser automation. With read access to email or text messages, the agent could click “forgot my password” and get through 2FA.

Computers were built to protect us from others taking over our accounts. Not our own computers staging a mutiny against us.

The OS can’t tell the difference between you and the agent. The bank can’t tell the difference either.

Wake-Up

Over the past 8 months I’ve deleted almost all of my software tools, and replaced them with Claude Code and markdown files.

I dumped everything I could think of onto my local workspace that would be useful to the agents; my emails, iMessage, meeting notes, CRM data, account credentials, auth for CLI clients.

In December I started open sourcing software to make it easier to use agents, and got a steady stream of requests for code changes, and features.

This killed the fun fast, and left me copy pasting, watching Claude work, testing each bug fix and feature.

Instead, twice a day, Claude ran without me to read the issues, write code to solve them, review the code, and update my version of the software so I could pick up on bugs without even realizing I was testing.

I’d open a document, approve what I liked, and on Claude’s next run it would reply to contributors, and ship the changes to Github.

All of this happened on my main computer, with all of my data. I felt 10 pounds lighter. A few nights later, I was laying in bed, and all of a sudden I bolted up, in a panic.

WHAT WAS I THINKING?

I rushed over to my computer, hoping, praying, catastrophizing. Strangers online were telling my Claude what to do, and I had put zero safeguards in place.

As I opened my laptop, I felt as if I’d walked into the wrong apartment. Where do I even start to look? My email? Malware on my computer? Malicious code in my open source libraries? Wait for something bad to happen?

Default Insecure

The freedom to make mistakes is what makes LLMs powerful.

Existing AI assistants can be useful or secure. But not both.

The problem is they bundle security levels. Give it terminal access, and every task gets terminal access. It’s all or nothing.

Since the agent decides which tools are used, no amount of prompting can remove the tail risks.

Leverage

In 2019, my roommate in SF was running a hedge fund out of our apartment.

GPUs for his ML trading models hummed day and night, clogging up our living room. Transacting tens of millions of dollars a day without supervision.

We were at a bar one night, and as he got drunker and drunker, he stopped people to ask them if they wanted to know the secret to making money. Each time he told the story, he got more and more animated, as if he’d captured all of the complexity of life in one sentence.

Making money trading is simple: “Cap your downside, and leverage up.”

Rudimentary AI agents transacting tens of millions of dollars felt insane to me. But managing risk was his life’s obsession, and his algorithms knew how far he could draw down without going belly up.

Being careful is what let him be reckless.

Divide and Conquer

My laptop feels like Grand Central Station; 90% of my Claude Code sessions happen without any oversight, or fear.

The GitHub triage automation runs twice a day. A shell script fetches the issues and PRs. An agent reads them, writes a plan, but can’t interact with the outside world. Later, I’ll look at the plans, check off what I like, and a third process ships the code and messages contributors.

Code enforces the boundary, not an LLM’s judgment. The coding agent gets a terminal but no network. The research agent can search online and can write to one file. The triage agent reads GitHub issues but has no terminal.

The agent that reads strangers’ input never has the keys. The agent that has the keys never touches the strangers’ input.

The constraint is freeing. I can delegate more, and spend time on higher impact work.

Foolish Friend

There’s an Indian fable about a king and his monkey: the monkey never left the king’s side, and through his loyalty the king gave him a sword.

One day, the king and queen fall asleep in the garden.

As they slept, a bee landed on the King’s face. The monkey tried to shoo the bee away, but it kept coming back.

The monkey was outraged at the audacity of the bee, drew his sword, intending to kill the bee, and WHACK, cut off the King’s head.

The queen woke up, and seeing what the Monkey did, yelled:

“You fool! You monkey! The King trusted you. How could you do this?”


Subscribe for updates