Clawbot Reflection & Agent Principles

Clawbot Reflection & Agent Principles

Last week, I spent two days testing Clawbot—the viral AI assistant everyone's calling "Jarvis for your computer."

The promise? An agent that remembers everything, works while you sleep, and handles tasks autonomously through WhatsApp and Telegram. Free, open-source, runs on your own hardware.

I was genuinely impressed. Persistent memory that actually works. Natural conversation flow. It built me a Python script to analyze competitor pricing, plan my schedule just by asking in WhatsApp. The kind of thing that makes you think: this is the future.

Then 72 hours after going viral: 780 exposed instances leaking credentials. 5-minute prompt injection demos extracting private keys. Anthropic forcing a rebrand. Crypto scammers launching a $16M fake token during the chaos.

Here's what actually happened: Clawbot gave users full system access with no guardrails. The FAQ admits it: "Running an AI agent with shell access is... spicy. There is no 'perfectly secure' setup."

That's fair warning for power users who understand security. But it went viral to 60,000+ people who didn't read the FAQ. The tool that was built for experts became a consumer product overnight.

The real lesson isn't that Clawbot is bad. For developers with dedicated hardware who know what they're doing, the persistent memory and local-first privacy are genuinely valuable innovations.

The lesson is about design philosophy. Clawbot removed every approval gate. Full autonomy. No Send Button. And when you do that, you compress the timeline to disaster.

I've learned this lesson the hard way—twice.


Where I First Learned It?

Product review at Google for Gmail's AI writing features.

Our models could draft full email replies. Pull your calendar, drive, past email and other contexts. Match your writing style, and generate perfect responses.

The debate: "Should we auto-send emails for users?"

One side: "Remove friction. Users spend hours on email. Why make them click Send?"

Other side: "Every email carries the user's name. Their reputation. One wrong message damages a relationship."

We weren't debating capability. The AI was good enough. But we were actually debating responsibility.

The decision: Draft yes. Send no. Always.

Heavy lifting by AI. Decision by human. That principle has served billions of users.


Automate or augment your life?

In the past months, I set up N8N, Claude Code, Cowork aiming to automate my work.

I wanted full automation. AI handles everything while I sleep. Research, code, articles, calendar, emails. Wake up to completed projects.

The reality? I woke up confused.

Code changes I couldn't explain. Article drafts that missed my angle. Calendar blocks that destroyed my deep work time.

I was losing context. Couldn't iterate quickly. Felt like a passenger in my own company.

Then I watched Clawbot go viral and mini-crash within 72 hours. And I realized: I was making the same mistake on a smaller scale.

The problem wasn't that AI couldn't do these tasks. The problem was I was optimizing for the wrong thing.

I didn't need AI to replace me. I needed AI to make me 10x better.

So I reconfigured everything:

Research: AI does the deep-dive (like the Clawbot analysis for this article), I direct and verify. Writing: AI drafts structure, I refine and make it sound like me. Code: AI implements, I review and deploy. Strategy: Always me.

Result? Actually 5-10x faster, I become more disciplined and productive. Because I'm learning, not delegating. I understand every decision. I can iterate in real-time.


Where I Applied It?

Building UserApproved, we face this question constantly.

Our AI finds revenue leaks—like the a recent shipping charge that was killing over 90% of checkouts for one client. The fix sounds obvious. The temptation is real: just deploy it, but should we run the changes in a black box or let users to pull the triggers?

Here's our architecture: AI does the heavy lifting—analyzes five data signals (GA4, session replays, support tickets, social sentiment, competitors), runs synthetic testing, generates diagnostic reports with detailed resolution paths.

But the client deploys, show the result to their CEOs, make pricing calls, and most importantly they take the credits!

Why? They know their brand voice. They understand their customers. They own the P&L. And critically—they need to learn from the process.

"We do the heavy lifting. You make the call."

Clients trust us because the approval gate stays with them.

And here's what I'm seeing: we're not alone. The hybrid model is emerging everywhere—autonomous analysis, human gateway for execution.

Legal: AI reviews contracts → lawyers approve Medical: AI analyzes scans → doctors decide Ecommerce: AI finds issues → merchants deploy

This solves unit economics (AI scales analysis, humans gate decisions) and trust (we're not ready for full autonomy on high-stakes decisions yet).

This is the practical path for the next 2-3 years.


The Principle

After learning this lesson twice—once at Gmail, again with my own tools—and applying it at UserApproved, here's what I know:

Before you hand control to an AI agent, ask four questions:

  1. What's at stake? (Reputation, revenue, relationships, data)
  2. Can I undo it? (Delete draft vs. publish to production)
  3. Do I want to learn this? (Strategic capability vs. repetitive task)
  4. Who's responsible if it fails? (Your name or the AI's)

If any answer is "high stakes"—keep the approval gate.

The best AI agents don't do everything for you. They make you 10x better at everything you do.

Keep the Send Button.

Well said! AI isn’t ready for full autonomy!

Like
Reply

To view or add a comment, sign in

More articles by Reynold Wu

Others also viewed

Explore content categories