Take-aways from 3 years of building AI Agents

Michael Huang
Mar 10
5 min read

There are two ways to build AI agents:

1. The way that looks good in a demo.

2. The way that makes money.

Most people start with #1. They build something impressive. They record a demo. They post it online and watch the views pile up.

Zhu Liang built both. And in a recent sharing at The Stage, and he shared his take-aways from building agents these last 3 years.

The Agent that Makes its Own Money

This was a really interesting exploration that Zhu Liang did over the past few years.

Does the agent generate revenue like a human freelancer on Fiverr or YouTube, without needing a sales team?

He was focused on agents that monetize directly (B2C) rather than through enterprise sales (B2B):

Agents should create value autonomously, similar to how humans work on platforms like YouTube or Fiverr
The goal is for agents to generate revenue directly without complex sales processes
Most platforms today are built around human KYC and identity verification, limiting agent monetization opportunities
YouTube Shorts creation is one of the few viable direct monetization paths currently
Clarified that the focus is on monetizing the agent's output, not the agent itself

Zhu Liang tried it. Built a video editor agent that generates YouTube shorts. Made $1 per video. 5 minutes to generate, 10 minutes to polish.

Rewrite Every 3 Months

"Agent landscape is evolving so fast," Zhu Liang said. "Every week, there's new techniques and new tricks. And then the old ones get obsolete."

He's not talking about the small updates. It's not just bug fixes, the minor improvements. He's talking about paradigm shifts. The kind that make your architecture look archaic.

Here's what that means in practice:

- Three months ago, RAG was the answer. Now it's a baseline.

- Six months ago, custom inhibitors were novel. Now they're table stakes.

- A year ago, multi-step planning was the new frontier. Now it's not.

If your agent's architecture hasn't changed in the last 90 days, you're just maintaining.

Zhu Liang's rule: Rewrite your agents every 3 months or less. Not because the old ones are broken. Because better paradigms exist.

Memory Is an Illusion

When it comes to AI agents, at least for the specific use cases he has built for, Zhu Liang's current assessment is that "Memory gives an illusion that the agent can do better.".

"But in fact, it doesn't really do better. It will make you think that the agent can learn from past mistakes, but then based on empirical evidence, we found that memory doesn't actually help."

For chatbots, memory is still valuable, but for coding agents, knowledge base and RAG systems are more effective. What he finds works better instead:

Prompts: Put knowledge directly in the system prompt.
RAG: Use retrieval-augmented generation for context.
CLI tools: Give the agent commands it can execute, not just memory to.
Logging: Offline trajectory collection for experiments and online runtime feedback

Memory systems add complexity. They make the agent slower. They create new failure modes. And they don't actually improve performance.

Improve yourself as an engineer by...reading Logs Every Day

This is Zhu Liang's most important tip: read the logs of the agents every day, not to debug, but to find contradictions.

System prompts get long. They get inconsistent. They start saying one thing in the morning and another in the evening.

Let the agent read its own logs. Let it spot the errors. Let it ask "why did I do that?"

This is the most efficient way to improve. As an AI engineer, build better self-awareness.

Zhu Liang runs this loop 24/7 on a Mac Mini. The agent continuously discovers and fixes its own inefficiencies. "The offline loop is more like an ML or data science-focused track," he explained. "So the way I'm currently doing it is I have a trajectory collection system that's hooked to the agent. So that system would collect the logs, and then I would have, like, a prompt in my cloud code to ask the agent to look at the logs."

Use CLI Instead of Tools

"CLI is just more universal," Zhu Liang said. "You can run it yourself, or you can let the agent run it. It's more efficient than tools." Model-provided tools are locked to that model. They're not portable. They're not flexible.

CLI is the universal interface. It works across models. It works for manual execution. It works for automation.

Zhu Liang uses CLIs for everything: video editing, file management, web requests. The agent can run them manually or automatically.

"You can run them manually or let the agent run them, avoiding lock-in to specific model tooling."

Remove Features Over Time

"As models become better and the landscape evolve, it's very hard to keep these features working or they might actually be a detriment to your agent."

Reading through your agent's custom instructions every day, Zhu Liang finds contradictions and inefficiencies.

Features that were useful three months ago—compaction, memory systems, complex hand-off techniques—often become dead weight.

Remove them. Simplify. Let the new model do the heavy lifting.

The agent landscape moves so fast that maintaining old features costs more than rebuilding.

What Framework Should You Use?

This is the question everyone asks: Should you use Cloud Agent SDK? OpenAI SDK? Build your own? Zhu Liang's answer: It's hard to switch later.

"Once you settle on building your own agent with a particular framework or handlers, and then later you want to switch to a different handlers or a different model, it's actually a tricky topic to do."

He's currently working on figuring this out by building his own agentic harness. He's keen to look for anyone working on the same to exchange ideas or collaborate.

The Bottom Line

AI agents are getting smaller. The market for large, complex agents is shrinking. The market for small, focused, monetizable agents is growing. Explore and experiment. Early experiments might not work out, but it always provides good learnings for future endeavors.

Ship it. Let it fail. Read the logs. Rewrite it. Ship again.

The workflow is simple:

1. Build the smallest agent that solves one problem.

2. Ship it to a real market.

3. Watch it fail.

4. Read the logs.

5. Rewrite.

6. Ship again.

Or until you learn something valuable.

Either way, you win.

Missed out last week?

Don't worry, these conversations happen every Friday at SQ Collective.

Usually over laptops. Sometimes over pizza.

You're welcome to join the next one.

If you've ever built something that's never been built before, f you've ever looked at a blank terminal and thought "I have no idea how to make this work, but I'm going to try anyway", you belong at SQ Collective.

You don't need permission to build. You don't need a degree. You don't need a big company backing you. You just need to ship. And then ship again.