Issue #11: Build an AI agent in 5 minutes with Docker and cagent

Welcome to this edition of Ctrl+Alt+Deploy 🚀

I’m Lauro Müller and super happy to have you around 🙂 Let’s dive in right away!

Let me share a secret: you don’t need those fancy multi-agent AI frameworks to create your AI agents. Instead, you can do it in 5 minutes and with a single YAML file. How? Let me show it to you.

Why Build an AI Agent?

We've all experienced the magic of modern AI. Tools like GPT-5 and Claude are incredible generalists, capable of writing code, drafting emails, and answering complex questions. They're the ultimate solo assistants. But in development, the most important work is rarely a solo job. It requires a team of specialists: a product manager to define the scope, a user advocate to clarify the "why", an architect to research the "how", software engineers to implement the “what”, and so on and so forth.

A simple AI prompt can feel smart, but it's often brittle. A truly agentic system is different. It's a flexible, multi-step workflow that can handle an entire class of problems. Today, we'll build exactly that: a robust team of AI agents that can turn any vague idea into a well-defined feature, using Docker's open-source tool, cagent.

Our Goal: Build an Automated Code Generation and Review Pipeline

We've all had AI write a function for us. We give it a prompt, and it generates a snippet of code. It feels like magic, but it’s often just the first step. The code might work, but is it production-ready? Does it have tests? Has it been reviewed for security and style? Does it have clear documentation?

A single AI, no matter how powerful, is a brilliant generalist. But professional software development is a multi-stage process handled by a team of specialists. What if you could build an AI team that mimics that very process?

We’ll move beyond a simple "write me a function" task and build something far more robust. Our goal is to create an agentic team that takes a high-level feature request and takes it through a complete mini-development lifecycle.

The user will provide a prompt like: "Create a Python function that takes a URL, downloads its content, and returns the number of words in the HTML body."

Our agentic team will then execute the following workflow:

The coder_agent writes the initial Python code to solve the problem.
The tester_agent receives the code and writes a pytest unit test for it, covering functionality and edge cases.
The reviewer_agent analyzes the original code for style, security flaws, and best practices, providing actionable feedback.
The refactor_agent takes the original code and the reviewer's feedback to produce an improved, final version.
The documenter_agent receives the final, refactored code and writes a clear, professional docstring for it.

However, we will not hardcode this workflow. Instead, we’ll simply define modular agents, as well as a root agent who will be responsible for orchestrating the calls to different agents.

The Elephant in the Room: Why Bother with an Agentic System?

The truth is, we could give it a try to write a well-crafted prompt to achieve all this in a single shot, but if you’ve tried this, you know that… Well, it just hardly ever works. We get something that “looks good, doesn’t work” (or doesn’t fit the codebase of our project). Having an agentic system tackling the issue opens up the possibility for many improvements:

Separation of Concerns: The "skill" of writing functional code is different from writing robust tests or spotting security flaws. By dedicating an agent to each specific task, we achieve a much higher quality result for each part of the process.
Modular System: We can work on each component of our system individually. Does providing a clearer prompt to the tester agent improve the overall output? How about allowing the reviewer to actually run the tests and integrate the test output into the analysis? Agentic systems are considerably more extensible than having a single prompt.
Structured Workflow: The process is inherently sequential. You can't write tests before code exists, and you shouldn't document a function until it's been finalized. cagent allows a coordinator agent to enforce this workflow, passing the code artifact from one specialist to the next.
Iterative Improvement: The feedback loop between the reviewer and refactor agents is a core part of software development. Agentic systems are perfectly suited to model this cycle of review and improvement.

A 5-Minute Introduction to `cagent`

cagent operates on a simple but powerful principle: a team is composed of a root agent (the coordinator) and a group of sub_agents (the specialists).

The Root Agent: This is your primary point of contact. It receives the initial request from the user. Its main job isn't to do the work itself, but to understand the request, break it down, and delegate tasks to the appropriate specialist sub-agents. It acts as a sort of project manager, orchestrating the entire workflow.
Sub-Agents: These are your specialists. Each one is given a single, focused instruction ("write code," "review code," "write tests", etc.). They receive a task from the root agent, execute it, and report the results back. They do not share knowledge or context with each other directly; all communication is managed by the root agent, ensuring a clean, auditable flow of work.

This hierarchical structure allows you to build complex workflows from simple, single-purpose components.

Understanding the YAML Structure

Wow, sounds complex! Not so much 🙂 The great thing about cagent is that you define this entire team in a single YAML file. Let’s dive a bit deeper into some key fields you'll use:

agents: The top-level key that contains the definition for all your agents.
root: A special agent that serves as the entry point and coordinator.
- sub_agents: This field is defined under the root agent, and contains a list of other agents that the root agent (or any other agent) can delegate tasks to.
model: Specifies which language model the agent should use (e.g., openai/gpt-5-mini or anthropic/claude-sonnet-4-0).
instruction: The most critical part. This is the prompt that defines the agent's role, personality, and the task it's meant to perform. You’ll want to have clear instructions for each of your agents.
toolsets: Defines any external tools the agent can use, such as a search engine (docker:duckduckgo) or other capabilities provided by the Docker MCP Gateway.

Setting Up Your Environment

Before we build the pipeline, you need to get your local environment ready.

Install cagent: If you're on macOS, the easiest way is with Homebrew. For other systems, you can download the binary from the official repository.

brew install cagent

Configure Your API Keys: cagent needs API keys to communicate with the model providers. For our example, we'll be using only the gpt-4o-mini model from OpenAI, since it’s very cost-efficient, but you are free to use other models if you wish to experiment! Export your keys as environment variables in your terminal.

export OPENAI_API_KEY="<your_openai_api_key>"

Building Our AI Development Team in YAML

Now that we have a good understanding of how the YAML file looks like, as well as the environment setup, let's create the blueprint for our agentic team. Create a file named code-pipeline.yaml and add the following configuration. P.S.: I’m adding placeholders to the instructions just to avoid having a hugely long YAML file. You can find the complete prompts a bit below in this article 🙂

version: "2"
 
agents:
  root:
    model: openai/gpt-4o-mini
    instruction: |
      <Root Instruction Placeholder>
    sub_agents:
      - coder_agent
      - tester_agent
      - reviewer_agent
      - refactor_agent
      - documenter_agent
 
  coder_agent:
    model: openai/gpt-4o-mini
    description: Writes Python code.
    instruction: |
      <Coder Instruction Placeholder>

  tester_agent:
    model: openai/gpt-4o-mini
    description: Writes unit tests for a given piece of code.
    instruction: |
      <Tester Instruction Placeholder>

  reviewer_agent:
    model: openai/gpt-4o-mini
    description: Reviews code for quality, style, and security.
    instruction: |
      <Reviewer Instruction Placeholder>

  refactor_agent:
    model: openai/gpt-4o-mini
    description: Refactors code based on feedback.
    instruction: |
      <Refactor Instruction Placeholder>

  documenter_agent:
    model: openai/gpt-4o-mini
    description: Writes documentation for code.
    instruction: |
      <Docs Instruction Placeholder>

The Agents’ Prompts

Once we have this YAML file in place, we just need to fill out the agents’ prompts (pro tip: you might as well ask AI to fill this out for you!) Here are the prompts that AI created for me. There is quite a bit of room for improvement by leveraging prompt engineering techniques, but I’m gonna leave that to you as you tailor this workflow to your specific needs 😉

Root Agent

You are a root coordinator agent managing a software development workflow. Delegate tasks to sub-agents based on the user's request:
  - Use coder_agent to write code
  - Use tester_agent to create unit tests
  - Use reviewer_agent to review code quality
  - Use refactor_agent to improve code based on feedback
  - Use documenter_agent to write documentation
Coordinate their outputs to deliver a complete, tested, and documented solution.

Coder Agent

You are an expert Python developer. Write clean, efficient, and well-structured Python code based on the user's requirements. Follow PEP 8 style guidelines. Include type hints where appropriate. Add brief inline comments for complex logic.

Tester Agent

You are a testing specialist. Write comprehensive unit tests using pytest for the provided code. Ensure tests cover normal cases, edge cases, and error conditions. Use clear test names that describe what is being tested. Aim for high code coverage.

Reviewer Agent

You are a senior code reviewer.
Review the provided code for:
  - Code quality and maintainability
  - Adherence to best practices and style guidelines
  - Potential bugs or logic errors
  - Security vulnerabilities
  - Performance issues
Provide constructive feedback with specific suggestions for improvement.

Refactor Agent

You are a refactoring expert. Improve the provided code based on reviewer feedback. Focus on:
  - Improving code structure and readability
  - Eliminating code smells
  - Enhancing performance where needed
  - Maintaining existing functionality
Explain the changes you make and why they improve the code.

Documenter Agent

You are a technical documentation specialist. Write clear and comprehensive documentation for the provided code. Include:
  - Module/class/function docstrings following Google or NumPy style
  - Usage examples
  - Parameter and return value descriptions
  - Any important notes or warnings
Make documentation accessible to both beginners and experienced developers.

Putting Your Team to the Test

Alright! We have our agents definitions done, so let’s take them for a spin! That’s it, really, you don’t need to do anything else to get them up and running.

Execute the cagent run command in your terminal:

cagent run code-pipeline.yaml

cagent will start an interactive session. When prompted, provide the feature request:

❝

Create a Python function that takes a URL, downloads its content, and returns the number of words in the HTML body.

You can now watch as the coordinator delegates each step, and the specialist agents build upon each other's work. The final output will be a complete package: a well-written, tested, reviewed, and documented function, all generated from a single high-level prompt.

Isn’t that amazing? And this is just the beginning! cagent also has built-in tools and native integration with Docker’s MCP Gateway, so the sky is the limit in terms of what you can achieve with your agents 🚀

🎉 That's a wrap!

Thanks for reading this edition of Ctrl+Alt+Deploy. Found these insights valuable? Share this newsletter with fellow developers and let me know which story resonated with you most!

Until next time, keep coding and stay curious! 💻✨

💡 Curated with ❤️ for the developer community