Deep Research Agent in Practice (Part 3): Building Multi-Agent Systems

Come to think of it, this series has reached its final installment.

In the first two articles, we discussed the basic architecture and practical applications of Deep Research Agents. Some friends left comments asking: "Can you talk about more complex scenarios?"

Indeed, the previous approach works well for single-topic deep research. But in the real world, our research tasks often involve multiple topics and dimensions.

For example, I want to compare the differences in social welfare systems across five countries. If I let one Agent handle all the content, what would happen?

The context would get mixed together, the Agent might confuse information from different countries, and naturally the effectiveness would deteriorate.

This is the bottleneck of single-Agent systems.

Today, let's discuss how to build multi-Agent systems to solve this pain point.

Opening Thoughts

Before we officially begin, I want to raise four questions:

Why introduce multi-Agent systems?
How are multi-Agent systems built?
How do parent Agents and child Agents interact?
How should the entire agent system be integrated?

These four questions are actually the core content of today's article.

I suggest you read on with these questions in mind. After finishing, I believe you'll have a deeper understanding of multi-Agent systems.

Additionally, this is the conclusion of this series. Later, I'll write a separate article to break down what lies at the core of an Agent—stay tuned.

Why Introduce Multi-Agent Systems?

Let me start with a practical scenario.

For example, I want to do research comparing the differences in three major social welfare systems—healthcare, education, and pension—across five countries: the United States, United Kingdom, Germany, Sweden, and Japan.

Initially, I used a single Agent to handle it, letting it complete all research within one context.

The result?

Poor effectiveness.

Why? Because the context was too messy.

Think about it: five countries, each with three welfare systems—that's 15 dimensions in total. When processing, the Agent could easily confuse information from the US with information from Germany.

Moreover, data sources and statistical standards vary for each country. A single Agent needs to switch between different topics, and its attention gets scattered.

I tested it, and the error rate for a single Agent processing this task was about 30%. That's pretty significant, and it surprised me a bit.

Later, I changed my approach: creating a child Agent for each country, letting them each handle research for their respective country. Then using a parent Agent to summarize all results.

The effect was immediate. The error rate dropped to below 5%.

This made me realize a truth: task decomposition is an effective means to improve Agent performance.

Specifically, multi-Agent systems have these advantages:

1. More focused context

Each Agent only focuses on its own topic, without interference from other topics. It's like working on a project—assigning dedicated people to dedicated tasks naturally improves efficiency.

2. More focused tasks

Child Agents have clear task boundaries and execute more purely. There's no hesitation about "should I pay attention to this information" or "should I dig deeper into that data."

3. Can execute in parallel

This is a big advantage. If tasks for five countries can proceed simultaneously, inference speed can improve several times over.

If you want to be a "Token Burner" and quickly consume large amounts of Tokens, this parallel execution approach is worth trying.

Of course, excessive layering can also bring some Token waste. This tradeoff needs to be grasped based on actual circumstances.

How Are Multi-Agent Systems Built?

Having discussed why, let's talk about how.

The overall idea is actually quite clear: add a supervisor Agent that takes responsibility for task decomposition, allocation, and dispatch.

Multi-Agent Tree Architecture Diagram

The entire system will form a tree structure. The root node is the supervisor Agent (parent Agent), below are multiple child Agents, and child Agents can have even more finely divided Agents below them.

The core of this layered design lies in the granularity of task decomposition.

Too coarse, and child Agent tasks are too heavy with no obvious performance improvement. Too fine, and scheduling costs become too high with serious Token waste.

My own experience is that decomposing by topic or dimension is more appropriate.

Like the example above, decomposing by country is reasonable. But if we go to a third level by "further subdividing each country's healthcare system into insurance, medical resources, and healthcare quality," that might be a bit excessive.

Build steps are roughly like this:

Step 1: Define parent Agent responsibilities

The parent Agent needs these capabilities:

Understand overall task objectives
Decompose tasks into parallel executable subtasks
Assign subtasks to appropriate child Agents
Summarize execution results from child Agents
Check result consistency and completeness

Step 2: Design child Agent interfaces

Each child Agent needs to clarify:

What types of tasks it can handle
What input parameters it needs
What format of results it outputs

Interface design should be simple and clear, avoiding overly complex interaction protocols.

Step 3: Register child Agents to parent Agent

The parent Agent registers child Agents as its "tools." Calling a child Agent is as simple as calling a tool function.

The benefit of this design is that the parent Agent doesn't need to care how child Agents are implemented internally—it only needs to know "what result I get from calling it."

Step 4: Establish communication mechanisms

Parent-child Agents need a communication protocol, including:

Format of call commands
Format of result returns
Error handling mechanisms
Timeout and retry strategies

Step 5: Implement result aggregation

After receiving all child Agent results, the parent Agent needs to:

Unify result formats
Check logical consistency
Resolve conflicts (if results from different child Agents contradict)
Generate final report

This step is crucial. I previously encountered situations where child Agents returned results in inconsistent formats, causing the parent Agent to fail parsing. So interface specifications must be defined in advance.

How Do Parent Agents and Child Agents Interact?

This is the core question of multi-Agent systems.

I've summarized a pattern: let the parent Agent treat child Agents as tools to call.

Parent Agent calling Child Agent flowchart

The specific interaction flow is like this:

Parent Agent decides to call child Agent

When the parent Agent recognizes that a task is suitable for handling by a child Agent, it initiates the call.

For example, the parent Agent discovers it needs to "research the US healthcare system." It looks up registered child Agents, finds the child Agent specifically responsible for US healthcare, and calls it.

Parent Agent constructs call parameters

Call parameters are passed as tool inputs. Typically including:

Task description: clearly telling the child Agent what to do
Background information: overall context of the task
Expected output: what we hope the child Agent returns

For example, the parent Agent might call like this:

Call child Agent "us_healthcare_agent":
{
  "task": "Research core features of the US healthcare system",
  "context": "This is part of a comparative study of five countries' social welfare systems",
  "output_format": "Include healthcare coverage, medical expenditures, medical resources, and other key indicators"
}

Child Agent executes task

After receiving the command, the child Agent begins executing its task. It focuses on its own topic, undisturbed by other information.

Child Agent returns results

After execution completes, the child Agent returns results to the parent Agent. The returned content should be refined conclusions, not all raw data.

For example, the child Agent won't return "all detailed data on US healthcare spending in 2023," but rather "US healthcare spending accounts for about 18% of GDP, the highest among the five countries."

The benefit of this approach is compressing context and reducing Token consumption.

Parent Agent processes returned results

After receiving results, the parent Agent does several things:

Store results
Check whether other information is still needed
If needed, possibly call the same or another child Agent again
Finally summarize all results and generate the final report

An important detail: when the parent Agent calls a child Agent, the command passed should be as concise as possible, conveying only necessary information.

There are two reasons:

Reduce context interference for the child Agent
Save Token costs

I previously made a mistake where the parent Agent passed all background information to the child Agent. The result was that the child Agent's context was too long, which actually affected effectiveness.

Later I learned to only pass "task description + necessary context," and the effect was actually better.

How Should the Entire Agent System Be Integrated?

Having discussed interaction details, let's talk about overall integration.

A good multi-Agent system should have these characteristics:

Modularity

Each Agent is an independent module that can be developed, tested, and deployed separately.

This means that if you want to add research on a new topic, you only need to develop a new child Agent and register it to the parent Agent, without modifying the entire system.

Scalability

As business complexity increases, the system should be able to:

Conveniently add new child Agents
Adjust task decomposition granularity
Optimize scheduling strategies

I recommend using a state machine or workflow engine to manage multi-Agent scheduling.

LangGraph is a good choice.

With LangGraph, you can define each Agent as a node, with edges between nodes defining call relationships. The execution process of the entire system happens on this graph.

The benefits are:

Strong visualization—system architecture is clear at a glance
Easy debugging—can track execution status of each node
Easy to extend—just add new nodes

Conclusion

This series brings us to the end of the practical portion of Deep Research Agents.

From the initial single-Agent architecture to today's multi-Agent systems, we've explored together how to build a powerful research Agent.

Looking back at these three articles, the core idea is actually just one: through reasonable architectural design, enable Agents to better complete tasks.

Single Agents suit simple scenarios, multi-Agents suit complex multi-topic tasks. There's no absolute superiority or inferiority, only suitability or not.

Today we focused on discussing:

Parent-child Agent interaction mechanisms
Parent Agent prompt design

This content is a summary of my experience from actual projects, and I hope it helps you.

Next, I'll write a more foundational article to break down what lies at the core of an Agent.

Many friends have been asking me: "What exactly is an Agent? How is it different from regular AI applications?"

In that article, I'll start from first principles and discuss the essence of Agents.

Series Review:

This series consists of three articles, recording the complete practical process of my building Deep Research Agents:

"Deep Research Agent in Practice (Part 1): Basic Architecture" - Building a research Agent from 0 to 1
"Deep Research Agent in Practice (Part 2): Practical Application" - Applications in real scenarios
"Deep Research Agent in Practice (Part 3): Building Multi-Agent Systems" (this article) - Solutions for complex scenarios

If you're interested in this topic, I recommend reading in order for a more complete understanding.

About the Author:

I'm Code Milestone, an AI Native Coder. This series of articles records my practical process of building Deep Research Agents.

If you're interested in Agent systems, feel free to discuss in the comments. Also welcome to follow my official account for more AI practical experience.

References:

Fairy Project - Complete multi-Agent research system
LangGraph Documentation - Workflow engine
LangChain Deep Research - Official tutorial