What are the components of a deep research agent?
content
I am once again following Ivan’s tutorial, it seems like the only difference between his deep research agent and a normal agent is that the deep research agent has a to-do tool
I think that gives a good starting point but it’s not able to achieve deep research
On top of my head I think it’s not a bad idea to start from human’s research behavior (more specifically, my own research behavior back in uni days)
starting from human’s research behavior
- How a deep research agent should work is probably similar to human’s research behavior
- the goal is usually to generate a report
- when we research something, we first think of a few bullets points (outlines/table_of_content for the final report) to research on → breadth.
- then as we go to each topic, as the research goes, we click the next link to read from the link we’re reading now → depth. and when we research, we might have a few queries we wanna send to search engine at the same time → breadth
- then we write the report and polish the report
components
breadth and depth control
-
for depth:
- for that this is pretty much a recursive function I think, we just need to research something, decide if it’s in-depth enough for the goal.
- if we need more in-depth research, we then ask follow-up questions, and continue the next round of research
- the agent should be able to study whatever it has searched and ask a follow-up question. if it hasn’t reached its maximum depth, continue the next round of searching
-
for breadth:
- the variation of the follow-up questions
- depth: how many rounds of
follow-up question + researchshould we run - breadth: how may follow-up questions we should ask for each research
- depth: how many rounds of
- e.g., research topic is “AI agent”
- small breadth: search “what are AI agents”, “examples of AI agents”
- big breadth:
- “what are AI agents”
- “examples of AI agents”
- “AI agent architecture components”
- “multi-agent systems vs single agent”
- “tools used in AI agents”
- the variation of the follow-up questions
memory (or state)
- throughout the research it should be able to remember what it has seen, like all the references or the URLs it has visited
- so we need a memory system which is for:
- duplication: we don’t want to search for the same thing.
- final references, attribution.
query intake
- intake: it should be able to take in a user query
- clarification: and maybe ask the user a few follow-up questions if the initial query is not clear enough
- maybe some decisions: how deep, how wide?
research planning
- this is where a human will decide roughly what should and should not be included in the final report.
- Basically we want to generate a table of contents for the research
researching
- the first stage is probably just collect evidence based on the first outline
- basically just scraping the web.
- Either use a web search tool like Exa, Brave Search, or manually (still by agents, of course) spawn a Chrome instance to do research
- QUESTION: when the agent is doing recursive research and decides to research the follow-up question, how would this affect the initial table of content? I imagine this table of content/research outline would be updated along the way of research
on the implementation level
agents
- I think essentially we need:
- one (main) agent load a
deep-researchskill, triggered byresearch- deep-research skill includes specific prompt like “understand research questions, expand, ask for clarification on query/research scope/final report expectation/target audience/etc”
- write down main points / goals in files for persistence
- main agent spawns individual agents for evidence collection for each research
- for now it’s just researching. We are just collecting evidence
- individual agents write evidence to files
- along the way, what have been researched are captured
- and we will have main agent or another agent to go through all the evidence and writes a first draft of the report
- and then main agent another agent to find gaps in the draft, further research and polish
- one (main) agent load a