Welcome to Your AI Research Partner
The Deep Research Agent is a powerful tool designed to automate the entire research lifecycle. It transforms a single topic into a comprehensive, well-structured, and fully-cited report. This guide will walk you through its features and show you how to get the most out of your automated research assistant.
Why It Matters: Beyond the Filter Bubble
The Problem: Algorithmic Bias & Informational Silos
Traditional research workflows often rely on a single search engine or database. This creates a "filter bubble," where algorithmic biases can inadvertently hide critical information and diverse perspectives, leading to an incomplete or skewed understanding of a topic.
The Solution: Multi-Engine Synthesis with an AI Judge
This agent systematically overcomes this limitation. By generating separate reports from different search engines (e.g., a standard web search vs. an academic search) and then using a powerful LLM as an impartial "Judge," it identifies and merges the most valuable, unique insights from each source into a single, definitive report.
The Value Proposition: Higher-Fidelity Insights
This method produces a final analysis that is more comprehensive, objective, and robust than what any single source could provide. Furthermore, the **Scoring** utility provides direct feedback on report quality, helping you iteratively refine the research plan itself. By comparing how different engines cover a topic, you can identify and fill knowledge gaps, leading to a truly superior research outcome.
Core Features
Automated Planning
The AI first acts as an expert planner, breaking down your topic into logical objectives and sub-topics to ensure comprehensive coverage.
Deep Web Research
For each sub-topic, the AI performs a deep dive using modern search-enabled models, gathering relevant information from across the web.
Cited Synthesis
The system synthesizes all findings into a single report, ensuring every claim is backed by a citation and provides a clean, centralized reference list.
Engine Flexibility
Choose different LLM providers (like Claude, Perplexity, Gemini, and OpenAI) for different stages to optimize for quality and cost.
AI Judging
Generate multiple report versions, then have a final "AI Judge" analyze them to create a single, superior version that combines their strengths.
Automated Scoring
Quantify the quality of each report. The AI scores syntheses against the original plan on metrics like objective fulfillment, coverage, and insight.
How to Use: The Research Workflow
Step 1: Define Your Research Topic
This is your starting point. Provide a clear and concise research topic. The more specific your topic, the better the resulting plan will be.
- Enter your topic in the text area. Good examples include "The impact of quantum computing on modern cryptography" or "Sustainable urban planning strategies in coastal cities."
Two Options for Creating Your Research Plan:
Option A: AI-Generated Plan (Recommended)
- Select a Plan Generation Engine from the dropdown (Gemini, Claude, Perplexity, OpenAI, or DeepSeek)
- Choose Research Depth: Select the number of research questions:
- 2 - Quick overview (fastest)
- 3 - Standard (recommended for most topics)
- 5 - Comprehensive (deeper exploration)
- 7 - Deep dive (maximum detail)
- Click the Generate Research Plan button to let AI create a structured plan for you
Option B: Manual Plan Entry
- Check the box "I'll enter my own research plan" to skip AI generation
- Click Continue to Plan Editor to write your own plan manually
- Useful when you have a specific research structure in mind or want full control over the research questions
Step 2: Review and Refine the Plan
This is the most critical step for ensuring a high-quality outcome. The AI will generate a detailed plan, but you are the expert. You can now edit the plan directly in the text box.
- Review the Objectives: Do they align with your goals?
- Check the Subtopics: Are they relevant? Should any be added, removed, or rephrased?
- Adjust Search Queries: You can modify the suggested search queries under each subtopic to guide the research agent more effectively.
Tip: Taking a few minutes to refine the plan here can dramatically improve the final report's focus and relevance.
Step 3: Configure and Execute
Once you are satisfied with your research plan, configure the execution parameters and start the main process.
- Select a Search Engine: This model will perform the actual web research for each subtopic. Some engines have special parameters (like recency) you can adjust.
- Select a Synthesis Engine: This model will read all the research findings and write the final, cohesive report.
- Set Target Report Length: Provide an approximate word count for the final report.
- Click **Generate New Research**. The agent will now work in the background with real-time status updates showing progress through each step (analyzing subtopics, synthesizing, etc.). This initial run gathers all the necessary data and unlocks the advanced features below.
Visualizing the Workflow
The Full Research Lifecycle
The agent provides a seamless user experience, guiding you from the initial topic definition to the final, advanced stages of judging and scoring multiple AI-generated reports.
Sample Score Report
After generating reports, the AI scoring utility provides a detailed, quantitative breakdown of each version's performance, helping you objectively identify the highest-quality synthesis.
External Prompt Management
All LLM prompts are now stored externally in the prompts/ directory, allowing you to customize the agent's behavior without modifying the core code.
Why External Prompts?
- No Code Changes: Modify prompt templates by editing text files - no Python knowledge required
- Runtime Reload: Use the "Reload Prompts" button in the UI to apply changes instantly without restarting the server
- Experimentation Friendly: Test different prompt strategies quickly and iterate based on results
- Version Control: Track prompt changes in Git alongside your code
Available Prompt Templates
Search Engine Prompts
research_claude.txt- Claude web search with citationsresearch_perplexity.txt- Perplexity web search with citationsresearch_tavily.txt- Tavily API documentationresearch_serpapi.txt- SerpApi API documentation
Processing Engine Prompts
Shared by Gemini, OpenAI, DeepSeek, Perplexity, Claude
planning.txt- Generate research plans from topicsynthesis.txt- Synthesize findings into reportjudge.txt- Merge multiple reports into superior onescoring.txt- Score reports against research planparse_plan.txt- Parse text plan back to JSON
How to Reload Prompts
- Edit any
.txtfile in theprompts/directory - Save your changes
- Click the "Reload Prompts" button in the navigation bar (or restart the server)
- Verify success by checking the confirmation alert showing loaded prompts
Note: All prompt templates support Python .format() style placeholders like {variable_name}.
Demo: Multi-Engine Fusion Results
See the power of multi-engine synthesis with this real example using two different search engines to research RAG system design.
Research Topic & Plan
Topic: RAG System Design
Generated with 2 research questions for a broad overview. More questions = deeper research (you can choose 2, 3, 5, or 7 questions).
View Research PlanAI Judge Fusion Report
The AI Judge analyzed both source reports and created a superior synthesis that combines their strengths while preserving the best content from each.
Key Fusion Benefits:
- ✓ Preserves unique insights from both search engines
- ✓ Resolves contradictions by presenting multiple viewpoints
- ✓ Eliminates redundancy while maintaining citation density
- ✓ Creates coherent narrative with better structure
What to Observe
When comparing these three reports, notice how:
- Source Reports 1 & 2 may have different coverage, sources, and perspectives due to using different search engines
- Fusion Report (labeled "Fusion (2 sources)") combines the best elements from both, creating a more comprehensive and balanced analysis
- Citation density is maintained throughout - all claims remain well-supported
- Structure is improved with better flow and organization
Advanced Features & Tips
Recent Improvements (v1.2.4+)
Better Table Formatting
All synthesis engines now properly render markdown tables with clean borders, alternating row colors, and hover effects. Tables are automatically formatted with proper pipe (`|`) alignment.
Perplexity Citation Mapping Fixed
Reports generated using Perplexity as the search engine now have accurate citation mapping between in-text citations and the References section. The system properly handles Perplexity's unique citation format.
Adjustable Research Depth
Control how detailed your research plan is by choosing 2, 3, 5, or 7 research questions. More questions mean deeper exploration but longer research time.
Iteration with "Re-Synthesize"
After a report is generated, the **Re-Synthesize** button appears. This powerful feature allows you to re-write the final report using the *already-gathered research data*. This is incredibly useful for:
- Trying a different synthesis model to see if it produces a better narrative.
- Adjusting the report length without having to re-run the entire (and time-consuming) research process.
- Creating multiple versions of the report to compare before final judging.
Step 4 (Optional): Synthesis Management
After generating one or more syntheses, the **Synthesis Management** panel appears. This is your command center for comparing different versions and creating a definitive final report.
Scoring Reports: Quantifying Quality
The scoring utility provides an objective, AI-driven evaluation of your reports, allowing you to quickly compare their performance. The scorer uses the same AI-powered analysis tools (Draft Statistics and Citation Overlap Analysis) to inform its evaluation.
The Scoring Workflow:
- Generate Reports: Create one or more reports using the "Generate" or "Re-Synthesize" buttons.
- Select Reports to Score: In the management panel, use the checkboxes to select one or more reports.
- Choose an Engine: The "Judge Engine" dropdown is also used for scoring. Select the LLM you want to act as the evaluator.
- Click "Score Selected": A modal window will appear, showing a detailed score card for each selected report. You'll see an overall score and a breakdown across key metrics like Objective Fulfillment, Question Coverage, and Depth & Insight.
The AI Judge: Creating the Definitive Report
The judge's goal is to analyze multiple drafts and produce a single, superior report that best fulfills the original research plan's objectives. It intelligently merges content, resolves contradictions, and ensures the best information is included.
AI-Powered Analysis Tools:
The judge is assisted by two automated analysis tools that provide insights into the source reports:
- 📊 Draft Statistics: Automatically calculates quantitative metrics for each report including word count, total citations, unique citations, citation density (citations per 100 words), and citation coverage (percentage of paragraphs with citations). This helps identify which reports are most comprehensive and well-supported.
- 🔗 Citation Overlap Analysis: Analyzes which sources are cited by multiple reports (consensus sources) versus sources unique to each report. This helps the judge identify broadly-supported information versus unique insights that only one engine discovered.
These tools run automatically during judging and scoring, providing the AI with data-driven insights to make better fusion decisions.
How the Judge Works:
- Quality Filtering: The judge actively filters out low-quality content including unsubstantiated statistics, unsupported claims, vague filler statements, and redundant information.
- Selective Citations: Only sources that were actually cited in the report text are included in the References section. If the highest in-text citation is [48], only 48 sources will be listed.
- Citation Accuracy: Every statistic and factual claim in the judged report must have a proper citation marker [CITATION:X] attached to it.
- Quality Over Quantity: The judge is instructed that a shorter, well-supported report is better than a longer report filled with unverified claims.
- Tool-Guided Synthesis: The judge uses the citation overlap analysis to prioritize consensus sources (information verified by multiple engines) while preserving unique insights from individual reports.
The Judging Workflow:
- Generate Multiple Versions: Use the **Re-Synthesize** button to create at least two reports to compare.
- Select Reports for Judging: Use the checkboxes to select two or more reports you want the judge to analyze.
- Choose a Judge Engine: Select the LLM you want to act as the final editor.
- Click "Judge Selected": Watch as real-time status updates show the judge's progress (analyzing reports, generating citation overlap statistics, synthesizing, post-processing). The AI will analyze the reports and create a new, consolidated version, which will be added to the list and displayed.
Note: Because the judge applies strict quality filtering, the resulting report may be shorter than the source reports. This is intentional - the judge removes filler, redundancy, and unsupported content while preserving only well-sourced, verifiable information.
Report Management
At the top of the displayed report, you have three options:
- Start New Topic: Clears the entire session (plan, research data, and all syntheses) and takes you back to Step 1.
- Copy HTML: Copies the raw HTML of the currently viewed report to your clipboard, perfect for pasting into a CMS or other web-based editors.
- Save as HTML: Downloads a complete, standalone HTML file of your currently viewed report for easy sharing and archiving.