The multi-agent research workflow that turns 50+ papers into a structured, grounded, bias-checked synthesis — inside NotebookLM.
You uploaded 50 papers. You asked NotebookLM to synthesize them. The answer was… fine. But it missed the real gaps. Here's why.
"I have 80 papers but NotebookLM only takes 50. I uploaded everything and the answers got vaguer, not sharper."
"My advisor says the gap is obvious. I've read every paper. I still can't see it. Comparing across 50 sources manually is impossible."
"Sometimes NotebookLM gives brilliant insights. Other times it hallucinates citations. I can't trust the output without re-reading everything."
"I don't know what I've already analyzed. Switching topics loses all context. Am I making progress or going in circles?"
A complete research operating system. Each layer has specialized prompts, tools, and quality controls. They loop until your synthesis is bulletproof.
Each workflow is a self-contained research engine. Pick the one that fits your time budget and paper count. All use the same OS v2 prompt library.
Pre-process 50 papers through automated quality gates. Extract RQ, methodology, findings, limitations per paper. Sort by relevance × rigor × recency. Distribute to Hub-Spoke notebooks with taxonomy alignment.
Run structural gap analysis across 4 categories (theoretical, methodological, empirical, synthesis). Generate hypotheses at 3 risk levels per gap. Build a prioritized research tree with status tracking.
For each hypothesis: targeted querying → cross-Spoke validation → adversarial debate via Fighting Arena → grounding check against sources. Update research tree status. Iterate until all branches resolved.
Bias audit across 5 dimensions. Confidence scoring with 6 factors per claim. Progress dashboard update. Weekly meta-reflection: "Am I making progress or going in circles?" Loop back to Phase 3 for unresolved branches.
From your research tree, select the most contested question. Each Spoke notebook generates a position using ONLY its sources. Advocate A argues for, Advocate B argues against, Advocate C reviews the raw evidence neutrally.
External LLM (Claude/GPT) moderates. Each advocate: acknowledge strongest opposing claim → attack weakest claim → present counter-evidence → state what would change their mind. Critic finds logical flaws in all positions.
Moderator identifies: what was resolved, what remains contested, what new questions emerged. Workshop notebook grounds every claim in actual sources. Final integrated position with confidence levels and caveats.
Each prompt is battle-tested across the PERCEIVE → PLAN → ACT → EVALUATE cycle. Preview two of the most powerful — unlock the full library.
Scans your entire literature base across 4 gap categories. Finds what's missing that you didn't know to look for.
Produces a synthesis that explicitly marks what's proven, probable, emerging, and unknown — with citations per claim.
Quantified improvements across every dimension of the research process.
| Dimension | OS v1 (Typical Use) | OS v2 (This System) | Δ |
|---|---|---|---|
| Notebook Setup | 1 notebook, 50 sources crammed in | Hub + 3 Spokes + Workshop | ↑ 3x capacity |
| Gap Detection | Manual, ask once, hope for the best | 4-category automated structural analysis | ↑ 3-5x gaps found |
| Perspectives | Single LLM answer (confirmation bias) | Multi-agent adversarial debate | ↑ Bias reduction |
| Synthesis | Trust single output, no verification | 4-layer confidence + source grounding | ↑ 40-60% accuracy |
| Bias Checking | None | 5-type systematic audit | ↑ 70% catch rate |
| Progress | No tracking, no memory | Dashboard + weekly meta-reflection | ↑ Continuous |
| Time to Insight | Hours of manual comparison | Structured pipeline, smart routing | ↑ ~50% faster |
| Scalability | ~20 papers before quality degrades | 50-100+ papers with tiered architecture | ↑ 5x scale |
25+ battle-tested prompts · 3 complete workflows · Templates · The complete multi-agent research system.
Free. No spam. One email with the complete package.