Research OS v2 — June 2025

Stop Prompting.
Start Operating.

The multi-agent research workflow that turns 50+ papers into a structured, grounded, bias-checked synthesis — inside NotebookLM.

⬇ See the Workflow 🧠 Preview Prompts
4
OS Layers
25+
Prompt Library
3
Full Workflows
50→100+
Paper Capacity

Why Your Research Hits a Wall

You uploaded 50 papers. You asked NotebookLM to synthesize them. The answer was… fine. But it missed the real gaps. Here's why.

📚

Literature Overload

"I have 80 papers but NotebookLM only takes 50. I uploaded everything and the answers got vaguer, not sharper."

🔍

Gap Detection Failure

"My advisor says the gap is obvious. I've read every paper. I still can't see it. Comparing across 50 sources manually is impossible."

📊

Unstable Synthesis

"Sometimes NotebookLM gives brilliant insights. Other times it hallucinates citations. I can't trust the output without re-reading everything."

🔄

No Progress Tracking

"I don't know what I've already analyzed. Switching topics loses all context. Am I making progress or going in circles?"

Research OS v2 — Four Layers

A complete research operating system. Each layer has specialized prompts, tools, and quality controls. They loop until your synthesis is bulletproof.

📡
PERCEIVE Layer 1
Multi-source ingestion → auto-parsing → quality assessment → version management. Feed the system 50+ papers with structured metadata extraction.
P1.1 paper-parse P1.2 quality-gate P1.3 version-diff
🗺️
PLAN Layer 2
Gap detection across 4 categories → hypothesis generation at 3 risk levels → research tree with priority queue and status tracking.
P2.1 gap-detect P2.3 hypothesis-factory P2.4 research-tree
ACT Layer 3
4-level iterative querying → multi-agent debate (Fighting Arena) → grounded synthesis with source verification. No hallucination survives.
P3.2 deep-dive P3.5 arena-debate P3.6 ground-check
📊
EVALUATE Layer 4
Systematic bias audit (5 types) → 6-dimension confidence scoring → progress dashboard → weekly meta-reflection loop.
P4.1 bias-audit P4.2 confidence P4.3 meta-reflect

Multi-Notebook Hub-Spoke Architecture

🏛️ MASTER HUB
Cross-domain Map · Taxonomy · Gap Tracker
📚 Spoke A
Theory · 15 sources
🔬 Spoke B
Methods · 12 sources
📋 Spoke C
Evidence · 12 sources
🔬 WORKSHOP — Grounded Synthesis Arena

Three Complete Closed Loops

Each workflow is a self-contained research engine. Pick the one that fits your time budget and paper count. All use the same OS v2 prompt library.

01

OS v2 Full Pipeline — 50+ Papers

Complete research cycle from ingestion to validated synthesis. ~14 days.

Phase 1 — Perceive

Systematic Ingestion

Pre-process 50 papers through automated quality gates. Extract RQ, methodology, findings, limitations per paper. Sort by relevance × rigor × recency. Distribute to Hub-Spoke notebooks with taxonomy alignment.

P1.1 paper-parse P1.2 quality-gate Taxonomy template
Phase 2 — Plan

Gap Detection & Hypothesis Generation

Run structural gap analysis across 4 categories (theoretical, methodological, empirical, synthesis). Generate hypotheses at 3 risk levels per gap. Build a prioritized research tree with status tracking.

P2.1 gap-detect P2.2 priority-matrix P2.3 hypothesis-factory P2.4 research-tree
Phase 3 — Act

Multi-Agent Investigation

For each hypothesis: targeted querying → cross-Spoke validation → adversarial debate via Fighting Arena → grounding check against sources. Update research tree status. Iterate until all branches resolved.

P3.2 deep-dive P3.3 cross-validate P3.5 arena-debate P3.6 ground-check P3.7 multi-layer synthesis
Phase 4 — Evaluate

Quality Control & Reflection

Bias audit across 5 dimensions. Confidence scoring with 6 factors per claim. Progress dashboard update. Weekly meta-reflection: "Am I making progress or going in circles?" Loop back to Phase 3 for unresolved branches.

P4.1 bias-audit P4.2 confidence-score P4.3 meta-reflect Dashboard template
02

Fighting Arena — Council of Agents

Multi-perspective adversarial debate. Resolves contested questions in 2 hours.

Round 1 — Position Statements

Define the Arena Question

From your research tree, select the most contested question. Each Spoke notebook generates a position using ONLY its sources. Advocate A argues for, Advocate B argues against, Advocate C reviews the raw evidence neutrally.

P3.5 advocate-role Spoke A, B, C
Round 2 — Adversarial Exchange

Structured Debate (3 Clashes)

External LLM (Claude/GPT) moderates. Each advocate: acknowledge strongest opposing claim → attack weakest claim → present counter-evidence → state what would change their mind. Critic finds logical flaws in all positions.

External LLM debate Cross-Spoke verification
Round 3 — Grounded Synthesis

Resolution & Integration

Moderator identifies: what was resolved, what remains contested, what new questions emerged. Workshop notebook grounds every claim in actual sources. Final integrated position with confidence levels and caveats.

P3.6 ground-check P3.7 multi-layer synthesis Workshop notebook
03

FUTURE Pipeline — For Working Professionals

45 minutes/week. Structured research momentum that survives a busy schedule.

Monday
F
Frame
5 min
Tuesday
U
Upload
10 min
Wednesday
T
Target & Think
15 min
Thursday
U
Unpack & Use
10 min
Friday
R
Reflect & Route
5 min
📅
Monthly: E — Evolve & Extend
60 min deep synthesis · Research tree update · Strategic planning

25+ Prompts. Here Are 2.

Each prompt is battle-tested across the PERCEIVE → PLAN → ACT → EVALUATE cycle. Preview two of the most powerful — unlock the full library.

🔍
Gap Detection
PLAN Layer

Structural Gap Detection Engine

Scans your entire literature base across 4 gap categories. Finds what's missing that you didn't know to look for.

/* Scan all source summaries and perform structured gap analysis */ For each of the 4 categories below, identify gaps in my literature: ## Category 1: Theoretical Gaps - Concepts used but never defined? - Assumptions shared but never tested? - Missing theoretical frameworks? ## Category 2: Methodological Gaps - Overused methods? - Underused methods? - Missing validation? ## Category 3: Empirical Gaps - Understudied populations/contexts? - Correlation without causation tests? - Time periods lacking data? ## Category 4: Synthesis Gaps - Cross-domain contradictions? - Unnamed repeating patterns? - "Elephants in the room"? For each gap, rate: [CRITICAL/HIGH/MEDIUM/LOW] importance [EASY/MODERATE/HARD] feasibility [OVERLOOKED/PARTIALLY_NOTED] novelty
Grounded Synthesis
ACT Layer

Multi-Layer Synthesis with Confidence Scoring

Produces a synthesis that explicitly marks what's proven, probable, emerging, and unknown — with citations per claim.

/* Produce synthesis of [topic] with 4 confidence layers */ LAYER 1: Well-Established (multiple sources agree) → Confidence: HIGH → Requires: 3+ sources per claim LAYER 2: Probable (2 sources agree, none disagree) → Confidence: MEDIUM → Requires: 2 sources per claim LAYER 3: Emerging (1 strong source suggests) → Confidence: LOW → Flagged for validation LAYER 4: Unknown (gaps identified, no data) → Action: Investigation needed For each claim provide: - [direct quote or summary] - [confidence 0-100] - [potential counter-evidence]

OS v1 vs. OS v2

Quantified improvements across every dimension of the research process.

Dimension OS v1 (Typical Use) OS v2 (This System) Δ
Notebook Setup 1 notebook, 50 sources crammed in Hub + 3 Spokes + Workshop ↑ 3x capacity
Gap Detection Manual, ask once, hope for the best 4-category automated structural analysis ↑ 3-5x gaps found
Perspectives Single LLM answer (confirmation bias) Multi-agent adversarial debate ↑ Bias reduction
Synthesis Trust single output, no verification 4-layer confidence + source grounding ↑ 40-60% accuracy
Bias Checking None 5-type systematic audit ↑ 70% catch rate
Progress No tracking, no memory Dashboard + weekly meta-reflection ↑ Continuous
Time to Insight Hours of manual comparison Structured pipeline, smart routing ↑ ~50% faster
Scalability ~20 papers before quality degrades 50-100+ papers with tiered architecture ↑ 5x scale

Get the Full OS v2 Package

25+ battle-tested prompts · 3 complete workflows · Templates · The complete multi-agent research system.

Free. No spam. One email with the complete package.