Deep Research

advanced

Pipeline de pesquisa profunda com multiplos agentes em paralelo. Gera relatorios em markdown com citacoes e protocolo anti-alucinacao. Use para analises comparativas, investigacoes de mercado e pesquisas abrangentes.

🚀 DevOps & CI/CD View Source Apache-2.0 8 files

Installation

Install via CLI

npx @vector-labs/skills add deep-research

Or target a specific tool

npx @vector-labs/skills add deep-research --tool cursor

View on GitHub

Skill Files (8)

SKILL.md 6.7 KB

---
name: deep-research
description: >-
  Pipeline de pesquisa profunda com multiplos agentes em paralelo. Gera
  relatorios em markdown com citacoes e protocolo anti-alucinacao. Use para
  analises comparativas, investigacoes de mercado e pesquisas abrangentes.
license: Apache-2.0
compatibility: Requires internet access
allowed-tools: Bash Read Edit Write WebFetch WebSearch Agent Glob Grep
metadata:
  author: vector-labs
  version: "1.0"
tags: [research, analysis]
complexity: advanced
featured: true
---

# Deep Research

Agent-orchestrated research pipeline. Takes input text, decomposes into parallel agent tasks, synthesizes results into a citation-backed report.

**Context budget**: Orchestrator stays lean. Agents write to files, return summaries. Synthesis is delegated to a sub-agent. NEVER read agent output files or reference files into orchestrator context — pass paths to sub-agents. NEVER use `run_in_background` — it causes TaskOutput to return full agent logs. If context approaches limit before synthesis, the research is lost.

## Workflow

### 1. Analyze Input

Classify the input:
- **Question**: Direct research question → extract core topic + angles
- **Brief**: Context document with research directive → extract what to investigate
- **Seed context**: Background text that needs expansion → identify knowledge gaps

Determine complexity (drives agent count):
- **Focused** (3 agents): Single topic, clear boundaries
- **Broad** (4-5 agents): Multi-faceted topic, comparison, or trend analysis

### 2. Decompose into Agent Tasks

Break research into 3-5 **independent** investigation angles. Each angle becomes a Task agent.

**Decomposition heuristics:**
- One angle per distinct sub-question or perspective
- Separate factual retrieval from opinion/analysis sources
- Include at least one critical/contrarian angle
- If project context is relevant, dedicate one agent to local analysis

**Agent types** (see [agent-templates](./references/agent-templates.md) for full prompts):

| Type | Tools | Use when |
|------|-------|----------|
| `web-researcher` | WebSearch, WebFetch | External facts, data, current info |
| `local-analyst` | Grep, Read, Glob | Project files, meeting notes, internal docs |
| `deep-diver` | WebSearch, WebFetch | Single source/topic requiring multi-step investigation |

### 3. Deploy Agents (Parallel)

**CRITICAL: Launch ALL agents in a single message with multiple Task tool calls.**

**CRITICAL: Do NOT use `run_in_background: true`.** Launch all agents as parallel Task calls in a single message. They execute concurrently, and each returns ONLY the agent's final message (the 3-5 line summary). Background agents write full conversation logs to output files — reading those with TaskOutput will overflow orchestrator context.

**Context management**: Do NOT read reference files into orchestrator context. Instead, inline the relevant template from [agent-templates](./references/agent-templates.md) directly into each agent's prompt.

**Output directory**: Before launching agents, create an output directory:
`[report-directory]/research-data/`

Each agent prompt MUST include these instructions:
1. Write full findings to a file: `[output-dir]/agent-[angle-slug].md` using the Write tool
2. Return ONLY a 3-5 line summary to the orchestrator containing:
   - File path where findings were written
   - Top 3 key findings (one line each)
   - Overall confidence level (high/medium/low)

```
[Single message — all parallel]
Task(subagent_type="general-purpose", description="Research angle A", prompt=<template with OUTPUT_DIR + file-write instructions>)
Task(subagent_type="general-purpose", description="Research angle B", prompt=<template with OUTPUT_DIR + file-write instructions>)
Task(subagent_type="general-purpose", description="Research angle C", prompt=<template with OUTPUT_DIR + file-write instructions>)
Task(subagent_type="Explore", description="Local context analysis", prompt=<template with OUTPUT_DIR + file-write instructions>)
...
```

Each agent returns ONLY a concise summary (NOT full findings) — see return format in [agent-templates](./references/agent-templates.md). Full findings are written to the agent's output file.

### 4. Synthesize & Write Report (Delegated)

**CRITICAL: Do NOT synthesize in main context.** Delegate to a synthesis sub-agent.

**Output location**: `[relevant-project-or-area-folder]/research-[topic-slug]-[YYYY-MM-DD].md`
If no clear project context, ask the user where to save.

Spawn a single `general-purpose` synthesis agent using the prompt from [synthesis-templates](./references/synthesis-templates.md#synthesis-agent). Fill in:
- **RESEARCH BRIEF**: the original input context
- **AGENT SUMMARIES**: the 3-5 line summaries returned by each agent
- **OUTPUT DIRECTORY**: path to `research-data/`
- **REPORT PATH**: final report location
- **REPORT TEMPLATE PATH**: `~/.claude/skills/deep-research/templates/report_template.md`

The orchestrator receives only a summary — the full report is written to disk by the sub-agent.

### 4b. Handle Late Agents

If agents complete after synthesis, spawn a patch agent using the prompt from [synthesis-templates](./references/synthesis-templates.md#patch-agent-late-findings). Fill in the late agent's output file path and the existing report path.

### 5. Validate

Run validation after writing:
```bash
python ~/.claude/skills/deep-research/scripts/validate_report.py --report [path]
```

Optionally verify citations:
```bash
python ~/.claude/skills/deep-research/scripts/verify_citations.py --report [path]
```

If validation fails: fix and re-validate (max 2 attempts).

## Anti-Hallucination Protocol

- **Source grounding**: Every factual claim cites a specific source [N]
- **No fabricated citations**: If unsure a source says X, do NOT cite it
- **Label inference**: "This suggests..." not "Research shows..."
- **Admit uncertainty**: "No sources found" over invented references

## Error Handling

- <5 sources after exhaustive search → note limitation, proceed with extra verification
- Agent returns empty/low-quality → spawn replacement with refined query
- 2 validation failures → stop, report issues, ask user

## Scripts

- `scripts/validate_report.py` — Report quality validation
- `scripts/verify_citations.py` — Citation verification (DOI + URL checks)
- `scripts/source_evaluator.py` — Source credibility scoring (0-100)
- `scripts/citation_manager.py` — Citation tracking utilities

## References (for sub-agents, not orchestrator)

- [Agent Templates](./references/agent-templates.md) — Structured prompts for research agents. Pass to sub-agents or inline into their prompts.
- [Synthesis Templates](./references/synthesis-templates.md) — Prompts for synthesis and patch agents.
- [Report Template](./templates/report_template.md) — Report output structure. Synthesis agent reads this from disk.

references/

agent-templates.md 4.9 KB

# Agent Templates for Deep Research

Structured prompts for spawning research agents via the Task tool. All agents use `subagent_type="general-purpose"` unless noted otherwise.

**IMPORTANT**: These templates are reference documentation. Do NOT read this file into the main orchestrator context. Instead, inline the relevant template directly into each agent's prompt when spawning agents in Step 3.

## Web Researcher

For gathering external facts, data, and current information on a specific angle.

```
RESEARCH TASK: [Specific angle/question]

CONTEXT: [1-2 sentences on broader research topic and why this angle matters]

OUTPUT_DIR: [path to research-data directory]
OUTPUT_FILE: [OUTPUT_DIR]/agent-[angle-slug].md

INSTRUCTIONS:
1. Run 3-5 WebSearch queries exploring this angle from different keywords
2. For the 2-3 most promising results, use WebFetch to extract detailed content
3. Focus on: specific data points, statistics, dates, named entities, quotes
4. Write your FULL findings to OUTPUT_FILE using the Write tool (structure below)
5. Return ONLY a concise summary to the orchestrator (see RETURN FORMAT)

FINDINGS FILE STRUCTURE (write to OUTPUT_FILE):
## Findings
- **Claim**: [factual statement] | **Source**: [title](URL) | **Confidence**: high/medium/low
[repeat for each finding]

## Key Data Points
- [specific numbers, dates, statistics — always with source]

## Contradictions or Gaps
- [anything conflicting or missing]

## Sources Used
- [N] Author/Org (Year). "Title". URL
[list all sources consulted, even if not all yielded findings]

RETURN FORMAT (to orchestrator — keep this SHORT):
**File**: [OUTPUT_FILE path]
**Top findings**: 1) [finding] 2) [finding] 3) [finding]
**Confidence**: high/medium/low
```

## Local Analyst

For analyzing project files, meeting notes, or internal documentation. Use `subagent_type="Explore"`.

```
ANALYSIS TASK: [What to look for in the codebase/docs]

CONTEXT: [Broader research topic and what internal context would help]

OUTPUT_DIR: [path to research-data directory]
OUTPUT_FILE: [OUTPUT_DIR]/agent-[angle-slug].md

INSTRUCTIONS:
1. Search project files relevant to [topic] using Glob and Grep
2. Read the most relevant files
3. Extract facts, decisions, context, and data points
4. Write your FULL findings to OUTPUT_FILE using the Write tool (structure below)
5. Return ONLY a concise summary to the orchestrator (see RETURN FORMAT)

FINDINGS FILE STRUCTURE (write to OUTPUT_FILE):
## Findings from Project Context
- **Finding**: [what was found] | **Source**: [file path:line] | **Relevance**: high/medium/low
[repeat]

## Key Context
- [decisions, constraints, or background that informs the research]

## Gaps
- [what internal docs don't cover that external research should address]

RETURN FORMAT (to orchestrator — keep this SHORT):
**File**: [OUTPUT_FILE path]
**Top findings**: 1) [finding] 2) [finding] 3) [finding]
**Confidence**: high/medium/low
```

## Deep Diver

For multi-step investigation of a single source, topic, or complex question that requires following leads.

```
INVESTIGATION TASK: [Specific topic requiring deep exploration]

CONTEXT: [Why this needs depth beyond a simple search]

OUTPUT_DIR: [path to research-data directory]
OUTPUT_FILE: [OUTPUT_DIR]/agent-[angle-slug].md

INSTRUCTIONS:
1. Start with 2-3 broad WebSearch queries
2. Follow the most promising leads with WebFetch for full content
3. If a source references other important sources, search for those too
4. Build a chain of evidence on this topic
5. Aim for 5-10 high-quality sources on this specific angle
6. Write your FULL findings to OUTPUT_FILE using the Write tool (structure below)
7. Return ONLY a concise summary to the orchestrator (see RETURN FORMAT)

FINDINGS FILE STRUCTURE (write to OUTPUT_FILE):
## Investigation Summary
[2-3 paragraph narrative of what was found and how findings connect]

## Evidence Chain
- **Claim**: [statement] | **Source**: [title](URL) | **Confidence**: high/medium/low
[repeat — ordered by strength of evidence]

## Key Data Points
- [specific numbers, dates, statistics with sources]

## Open Questions
- [what remains unanswered after investigation]

## Sources Used
- [N] Author/Org (Year). "Title". URL

RETURN FORMAT (to orchestrator — keep this SHORT):
**File**: [OUTPUT_FILE path]
**Top findings**: 1) [finding] 2) [finding] 3) [finding]
**Confidence**: high/medium/low
```

## Usage Notes

- Always provide **CONTEXT** so agents understand how their angle fits the whole
- Keep agent prompts focused — one angle per agent, not the full research question
- For **Focused** research (3 agents): 2 web-researchers + 1 deep-diver
- For **Broad** research (5-7 agents): 3-4 web-researchers + 1-2 deep-divers + 1 local-analyst (if project context relevant)
- Launch ALL agents in a single message for parallel execution
- **Every agent MUST write full findings to a file and return only a summary** — this is critical for context management
- The orchestrator should create `[output-dir]/` before launching agents (e.g., `mkdir -p` via Bash)

synthesis-templates.md 2.7 KB

# Synthesis Templates for Deep Research

Prompt templates for the synthesis and patch agents. These are read by sub-agents or inlined into sub-agent prompts — never loaded into the orchestrator's main context.

## Synthesis Agent

Spawn with `subagent_type="general-purpose"`.

```
SYNTHESIS TASK: Compile research findings into a final report.

RESEARCH BRIEF:
[Original research question/brief — paste the input context here]

AGENT SUMMARIES:
[Paste the 3-5 line summaries returned by each agent]

OUTPUT DIRECTORY: [output-dir]
  — Read all agent-*.md files in this directory for full findings

REPORT PATH: [final-report-path]
REPORT TEMPLATE PATH: [path-to-skill]/templates/report_template.md

INSTRUCTIONS:
1. Read the report template from REPORT TEMPLATE PATH
2. Read ALL agent-*.md files from the output directory
3. Merge findings — deduplicate, group by theme
4. Cross-reference — identify claims supported by multiple agents (triangulation)
5. Resolve contradictions — note them explicitly in the report
6. Generate insights — patterns, implications, second-order effects
7. If critical gaps found, note them in the Limitations section
8. Write the final report to REPORT PATH following the template structure

WRITING STANDARDS:
- Prose-first (bullets only for distinct lists)
- Every factual claim cited inline: "Market reached $2.4B [1]"
- Distinguish facts (from sources) from synthesis (your analysis)
- No vague attributions ("studies show...") — always specific: "According to [1]..."
- Admit gaps: "No sources found for X" rather than fabricating
- Citation format: [N] inline, full bibliography at end
- Bibliography format: [N] Author/Org (Year). "Title". URL (Retrieved: YYYY-MM-DD)

RETURN FORMAT (to orchestrator):
**Report written to**: [path]
**Key numbers**: [3-5 most important statistics/data points from the report]
**Overall confidence**: high/medium/low
**Gaps noted**: [any critical gaps that may need follow-up]
```

## Patch Agent (Late Findings)

Spawn with `subagent_type="general-purpose"`. Use when agents complete after synthesis or when follow-up agents return new data.

```
PATCH TASK: Update an existing research report with new findings.

LATE AGENT OUTPUT: [output-dir]/agent-[late-slug].md
EXISTING REPORT: [final-report-path]

INSTRUCTIONS:
1. Read the late agent's output file
2. Read the existing report
3. Identify high-value additions (new data points, stronger evidence, resolved gaps)
4. Edit the report to integrate additions — update relevant sections and bibliography
5. Do NOT rewrite existing content unless the new findings contradict it

RETURN FORMAT:
**Additions**: [1-2 line summary of what was added]
**Sections modified**: [list of section names touched]
```

scripts/

citation_manager.py 5.9 KB

#!/usr/bin/env python3
"""
Citation Management System
Tracks sources, generates citations, and maintains bibliography
"""

from dataclasses import dataclass, field
from typing import List, Dict, Optional
from datetime import datetime
from urllib.parse import urlparse
import hashlib


@dataclass
class Citation:
    """Represents a single citation"""
    id: str
    title: str
    url: str
    authors: Optional[List[str]] = None
    publication_date: Optional[str] = None
    retrieved_date: str = field(default_factory=lambda: datetime.now().strftime('%Y-%m-%d'))
    source_type: str = "web"  # web, academic, documentation, book, paper
    doi: Optional[str] = None
    citation_count: int = 0

    def to_apa(self, index: int) -> str:
        """Generate APA format citation"""
        author_str = ""
        if self.authors:
            if len(self.authors) == 1:
                author_str = f"{self.authors[0]}."
            elif len(self.authors) == 2:
                author_str = f"{self.authors[0]} & {self.authors[1]}."
            else:
                author_str = f"{self.authors[0]} et al."

        date_str = f"({self.publication_date})" if self.publication_date else "(n.d.)"

        return f"[{index}] {author_str} {date_str}. {self.title}. Retrieved {self.retrieved_date}, from {self.url}"

    def to_inline(self, index: int) -> str:
        """Generate inline citation [index]"""
        return f"[{index}]"

    def to_markdown(self, index: int) -> str:
        """Generate markdown link format"""
        return f"[{index}] [{self.title}]({self.url}) (Retrieved: {self.retrieved_date})"


class CitationManager:
    """Manages citations and bibliography"""

    def __init__(self):
        self.citations: Dict[str, Citation] = {}
        self.citation_order: List[str] = []

    def add_source(
        self,
        url: str,
        title: str,
        authors: Optional[List[str]] = None,
        publication_date: Optional[str] = None,
        source_type: str = "web",
        doi: Optional[str] = None
    ) -> str:
        """Add a source and return its citation ID"""
        # Generate unique ID based on URL
        citation_id = hashlib.md5(url.encode()).hexdigest()[:8]

        if citation_id not in self.citations:
            citation = Citation(
                id=citation_id,
                title=title,
                url=url,
                authors=authors,
                publication_date=publication_date,
                source_type=source_type,
                doi=doi
            )
            self.citations[citation_id] = citation
            self.citation_order.append(citation_id)

        # Increment citation count
        self.citations[citation_id].citation_count += 1

        return citation_id

    def get_citation_number(self, citation_id: str) -> Optional[int]:
        """Get the citation number for a given ID"""
        try:
            return self.citation_order.index(citation_id) + 1
        except ValueError:
            return None

    def get_inline_citation(self, citation_id: str) -> str:
        """Get inline citation marker [n]"""
        num = self.get_citation_number(citation_id)
        return f"[{num}]" if num else "[?]"

    def generate_bibliography(self, style: str = "markdown") -> str:
        """Generate full bibliography"""
        if style == "markdown":
            lines = ["## Bibliography\n"]
            for i, citation_id in enumerate(self.citation_order, 1):
                citation = self.citations[citation_id]
                lines.append(citation.to_markdown(i))
            return "\n".join(lines)

        elif style == "apa":
            lines = ["## Bibliography\n"]
            for i, citation_id in enumerate(self.citation_order, 1):
                citation = self.citations[citation_id]
                lines.append(citation.to_apa(i))
            return "\n".join(lines)

        return "Unsupported citation style"

    def get_statistics(self) -> Dict[str, any]:
        """Get citation statistics"""
        return {
            'total_sources': len(self.citations),
            'total_citations': sum(c.citation_count for c in self.citations.values()),
            'source_types': self._count_by_type(),
            'most_cited': self._get_most_cited(5),
            'uncited': self._get_uncited()
        }

    def _count_by_type(self) -> Dict[str, int]:
        """Count sources by type"""
        counts = {}
        for citation in self.citations.values():
            counts[citation.source_type] = counts.get(citation.source_type, 0) + 1
        return counts

    def _get_most_cited(self, n: int = 5) -> List[tuple]:
        """Get most cited sources"""
        sorted_citations = sorted(
            self.citations.items(),
            key=lambda x: x[1].citation_count,
            reverse=True
        )
        return [(self.get_citation_number(cid), c.title, c.citation_count)
                for cid, c in sorted_citations[:n]]

    def _get_uncited(self) -> List[str]:
        """Get sources that were added but never cited"""
        return [c.title for c in self.citations.values() if c.citation_count == 0]

    def export_to_file(self, filepath: str, style: str = "markdown"):
        """Export bibliography to file"""
        with open(filepath, 'w') as f:
            f.write(self.generate_bibliography(style))


# Example usage
if __name__ == '__main__':
    manager = CitationManager()

    # Add sources
    id1 = manager.add_source(
        url="https://example.com/article1",
        title="Understanding Deep Research",
        authors=["Smith, J.", "Johnson, K."],
        publication_date="2025"
    )

    id2 = manager.add_source(
        url="https://example.com/article2",
        title="AI Research Methods",
        source_type="academic"
    )

    # Use citations
    print(f"Inline citation: {manager.get_inline_citation(id1)}")
    print(f"\nBibliography:\n{manager.generate_bibliography()}")
    print(f"\nStatistics:\n{manager.get_statistics()}")

source_evaluator.py 9.3 KB

#!/usr/bin/env python3
"""
Source Credibility Evaluator
Assesses source quality, credibility, and potential biases
"""

from dataclasses import dataclass
from typing import List, Dict, Optional
from urllib.parse import urlparse
from datetime import datetime, timedelta
import re


@dataclass
class CredibilityScore:
    """Represents source credibility assessment"""
    overall_score: float  # 0-100
    domain_authority: float  # 0-100
    recency: float  # 0-100
    expertise: float  # 0-100
    bias_score: float  # 0-100 (higher = more neutral)
    factors: Dict[str, str]
    recommendation: str  # "high_trust", "moderate_trust", "low_trust", "verify"


class SourceEvaluator:
    """Evaluates source credibility and quality"""

    # Domain reputation tiers
    HIGH_AUTHORITY_DOMAINS = {
        # Academic & Research
        'arxiv.org', 'nature.com', 'science.org', 'cell.com', 'nejm.org',
        'thelancet.com', 'springer.com', 'sciencedirect.com', 'plos.org',
        'ieee.org', 'acm.org', 'pubmed.ncbi.nlm.nih.gov',

        # Government & International Organizations
        'nih.gov', 'cdc.gov', 'who.int', 'fda.gov', 'nasa.gov',
        'gov.uk', 'europa.eu', 'un.org',

        # Established Tech Documentation
        'docs.python.org', 'developer.mozilla.org', 'docs.microsoft.com',
        'cloud.google.com', 'aws.amazon.com', 'kubernetes.io',

        # Reputable News (Fact-check verified)
        'reuters.com', 'apnews.com', 'bbc.com', 'economist.com',
        'nature.com/news', 'scientificamerican.com'
    }

    MODERATE_AUTHORITY_DOMAINS = {
        # Tech News & Analysis
        'techcrunch.com', 'theverge.com', 'arstechnica.com', 'wired.com',
        'zdnet.com', 'cnet.com',

        # Industry Publications
        'forbes.com', 'bloomberg.com', 'wsj.com', 'ft.com',

        # Educational
        'wikipedia.org', 'britannica.com', 'khanacademy.org',

        # Tech Blogs (established)
        'medium.com', 'dev.to', 'stackoverflow.com', 'github.com'
    }

    LOW_AUTHORITY_INDICATORS = [
        'blogspot.com', 'wordpress.com', 'wix.com', 'substack.com'
    ]

    def __init__(self):
        pass

    def evaluate_source(
        self,
        url: str,
        title: str,
        content: Optional[str] = None,
        publication_date: Optional[str] = None,
        author: Optional[str] = None
    ) -> CredibilityScore:
        """Evaluate source credibility"""

        domain = self._extract_domain(url)

        # Calculate component scores
        domain_score = self._evaluate_domain_authority(domain)
        recency_score = self._evaluate_recency(publication_date)
        expertise_score = self._evaluate_expertise(domain, title, author)
        bias_score = self._evaluate_bias(domain, title, content)

        # Calculate overall score (weighted average)
        overall = (
            domain_score * 0.35 +
            recency_score * 0.20 +
            expertise_score * 0.25 +
            bias_score * 0.20
        )

        # Determine factors
        factors = self._identify_factors(
            domain, domain_score, recency_score, expertise_score, bias_score
        )

        # Generate recommendation
        recommendation = self._generate_recommendation(overall)

        return CredibilityScore(
            overall_score=round(overall, 2),
            domain_authority=round(domain_score, 2),
            recency=round(recency_score, 2),
            expertise=round(expertise_score, 2),
            bias_score=round(bias_score, 2),
            factors=factors,
            recommendation=recommendation
        )

    def _extract_domain(self, url: str) -> str:
        """Extract domain from URL"""
        parsed = urlparse(url)
        domain = parsed.netloc.lower()
        # Remove www prefix
        domain = domain.replace('www.', '')
        return domain

    def _evaluate_domain_authority(self, domain: str) -> float:
        """Evaluate domain authority (0-100)"""
        if domain in self.HIGH_AUTHORITY_DOMAINS:
            return 90.0
        elif domain in self.MODERATE_AUTHORITY_DOMAINS:
            return 70.0
        elif any(indicator in domain for indicator in self.LOW_AUTHORITY_INDICATORS):
            return 40.0
        else:
            # Unknown domain - moderate skepticism
            return 55.0

    def _evaluate_recency(self, publication_date: Optional[str]) -> float:
        """Evaluate information recency (0-100)"""
        if not publication_date:
            return 50.0  # Unknown date

        try:
            pub_date = datetime.fromisoformat(publication_date.replace('Z', '+00:00'))
            age = datetime.now() - pub_date

            # Recency scoring
            if age < timedelta(days=90):  # < 3 months
                return 100.0
            elif age < timedelta(days=365):  # < 1 year
                return 85.0
            elif age < timedelta(days=730):  # < 2 years
                return 70.0
            elif age < timedelta(days=1825):  # < 5 years
                return 50.0
            else:
                return 30.0

        except Exception:
            return 50.0

    def _evaluate_expertise(
        self,
        domain: str,
        title: str,
        author: Optional[str]
    ) -> float:
        """Evaluate source expertise (0-100)"""
        score = 50.0

        # Academic/research domains get high expertise
        if any(d in domain for d in ['arxiv', 'nature', 'science', 'ieee', 'acm']):
            score += 30

        # Government/official sources
        if '.gov' in domain or 'who.int' in domain:
            score += 25

        # Technical documentation
        if 'docs.' in domain or 'documentation' in title.lower():
            score += 20

        # Author credentials (if available)
        if author:
            if any(title in author.lower() for title in ['dr.', 'phd', 'professor']):
                score += 15

        return min(score, 100.0)

    def _evaluate_bias(
        self,
        domain: str,
        title: str,
        content: Optional[str]
    ) -> float:
        """Evaluate potential bias (0-100, higher = more neutral)"""
        score = 70.0  # Start neutral

        # Check for sensationalism in title
        sensational_indicators = [
            '!', 'shocking', 'unbelievable', 'you won\'t believe',
            'secret', 'they don\'t want you to know'
        ]
        title_lower = title.lower()
        if any(indicator in title_lower for indicator in sensational_indicators):
            score -= 20

        # Academic sources are typically less biased
        if any(d in domain for d in ['arxiv', 'nature', 'science', 'ieee']):
            score += 20

        # Check for balance in content (if available)
        if content:
            # Look for balanced language
            balanced_indicators = ['however', 'although', 'on the other hand', 'critics argue']
            if any(indicator in content.lower() for indicator in balanced_indicators):
                score += 10

        return min(max(score, 0), 100.0)

    def _identify_factors(
        self,
        domain: str,
        domain_score: float,
        recency_score: float,
        expertise_score: float,
        bias_score: float
    ) -> Dict[str, str]:
        """Identify key credibility factors"""
        factors = {}

        if domain_score >= 85:
            factors['domain'] = "High authority domain"
        elif domain_score <= 45:
            factors['domain'] = "Low authority domain - verify claims"

        if recency_score >= 85:
            factors['recency'] = "Recent information"
        elif recency_score <= 40:
            factors['recency'] = "Outdated information - verify currency"

        if expertise_score >= 80:
            factors['expertise'] = "Expert source"
        elif expertise_score <= 45:
            factors['expertise'] = "Limited expertise indicators"

        if bias_score >= 80:
            factors['bias'] = "Balanced perspective"
        elif bias_score <= 50:
            factors['bias'] = "Potential bias detected"

        return factors

    def _generate_recommendation(self, overall_score: float) -> str:
        """Generate trust recommendation"""
        if overall_score >= 80:
            return "high_trust"
        elif overall_score >= 60:
            return "moderate_trust"
        elif overall_score >= 40:
            return "low_trust"
        else:
            return "verify"


# Example usage
if __name__ == '__main__':
    evaluator = SourceEvaluator()

    # Test sources
    test_sources = [
        {
            'url': 'https://www.nature.com/articles/s41586-2025-12345',
            'title': 'Breakthrough in Quantum Computing',
            'publication_date': '2025-10-15'
        },
        {
            'url': 'https://someblog.wordpress.com/shocking-discovery',
            'title': 'SHOCKING! You Won\'t Believe This Discovery!',
            'publication_date': '2020-01-01'
        },
        {
            'url': 'https://docs.python.org/3/library/asyncio.html',
            'title': 'asyncio — Asynchronous I/O',
            'publication_date': '2025-11-01'
        }
    ]

    for source in test_sources:
        score = evaluator.evaluate_source(**source)
        print(f"\nSource: {source['title']}")
        print(f"URL: {source['url']}")
        print(f"Overall Score: {score.overall_score}/100")
        print(f"Recommendation: {score.recommendation}")
        print(f"Factors: {score.factors}")

validate_report.py 6.2 KB

#!/usr/bin/env python3
"""
Report Validation Script
Validates research reports for quality standards.
"""

import argparse
import re
import sys
from pathlib import Path
from typing import List


class ReportValidator:
    """Validates research report quality"""

    def __init__(self, report_path: Path):
        self.report_path = report_path
        self.content = self._read_report()
        self.errors: List[str] = []
        self.warnings: List[str] = []

    def _read_report(self) -> str:
        try:
            with open(self.report_path, 'r', encoding='utf-8') as f:
                return f.read()
        except Exception as e:
            print(f"ERROR: Cannot read report: {e}")
            sys.exit(1)

    def validate(self) -> bool:
        print(f"\n{'='*60}")
        print(f"VALIDATING: {self.report_path.name}")
        print(f"{'='*60}\n")

        checks = [
            ("Has sections", self._check_has_sections),
            ("Citations present", self._check_citations),
            ("Bibliography", self._check_bibliography),
            ("No placeholders", self._check_placeholders),
            ("No truncation", self._check_content_truncation),
            ("Word count", self._check_word_count),
        ]

        for check_name, check_func in checks:
            print(f"  Checking: {check_name}...", end=" ")
            passed = check_func()
            print("PASS" if passed else "FAIL")

        self._print_summary()
        return len(self.errors) == 0

    def _check_has_sections(self) -> bool:
        """Check report has meaningful structure (at least 3 ## headings)"""
        sections = re.findall(r'^## .+$', self.content, re.MULTILINE)
        if len(sections) < 3:
            self.errors.append(f"Too few sections: {len(sections)} (need at least 3)")
            return False

        # Warn if missing common sections
        content_lower = self.content.lower()
        if 'bibliography' not in content_lower:
            self.warnings.append("No Bibliography section found")
        if 'executive summary' not in content_lower and 'summary' not in content_lower:
            self.warnings.append("No Executive Summary section found")

        return True

    def _check_citations(self) -> bool:
        """Check citations exist and are reasonably numbered"""
        citations = re.findall(r'\[(\d+)\]', self.content)
        if not citations:
            self.errors.append("No citations [N] found in report")
            return False

        unique = set(citations)
        if len(unique) < 5:
            self.warnings.append(f"Only {len(unique)} unique sources (consider expanding)")

        return True

    def _check_bibliography(self) -> bool:
        """Check bibliography exists and has entries matching citations"""
        pattern = r'## Bibliography(.*?)(?=##|\Z)'
        match = re.search(pattern, self.content, re.DOTALL | re.IGNORECASE)

        if not match:
            self.warnings.append("Missing Bibliography section")
            return True  # Warning, not error

        bib_section = match.group(1)

        # Check for truncation placeholders
        truncation_patterns = [
            (r'\[\d+-\d+\]', 'Citation range (e.g., [8-75])'),
            (r'Additional.*citations', '"Additional citations"'),
            (r'\[Continue with', '"[Continue with"'),
        ]
        for pattern_re, description in truncation_patterns:
            if re.search(pattern_re, bib_section, re.IGNORECASE):
                self.errors.append(f"Bibliography truncated: {description}")
                return False

        # Check entries exist
        bib_entries = re.findall(r'^\[(\d+)\]', bib_section, re.MULTILINE)
        if not bib_entries:
            self.errors.append("Bibliography has no entries")
            return False

        # Check citations in text have bib entries
        text_citations = set(re.findall(r'\[(\d+)\]', self.content))
        bib_citations = set(bib_entries)
        missing = text_citations - bib_citations
        if missing:
            self.warnings.append(f"Citations missing from bibliography: {sorted(missing)}")

        return True

    def _check_placeholders(self) -> bool:
        placeholders = ['TBD', 'TODO', 'FIXME', '[citation needed]', '[placeholder]']
        found = [p for p in placeholders if p in self.content]
        if found:
            self.errors.append(f"Placeholder text found: {', '.join(found)}")
            return False
        return True

    def _check_content_truncation(self) -> bool:
        patterns = [
            (r'Content continues', '"Content continues"'),
            (r'Due to length', '"Due to length"'),
            (r'\[Sections \d+-\d+', '"[Sections X-Y"'),
        ]
        for pattern_re, description in patterns:
            if re.search(pattern_re, self.content, re.IGNORECASE):
                self.errors.append(f"Content truncation: {description}")
                return False
        return True

    def _check_word_count(self) -> bool:
        word_count = len(self.content.split())
        if word_count < 500:
            self.warnings.append(f"Short report: {word_count} words")
        return True

    def _print_summary(self):
        print(f"\n{'='*60}")
        if self.errors:
            print(f"ERRORS ({len(self.errors)}):")
            for e in self.errors:
                print(f"  - {e}")
        if self.warnings:
            print(f"WARNINGS ({len(self.warnings)}):")
            for w in self.warnings:
                print(f"  - {w}")
        if not self.errors and not self.warnings:
            print("ALL CHECKS PASSED")
        elif not self.errors:
            print("PASSED (with warnings)")
        else:
            print("FAILED — fix errors before delivery")
        print(f"{'='*60}\n")


def main():
    parser = argparse.ArgumentParser(description="Validate research report")
    parser.add_argument('--report', '-r', type=str, required=True, help='Path to report')
    args = parser.parse_args()

    report_path = Path(args.report)
    if not report_path.exists():
        print(f"ERROR: Not found: {report_path}")
        sys.exit(1)

    validator = ReportValidator(report_path)
    passed = validator.validate()
    sys.exit(0 if passed else 1)


if __name__ == '__main__':
    main()

verify_citations.py 15.2 KB

#!/usr/bin/env python3
"""
Citation Verification Script (Enhanced with CiteGuard techniques)

Catches fabricated citations by checking:
1. DOI resolution (via doi.org)
2. Basic metadata matching (title similarity, year match)
3. URL accessibility verification
4. Hallucination pattern detection (generic titles, suspicious patterns)
5. Flags suspicious entries for manual review

Enhanced in 2025 with:
- Content alignment checking (when URL available)
- Multi-source verification (DOI + URL + metadata cross-check)
- Advanced hallucination detection patterns
- Better false positive reduction

Usage:
    python verify_citations.py --report [path]
    python verify_citations.py --report [path] --strict  # Fail on any unverified

Does NOT require API keys - uses free DOI resolver and heuristics.
"""

import sys
import argparse
import re
from pathlib import Path
from typing import List, Dict, Tuple
from urllib import request, error
from urllib.parse import quote
import json
import time

class CitationVerifier:
    """Verify citations in research report"""

    def __init__(self, report_path: Path, strict_mode: bool = False):
        self.report_path = report_path
        self.strict_mode = strict_mode
        self.content = self._read_report()
        self.suspicious = []
        self.verified = []
        self.errors = []

        # Hallucination detection patterns (2025 CiteGuard enhancement)
        self.suspicious_patterns = [
            # Generic academic-sounding but fake patterns
            (r'^(A |An |The )?(Study|Analysis|Review|Survey|Investigation) (of|on|into)',
             "Generic academic title pattern"),
            (r'^(Recent|Current|Modern|Contemporary) (Advances|Developments|Trends) in',
             "Generic 'advances' title pattern"),
            # Too perfect, templated titles
            (r'^[A-Z][a-z]+ [A-Z][a-z]+: A (Comprehensive|Complete|Systematic) (Review|Analysis|Guide)$',
             "Too perfect, templated structure"),
        ]

    def _read_report(self) -> str:
        """Read report file"""
        try:
            with open(self.report_path, 'r', encoding='utf-8') as f:
                return f.read()
        except Exception as e:
            print(f"L ERROR: Cannot read report: {e}")
            sys.exit(1)

    def extract_bibliography(self) -> List[Dict]:
        """Extract bibliography entries from report"""
        pattern = r'## Bibliography(.*?)(?=##|\Z)'
        match = re.search(pattern, self.content, re.DOTALL | re.IGNORECASE)

        if not match:
            self.errors.append("No Bibliography section found")
            return []

        bib_section = match.group(1)

        # Parse entries: [N] Author (Year). "Title". Venue. URL
        entries = []
        lines = bib_section.strip().split('\n')

        current_entry = None
        for line in lines:
            line = line.strip()
            if not line:
                continue

            # Check if starts with citation number [N]
            match_num = re.match(r'^\[(\d+)\]\s+(.+)$', line)
            if match_num:
                if current_entry:
                    entries.append(current_entry)

                num = match_num.group(1)
                rest = match_num.group(2)

                # Try to parse: Author (Year). "Title". Venue. URL
                year_match = re.search(r'\((\d{4})\)', rest)
                title_match = re.search(r'"([^"]+)"', rest)
                doi_match = re.search(r'doi\.org/(10\.\S+)', rest)
                url_match = re.search(r'https?://[^\s\)]+', rest)

                current_entry = {
                    'num': num,
                    'raw': rest,
                    'year': year_match.group(1) if year_match else None,
                    'title': title_match.group(1) if title_match else None,
                    'doi': doi_match.group(1) if doi_match else None,
                    'url': url_match.group(0) if url_match else None
                }
            elif current_entry:
                # Multi-line entry, append to raw
                current_entry['raw'] += ' ' + line

        if current_entry:
            entries.append(current_entry)

        return entries

    def verify_doi(self, doi: str) -> Tuple[bool, Dict]:
        """
        Verify DOI exists and get metadata.
        Returns (success, metadata_dict)
        """
        if not doi:
            return False, {}

        try:
            # Use content negotiation to get JSON metadata
            url = f"https://doi.org/{quote(doi)}"
            req = request.Request(url)
            req.add_header('Accept', 'application/vnd.citationstyles.csl+json')

            with request.urlopen(req, timeout=10) as response:
                data = json.loads(response.read().decode('utf-8'))

                return True, {
                    'title': data.get('title', ''),
                    'year': data.get('issued', {}).get('date-parts', [[None]])[0][0],
                    'authors': [
                        f"{a.get('family', '')} {a.get('given', '')}"
                        for a in data.get('author', [])
                    ],
                    'venue': data.get('container-title', '')
                }
        except error.HTTPError as e:
            if e.code == 404:
                return False, {'error': 'DOI not found (404)'}
            return False, {'error': f'HTTP {e.code}'}
        except Exception as e:
            return False, {'error': str(e)}

    def verify_url(self, url: str) -> Tuple[bool, str]:
        """
        Verify URL is accessible (2025 CiteGuard enhancement).
        Returns (accessible, status_message)
        """
        if not url:
            return False, "No URL"

        try:
            # HEAD request to check accessibility without downloading
            req = request.Request(url, method='HEAD')
            req.add_header('User-Agent', 'Mozilla/5.0 (Research Citation Verifier)')

            with request.urlopen(req, timeout=10) as response:
                if response.status == 200:
                    return True, "URL accessible"
                else:
                    return False, f"HTTP {response.status}"
        except error.HTTPError as e:
            return False, f"HTTP {e.code}"
        except error.URLError as e:
            return False, f"URL error: {e.reason}"
        except Exception as e:
            return False, f"Connection error: {str(e)[:50]}"

    def detect_hallucination_patterns(self, entry: Dict) -> List[str]:
        """
        Detect common LLM hallucination patterns in citations (2025 CiteGuard).
        Returns list of detected issues.
        """
        issues = []
        title = entry.get('title', '')

        if not title:
            return issues

        # Check against suspicious patterns
        for pattern, description in self.suspicious_patterns:
            if re.match(pattern, title, re.IGNORECASE):
                issues.append(f"Suspicious title pattern: {description}")

        # Check for overly generic titles
        generic_words = ['overview', 'introduction', 'guide', 'handbook', 'manual']
        if any(word in title.lower() for word in generic_words) and len(title.split()) < 5:
            issues.append("Very generic short title")

        # Check for placeholder-like titles
        if any(x in title.lower() for x in ['tbd', 'todo', 'placeholder', 'example']):
            issues.append("Placeholder text in title")

        # Check for inconsistent metadata
        if entry.get('year'):
            year = int(entry['year'])
            # Very recent without DOI or URL is suspicious
            if year >= 2024 and not entry.get('doi') and not entry.get('url'):
                issues.append("Recent year (2024+) with no verification method")
            # Future year is definitely wrong
            if year > 2025:
                issues.append(f"Future year: {year}")
            # Very old with modern phrasing is suspicious
            if year < 2000 and any(word in title.lower() for word in ['ai', 'llm', 'gpt', 'transformer']):
                issues.append(f"Anachronistic: pre-2000 ({year}) citation mentioning modern AI terms")

        return issues

    def check_title_similarity(self, title1: str, title2: str) -> float:
        """
        Simple title similarity check (word overlap).
        Returns score 0.0-1.0
        """
        if not title1 or not title2:
            return 0.0

        # Normalize: lowercase, remove punctuation, split
        def normalize(s):
            s = s.lower()
            s = re.sub(r'[^\w\s]', ' ', s)
            return set(s.split())

        words1 = normalize(title1)
        words2 = normalize(title2)

        if not words1 or not words2:
            return 0.0

        overlap = len(words1 & words2)
        total = len(words1 | words2)

        return overlap / total if total > 0 else 0.0

    def verify_entry(self, entry: Dict) -> Dict:
        """Verify a single bibliography entry (Enhanced 2025 with CiteGuard)"""
        result = {
            'num': entry['num'],
            'status': 'unknown',
            'issues': [],
            'metadata': {},
            'verification_methods': []
        }

        # STEP 1: Run hallucination detection (CiteGuard 2025)
        hallucination_issues = self.detect_hallucination_patterns(entry)
        if hallucination_issues:
            result['issues'].extend(hallucination_issues)
            result['status'] = 'suspicious'

        # STEP 2: Has DOI?
        if entry['doi']:
            print(f"  [{entry['num']}] Checking DOI {entry['doi']}...", end=' ')
            success, metadata = self.verify_doi(entry['doi'])

            if success:
                result['metadata'] = metadata
                result['status'] = 'verified'
                print("")

                # Check title similarity if we have both
                if entry['title'] and metadata.get('title'):
                    similarity = self.check_title_similarity(
                        entry['title'],
                        metadata['title']
                    )

                    if similarity < 0.5:
                        result['issues'].append(
                            f"Title mismatch (similarity: {similarity:.1%})"
                        )
                        result['status'] = 'suspicious'

                # Check year match
                if entry['year'] and metadata.get('year'):
                    if int(entry['year']) != int(metadata['year']):
                        result['issues'].append(
                            f"Year mismatch: report says {entry['year']}, DOI says {metadata['year']}"
                        )
                        result['status'] = 'suspicious'

            else:
                print(f"✗ {metadata.get('error', 'Failed')}")
                result['status'] = 'unverified'
                result['issues'].append(f"DOI resolution failed: {metadata.get('error', 'unknown')}")

        # STEP 3: Check URL accessibility (if no DOI or DOI failed)
        if entry['url'] and result['status'] != 'verified':
            url_ok, url_status = self.verify_url(entry['url'])
            if url_ok:
                result['verification_methods'].append('URL')
                # Upgrade status if URL verifies
                if result['status'] in ['unknown', 'no_doi', 'unverified']:
                    result['status'] = 'url_verified'
                print(f"  [{entry['num']}] URL accessible ✓")
            else:
                result['issues'].append(f"URL check failed: {url_status}")

        # STEP 4: Final fallback - no verification method
        if not entry['doi'] and not entry['url']:
            if 'No DOI provided' not in ' '.join(result['issues']):
                result['issues'].append("No DOI or URL - cannot verify")
            result['status'] = 'suspicious'

        return result

    def verify_all(self):
        """Verify all bibliography entries"""
        print(f"\n{'='*60}")
        print(f"CITATION VERIFICATION: {self.report_path.name}")
        print(f"{'='*60}\n")

        entries = self.extract_bibliography()

        if not entries:
            print("L No bibliography entries found\n")
            return False

        print(f"Found {len(entries)} citations\n")

        results = []
        for entry in entries:
            result = self.verify_entry(entry)
            results.append(result)

            # Rate limiting
            time.sleep(0.5)

        # Summarize
        print(f"\n{'='*60}")
        print(f"VERIFICATION SUMMARY")
        print(f"{'='*60}\n")

        verified = [r for r in results if r['status'] == 'verified']
        url_verified = [r for r in results if r['status'] == 'url_verified']
        suspicious = [r for r in results if r['status'] == 'suspicious']
        unverified = [r for r in results if r['status'] in ['unverified', 'no_doi', 'unknown']]

        print(f'DOI Verified: {len(verified)}/{len(results)}')
        print(f'URL Verified: {len(url_verified)}/{len(results)}')
        print(f'Suspicious: {len(suspicious)}/{len(results)}')
        print(f'Unverified: {len(unverified)}/{len(results)}')
        print()

        if suspicious:
            print('SUSPICIOUS CITATIONS (Manual Review Needed):')
            for r in suspicious:
                print(f"\n  [{r['num']}]")
                for issue in r['issues']:
                    print(f"    - {issue}")
            print()

        if unverified and len(unverified) > 0:
            print('UNVERIFIED CITATIONS (Could not check):')
            for r in unverified:
                print(f"  [{r['num']}] {r['issues'][0] if r['issues'] else 'Unknown'}")
            print()

        # Decision (Enhanced 2025 - includes URL-verified as acceptable)
        total_verified = len(verified) + len(url_verified)

        if suspicious:
            print('WARNING: Suspicious citations detected')
            if self.strict_mode:
                print('  STRICT MODE: Failing due to suspicious citations')
                return False
            else:
                print('  (Continuing in non-strict mode)')

        if self.strict_mode and unverified:
            print('STRICT MODE: Unverified citations found')
            return False

        if total_verified / len(results) < 0.5:
            print('WARNING: Less than 50% citations verified')
            return True  # Pass with warning
        else:
            print('CITATION VERIFICATION PASSED')
            return True


def main():
    parser = argparse.ArgumentParser(
        description="Verify citations in research report",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  python verify_citations.py --report report.md

Note: Requires internet connection to check DOIs.
Uses free DOI resolver - no API key needed.
        """
    )

    parser.add_argument(
        '--report', '-r',
        type=str,
        required=True,
        help='Path to research report markdown file'
    )

    parser.add_argument(
        '--strict',
        action='store_true',
        help='Strict mode: fail on any unverified or suspicious citations'
    )

    args = parser.parse_args()
    report_path = Path(args.report)

    if not report_path.exists():
        print(f"ERROR: Report file not found: {report_path}")
        sys.exit(1)

    verifier = CitationVerifier(report_path, strict_mode=args.strict)
    passed = verifier.verify_all()

    sys.exit(0 if passed else 1)


if __name__ == '__main__':
    main()

templates/

report_template.md 1.1 KB

# Research Report: [Topic]

## Executive Summary

[3-5 key findings in prose. Primary recommendation. Confidence level: High/Medium/Low.]

---

## Introduction

### Research Question
[Original question or research directive]

### Scope & Methodology
[What was investigated, boundaries, agent strategy used, sources consulted]

---

## Main Analysis

### Finding 1: [Descriptive Title]
[Prose paragraphs with evidence. Specific data, statistics, dates. Citations inline [1].]

### Finding 2: [Descriptive Title]
[Continue for all findings — as many as evidence warrants]

---

## Synthesis & Insights

### Patterns
[Cross-cutting patterns across findings]

### Implications
[What this means for the research question. Second-order effects.]

---

## Limitations & Caveats
[Known gaps, contradictions, areas of uncertainty. Be honest.]

---

## Recommendations
[Actionable next steps based on findings]

---

## Bibliography

[1] Author/Org (Year). "Title". URL (Retrieved: YYYY-MM-DD)
[2] ...

---

## Methodology Appendix
[Research process: agents deployed, sources consulted, verification approach]

License (Apache-2.0)

Apache-2.0 Source: vlabsai/skills-hub

View full license text

Licensed under Apache-2.0