Initiative: Automated PR Review Service

The Bet

We believe that by building an automated PR review service on top of mcp-vector-search and duetto-code-intelligence, we can: - Replace CodeRabbit/Augment with internal solution - Save ~$50/seat/month across entire engineering organization - Match or exceed commercial tool quality with vector search + KG context - Maintain full control over code analysis infrastructure and data

Background

Current State

Commercial Tools in Use: - CodeRabbit or Augment for automated PR reviews - Cost: ~$50/seat/month per developer - Annual cost (50 engineers): ~$30,000/year - Limited customization and integration with internal tooling

New Capabilities Available: - mcp-vector-search now has comprehensive code review capabilities: - analyze review command with security, architecture, performance analysis - Vector search + knowledge graph for deep codebase context - SARIF output format for standardized findings - Specialized LLM prompts per review type (OWASP, CWE, SOLID) - duetto-code-intelligence provides foundation for GitHub integration

The Problem

Pain Points with Commercial Tools: 1. Cost: $30K+/year for features we can build internally 2. Limited Context: Generic code analysis without Duetto-specific knowledge 3. No Customization: Can't adapt review criteria to Duetto standards 4. Data Privacy: External services analyzing proprietary code 5. Integration Gaps: Doesn't understand Duetto architecture patterns

The Opportunity

What We Can Build: - Automated service monitoring all Duetto GitHub org PRs - Leverages mcp-vector-search review capabilities: - Security review (OWASP/CWE vulnerabilities) - Architecture review (SOLID principles, patterns) - Performance review (complexity, optimizations) - Contextual analysis using vector search + knowledge graph: - Understands related code across repository - Recognizes Duetto-specific patterns and conventions - Learns from historical PR reviews and decisions - PR Comments as first deliverable (GitHub integration)

Solution Design

Architecture

GitHub Webhooks
    ↓
Automated PR Review Service (duetto-code-intelligence based)
    ↓
mcp-vector-search Code Review Engine
    ↓
    ├─ Vector Search (find relevant code context)
    ├─ Knowledge Graph (understand relationships)
    ├─ LLM Analysis (Claude 3.5 Sonnet via AWS Bedrock)
    └─ SARIF Output (structured findings)
    ↓
GitHub API (post PR comments)

Components

1. PR Monitor Service - GitHub webhook listener for all Duetto org repositories - Filters: new PRs, PR updates, specific labels - Queue management for processing PRs

2. Review Orchestrator - Clones PR branch and target branch - Runs mcp-vector-search review analysis: - Security review (vulnerabilities, secrets) - Architecture review (patterns, SOLID principles) - Performance review (complexity, bottlenecks) - Aggregates findings across all review types

3. Context Engine (mcp-vector-search) - Vector search for similar code and historical fixes - Knowledge graph for understanding code relationships - Duetto-specific pattern recognition

4. Comment Formatter - Converts SARIF findings to GitHub PR comment format - Prioritizes critical/high findings - Provides actionable recommendations with context

5. GitHub Integration - Posts review comments to PR - Inline comments for specific lines (future) - Review status updates (future)

Tech Stack

Base: duetto-code-intelligence (existing GitHub integration patterns)
Analysis Engine: mcp-vector-search (new review capabilities)
LLM: Claude 3.5 Sonnet via AWS Bedrock
Deployment: TBD (AWS Lambda, Cloud Run, or Kubernetes)
Storage: Vector DB (Chroma/Lance), Knowledge Graph (Küzu)

Success Metrics

Metric	Current	Target	Measurement
Cost savings	$30K/year (external tools)	$0 licensing	Budget tracking
Review quality	CodeRabbit baseline	Match or exceed	Developer satisfaction survey
Review speed	TBD	< 5 min per PR	Service metrics
False positive rate	TBD	< 10%	Developer feedback
Adoption rate	0%	80% of PRs	PR coverage tracking
Developer satisfaction	Baseline (CodeRabbit)	≥ 4/5 rating	Quarterly survey

Risks & Mitigation

Risk	Likelihood	Impact	Mitigation
Review quality below commercial tools	Medium	High	Start with pilot team, iterate based on feedback
Performance too slow for large PRs	Medium	Medium	Optimize vector search, implement incremental analysis
High false positive rate	High	Medium	Fine-tune prompts, build feedback loop for improvements
GitHub rate limiting	Low	Medium	Implement queue management, request rate limit increase
Infrastructure costs exceed savings	Low	High	Monitor costs closely, optimize LLM usage and caching

Timeline

Phase 1: POC (2 weeks)

Basic PR webhook listener
Single review type integration (security)
Console output (no GitHub comments yet)

Phase 2: MVP (2 weeks)

All review types (security, architecture, performance)
GitHub PR comment integration
Deploy to pilot team (5-10 developers)

Phase 3: Production (4 weeks)

Production deployment to all repositories
Monitoring and alerting
Developer feedback loop
Cost tracking

Total: 8 weeks to full production deployment

Experiments

E-2026-ENG-001: POC - Single Repository PR Review

Goal: Validate mcp-vector-search code review quality on real PRs

Approach: - Select 1 active repository (e.g., pricing-service) - Run mcp-vector-search reviews on last 10 merged PRs - Compare findings against CodeRabbit/Augment reviews - Measure: precision, recall, developer usefulness ratings

Success Criteria: - ≥80% precision (findings are valid issues) - ≥70% recall (catches issues found by commercial tools) - ≥4/5 developer usefulness rating

Time box: 1 week

E-2026-ENG-002: MVP - Automated Comment Service

Goal: Validate end-to-end automation and GitHub integration

Approach: - Deploy webhook service for pilot repository - Auto-post review comments on new PRs - Collect developer feedback for 2 weeks - Iterate on comment format and findings presentation

Success Criteria: - Service processes 100% of PRs within 5 minutes - ≥70% of comments marked as "helpful" by developers - Zero false positives flagged as critical/high severity

Time box: 3 weeks (1 week build, 2 weeks pilot)

Dependencies

mcp-vector-search review capabilities: ✅ Available (recent implementation)
duetto-code-intelligence codebase: Access to GitHub integration patterns
GitHub org admin access: For webhook configuration
OpenRouter API key: For Claude Opus access (cost management)
Deployment infrastructure: AWS/GCP account for service hosting

Cost Analysis

Current Cost (Commercial Tools)

CodeRabbit/Augment: $50/seat/month
50 engineers: $2,500/month = $30,000/year

Projected Cost (Internal Solution)

Development: - 1 engineer × 8 weeks = ~$20K one-time

Operating Costs (annual): - LLM API (OpenRouter): ~$5K/year (estimated) - Assume 100 PRs/week, 10K tokens/review, $0.01/1K tokens - = $100/week = $5,200/year - Infrastructure: ~$2K/year - Serverless functions or small Kubernetes deployment - Vector DB storage - Maintenance: ~$10K/year (20% of engineer time)

Total Year 1: $37K (development + operating) Total Year 2+: $17K/year (operating only)

ROI: - Year 1: $37K ($7K investment over commercial tools; investment recoups in first 7 months of Year 2) - Year 2+: $13K/year savings (43% cost reduction) - 3-year NPV: ~$30K savings

Note: Cost savings increase with team growth (commercial tools scale linearly with headcount)

Open Questions

Deployment Model: AWS Lambda, Google Cloud Run, or Kubernetes?
Feedback Mechanism: How do developers mark comments as helpful/unhelpful?
Customization: How do teams configure review criteria per repository?
Integration: Should we integrate with Slack for review notifications?
Inline Comments: When to implement line-specific comments vs PR-level?
Rate Limiting: What's Duetto's GitHub API rate limit? Do we need to request increase?

mcp-vector-search: Code review engine foundation
/analyze review security|architecture|performance
Vector search + knowledge graph context
SARIF output format
duetto-code-intelligence: GitHub integration patterns
CodeRabbit/Augment: Commercial tools we're replacing

Acceptance Criteria

[ ] POC experiment completed and validated (E-2026-ENG-001)
[ ] MVP deployed to pilot team (E-2026-ENG-002)
[ ] Developer satisfaction ≥ 4/5 rating
[ ] Cost tracking shows ≥30% savings vs commercial tools
[ ] False positive rate < 10%
[ ] Production deployment to all repositories
[ ] Documentation: setup guide, architecture, troubleshooting

mcp-vector-search Review System - Code review capabilities
duetto-code-intelligence - GitHub integration foundation
Engineering Platform README

Automated PR Review Service