Skip to main contentRoot Cause Analysis
Deductive AI dramatically accelerates incident resolution by automatically investigating incidents, testing multiple hypotheses in parallel, and presenting evidence-backed root cause analysis.
The Challenge
Traditional incident investigation is slow and manual:
- Sequential Investigation - Teams check one system at a time
- Limited Context - Hard to correlate events across systems
- Knowledge Gaps - Requires deep expertise in multiple tools
- Time Pressure - Every minute counts during an incident
How Deductive Helps
Deductive AI joins your on-call rotation and starts investigating incidents before you even begin. It:
- Builds Multiple Theories - Tests several hypotheses in parallel instead of one at a time
- Collects Evidence - Queries code, metrics, logs, and infrastructure simultaneously
- Builds Causal Timelines - Links code changes, infrastructure events, and telemetry
- Presents Findings - Shows root cause with supporting evidence
Example: Production Incident
The Incident
At 3:47 PM, your error rate spiked from 0.1% to 15%. Users reported payment failures.
Traditional Investigation
- Check monitoring dashboard (2 minutes)
- Search logs for errors (5 minutes)
- Review recent deployments (3 minutes)
- Check infrastructure changes (4 minutes)
- Correlate findings manually (10 minutes)
- Total: ~24 minutes
With Deductive AI
You ask: “What caused the payment service errors at 3:47 PM?”
Deductive immediately:
- Queries metrics - Confirms error spike at 3:47 PM
- Searches logs - Finds “database connection pool exhausted” errors
- Reviews code - Identifies recent PR #1234 that changed connection pool settings
- Checks infrastructure - Finds auto-scaling event at 3:45 PM
- Builds timeline - Links all events together
Response time: ~2 minutes
The Root Cause
Deductive presents:
Root Cause: PR #1234 reduced the database connection pool size from 50 to 10, which became insufficient when traffic increased due to the auto-scaling event at 3:45 PM.
Evidence:
- Code change:
src/db/pool.ts (PR #1234, merged 2 hours ago)
- Error pattern: “connection pool exhausted” starting at 3:47 PM
- Infrastructure: Auto-scaling increased traffic by 3x at 3:45 PM
- Timeline: All events align within 2-minute window
Recommended Fix: Revert connection pool change or increase pool size to handle peak traffic.
Key Features
Parallel Hypothesis Testing
Instead of checking systems sequentially, Deductive tests multiple theories simultaneously:
Causal Timeline Building
Deductive automatically builds timelines that show how events relate:
Evidence-Backed Conclusions
Every root cause analysis includes:
- Primary Conclusion - The most likely cause with confidence score
- Supporting Evidence - Specific logs, metrics, code changes
- Alternative Theories - Other possibilities considered
- Recommended Actions - Next steps to resolve
Best Practices
Connect Comprehensive Integrations
The more data sources Deductive has access to, the better its analysis:
- Code: GitHub, GitLab for code change tracking
- Metrics: Prometheus, Datadog for system health
- Logs: Elasticsearch, OpenSearch for error details
- Infrastructure: AWS, Kubernetes for resource changes
- Alerts: PagerDuty, Incident.io for incident context
Provide Context
Help Deductive understand your systems:
- Document service dependencies
- Share runbooks and troubleshooting guides
- Explain normal vs. abnormal patterns
- Provide team-specific context
Review and Refine
Deductive learns from feedback:
- Confirm correct root causes
- Correct incorrect analyses
- Provide additional context
- Share resolution steps
Results
Teams using Deductive AI for root cause analysis report:
- 5x faster MTTR - Average resolution time drops from hours to minutes
- Higher accuracy - Evidence-backed conclusions reduce false positives
- Better learning - Patterns captured for future incidents
- Reduced stress - Automated investigation reduces on-call burden
Next Steps