AI Troubleshooting Guide
This guide helps support teams diagnose and resolve AI-related issues reported by users.
Looking Up Request Traces
Every AI request in olllo has a unique correlation ID for tracing. When a user reports an issue, ask for:
- Approximate time of the issue
- Feature they were using (reflection, accomplishments, goals, etc.)
- Error message if any was displayed
Finding the Correlation ID
If the user has the correlation ID from an error message:
# Get full request trace
curl -H "Authorization: Bearer $ADMIN_API_KEY" \
"https://app.olllo.app/api/admin/ai-dashboard/trace/{correlationId}"If searching by time and user:
# Query dashboard for recent requests by feature
curl -H "Authorization: Bearer $ADMIN_API_KEY" \
"https://app.olllo.app/api/admin/ai-dashboard?period=1h&feature=reflection"Understanding Trace Output
{
"trace": {
"telemetry": {
"correlationId": "550e8400-e29b-41d4-a716-446655440000",
"userId": "user_xxx",
"feature": "reflection",
"status": "SUCCESS",
"durationMs": 1500,
"modelName": "claude-sonnet-4-5",
"usedFallback": false,
"totalTokens": 800
},
"safetyIncident": null,
"retryAttempts": []
}
}Common Error Codes
Rate Limiting Errors
| Error Code | Meaning | User Action |
|---|---|---|
RATE_LIMITED | User exceeded request limits | Wait and try again later |
DAILY_LIMIT_EXCEEDED | Daily token budget exhausted | Resets at midnight UTC |
FEATURE_RATE_LIMITED | Too many requests to specific feature | Wait 1-5 minutes |
Support Response: “You’ve reached your usage limit for now. This resets automatically. Please try again in a few minutes.”
Provider Errors
| Error Code | Meaning | Support Action |
|---|---|---|
PROVIDER_ERROR | AI provider returned an error | Check provider status |
PROVIDER_TIMEOUT | Request timed out | May be high load, retry |
MODEL_UNAVAILABLE | Specific model is down | System should auto-fallback |
ALL_PROVIDERS_FAILED | Both primary and fallback failed | Escalate to engineering |
Support Response: “We’re experiencing temporary issues with our AI service. Your data has been saved and we’ll process it shortly.”
Safety Errors
| Error Code | Meaning | Support Action |
|---|---|---|
CONTENT_BLOCKED | Content flagged by safety system | Review safety incident |
PII_DETECTED | Personal information detected | Advise user on @mentions |
STREAMING_BLOCKED | Content blocked during streaming | Check what triggered block |
Support Response: “Your content was flagged by our safety system. Please review and try rephrasing. Use @mentions for names.”
System Errors
| Error Code | Meaning | Support Action |
|---|---|---|
INTERNAL_ERROR | Unexpected system error | Escalate with correlation ID |
INVALID_REQUEST | Malformed request | Check user input |
CONTEXT_TOO_LONG | Input exceeded token limits | Advise user to shorten input |
Troubleshooting by Symptom
”AI response was slow”
- Check
durationMsin the trace - Normal: < 5 seconds, Slow: 5-15 seconds, Very slow: > 15 seconds
- Check if
usedFallback: true(fallback can be slower) - Check
retryAttempts- multiple retries add latency
If consistently slow: Escalate to engineering with sample correlation IDs.
”AI response was cut off”
- Check
finishReasonin trace "length"= Response hit token limit (normal for long content)"interrupted"= Stream was interrupted (checkinterruptReason)"safety_block"= Content was blocked mid-stream
Resolution: For length limits, advise user to break into smaller requests.
”AI gave an unhelpful response”
- This is subjective but may indicate:
- Prompt issue (engineering may need to tune)
- Model degradation (compare recent quality)
- User expectation mismatch
- Collect specific examples for product team
”Feature says AI is unavailable”
- Check if system is in degraded mode
- Verify the specific feature’s status
- Check for ongoing incidents
- If isolated to one user, check their rate limit status
Safety Incident Investigation
When a user’s content is blocked:
Viewing the Safety Incident
curl -H "Authorization: Bearer $ADMIN_API_KEY" \
"https://app.olllo.app/api/admin/ai-dashboard/trace/{correlationId}"The safetyIncident field shows:
incidentType: What triggered the block (PII_DETECTED, HARMFUL_CONTENT, etc.)severity: low, medium, highsafetyReason: Human-readable explanationactionTaken: What the system did (blocked, warned, etc.)userNotified: Whether user saw an error message
Common Safety Scenarios
PII Detection
- User included real names, emails, or phone numbers
- Advise using @mentions instead of real names
- Example: “@manager” instead of “John Smith”
Content Flagged
- AI detected potentially problematic content
- Review the context - may be false positive
- If legitimate, explain our content guidelines
Escalation Paths
Level 1: Support Team
- Rate limit questions
- General “how to use” questions
- Explaining error messages
Level 2: Engineering Support
- Repeated errors for same user
- Errors with no clear cause
- Performance issues affecting multiple users
Level 3: Engineering On-Call
- Complete AI outage
- Security concerns (data exposure, etc.)
- Safety system failures
Information to Include in Escalations
- Correlation ID(s) - Most important!
- User ID (anonymized if needed)
- Timestamp of issue
- Feature affected
- Error code if any
- User’s description of the problem
- Steps already taken
Quick Reference
API Endpoints
| Endpoint | Purpose |
|---|---|
GET /api/admin/ai-dashboard | Dashboard metrics |
GET /api/admin/ai-dashboard/trace/{id} | Single request trace |
GET /api/admin/ai-dashboard?format=text | Human-readable output |
Status Codes
| Status | Meaning |
|---|---|
| SUCCESS | Request completed normally |
| FAILED | Request failed after retries |
| BLOCKED | Content was blocked by safety |
| DEGRADED | Served with reduced functionality |
Feature Names
| Feature | Description |
|---|---|
pii-detection | Personal information detection |
star-extraction | STAR format extraction |
star-refinement | STAR content refinement |
weekly-reflection | Weekly reflection prompts |
goal-creation | Goal setting assistance |
goal-matching | Linking content to goals |
Contact
- Slack: #support-escalations
- Email: support-engineering@olllo.app
- PagerDuty: For Sev 1 issues only