# Rocket Relay Incident Response Process

Last reviewed: April 24, 2026

## Severity triage

Incidents should be classified by customer impact, security sensitivity, data exposure risk, billing impact, and service availability.

Example severity categories:

- SEV-1: confirmed data exposure, credential compromise, widespread outage, or incorrect billing at scale.
- SEV-2: degraded routing, elevated provider errors, partial billing impact, or isolated tenant security issue.
- SEV-3: non-critical operational issue, delayed webhook delivery, dashboard issue, or minor support-impacting bug.

## Response workflow

1. Detect through metrics, logs, customer report, provider status, or admin dashboard signals.
2. Assign incident owner and create a timeline.
3. Contain the issue by disabling affected upstream accounts, rotating secrets, pausing affected routes, or applying rate limits.
4. Preserve relevant audit logs, request metadata, billing records, and deployment metadata.
5. Notify affected customers when confidentiality, integrity, availability, or billing accuracy is materially impacted.
6. Complete post-incident review with root cause, customer impact, remediation, prevention, and owner.

## Customer communication

Incident notices should include:

- Incident window.
- Affected service or tenant scope.
- Data or billing impact assessment.
- Mitigation performed.
- Customer action required, if any.
- Follow-up timeline.

## Evidence sources

- Audit logs for admin changes.
- Request metadata logs for affected traffic.
- Usage records and billing idempotency records.
- Model quality checks for route health.
- Upstream account status and error history.
