Escalation Policies
Configure multi-step escalation policies to ensure alerts are acknowledged and resolved.
Overview#
Escalation policies define what happens when an alert fires and nobody responds. Instead of relying on a single notification, escalation policies send progressively more urgent notifications to different people or channels until someone acknowledges the alert. This ensures critical issues are never missed.
How Escalation Works#
- An alert rule fires (e.g., error rate > 5% for 5 minutes)
- The escalation policy's Step 1 is triggered immediately
- If nobody acknowledges the alert within the step's delay period, Step 2 is triggered
- This continues through all configured steps
- Acknowledging the alert at any step stops the escalation
Creating an Escalation Policy#
Navigate to Monitoring > Alerts > Escalation Policies and click Create Policy.
Policy Configuration#
| Setting | Description | |---------|-------------| | Name | Descriptive policy name (e.g., "Production Critical") | | Description | What this policy covers | | Steps | Ordered list of notification steps (1-5) | | Repeat | Whether to restart from Step 1 if all steps exhaust without acknowledgment | | Repeat Limit | Maximum number of full cycles (default: 3) |
Configuring Steps#
Each step defines who gets notified and how long to wait before escalating:
Step 1: Immediate
| Setting | Value | |---------|-------| | Delay | 0 minutes (immediate) | | Channels | Slack #oncall-alerts, Email: oncall@company.com |
Step 2: 10-Minute Escalation
| Setting | Value | |---------|-------| | Delay | 10 minutes | | Channels | Slack #engineering, SMS: on-call engineer |
Step 3: 30-Minute Escalation
| Setting | Value | |---------|-------| | Delay | 30 minutes | | Channels | SMS: engineering manager, Phone call: engineering manager |
Step 4: 60-Minute Escalation
| Setting | Value | |---------|-------| | Delay | 60 minutes | | Channels | SMS: VP Engineering, Email: leadership@company.com |
Step 5: Final Escalation
| Setting | Value | |---------|-------| | Delay | 120 minutes | | Channels | Phone call: CTO, SMS: entire engineering team |
Delay Configuration#
Each step's delay is the time to wait after the previous step's notification before escalating. The delay is measured from the notification send time, not from the original alert firing time.
| Delay | Use Case | |-------|----------| | 0 minutes | First responder, immediate notification | | 5 minutes | Quick follow-up for time-sensitive issues | | 10 minutes | Standard escalation window | | 15 minutes | Moderate urgency | | 30 minutes | Senior escalation | | 60 minutes | Management escalation | | 120 minutes | Executive escalation |
Channel Types Per Step#
Each step can use one or more notification channels:
| Channel | Description | Setup | |---------|-------------|-------| | Email | Email notification with alert details | Configured in Notification Channels | | Slack | Message to a Slack channel or DM | Slack integration required | | Webhook | POST to a URL with alert payload | Configure URL in channels | | SMS | Text message to a phone number | Twilio integration required | | Phone Call | Automated phone call with TTS alert | Twilio integration required | | In-App | Dashboard notification bell | Automatic for all dashboard users | | PagerDuty | PagerDuty incident creation | PagerDuty integration required | | Microsoft Teams | Teams channel message | Teams webhook required |
Acknowledging Alerts#
Acknowledging an alert stops the escalation at the current step.
How to Acknowledge#
- Dashboard: Click "Acknowledge" on the alert in Monitoring > Alerts
- Email: Click the "Acknowledge" link in the alert email
- Slack: Click the "Acknowledge" button on the Slack message
- SMS: Reply "ACK" to the SMS notification
- API: POST to
/api/alerts/{alertId}/acknowledge
Acknowledgment Details#
When an alert is acknowledged:
- Escalation stops immediately
- All channels receive an "acknowledged by [user]" notification
- The alert status changes from "Firing" to "Acknowledged"
- The alert remains acknowledged until the condition resolves or the user manually re-escalates
What if Nobody Acknowledges?#
If all steps exhaust without acknowledgment:
- If Repeat is enabled, the policy restarts from Step 1
- The cycle repeats up to the configured Repeat Limit (default: 3 times)
- After all repeats, the alert remains in "Firing" state with a "Escalation Exhausted" flag
- A special "escalation exhausted" notification is sent to all channels in the final step
Linking Policies to Alert Rules#
Connect escalation policies to your alert rules:
- Go to Monitoring > Alerts and edit (or create) an alert rule
- In the Escalation section, select a policy from the dropdown
- Save the rule
Policy Assignment by Severity#
A common pattern is to assign different policies based on alert severity:
| Severity | Escalation Policy | |----------|-------------------| | Info | None (notification only, no escalation) | | Warning | "Standard" (email, then Slack after 15 min) | | Critical | "Production Critical" (full 5-step escalation with SMS/phone) |
Multiple Rules, One Policy#
A single escalation policy can be shared across multiple alert rules. For example, the "Production Critical" policy might be used by:
- Error rate > 5% (critical)
- P95 latency > 5000ms (critical)
- CPU usage > 95% for 10 minutes (critical)
- Uptime check failing (critical)
SMS Alerting Setup (Twilio)#
SMS and phone call notifications require a Twilio account.
Twilio Configuration#
- Navigate to Settings > Integrations > Twilio
- Enter your Twilio credentials:
| Setting | Description | |---------|-------------| | Account SID | Your Twilio account SID | | Auth Token | Your Twilio auth token | | Phone Number | Twilio phone number to send from (e.g., +1234567890) |
- Click Test Connection to verify
- Save the configuration
Adding Phone Numbers#
Add recipient phone numbers in Notification Channels:
- Go to Monitoring > Alerts > Notification Channels
- Click Create Channel and select "SMS" or "Phone Call"
- Enter the recipient's phone number with country code (e.g., +1-555-123-4567)
- A verification code is sent -- enter it to confirm the number
- Use this channel in escalation policy steps
SMS Message Format#
SMS alerts include:
[JA CRITICAL] Error rate 8.2% (threshold: 5%)
Service: api-server
Site: acme.com
Time: 2026-03-15 14:23 UTC
Acknowledge: https://ja.app/ack/abc123
Phone Call Format#
Automated phone calls use text-to-speech:
"JustAnalytics critical alert. Error rate is 8.2 percent on api-server for acme.com. Press 1 to acknowledge. Press 2 to repeat."
Pressing 1 acknowledges the alert. The call retries up to 3 times if unanswered.
Best Practices#
- Start simple -- begin with 2-3 steps and add more as needed
- Test your policies -- use the "Test Escalation" button to verify all channels work before relying on them in production
- Set reasonable delays -- too short causes alert fatigue, too long delays response
- Use SMS/phone for critical only -- reserve intrusive channels for truly critical alerts
- Rotate on-call -- update escalation policy channels when on-call rotations change (or integrate with PagerDuty for automatic rotation)
- Include context -- use descriptive alert rule names so escalation messages are immediately understandable
- Enable repeat -- for critical policies, enable repeat (limit 2-3) to avoid silent failures
- Acknowledge quickly -- acknowledging stops escalation and signals to the team that someone is on it
- Review monthly -- audit escalation policy effectiveness: how often do alerts reach Step 3+? If frequently, your Step 1 response may need improvement