Health Checks

Monitor your endpoints and get alerted when they go down.

Overview#

Uptime monitoring continuously checks your endpoints from multiple regions and alerts you when they become unreachable or return unexpected responses.

Creating a Monitor#

Navigate to Monitoring > Uptime and click Add Monitor.

Configuration#

| Setting | Description | Example | |---------|-------------|---------| | Name | Descriptive monitor name | Production API | | URL | Endpoint to check | https://api.example.com/health | | Method | HTTP method | GET, POST, HEAD | | Interval | Check frequency | 1, 5, 10, 15, 30 minutes | | Timeout | Max wait time | 10, 15, 30 seconds | | Expected Status | Success status code | 200 | | Regions | Check from regions | US East, EU West, Asia Pacific |

Request Configuration#

Optionally configure request headers and body:

Headers:
  Authorization: Bearer your-token
  Content-Type: application/json

Body (POST only):
  {"check": "health"}

Response Validation#

Beyond status codes, validate the response body:

  • Contains text -- response must contain a specific string
  • JSON path -- a JSON field must match a value
  • Response time -- must respond within N milliseconds

Monitor Dashboard#

Each monitor shows:

  • Current status -- Up or Down with response time
  • Uptime percentage -- over 24h, 7d, 30d, 90d
  • Response time chart -- latency over time
  • Incident history -- list of all downtime events
  • Average response time -- across all regions

Incidents#

When a monitor fails, an incident is created:

  1. Detection -- check fails (confirmed after 2 consecutive failures to avoid flapping)
  2. Alert -- notifications sent to configured channels
  3. Duration -- incident remains open until endpoint recovers
  4. Recovery -- recovery notification sent when endpoint is back

Incident Detail#

Each incident records:

  • Start and end time
  • Total downtime duration
  • Error details (timeout, DNS failure, HTTP error, etc.)
  • Response body (if any)

Alerts#

Uptime monitors automatically create alerts:

  • Endpoint down -- after 2 consecutive failures
  • Slow response -- response time exceeds threshold
  • SSL certificate expiring -- within 14 days of expiry
  • Recovery -- when endpoint comes back up

Status Page#

Share uptime status with your users by enabling the public status page. This shows:

  • Current status of all monitors
  • Uptime percentages
  • Recent incident history
  • Scheduled maintenance windows

Best Practices#

  1. Monitor critical paths -- check your most important endpoints, not just /
  2. Use health check endpoints -- create dedicated /health endpoints that verify database connectivity
  3. Check from multiple regions -- detect regional outages
  4. Set appropriate intervals -- critical services at 1 minute, others at 5-15 minutes
  5. Configure meaningful timeouts -- too short causes false alarms, too long delays detection