Each check is a real file in apps/worker/src/checks.ts.
Nothing below is a roadmap item.
Verifies status codes, measures response time, flags degradation before full outage. Send custom headers (encrypted at rest), compare against expected status, track p95 over 24h.
Reachable-or-not checks for databases, message queues, anything that speaks TCP. Uses pre-validated IPs so SSRF can't redirect the connect.
Warns you 7 days before expiry. Validates cert chain, not just "200 OK". SNI-correct even when connecting by IP.
Classic reachability. Uses execFile (not shell) with a pre-validated IP — no shell-injection surface, no DNS re-lookup by the ping binary itself.
Does the page still say "Order placed" — or is the 200 just a generic landing page? Catches silent content regressions that status codes miss.
Like TCP but when the URL format doesn't include the port. Same SSRF protections, same timeouts.
Public, slugged, optionally domain-mapped (`status.yourco.com`). Group monitors into components. Read-only by default — embed the incident feed on your own site if you want.
Auto-opened when a monitor crosses its consecutive-failure threshold. Manually resolve, add updates, link to runbooks. Audit-logged end to end.
Schedule downtime by monitor or across the whole org. Checks still run; they just don't page. Uptime math respects the window.
Email + webhook today. Deterministic jobId per (incident, channel) — retries can never double-page. Failed jobs sit in a 7-day DLQ for forensics.
Every mutation — create/update/delete on any entity — writes an immutable row with actor, org, action, and metadata. Query it via SQL directly, no UI gating.
Every row filtered by `organizationId`. Cross-org access is a 404, covered by automated integration tests. Your data is structurally siloed.
No "pro tier" gating what should be default.