Health
How to probe Prelude TE for liveness and readiness with the /api/health endpoint, and where to look when something is not flowing.
Prelude TE exposes a dedicated health endpoint suitable for load balancers, container healthchecks, Kubernetes probes, and ops dashboards.
GET /api/health
A public JSON endpoint — no authentication required — that returns a snapshot of the engine and its subsystems.
curl -fsSL https://te.example.com/api/health
Response shape
{
"status": "ok",
"service": "prelude-te",
"version": "1.2.3",
"commit": "a1b2c3d",
"uptime_seconds": 84213,
"timestamp": "2026-05-26T08:31:14Z",
"checks": {
"database": { "status": "ok" },
"bgp": {
"status": "ok",
"running": true,
"peers_total": 4,
"peers_established": 4
},
"topology": {
"status": "ok",
"running": true,
"domains": 2,
"nodes": 318,
"links": 642
},
"outputs": {
"status": "ok",
"running": true,
"total": 1,
"connected": 1,
"errors": 0
},
"licensing": {
"status": "ok",
"tier": "standard",
"trial_state": "registered",
"trial_days_remaining": 0
}
}
}
Global status
The top-level status aggregates the per-check statuses:
| Value | Meaning | HTTP |
|---|---|---|
ok |
Every subsystem is healthy. | 200 |
degraded |
A non-critical check is unhealthy (e.g. no peer established, output errors). | 200 |
down |
The database is unreachable. The engine cannot serve traffic. | 503 |
The database is the only critical dependency: if its check is
down, the global status is down and the endpoint returns
HTTP 503. Any other unhealthy check yields degraded with
HTTP 200.
This split is intentional: a Kubernetes liveness probe that
restarts on 503 should not restart the engine just because BGP
peers are flapping. Use down as the only "restart me" signal.
Per-check details
| Check | status becomes degraded when… |
|---|---|
database |
down if the DB connection or ping fails. No degraded state. |
bgp |
At least one peer is enabled but none are in established state, or BGP isn't running. |
topology |
The topology manager is not running. |
outputs |
One or more outputs are in error state, or the output manager is not running. |
licensing |
The built-in trial has expired and no license has been registered yet. |
The bgp, topology, and outputs checks also expose live
counters (peers, domains, nodes, links, connected outputs, errors)
so you can read operational state without a second round-trip.
Trimmed payload for liveness probes
For Kubernetes liveness probes that only need an up/down verdict,
pass ?verbose=false to get a smaller payload:
curl -fsSL https://te.example.com/api/health?verbose=false
{
"status": "ok",
"service": "prelude-te",
"uptime_seconds": 84213,
"timestamp": "2026-05-26T08:31:14Z"
}
Kubernetes probe example
livenessProbe:
httpGet:
path: /api/health?verbose=false
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
Both probes succeed on HTTP 200 (ok or degraded) and fail on
HTTP 503 (down). Use the readiness probe's verbose payload in
logs and dashboards to see why a pod is degraded without leaving
the cluster.
Operational signals to watch
Beyond the endpoint itself, a working Prelude TE deployment shows three healthy signals at the same time:
- At least one peer is
established—checks.bgp.peers_established > 0, or open BGP → Peers in the web UI. A fleet where every peer isidleor stuck inconnect/activemeans nothing is feeding the graph. - The topology is non-empty and recent —
checks.topology.nodes > 0, or open Topology in the UI. Pair this with the per-domainlast-updatetimestamp fromGET /api/topology/statsfor a liveness signal on the data pipeline. - Enabled outputs are
connected—checks.outputs.connected == checks.outputs.totalandchecks.outputs.errors == 0. Open Outputs to see each output'slast-errorwhen something is off.
When something is wrong
Walk the pipeline from the source to the sink:
- Peer state — if
checks.bgp.peers_establishedis0while peers are enabled, look at each peer's State history for the reason of the last failure. See Peers. - Topology stats — if peers are up but
checks.topology.nodesstays at0, the router may not be exporting BGP-LS NLRIs. Confirm the BGP-LS AFI/SAFI is enabled on the peer side. - Output state — if topology is populated but a downstream
consumer is silent, check
checks.outputs.errorsand the output'slast-erroron the Outputs detail page. See Outputs / NATS. - Logs — the engine writes per-module logs under
storage/logs/(e.g.prelude-te.log,access.log). Tail them when the UI and/api/healthsignals are not enough.
For Prometheus-scrapable metrics — peer state, session counters, topology counts, change rates — see Metrics.
See Support for how to escalate.