Skip to content
Documentation Prelude Collector 1.0.0

Polling intervals

Reasonable polling and streaming intervals per protocol and use case — with concrete starting numbers and the trade-offs between freshness, device load, and storage cost.

Recommendation

Start at 10 seconds for gNMI streaming telemetry, 60 seconds for SNMP and NETCONF state polling, 5 minutes for inventory, and 15 minutes for slow-changing config. Tighten only after you have data showing you need to.

Why this matters

The interval on a Subscription drives three costs at once: the device's CPU, the Collector's worker pool, and the storage and query cost on every Output downstream. Most teams overshoot, spend a quarter on infrastructure they did not need, and then complain that "telemetry is expensive." It does not have to be.

Pick the interval that matches the question you are trying to answer. A graph that shows traffic shape needs different freshness from an alert that pages on link down, which needs different freshness from a quarterly inventory report.

Starting numbers by protocol

These are defaults, not laws. They are tuned to be safe on production devices and to keep a 2 vCPU Collector comfortable.

Protocol Streaming/poll Default Notes
gNMI STREAM 10 s Bounded below by enforce-min-time (default 300 s in some deployments — adjust per device class).
gNMI ONCE / Get 30 s For data the device cannot stream cleanly.
NETCONF poll 60 s Sessions are heavy; do not poll faster without persistent sessions.
SNMP poll 60 s 30 s on small tables; 120 s on chassis with 1000+ rows.
CLI poll 5 min SSH login per cycle is the cost; everything below 60 s is suspect.

Starting numbers by use case

Your use case is usually a stronger signal than your protocol. Pick the row that matches the question.

Use case Suggested interval Why
Real-time telemetry (rates, queues, errors per second) 5-10 s Fast enough to see microbursts averaged over a few samples.
Operational dashboards 30 s Eye-friendly refresh, sub-minute issue detection.
Alerting on state changes 30-60 s Long enough to debounce flap, short enough to page within minutes.
Capacity planning 5 min Aggregates well; storage stays cheap.
Inventory and discovery 1 hour Hardware does not move every minute.
Configuration drift 15-60 min Config changes are events, not continuous data.
Compliance / audit dumps 24 hours Good enough; keep the device load near zero.

Putting them together

When the protocol number and the use-case number disagree, take the larger of the two as your starting point. The exception is gNMI STREAM, where the device is pushing on its own schedule and your Subscription's interval is more about sample period than poll frequency.

Examples:

  • Interface error counters via gNMI STREAM for a NOC dashboard — 10 s.
  • CPU and memory via SNMP for capacity planning — 5 min.
  • Routing table size via NETCONF for weekly trend — 1 hour.
  • show inventory via CLI on legacy chassis for asset DB — 24 h.

Trade-offs

What you give up by following the defaults above:

  • Sub-second visibility. You will not catch a 500 ms microburst with a 10 s sample. If you genuinely need that, you need streaming telemetry tuned to the device's native push cadence, not a tighter poll on top of a slow protocol.
  • Snapshots of every counter on every device. A 2 vCPU Collector comfortably handles a few hundred Subscriptions at the rates above and starts to wobble well before the max-subscriptions = 1000 cap. Do not size for the cap; size for headroom.
  • Cheap storage at high cardinality. A 5 s interval on 1000 interfaces across 100 devices is 1.7 million Snapshots an hour before you have collected anything useful. Whatever Output you pick, it will charge you for that.

Don't poll faster to "see more"

A tighter interval rarely improves what you can act on. It almost always increases device load, queue depth on the Output, and the storage bill. If a 30 s metric "isn't fast enough," the question is usually about resolution at query time, not at collection time.

When to deviate

  • You are debugging a live incident. Crank a single Subscription to 1-5 s for the duration of the incident. Put it back when you are done. Treat fast intervals like an emergency lever, not a default.
  • Your device pushes its own cadence. Some platforms stream every 1 s natively over gNMI. Take what they give you; do not try to retime them.
  • You are running synthetic probes, not collection. Probe intervals are a different problem and are usually driven by SLO budgets, not by Collector capacity.
  • The data only changes every hour anyway. A 10 s Subscription on a routing protocol summary is wasted work. Match the interval to how often the source can actually change.
  • The Output backend is the bottleneck. If you are remote-writing to a Prometheus that struggles, slow the Subscription before you scale the Output.
Filtering by: