Polling intervals
Reasonable polling and streaming intervals per protocol and use case — with concrete starting numbers and the trade-offs between freshness, device load, and storage cost.
Recommendation
Start at 10 seconds for gNMI streaming telemetry, 60 seconds for SNMP and NETCONF state polling, 5 minutes for inventory, and 15 minutes for slow-changing config. Tighten only after you have data showing you need to.
Why this matters
The interval on a Subscription drives three costs at once: the device's CPU, the Collector's worker pool, and the storage and query cost on every Output downstream. Most teams overshoot, spend a quarter on infrastructure they did not need, and then complain that "telemetry is expensive." It does not have to be.
Pick the interval that matches the question you are trying to answer. A graph that shows traffic shape needs different freshness from an alert that pages on link down, which needs different freshness from a quarterly inventory report.
Starting numbers by protocol
These are defaults, not laws. They are tuned to be safe on production devices and to keep a 2 vCPU Collector comfortable.
| Protocol | Streaming/poll | Default | Notes |
|---|---|---|---|
| gNMI | STREAM | 10 s | Bounded below by enforce-min-time (default 300 s in some deployments — adjust per device class). |
| gNMI | ONCE / Get | 30 s | For data the device cannot stream cleanly. |
| NETCONF | poll | 60 s | Sessions are heavy; do not poll faster without persistent sessions. |
| SNMP | poll | 60 s | 30 s on small tables; 120 s on chassis with 1000+ rows. |
| CLI | poll | 5 min | SSH login per cycle is the cost; everything below 60 s is suspect. |
Starting numbers by use case
Your use case is usually a stronger signal than your protocol. Pick the row that matches the question.
| Use case | Suggested interval | Why |
|---|---|---|
| Real-time telemetry (rates, queues, errors per second) | 5-10 s | Fast enough to see microbursts averaged over a few samples. |
| Operational dashboards | 30 s | Eye-friendly refresh, sub-minute issue detection. |
| Alerting on state changes | 30-60 s | Long enough to debounce flap, short enough to page within minutes. |
| Capacity planning | 5 min | Aggregates well; storage stays cheap. |
| Inventory and discovery | 1 hour | Hardware does not move every minute. |
| Configuration drift | 15-60 min | Config changes are events, not continuous data. |
| Compliance / audit dumps | 24 hours | Good enough; keep the device load near zero. |
Putting them together
When the protocol number and the use-case number disagree, take the larger of the two as your starting point. The exception is gNMI STREAM, where the device is pushing on its own schedule and your Subscription's interval is more about sample period than poll frequency.
Examples:
- Interface error counters via gNMI STREAM for a NOC dashboard — 10 s.
- CPU and memory via SNMP for capacity planning — 5 min.
- Routing table size via NETCONF for weekly trend — 1 hour.
show inventoryvia CLI on legacy chassis for asset DB — 24 h.
Trade-offs
What you give up by following the defaults above:
- Sub-second visibility. You will not catch a 500 ms microburst with a 10 s sample. If you genuinely need that, you need streaming telemetry tuned to the device's native push cadence, not a tighter poll on top of a slow protocol.
- Snapshots of every counter on every device. A 2 vCPU Collector
comfortably handles a few hundred Subscriptions at the rates above
and starts to wobble well before the
max-subscriptions = 1000cap. Do not size for the cap; size for headroom. - Cheap storage at high cardinality. A 5 s interval on 1000 interfaces across 100 devices is 1.7 million Snapshots an hour before you have collected anything useful. Whatever Output you pick, it will charge you for that.
Don't poll faster to "see more"
A tighter interval rarely improves what you can act on. It almost always increases device load, queue depth on the Output, and the storage bill. If a 30 s metric "isn't fast enough," the question is usually about resolution at query time, not at collection time.
When to deviate
- You are debugging a live incident. Crank a single Subscription to 1-5 s for the duration of the incident. Put it back when you are done. Treat fast intervals like an emergency lever, not a default.
- Your device pushes its own cadence. Some platforms stream every 1 s natively over gNMI. Take what they give you; do not try to retime them.
- You are running synthetic probes, not collection. Probe intervals are a different problem and are usually driven by SLO budgets, not by Collector capacity.
- The data only changes every hour anyway. A 10 s Subscription on a routing protocol summary is wasted work. Match the interval to how often the source can actually change.
- The Output backend is the bottleneck. If you are remote-writing to a Prometheus that struggles, slow the Subscription before you scale the Output.