Skip to content
Documentation Prelude Collector 1.0.0

Kafka output

Publish collected telemetry to a Kafka topic for high-throughput pipelines. SASL, TLS, partition key behavior, and example configuration.

The Kafka output publishes collected telemetry as JSON messages to a Kafka topic. It is Prelude Collector's pick for high-throughput pipelines where data needs to be consumed by multiple downstream systems or buffered for durable processing. After the first mention this page refers to Prelude Collector as "the collector".

When to use

Pick Kafka when you have multiple downstream consumers (stream processors, data warehouses, SIEM), need durable buffering, or are already running Kafka as the spine of your data platform. For single-consumer streaming inside a Prelude deployment, NATS is simpler. See Output selection.

How it works

Each record is JSON-serialized and published as a Kafka message to the configured topic. The message key is set to:

{model}-{deviceId}

For example: cpu-42. Kafka routes messages with the same key to the same partition, so all records for one model + device land in order on one consumer without needing a custom partitioner.

The writer uses a LeastBytes partition balancer for messages without a key match and groups records into batches with a 500 ms batch timeout.

Message format

Each Kafka message value is a JSON-serialised ParsedData envelope. All field names are kebab-case:

{
  "device": "router-01",
  "device-id": 42,
  "model-name": "cpu",
  "path": "Cisco-IOS-XR-wdsysmon-fd-oper:system-monitoring/cpu-utilization",
  "key": "cpu",
  "timestamp": "2024-01-01T12:00:00Z",
  "data-hash": "sha256:8f3c…",
  "data": {
    "usage": 45.2,
    "cores": 8
  }
}

Consumers should key on device-id (numeric uint, the device's DB id) and model-name. path echoes the source-protocol path that produced the record; data-hash is a content hash useful for deduplication.

Connection settings

Configure this backend through PUT /api/v1/outputs/kafka.

Field JSON key Type Required Default Description
Brokers brokers string Yes Comma-separated broker list, e.g. kafka1.example.com:9092,kafka2.example.com:9092
Topic topic string Yes Kafka topic to publish to
SASL mechanism sasl-mechanism string No empty One of PLAIN, SCRAM-SHA-256, SCRAM-SHA-512
SASL username sasl-username string No Username for SASL authentication
SASL password sasl-password string No Password for SASL authentication. Sensitive — masked as "********".
TLS enabled tls-enabled bool No false Enable TLS for broker connections

sasl-password is stored encrypted. Sending "" or "********" on update preserves the existing value.

Authentication

Three SASL options are supported, and TLS can be enabled independently:

Mechanism sasl-mechanism value
No authentication empty string
PLAIN PLAIN
SCRAM-SHA-256 SCRAM-SHA-256
SCRAM-SHA-512 SCRAM-SHA-512

For production over an untrusted network, combine SCRAM-SHA-512 with tls-enabled: true.

Configuration example

{
  "enabled": true,
  "config": {
    "brokers": "kafka1.example.com:9092,kafka2.example.com:9092",
    "topic": "prelude-collector",
    "sasl-mechanism": "SCRAM-SHA-512",
    "sasl-username": "prelude",
    "sasl-password": "<your-api-token>",
    "tls-enabled": true
  }
}

Apply it with:

export BASE="https://collector.example.com"
export TOKEN="<your-api-token>"

curl -s -X PUT "$BASE/api/v1/outputs/kafka" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d @kafka-output.json

Bruno: 08 Outputs / Update output config

Behavior on failure

Records are batched by the underlying Kafka writer and submitted with a 10-second write timeout. If the broker does not acknowledge a batch within that window, the entire batch is counted as failed and surfaced on /api/v1/outputs/metrics as an increment to the failures counter.

Flush() is a no-op for this Output — the Kafka writer manages its own batching internally. There is no on-disk buffer or automatic retry on the collector side; rely on Kafka's built-in retention to re-deliver to consumers, and on monitoring to catch sustained broker outages.

To validate connectivity proactively, use POST /api/v1/outputs/kafka/detect, which dials the brokers, probes ApiVersions, and (when a topic is configured) reads partitions to confirm the topic is visible.

Limitations

  • The topic must exist (or auto-creation must be permitted by the cluster). The collector does not create topics.
  • Partition key is fixed as {model}-{deviceId} and cannot be customized per Output.
  • Headers, idempotent producers, and transactional writes are not exposed here.
  • There is no on-disk buffer on the collector side — sustained broker unavailability beyond the in-flight batch results in dropped records counted as failures.
  • Compression settings are not exposed in the configuration.

See also

Filtering by: