Skip to main content
Version: edge

gelf-chunking

Splits the data using GELF chunking protocol.

What's the logic for creating messgae id?

TL;DR: We are using ingest_ns + increment_id + thread_id combination as the message id. Long explaination: The GELF documentation suggests to "Generate from millisecond timestamp + hostname, for example.": GELF via UDP However, relying on current time in milliseconds on the same system will result in a high collision probability if lots of messages are generated quickly. Things will be even worse if multiple servers send to the same log server. Adding the hostname is not guaranteed to help, and if the hostname is the FQDN it is even unlikely to be unique at all. The GELF module used by Logstash uses the first eight bytes of an MD5 hash of the current time as floating point, a hyphen, and an eight byte random number: [logstash-output-gelf]( It probably doesn't have to be that clever: Using the timestamp plus a random number will mean we only have to worry about collision of random numbers, we can make it more deterministic by using the ingest_ns + an incremental id as a message ID. To keep it simple we're using this logic: (epoch_timestamp & 0xFF_FF) | (auto_increment_id << 16 ) [Reference conversation]https://github.com/tremor-rs/tremor-runtime/pull/2662