This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Syslog data loss observed.

Hello All,

We have two systems with Kiwi Syslog server configured on it.

Both servers are receiving log data on default UDP port from the same source.

But at the EOD, Syslog files on both servers do not match. There is a huge difference between their contents.

RAM, CPU and HDD usage is normal throughout the day. We do not have any other UDP traffic throughout the network.

Please suggest what could be the problem here.

Thanks in advance.

  • Have you done a packet capture on both to compare incoming packets?

  • Unfortunately the ultimate answer is that syslog especially over udp is an incredibly bad idea.  Even more unfortunately, we don't have a good industry standard alternative.  Just as unfortunately, the whole thing is difficult to debug. 

    The short answer to your question is that:

    1) you need to find a way to send a large test set of syslogs that you can know about, control, and reproduce.  I did this with a new linux machine, and the syslog server sent the logs to a specific file just for that server.  Then I could send 1000 syslogs as fast as possible and count the lines in the file to see how many are missing.

    2) Then you can start to experiment to find the bottleneck.  Network connections anywhere between the device and the syslog server, or disk throughput would be my first suspects.  I hope you have people to help, networking, server, san, etc.

    Start with the new test sending server on the same network with the syslog receiving server.  If you can produce a high loss rate, then it's likely not the network, it's a bottleneck on the server.  If you receive most of the syslogs in that test then likely it's not the syslog server, it's on the network closer to the sender, maybe you can try to move the test sender to the other network and try again.