This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Detect Linux "OOM Killer" action

I'm now working in an environment with a lot of Linux servers.  Full disclosure: my background is 90% Windows admin.

I have a request from Linux admins to detect when the "OOM Killer" process starts running.  I guess OOM means "Out Of Memory" and is a kernel process that kicks in when the system runs out of memory.

Linux is Red Hat Enterprise of fairly recent vintage (I don't have the exact version number).

Thanks in advance for any inputs from the community.

  • Not a linuxey person here. Your linux admins should have given you more to work with here really!

    If you've just got SNMP you're probably screwed. I'd start with just making sure there's memory spare.

    The agent would be able to report on the processes and/or scrape the syslog for OOM killer events (the events bit is an easy google)
    You could set the syslog to forward to solarwinds to use in the events format. Ask the linux folk what the name of the process/daemon is for OOM exactly.

    I'd probably get a memory alert, an event alert, and a PID alert (as the memory one will come early, the PID one is unlikely to grab it as it'll run and bin itself off between polling cycles) and an events one that'd grab it most of the time but after the issue occurs

  • Thank you! The answer provided confirms my own reading on the subject.

    We have advised our Linux team that OOM killer activity gets tracked in Syslog, and asked if they want to configure Syslog to forward into Solarwinds for the couple of servers where they have OOM misbehavior.

    Marking answer from Adam.Beedell as Verified.