This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SW is reporting servers memory/cpu always hits 100% utilization

Is this normal? It's not a consistent 100% utilization. It more so seems like I get hammered with alerts at least 1 once per every hour with random servers (some use IIS, SQL, etc). We have over 80 servers in our virtual environment.

  • Provide a little more information. How are these VM's configured? When you get these alerts what's the vm showing? Are you seeing any spikes? How do you have the alert configured? This could also be in the way your monitoring the servers. Are they monitored in snmp, wmi, or agent monitoring?

  • I use WMI for all of my windows servers.  The attached screenshot is my trigger conditions.  When I get the alerts sometimes the VM shows the cpu/memory is high. sometimes they dont.  I do have agents configurd on them as well

    TRIGGER CONDITION.JPG

  • I would say change *I want to alert on* to "Node" instead of "Volume" as this is more of a node related alert than it is a volume related alert. The condition at least to mean appears to be sound. And you have it set to be active for at least 10 minutes before firing an alert which normally should be enough to to avoid spikes.

    In our environment we normally separate the alerts because I've had odd behavior like this as well trying to consolidate many things in one alert. Not that it wouldn't work, it might work just right for you but in our case we had alot of odd behavior happening and separated them.  

    Give this a try. See if this provides any better results.

  • Thanks. I'll give that a try. I wasn't sure if it was normal behavior for a server to shoot up to 100% almost every 30 minutes to an hour. Is this a problem where I need to up the resources? or just something servers do?

  • I agree with lcsw2013. it should be configured to monitor on "node".

    i don't believe server's CPU just shoots up to 100% every 30 mins. You should check on the server to see what it thinks. If it also shows CPU going up to 100%, try and see what's causing the spike. If it's 'normal', then you'll have to throw extra resources at it.

  • I agree! This looks like it could potentially be a resource issue. Check what service or process is consuming the cpu when it spikes it should be able to guide you which way to take.