This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Configuring a single Low Disk Space Alert for many individual servers

Hi Guys

A bit of a crazy situation to resolve so apologies in advance.

I'm trying to eliminate many false alerts currently being dropped into the teams mailboxes reporting on Low Disk Space.

The problem we have is that many of the servers in the estate are different; different disk/volume amounts, different disk/volume sizes, and different names.

I need to create 2 Low Disk Space Alerts that will report back on any disk/volume that has less that 500MBs, and the other alert to report back on 1GB or less disk space regardless of the entire size of the disks.

This eliminates specifying Disk Volume Percentage which is no good my different size disks.

All help and advice  appreciated.

  • If you want a single notification for a group of servers, it sounds like you would be better off using a report instead of an alert.  Depending on how often your nodes use disk space will determine how often you schedule the report to run.  If you use an alert, it will trigger every time a single server free space changes.

  • This is how I do it, to combine percentage and available disk space. You could drop the percentage entirely, since it might be redundant depending on disk sizes.

    pastedImage_0.png

    I also do this for the automatic reset condition as well to prevent reset happening too late, with larger disks.

    pastedImage_2.png

    Hope this helps you emoticons_happy.png

  • I am also in need of this.  I have multiple SQL servers with very large volumes that I need an alert when these volumes fall below 5gig of free space. 

  • Do this! This is what we use as well across a few hundred servers.

    We also have some that report a bit earlier based on size of the volume as well.
    It works well

  • I have done this recently using bits of what people of said above using custom propertys, creating a dashboard using swsql rather than reports as a starting point.

    Then moving on to alerts when we were happy we were not going to get thousands of alerts. Then only alerted on critial and used wallboard to keep an eye on warning.

    I liked the live icon changes, has more of impact ( and used links where ever i could the node name is a link too).

    pastedImage_0.png

    and something i dont see often but im going to do it here is the swsql script i created with some help from our dba team. So the Custom Property is the only thing i used so is it live is it test. But this means you can make multiple blocks

    pastedImage_1.png

  • Can you bless us with text version of the code? emoticons_happy.png

  • Sorry think i went into block overdrive see below

    select n.displayname as nodename, substring (v.caption,1,3) as [caption],

    CASE

    when (v.volumesize/1024/1024/1024/1024) > 1 then (tostring(round(v.volumesize/1024/1024/1024/1024,2))+ ' TB')

    when (v.volumesize/1024/1024/1024) > 1 then (tostring(round(v.volumesize/1024/1024/1024,2))+ ' GB')

    when (v.volumesize/1024/1024) > 1 then (tostring(round(v.volumesize/1024/1024,2))+ ' MB')

    end as [Volume Size],

    CASE

    when (v.VolumeSpaceAvailable/1024/1024/1024/1024) > 1 then (tostring(round(v.VolumeSpaceAvailable/1024/1024/1024/1024,2))+ ' TB')

    when (v.VolumeSpaceAvailable/1024/1024/1024) > 1 then (tostring(round(v.VolumeSpaceAvailable/1024/1024/1024,2))+ ' GB')

    when (v.VolumeSpaceAvailable/1024/1024) > 1 then (tostring(round(v.VolumeSpaceAvailable/1024/1024,2))+ ' MB')

    end as [Space On Volume],

    round((v.volumespaceused*100)/v.volumesize,2) as [volume percent used], v.detailsurl as [_linkfor_caption], n.detailsurl as [_linkfor_nodename],

    case

    when v.forecastcapacity.currentvalue > v.forecastcapacity.criticalthreshold then '/Orion/images/StatusIcons/Small-Critical.gif'

    when v.forecastcapacity.currentvalue > v.forecastcapacity.warningthreshold then '/Orion/images/StatusIcons/Small-Warning.gif'

    else '/Orion/images/StatusIcons/Small-Up.gif' end as [_iconfor_caption], '/Orion/images/StatusIcons/small-' + n.statusicon as [_iconfor_nodename]

    from Orion.Nodes n

    join Orion.NodesCustomProperties cp on n.nodeid = cp.nodeid

    join Orion.Volumes v on n.nodeid = v.nodeid

    Join Orion.VolumesCustomProperties vp on v.volumeid = vp.volumeid

    where

    (environment in ('live','dlive','dpre-prod','dr','dsit'))

    and v.forecastcapacity.currentvalue  >  v.forecastcapacity.warningthreshold

    and MonitorCapacity = 'A: True' and v.volumesize>0 and caption like '_:\%'

    order by [volume percent used] desc

  • I did have another custom property regarding whether i want to monitor the drive or not, for me this should be built into the system. As dont have the option to have a default custom property (this is a feature request somewhere) used "A: True" & "B: False" to move true to the top of the list, but after mentioning this to my DBA he came back with should of used affirmative emoticons_happy.png

  • For such situations I build the alert messages with variables for the information I need. i.e. Machine name, disk space, remaining disk space, etc. Those messages can be triggered by any alert that you want. So I would build an alert for the generic stuff such any drive with less than 10% space remaining and point it to those messages. And then build individual alerts for the specific machines and still point them to the same alert message - the body of the message will reflect what you need to see.

  • This is becoming a nightmare, something that should be so simple is clearly not so.

    I have created 4 alerts today as per the settings below (one alert example below) and one would have expected this alert to report back on Low Disk Space across all nodes/server volumes/disks in the estate but instead I'm not.

    I'm receiving alerts on disks that are not there and we have hundreds of application servers which have 5GB volumes and less user space, yet this alert nor any of the others are reporting back. However, I have received some alerts and they have reported against Page Files and Caches which I have specifically excluded in the alerts.

    Can some one tell me what is wrong with these alerts please?

    pastedImage_0.png

    Should this alert be like this instead eliminating the % and making the 'does not contain' mandatory?

    pastedImage_1.png

    This is the Reset Trigger:

    pastedImage_2.png

    These are the alerts I'm receiving, why is the Page File being reported on?

    Volume DC1**********9V-E:\ Label:PageFile 6******8:

          Total size 16.0 G

          Free space 4.93 M

          Percent used 100 %

    And why am I receiving alerts on volumes that aren't there?

    Volume AP**********02-E:\ Label:Data F*******0:

          Total size 0

          Free space 0

          Percent used 0 %

    I can confirm that there are hundreds of servers that should be alerting on 5GB or Less Free Space.