Universal Disk Free Space Monitoring (One Template Will Handle All Logical Disks + Exceptions And Overrides)

Version 6

    Dear All Thwack Members,


    It is my pleasure to present you with this elegant template to help you out with monitoring free disk space across all your windows servers. As a matter of fact - 80% of all our incidents are disk space related and I hope this will help you out to handle this part in a simple and very flexible way


    What's in the tin?

    • Monitor all your disks on all servers (requires just 1 SAM license per server)
    • Set global threshold based on Free MBytes AND Free % levels. This works perfectly well for both large (TBytes) disks and small (MBytes) disks
    • Differentiate between warning and critical level
    • Set overrides on a per disk per server level (very granular approach)
    • Exclude disks you do not want monitoring (either completely or setup overrides)
    • Track usage as a graph
    • Have it on your dashboard as a green/yellow/red blob (which is not possible out-of-the-box with SAM volumes)


    Screenshots:


    Component

    001.JPG

    Multi stat chart

    002.JPG

    Script Arguments:

    003.JPG

    Global Statistic:

    004.JPG

    Individual Disk Statistic for global overrides (example for "J" drive)

    005.JPG

    Dashboard:

    006.JPG

    007.JPG


    Benefits:

    • If you are short on licenses - this will help you to monitor all disks with just 1 license per server
    • The biggest benefit for me (as my SAM is unlimited and first point is not really applicable) is that I can have it as a group item on  my dashboard. Very handy. Screenshot above
    • Ability to compare against both MBytes and Percentage levels definitely makes it very self-sustained and self-managed template. It just works for every single disk. I only have very few exceptions configured across 700+ volumes
    • When you add a new disk (in virtual world this is quite common) - you have your monitoring automatically enabled for it (without the need to discover new disk in SAM, what I often forget)


    Additional info:


    User description notes (copy-paste from template, for those craving more info):

    ---------------------------------

    This smart script will monitor free space on all fixed local disks on Windows server.

     

    * Global threshold, which is going to be applied for all disks on a server by default, is based on free percentage AND free bytes (for example, a particular disk has to be less than 5% AND less than 20GB to fire off warning alert). This works perfectly well for large disks measured in TBytes and small disks measured in MBytes

    * You can exclude particular disk on a particular server from being monitored by global threshold and set custom threshold for this disk instead

    * You can differentiate between critical/warning threshold levels.

    >>> For the global threshold - this is achieved by incrementing "Statistic" counter in 100s for any critical breached disk and in 1s for any warning breached disk. (for example: Statistic value 102 means that we have 1 disk breached critical threshold and 2 disks have breached warning. Note, that when disk falls in critical level it will also be in breach of warning level as well. So, value 101 will indicate that 1 disk has fallen into critical level; likewise value 102 will indicate 1 disk critical and 1 disk warning)

    >>> For the custom threshold - you just simply set values in SAM template for warning and for critical level for a particular disk

     

    How to exclude disk / configure custom threshold:

    When you need to override global threshold for a particular disk - you would first exclude this disk from being monitored by global threshold and second - you would configure separate threshold, which is based on free MBytes value, in SAM template below.

    For example: disk F needs to trigger an alert when it drops below 150GB. In this case you would override script arguments with "${IP} 2 5000 5 20000 A,B,F" and you would set critical threshold to "less than 150000" as a SAM threshold value. You can also set SAM warning threshold as 200000 to be notified in advance

     

    Limitations (all with very limited impact on usability and flexibility):

    - You can only define custom thresholds for the first 8 disks (C,D,E,F,G,H,I,J). Note, that global threshold will still be applicable for all logical fixed disks, regardless of the quantity (unless you exclude any of them as explained above). In fact - this is very limited limitation as most of the servers will probably not have as many disks and even if they will - it is very rarely when you will need to define custom thresholds anyway - so, probability of this limiting your monitoring abilities is very low

    - When you exclude particular disk - you can only set "MBytes" threshold value (no percentage threshold here). This is also very very limited issue, because at the time when you exclude particular disk - you already know exact size of it and you can work out yourself at what level in MBytes you want an alert to come through.

    ---------------------------------


    Enjoy, comment, like, rate


    To Your Monitoring Success,

    Alex


    Update: 08/01/2016 - Slightly update script as per comments below to enable it to pick up mount points. Also, improve variable for disks exclusions - you will now need to enclose disk letter in square brackets to exclude it. Example usage is in the script/template itself. Thank you, Alex