Monitoring and Alerting Repairs it Entirely!

Throughout previous blog posts, I talked about thin provisioning, approaches to move from fat to thin, and the practice of over committing. All what I communicated was about their system, advantages, pluses & minuses, methodology, drawbacks etc. Likewise, I also talked about the need for constant monitoring of your storage as the solution to many drawbacks. This article will talk about how to apply a storage monitoring tool to your infrastructure to monitor your storage devices. But when you select the tool make sure that you select one which has alerting options too.  I will walk you through SolarWinds Storage Resource Monitor (SRM in short) which is one of the storage monitoring tools and in the course I will talk about the different necessary  features that any storage monitoring tool require to overcome the weaknesses of thin provisioning.

Introduction to SRM:

SRM is SolarWinds storage monitoring product. SRM monitors, reports, and alerts on SAN and NAS devices like Dell, EMC, NETAPP and so on. For a detailed list check here. In addition, SRM helps to manage and troubleshoot storage performance and capacity problems.

You can download SRM from the link below:

Storage Resource Monitor

Once you have installed SRM, next you will need to add your storage device. Adding your storage device is different based on your vendor. Visit the below page for instructions on how to add storage devices from different vendors.

How to add storage devices

Once you have installed SRM and added your storage devices to SRM, you will have instant visibility into all storage layersextending to virtualization and applications with the Application Stack Environment Dashboard. Using SRM, troubleshooting storage problems across your application infrastructure is a cake walk. Let’s start with SRM’s dashboard.

dashboard.png

The dashboard gives you a birds-eye view of any issues on your storage infrastructure. Further, the dashboard displays all storage devices monitored by SRM classified via product and relevant status of each layer of storage, such as storage arrays, storage pools, and LUN’s.

SRM and Thin Provisioning:

Moving on to Thin Provisioning, SRM allows you to more effectively manage Thin Provisioned LUN’s. And when thin provisioning is managed and monitored accurately over-provisioning or over committing can be done efficiently. SRM helps you view, analyze and plan thin provisioning deployments by collecting and reporting detailed information of virtual disks, so you can manage the level of over-commitment on your datastores.

LUN.png

This resource presents a grid of all LUNs using thin provisioning in the environment.

The columns are:

  • LUN : Shows the name of the LUN and its status
  • Storage Pool: Shows which storage pool the LUN belongs to
  • Associated Endpoint: The server volume or the datastore using the LUN
  • Total Size : The total User size of the LUN
  • Provisioned Capacity : Amount of capacity currently provisioned

There are also columns that show the provisioned percentage, File System Used Capacity, and File System Used Capacity percentage for the concerned LUN.

A tool tip will appear when you hover over the LUN or Storage Pool which gives you a quick snapshot of performance and capacity. This helps you decide if you need to take action. Moreover, this tool tip when hovered over storage pool shows the pool’s usable capacity summary. This shows the total usage capacity (i.e, the collected amount of storage capacity that a user can actually use), remaining capacity (the storage left behind to get occupied) and over-subscribed capacity (total capacity this storage pool is over committed).

hoverover storage pool _ 2.png

A drill-down on a specific storage pool gives information that presents important key/value pairs of information for the current storage pool. Moreover, detailed information on:

  • Total Usable Capacity
  • Total Subscribed Capacity
  • Over-Subscribed Capacity
  • Provisioned Capacity
  • Projected Run-Out time, approximate time it will take to wholly utilize this storage pool.

drill down on storage pool.png

In addition, Active Alerts displays the alerts related to this storage pool. This displays the alert name, alert message in short, name of the LUN for which alert is triggered and it’s time.

Learn how to create an alert in SRM.

Alerting helps proactive monitoring:

Storage performance issues can happen anytime and you cannot literally monitor each and every second on how storage is performing. This is why you need alerts. They help you by warning you before a problem occurs. By setting up alerts based on criteria, you will gain complete visibility into your storage. You have to setup an alert forecasting a particular situation that can cause issues with storage performance.

all active alerts.png

Provided below is a list of Example alerts that you can use for LUN’s while doing thin provisioning:

  • Alert when usable space in the LUN goes below a particular % (i.e 20%)
  • Alert when usable space in a storage pool goes below a particular %
  • Alert when the storage pools oversubscribed % goes higher than a particular % (i.e 10%)

The % values can only be decided by you, as it will be differ based on infrastructure. Some can add more storage in days, where as in many organizations it might take up to months to get approval for additional storage. Therefore, the decision of setting % can only be done by you. 

Once you have alerts in place, you can just sit back and relax. And spare your time (that you spent to monitoring thin provisioning and over committing in storage) for other endeavors.

Thwack - Symbolize TM, R, and C