As many of you know, Storage Manager (Profiler) gets a plethora of data, maybe even too much - but many of you ask about how to set thresholds and alerts so you can be notified when something is amiss. In Profiler, getting alerts involves three steps:
- Building a rule, which includes a threshold on the metric of interest
- Assign it to a policy (ie, the set of resources you want to monitor) and push it out
- Setting the Notification to alert you via email when the trap is received.
For the threshold, lets focus on performance metrics right now - although you can do storage and asset change thresholds as well.
Go to Settings > All Rules > Add New Rule. From the list of choices, choose Threshold Rule. You should see the following screen:
Some quick definitions:
- Section - basically the scope of resources this rule would apply to (Ex: NetApp)
- Category - the types of metrics applicable to that section (Ex: LUN Performance)
- Instances (if applicable) - the instances of the metric we are monitoring (All instances)
- Condition - the threshold on the metric. (Average Latency (ms) > 20)
- Duration - how long the condition has to be met before the threshold is triggered (0 Min)
- Choose Action - choose one action (Send Trap)
So what we are telling Profiler in this example is to send us a trap whenever any instance of a NetApp LUN has average latency greater than 20ms. Before moving to the next step, a couple of cool things:
- When you set Profiler to Any Instances, new objects are covered automatically. If you create a new LUN, Profiler will automatically apply the rule to that instance.
- You can pick one or more instance - so you can get very particular if you need to.
- The duration allows you to filter out noise, so you don't get alerted on every little spike.
So, you have your rule, now you have to apply it. In Profiler, you do that via policies - which are just a collection of resources of the same type that you configure at the same time. Every resource type has a Default Policy, and that is the one we will use today.
Go to Settings > Policies and click the edit icon for Default NetApp Filer Policy (let us stick with NetApp for this example)
Click Rules and you will see a list of rules that are available to be assigned, or already assigned to the policy. Note there are default rules already assigned to identify problems for you. To assign a rule to the policy, click the rule and press the down arrow, and then press the Save button.
Now the rules is assigned to the policy - but - make sure you press the Push button to update the configuration on the agents monitoring the NetApp Filers.
So now, if a condition were met, the agent would send a trap to the Profiler server and you would see the trap in the Event Monitor. You could then manage the event
However, if you want to receive an email for that event, you need to turn on notifications. Notifications are turned on per user, so go to Settings > Users and click the edit icon for your login. If you have defined an email address, you will see a "Notifications" section. Click the Add button in this section.
Now you can add a notification for on resource or a group of resources. Choose "Groups" and then choose "All Devices". You can then pick the trap severity you want to be notified, and one or more email addresses to send the email to.
Whew! That was a few too many steps (hint, we will make this better in the future) - but now I can safely sleep knowing that I will be notified if I have a problem.
As a bonus, I'll throw in a few notes about managing the events on the Event Monitor:
- Events that occur over and over again for the same object only notify you on the first occurrence, but maintain a count thereafter (hence the count column in the Event Monitor)
- You can acknowledge and clear traps thru the event monitor
- In the Setting > Server Setup > Server, you can turn the automatic clearing of events after a certain amount of time.
Thanks for listening - and as always, if you have thoughts or feedback, we would love to hear it.