This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Storage Array data has not been collected for at least 6 hours

Is there any way i can get alerted by mail when this happens? Also, is there any way the alert can include some type of root cause?

  • Yes, there are some alerts for issues collecting data.  One of the most common problems is the provider stops responding to the agent.  To alert on this,

    • Check the Event Monitor for a generic trap that says "Both secure/unsecure WBEMClient connection failed: Unable to connect". 
    • Edit the trap and set the severity level of those traps to the desired level (I set mine to critical).  

    • Create a notification by going into your user account (top left link, click your name). 
    • Under the notifications section, click Add
    • Choose the specific device or a group that contains that device.
    • Choose the severity level that matches what you set it to previously.
    • Pick one of your email addresses and press Save.

    The next time your provider goes down, you should receive an email that looks like this:

    Let me know if that helps.

    Brian

  • One additional thing - you have to clear the traps once you have fixed the issue or the next time it happens you will not get notified.  You have two options:

    • Clear the trap from the event monitor manually.
    • Set up the auto clearing of traps by going to Settings > Server Setup > Server and setting these two frequencies:
      • Automatic Clearing of Traps Frequency (how often to attempt to clear traps)
      • and Automatic Clearing of Traps Age (how old the traps have to be to clear them automatically)

    Brian

  • Excellent!

    If i may pick your brain a little further...

    What if the Agent that is assigned the task of collecting data is not running or has an issue such as a specific module being offline, how do i report on these two?

    I'd like to get an alert when the storage manager agent itself is not running and if possible, include in the alert what the problem is. For example:

    No heartbeat from Storage Manager Agent (10.1.1.201)

    AND OR

    Storage Manager Agent (10.1.1.201) Error: XXXXXX Module offline.

    AND OR

    Storage Manager Agent Service log in credentials failed.

    Maybe i'm adressing an issue that shouldn't occur but i'd still like to have an alert if a Storage Manager Agent isn't running (has not reported back to Storage Manager Server). It could be that the agent was used for FA or something else, usually critical, and therefore one should get an alert whenever the agent isn't running.