4 Replies Latest reply on Jul 27, 2016 1:42 PM by jest4kicks

    UnDP for Cisco fault persists after fault has cleared

    jest4kicks

      Hey all,

       

      I'm designing a UnDP to alert us to faults from our Cisco IMC's.  I've been using the following link as a general guide.

      https://communities.cisco.com/docs/DOC-37197

       

      The guide describes two methods for polling this info; snmp traps, or standard snmp polling of the faults table.  I decided to try and use the table to avoid messing with snmp traps.

       

      To test, I setup a RAID on a lab server and then yanked one of the disks.  This produced the expected fault, and I was able to poll fault information from the table.  So far, so good.

       

       

      The problem surfaced after the disk was reinserted.  During RAID rebuild, the fault updated to reflect the new status.  Ok, cool.  But then the rebuild completed and the fault in the CIMC cleared... except that the the UnDP is still reporting the fault data.  The following screen cap is current, and the fault has definitely cleared on the CIMC side.

       

      I thought the CIMC might still be reporting the data, so I did an SNMP walk against the OIDs.  Nothing.  (I was previously using this same method when the fault was active and it reported successfully, as expected.)

       

      Does anyone know why the UnDP's are still reporting the previous status?  Is there a way to automatically clear it when the snmp client stops reporting it?

       

      Thanks!

        • Re: UnDP for Cisco fault persists after fault has cleared
          rickrocks

          Hello,

          Make sure when you add the Universal Device Poller, you set the polling interval. It's under Show Advanced Options, expand this and input polling interval. Hope this helps.

            • Re: UnDP for Cisco fault persists after fault has cleared
              jest4kicks

              Hey Rick, I like your thinking, but I don't think it's an issue with the UnDP waiting to poll.  As I understand it, leaving the polling interval blank should default the UnDP to the default polling interval in NPM.  Additionally, the UnDP did report the updated status when the disk was reinserted and the RAID was rebuilding.

               

              Digging into this a little further, I noticed that the OIDs I polled this information from are no longer responding.  My theory is that the UnDP doesn't know what to update because it's no longer getting a response, and apparently the OID only responds when there's an active fault.

               

              Unless there's an option to clear a UnDP when it stops getting a response (which would introduce it's own set of problems), I think I'm stuck here.  May need to go the trap route, after all.