Has anyone found the right perf counters to monitor a MS cluster resource. I found several scripts to get cluster resource info, but I'm not sure how to display the echo's onto the APM template? Any ideas?
The SNMP monitor doens't offer "Get Next" as an option. That might be the problem. If it's not that, we'll need a Support ticket to delve into it further.
I haven't found a way to monitor it using APM but I have created a universal device poller and use that to monitor using NPM.
The OID for which node owns the resources is 1.3.6.1.4.1.232.15.2.4.1.1.5
I also use this OID for the overall Cluster Condition 1.3.6.1.4.1.232.15.1.3
APM 3.0 has a simple SNMP OID monitor
Adapting a script that collects some information to APM is extremely simple. The basic idea is to condense the information into 3 data points
1. Status: This is what determines the status of the monitor in APM. i.e. whether you see green/yellow/red dot next to the monitor in the monitoring UI.
2. Statistic: This is the data that you want to track over time. The value shows up as the stat associated with the monitor and is shown in dials for the latest stat and charts for historical reference.
3. Message: This is used to convey information about the current state of the monitor. It is displayed as the extra info button on the monitoring UI.
So in the context of your question about MS Cluster Resource lets take the following scenario:
You have an ACTIVE-ACTIVE-PASSIVE cluster. You want to monitor and make sure that both your active nodes are active and be alerted when a failover to passive takes place
Lets assume you found a script that gives you the information about what nodes are active and which ones are passive. You could modify this script to adapt it to APM in the following way:
1. Status: Return OK status if the planned active nodes are still active. If you see that failover has occured set the status to Down/Critical.
2. Statistic: You can report how many of your planned active nodes are active. This way you can see historical trend of when nodes have gone down. This may be even more beneficial when you have several nodes in your cluster and some nodes are brought up and down.
3. Message: Message can be used to indicate which node of the cluster is currently down to ease in troubleshooting.
You should take a look at the Windows Script documentation to see how you can adapt a script you find on the net to APM. Then of-course the only right thing to do is to share it with others on thwack and get some bragging rights
Happy scripting.
I have tried using the SNMP OID monitor in APM 3.0 Sp1 and I get the following error when trying to monitor OID 1.3.6.1.4.1.232.15.1.3 or any other OID for that matter.
failed with 'Undefined' status
Not sure what is going wrong but that is the error I am getting and I have this OID setup in the Universal Device Poller and I collect info from that same OID without a problem.
Is your target a table-based MIB?