In my last post, I spoke about different ways to Alert in NPM, pairing multiple features together to create powerful ways to create granular alerts and to really reduce on alerting noise.
Well, that’s all well and good in a perfect world, where all of your devices are reporting the correct data – but what can you do if they aren’t? If your device is providing the wrong data for CPU and Memory for example, it’s no longer possible to alert accurately on that node. Or if we’re showing the vendor as ‘Unknown’ then it’s hard to use a qualifier like ‘Where Vendor = ABCDXYZ’ to define your alert scope.
We poll certain OIDs for different device types with our Native Pollers – OIDs that we’ve carefully chosen for certain vendors or models that work for the vast majority of those devices. But sometimes, those default OIDs aren’t a perfect fit. Sometimes, the device should support an OID, but it doesn’t. Other times, we might not have a Poller created for that particular device model yet. (When we say Poller, we mean gathering specific data from an OID or group of OIDs – for example, CPU & Memory, Hardware health sensors, Topology data and so on)
Luckily, there’s an easy and quick way to swap in new Pollers, or create your own ones, and start polling these devices accurately - Device Studio!
Never heard of it? Check out this video.
So, let’s talk specifics about Device Studio, and show you the exact steps you can take to fix a device that’s providing inaccurate information. On your Orion Settings page, look under Node & Group Management for the ‘Manage Pollers’ option:
Let’s assume you have a problematic device, providing the wrong CPU & Memory details. Fixing this is a two-step process.
#1 – Find the right OIDs to poll to get accurate information
This one will need a little legwork! Check Thwack and the Content Exchange first (or click the Community tab to download directly from the Manage Pollers page) – after all, no point in re-inventing the wheel when the awesome folk on the forums have shared their successes! If that doesn’t work, you’ll often find all the information you need by plugging the device model, what you’re looking for and the word ‘OID’ into Google – chances are if you’re looking for this information, someone else was too. If that fails you, turn to the device documentation.
If you’re brave and curious, you can SNMP Walk the entire device to get a list of every single OID it supports. The best part? We ship an SNMP Walk tool with your Orion install – you can find it here:
[Install Drive]\Program Files(x86)\SolarWinds\Orion\SNMPWalk.exe
I’m an SNMP geek, so if I get into writing about how to read an SNMP Walk, you’ll never hear the end of it – so I’ll leave you with this handy guide on SNMP to get you started on reading the output and choosing the right OID from it to poll your device.
#2 – Using the OIDs, set up a new Poller, and assign it to your device
1. In Device Studio, click ‘Create New Poller’
2. Fill in the details about this new poller
3. On the next page, you will see a list of all required information for this Poller. For example, to poll CPU & Memory, Orion needs to know where to get details of the current CPU load, Memory used and Free memory.
4. For each of these details, you’ll need to define the data source – this means, you’ll need to define what OIDs Orion needs to poll to get accurate information from your device.
5. You can browse the MIB tree itself, testing OIDs against your chosen device as you go.
6. Once you’ve chosen the data source, you’ll be asked to confirm if that data is reasonable and accurate. You’ll have the option here to perform calculations on the polled result – for example, to get an average across CPU cores, or combine multiple pollers together – this is very useful for Memory, as often, the data is stored as the number of blocks used / free – which then must be multiplied by the block size to get an accurate result.
If you’re happy with what you see, click ‘Yes, the data source is reasonable’.
7. Almost there! Once you complete the wizard, choose your shiny new Poller from the list and select the ‘Assign’ button.
Select the node or nodes you need to assign this poller to, and run a Scan against them – this confirms that they will definitely support those new OIDs. If they pass the test with a Match, you can Enable your new poller, replacing the Native poller.
8. If you need to swap back again for any reason, just run a List Resources against the Node, and you can toggle back and forth between your pollers.
And there you have it!
But wait – I also mentioned that you can use the Device Studio to fix those pesky devices that show as ‘Unknown’. If you do have devices that show up with the Vendor as Unknown, we’d still like to hear about them so that we can match them natively – but if you’d like to fix this yourself without waiting for the next release, you can use Device Studio to do this, and you can even use these steps to correct any devices that respond as ‘NET-SNMP’ instead of the correct Vendor & MachineType.
Much of the steps will be the same as above – you’ll just be creating a ‘Node Details’ Poller instead.
When you define the Data Sources for Node Details pollers, you’ll notice a lot of these are optional – but one is absolutely required: the SysObjectID.
The SysObjectID returns an OID that references another part of the MIB database – usually the Vendor’s MIBs, and can be used to identify both the Vendor and the Model of the device. It’s quite rare that this one isn’t supported by a device, so try to let Orion poll the SysObjectID automatically if at all possible. If the device doesn’t support this OID, you can use a constant value instead, and manually define the OID that should have been returned by the device.
Now, with the required OIDs done and out of the way – you can move on to fixing that Vendor = Unknown problem – and that part is quick and simple. Set the Constant Value to the text string you want it to report for both the Vendor, and the MachineType.
So, there you have it – a great way to clean up those ‘Unknown’ devices, and take care of the devices that respond with incorrect information, all in one place.