I wanted to post an idea I have here to see if anyone has either done this or can foresee issues with the idea.
Our environment has 6 polling engines in two data centers supporting a LAN environment. So, the polling engines are pretty much in the same location on the network. Different physical locations but the same subnets. This being the case there was no real benefit to sectioning our environment and dedicating a section to each polling engine. What we have strived to do is have each polling engine monitoring all areas of our environment. Each polling engine is monitoring some of the devices in a given building. If I lose a polling engine or need to take one out of service, I’m still monitoring over 80% of a building…if something happens in the building odds are we’ll still see it. As you can image, it is hard to keep track of this by hand. It is hard to move nodes around if I lose a polling engine and put them back when I get the polling engine back in service.
My idea is to use SQL MOD function to create 10 groups of nodes based on the last numeral in the nodes NodeID. If the NodeID is 123 it will be in group 3. I will assign the groups to the polling engines. For example, a polling engine will be monitoring all nodes with a NodeID ending in 3 and 5. I will then use an SQL request to set the polling engine for the nodes. I could do this one group at a time or for all the groups in the environment. I image for load issues I should only change one group at a time.
Please let me know if this post should have been made in a different forum. Thank you.
The problem I'm having isn't managing the load but managing the coverage. I need to insure that a given building or region of our environment is not polled by just one or two polling engines. In fact the more polling engines that have a piece of a given pie the more resilient I am to issues with the polling engines.
So moving them in mass by say name doesn't work out well for our needs.
This is a very cool script. I worry a little bit about not knowing where the nodes are going. Insure they coverage distribution. Also there is the problem of taking polling engines in and out of service and having to shuffle the nodes.
Under the Manage Nodes screen you can reassign them 'in mass' by simply selecting them and assigning them to an engine. Usually if I have one that is sick I move all the nodes off of it until its fixed.
In the Manage Nodes screen set your Group By to Polling Engine then click on a polling engine, select all nodes, then More Actions, Change Polling Engines and move them all at once.
Moving them like that is what we do today and in order to keep the coverage for any given area of my environment spread to several polling engines for resilience can be changing to keep up with (~3500 nodes on 6 polling engines).
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process. Learn more today by joining now.