Open for Voting
over 1 year ago

Auto Load-balance of devices and other functions.


Quick Edit:

This is a rough draft below of an idea I thought to be valuable. I'm accepting input below from anyone who wishes to contribute. I would like to mature this idea in hopes of catching the eyes of the product managers and or developers from solarwinds. I'm hoping possibly to get some traction on this if the community things this is something useful.

Idea below:

This idea stems from having a large environment and not having an easier way to load balance jobs and devices across all pollers.

1. When you do a discovery, the poller always defaults to one single poller. Discovery should be set to add devices across pollers depending on poller load. This way for those unaware, they won't be overloading just a single server when they do not change the poller before starting the discovery. I myself have many times while going through many discoveries forgotten to change the poller causing everything to be added to my primary overwhelming and overloading my primary server. I think discovery should have two modes an auto mode that will automatically let the devices be distributed and added evenly or a manual mode where you chose the poller where you want to add the devices too. This will enable ease of use when adding devices so that they are even displaced among the pollers instead of crashing a server with to many devices to be monitored.

2. This would be good if it could be implemented with other modules in consideration. For example, right now I have one poller with 250 devices which is low on device count. but when you add templates, interfaces, volumes, etc. Once can easily see that even with a small node count you can still overload the servers without knowing it.

We have near 100k element count and quickly growing. Due to no inventory list telling me which node is where and what is being monitored my environment is all over the place which means I have to take time to go through this even things out which will not at all be an easy task.

I'm sure those with large infrastructures will appreciate this idea and I'm hoping people will contribute more to it. It is something I think could potentially bring even more value to solarwinds.

This is a tool and or function already offered in some competitor software. I think it would put solarwinds ahead of them having such a tool in place.

  • This feature would be a vast improvement, it's got my upvote. I posted a similar and in many ways duplicate of this feature request here:

    Going to link to this feature request from mine as well, the more visibility the better.

  • So I've learned that balancing can also be done by a distributed environment instead of centralized (Which we have). What sucks is that discovery still defaults to one server. And conveying this message to user is difficult because they don't know what poller to pick when adding devices. And unfortunately not everyone sends me their requests to add devices, or this could of been avoided.

    When moving devices from poller to poller, this is the easy part of load balancing. Now when you have to consider IPAM, SAM templates, etc. That's when load balancing can be quite a challenge. For example you'll have your node count balance but overbudget on the component count for one poller. Then you move devices and end up with uneven node count across the pollers.

    This is why I opened this idea for discussion. For the past 3 years we've attempted several different solution for better organization but they all failed. either due to lack of native tools. or orions inability to accept certain customization.

  • We have some devices that our network folk say can't point to more than one poller.  There is also one set of devices the owners don't want to open firewalls to more than one poller.  This is the vast minority of our devices though.  This feature should include the ability to lock down some devices to specific pollers or sets of pollers for this reason.

  • Right. The point is not to just strictly automate the process. But rather offer an alternate option to the current option to allow automation if the environment needs it.

    We have a problem where if we move devices to a different poller they stop polling but come to figure out that this was an internal problem. Network guys took a look and figure it was a route issue. That fixed that.

    But still left us with no way to auto balance but instead having to manual balance close to 100 k elements will take us a long time. node wise we have around 4500 nodes in monitoring and growing almost on a weekly bases.

    With growing pains we are finding the weak area's of the software. What made it tough for us was finding feature request with hundreds of up votes which was started years back and the feature still not implemented. One would thing if a feature request is getting hundreds of up-votes it's a hot item.

    This is more of an enhancement than it is a feature request. One that would make a world of difference to us.

  • I've written scripts to handle balancing via the API but it would definitely be nice to have Orion doing this natively, the tricky part is where people can only reach certain devices from certain pollers but just being able to create poller groups and associate a discovery with the group would get around that easily, and then that builds toward being able to do automatic failover to another APE in the event that a poller stops working as expected.

  • Bump... People aren't seeing how great this feature can be.  But I'm hoping to catch the eye of a product manager or developer. Would be nice to have this implemented.

  • Think about it for a second. You click discover. You have  a list of lets say 300 devices. The discovery starts running. With solarwinds automatically assigning the nodes to the engine with the lowest load levels. There won't ever be a need to worry about load balancing load across servers any more.

    Lets take the idea even further and say that solarwinds daily runs a scan and finds that an engine is getting overloaded and kind of like database maintenance it automatically moves the device over. Now if you have device critical that need to be in a specific poller why not also include an option where you could flag a device and solarwinds will leave it along?

    I've heard of people placing F5 load balancers and other solutions to customize and make this possible. But why not do it at the software level and keep it use for all of us Administrators? SolarWinds is a master at what it does but at the same time I've notice it lacks alot of these simple features you can find with other solutions that does not monitor the device in the same capacity that solarwinds does.

    Features such as these would make the software more valuable than the competition out there and propel it to a higher position in the standings. Not only that it would be one less thing we all need to worry about. I could change my SOP's to not worry about selecting a poller for example.

    It could even include all the other elements into the equation and balance the work load perfectly.

    Any thoughts or opinions? I'm sure that with more contribution this could be an even better feature request.