Open for Voting
over 1 year ago

Additional Pollers - Automatic Load Disribution

Hi there.

We recently added additional pollers to our Solarwinds system and I was shocked that you have to manually allocate nodes to specific polling engines.

My feature request would be for the OPTION to have automatic load balancing of additional pollers in an upcoming release.  There should be consideration put into how the load distribution works and we need to retain the option to manually assign nodes still (because there are corner cases in my opinion for wanting to manually control certain nodes).

Thank you.

  • Most common observed issue with our external pollers is when they have a process like svchost.exe or w3wp.exe using excessive system resources, and the poller stops processing work or starting new tasks... The poller itself rarely croaks.

    So the neat trick would be to monitor internal logs for specific errors and initiate a failover based on the error type or frequency; not just when the poller DB heartbeat starts failing or it dies outright.

  • This really cannot be that hard to achieve in some form.  I am working on a solution for automatic polling engine fail over right now.

    Seems like an alert could be set up with an action:

    Alert:

    node selection: (all nodes assigned to engine 5)

    Alert Trigger: (engine 5 services, interface, (whatever you want)  not working

    alert action: execute (script action, sql update, best solution)  to move effected nodes to engine 11, only have to change one value in the DB.

    Here is what I have now:

    current alert suppression using dependencies:

    Right now,

    Custom property_Area: value = poller1, poller2, poller3, etc....

    Group with dynamic query: if CP_area = poller 1, assign to group 1, (etc..)

    Depenency: all members of group 1 are dependant on poller1 interface being operational.

    This solves the problem of sending off 1000 alerts, when someone accidentally disables the interface, but I think the ability to assign members to a group based on the value of their engineID would be more direct and more efficient. 

    Just need a way to use engineID in a dynamic query, and use engineid as the node selection criteria in an alert that can execute the action to change the engine id on the nodes.

    For example:

    Group1 = all nodes assigned to poller 1.

    group1 is dependant on poller1 being operational

    alert:

    all nodes on poller1.

    if poller1 is down, execute action to move nodes to poller03.

    this way once the nodes are moved they are also moved to group 3 and alerting will automatically resume, but will not alert while poller1 is down and they are assigned to poller1.

    So the question is, is it possible to assign nodes to a group based on actual assigned polling engine?

    Can an alert object selection criteria be selected based on assigned polling engine?

    Can a Sql update or something similar be created as an action?

  • having the ability to have a primary/backup poller would be great.  I've had instances where my additional polling engine has stopped and I might not know about it for too long.  The ability for a backup or secondary poller to automatically take over for failover would be a great addition.

  • ctmidnight‌ oh trust me I hear you there emoticons_happy.png I've been burned by MS as well (most of us have). We are still in the process of testing our SCOM 2012 R2 in our environment and I am still highly critical of it's so-called dashboard and monitoring capabilities --- Personally I am not impressed (not yet anyway). I don't have the warm fuzzies when I hear the term "SCOM". But there is one true thing --- there are pros and cons that must be weighed with every solution out there, even with SolarWinds.