cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Three questions about UDT polling intervals and pollers

Jump to solution

For those of you who use UDT:

I just added UDT to my environment, and I like some of the new things we're learning through it.  But I regularly get two errors from it:

pastedImage_0.png

I've looked up the errors and it appears UDT needs either more resources, more pollers, or more time to complete its polling.

pastedImage_1.png

I've doubled the polling interval allowed, and still the error appears.

Question 1:  What polling intervals do you use to successfully complete polling?

Question 2:  How many nodes do your individual UDT pollers handle?

Question 3:  What physical resources / components (I'm assuming the manual is referencing memory and CPU, perhaps from a VM point of view) do your UDT pollers have installed?

​It's OK if you only answer one or two of these questions.  Please guide me with your UDT experience!

0 Kudos
1 Solution

Thanks for your information.

After opening a Support Case with SW, I learned there's yet another Hot Fix and associated procedure to remedy the UDT polling errors I'm seeing.  I've not yet applied it, but I hope to within the next few days.  Fingers will be crossed that I can go back to default polling settings, and that the Fix will be just that.

View solution in original post

17 Replies
Level 12

Are you guys referring to running UDT standalone?  According to the Solarwinds published scalability KB, UDT can poll up to 100k ports per poller.

Scalability Engine Guidelines for SolarWinds Orion Products - SolarWinds Worldwide, LLC. Help and Su...

pastedImage_0.png

0 Kudos

We're using the integrated solution, deployed across seven pollers--not the stand-alone solution.

0 Kudos

Yes, we applied that fix - that fix addressed the joy of adding all devices we had into UDT management.

0 Kudos

We have an open case in which we are unable to change our default polling interval for UDT.  It is "Stuck" at 30 min.

I have asked Tech Support to look at your case and see if your buddy drop addresses this known bug.

Thanks,

0 Kudos

I trust you've already applied the UDT Patch:

pastedImage_0.png

0 Kudos

We uninstalled UDT for a while.  Yesterday we have reinstalled it and we are slowly bringing it back online.

Nothing worse than a complete outage for a week with little to no response from development.....

0 Kudos

Thanks for your information.

After opening a Support Case with SW, I learned there's yet another Hot Fix and associated procedure to remedy the UDT polling errors I'm seeing.  I've not yet applied it, but I hope to within the next few days.  Fingers will be crossed that I can go back to default polling settings, and that the Fix will be just that.

View solution in original post

Is this a new hotfix? I only have UDT 3.3 hotfix 1 in my customer portal.

0 Kudos

This "hot fix" turns out to be a "buddy dump" where Support sent me a link to download a new UDT dll file.  The installation process was simple:

1. Stop all Solarwinds services on the poller

2. Move the original .dll file out of the normal location and keep it elsewhere for possible future restoration

3. Copy the new .dll file to the indicated directory

4. Start all Solarwinds services

5. Repeat on all pollers

Now that I've done this, Support asked to be notified, and then they'll send me the next step(s).  I'm not sure why they don't include them all with the original e-mail . . .

0 Kudos

What's the buddy drop number? I'd like to ask my support tech about it

0 Kudos

Nick, you can have them reference Case Ticket 1353740.

Did this buddy drop resolve your ability to change polling times?

0 Kudos

The root cause appears to be number of devices being polled by UDT versus the number of pollers.

Using the default polling interval, a poller can monitor only 3000 UDT devices/IP addresses.  My confusion came from that number--I was thinking it was referencing "nodes", not UDT-polled-devices.  Since I have five pollers and only 800 nodes, I thought I was good to go.

Not so with UDT!

It turns out I have about 53,000 active devices.  At 3000 devices per poller, I must increase my poller count to at least 18.  Yes, it was a shock to me, too.

The alternative is to change the polling interval and risk having stale/obsolete data in User Device Tracker.  The question becomes "Which is worse--having 18 UDT pollers, or having unreliable data in UDT?"

Fortunately I purchased NAM this past fall, which enables me to install twenty pollers without any up charge at all!

Unfortunately, it becomes a management headache, reassigning nodes to pollers, and changing those nodes syslog settings and NetFlow settings and traps.

Because I'd want every node sending all syslogs and NetFlow data and traps to the same poller that manages them for UDT--wouldn't you?

I'm just trying to decide what's easier or harder.  NCM can make simple work of changing Netflow and syslog settings in bulk on switches & routers--assuming you've used good conventions and identical routing/management/reporting interfaces on the nodes.

Do I really want to approach my System Admins and ask for that many more pollers?  The jury is still out on that.  I have Infoblox for my DHCP solution, and it offers a module (for another $50K, I suppose) that will do the same network discovery that UDT does, and will list the switches & ports every device is attached to, sorting by end device's IP address or MAC address or FQDN.

I'm sorry that UDT has this limitation, since I've already done the homework to be able to mine its database and pull the UDT information from Solarwinds Orion SQL and import it into our LANDesk Web Desk CMDB.  Will Infoblox's "discovery" module be as easily mined & exported?  I don't know.

0 Kudos

Yes.

0 Kudos

It was in the PreRelease 17 BD.

Now that it's deployed, Support just wrote me these instructions to proceed:

"

For the next step, you will need to contact your Database administrator.  Try running the SQL queries below.

Delete from udt_nodecapability where nodeid in (select nodeid from nodesdata where objectsubtype = 'WMI')

Delete from udt_nodecapability where nodeid in (select nodeid from nodesdata where objectsubtype = 'Agent')

Delete from udt_nodecapability where nodeid in (select nodeid from nodesdata where vendor = 'Windows')

Afterwards:

Delete from udt_job where nodeid not in (select nodeid from udt_nodecapability)

0 Kudos

Question 1:  What polling intervals do you use to successfully complete polling?

    

Question 2:  How many nodes do your individual UDT pollers handle?

      Total nodes is 2,400. However, we only focus on about 1,400 of them and 3,200 ports.   

Question 3:  What physical resources / components (I'm assuming the manual is referencing memory and CPU, perhaps from a VM point of view) do your UDT pollers have installed?

     Physical app and DB servers. Windows/SQL Server 2016. 64GB RAM, bucketload of CPU's. But we also run SAM, NCM, NTA, Patch Mgr, LEM, IPAM, SRM, VNQM, VMAN. 

Level 8

Question 1:  What polling intervals do you use to successfully complete polling?

                         For our environment it's:  Layer 2 - 10 minutes , Layer 3 - 15 minutes , Domain Controllers - 20 minutes.

Question 2:  How many nodes do your individual UDT pollers handle?

                        

                         We have 120 nodes with 2300 ports being monitored.

Question 3:  What physical resources / components (I'm assuming the manual is referencing memory and CPU, perhaps from a VM point of view) do your UDT pollers have installed?

                           Our solarwinds server consists of 1 app server with 1 DB server that is separate from the application server. The Solarwinds app server is set to 4 vCPU with 16GB RAM , The DB server is set to 4 vCPU with 32GB RAM.

We have not had any errors so far , other than right after implementation where it was populating all of the initial information. After that things have been fine, I actually reduced the polling intervals to those numbers and it seems to be the sweet spot for our environment so far.

How many nodes are you monitoring?