I'm not convinced that the answers I got on case # 281834 are accurate so I'm going to ask the community for their thoughts.
I recently discovered a node that was up (green indicator on my map) but had no history on any of the graphs I looked at ... not even the ICMP graphs. I thought it was odd so I put in a ticket to ask about it and in the mean time I recreated the node. The new node started collecting historical information right away, so I knew the old node was broken, I just didn't know why.
It turns out that before I came to this company, we upgraded from Orion v9.x to v10.x and a handful of nodes didn't get the new PollingEngineID. A great support tech gave me a query to change any PollingEngineID that wasn't 1 to 1 and suddenly the broken nodes started working.
But now I was faced with an unusual situation. I had two nodes, each with its own NodeID but both were pointing at the same IP Address. I monitored for a few days and one node would go down briefly but the other would not, so I opened a new case.
While I waited for a reply I decided to double poll another handful of nodes. I had nodes at 10.0.50.2 - 10.0.50.4 with nothing at 50.5, so I created a node for 10.0.50.5 ICMP only and once it was created, changed its IP to 10.0.50.2. I repeated the process till I had 50.2 - 50.4 double monitored. All eight nodes were collecting data but only the original nodes would trigger alerts.
The first tech I spoke with told me that double monitoring this way was a bad idea. He told me “Since you have the both nodes assigned to monitor the same IP address it is definitely creating a conflict and also it may be causing a mac flapping." MAC flapping? Really? I've seen port flapping but never MAC flapping? I asked to escalate and the next tech was at least a little better with her techno babble. She insisted that double polling wouldn't work from a single polling engine ... so I set up an eval copy of Orion to monitor my handful of nodes and deleted the duplicates I'd created.
Now when a node goes down, both copies of Orion alert me, but she still tells me that it's a bad idea to double poll. She said:
You don’t want duplicates from same machine for a couple of reasons. You are sending data about the same IP address and going to the same database this can cause confusion for customer and database. The device can also block SNMP due to too many hits. If the node was to go down in a fast poll this would mean you would also have two fast polls on this device causing bottleneck of traffic and delay’s for other information to be wrote.
I almost buy this but if I run a set of pings to a device from two different DOS windows they don't conflict with each other, so I think she's just blowing a bunch of smoke up my ... tail pipe.
My thought is that Orion sees each node as an individual NodeID. It doesn't care what that NodeID's IP Address is as long as it has a place to send pings and SNMPGet requests. So there's no reason that Orion can't monitor the same IP using two different NodeIDs. Testing seems to indicate otherwise, however, since my duplicate nodes wouldn't alert when they were on the same polling engine but they would when they were monitored by separate copies of Orion.
Can someone give me a believable reason why we can't double poll with a single copy of Orion? I don't want to have a single NodeID try to ping more frequently than every 60 seconds, but if I had 2 nodes for the same IP, each polling at 60 seconds and if my polling was optimized then in theory I should be able to collect data from one or the other node every 30 seconds. And triple or quadruple polling would give me even more reliable data on when a node goes down/up.
What are -your- thoughts?