Hope this does not come off too dumb but here is an issue that has been KILLING us.
Late last night I finally figured out the cause, and I believe I have found a solution but I wanted to get some input from the community before I start pushing out that change.
The effort will be quite significant to say the least.
So here is what I am trying to accomplish:
We have various web promotion pages that reside within our main URL.
Say our main is www.helloworld.com.
Currently we have the following promotion pages monitored.
The main page resides behind an F5 and consists of a 4 server backend web env.
the main page was added as a node with a fixed IP address that will always be resolved to main VIP of the site..
Due to our F5 configuration I am unable to get to an IP for a webpage on each of the servers. That would be the ideal way to monitor but not possible per our net guys at this time.
So node monitor points to www.helloworld.com
I then added each promo page as a single HTTP content monitor to this parent node. I did not combine each URL into a single template with multiple HTTP content component monitors so I could easily unmanage each site when necessary.
This worked fine for several months. The site continued to grow in numbers until today where we have about 25 HTTP content monitors /sites all attached to the parent URL.
For the past few months we have been seeing bogus alarms for a single poll cycle just about every 20 minutes like clockwork. For months we have been digging through event logs and pref counters trying to find the cause. nada was for sure.
A month ago I set up identicle monitors from another data center across the country and we saw very similar if not exact behavior with the false alarms.
So last night I added something new. I took an external node monitor (google.com) and then added a single HTTP Content template to the google node.
I then assigned each of the current http content monitors to that node but all within a single application template.
Since adding the monitors this way the old monitors are still alarming about every 20 minutes as always but the new monitors are running rock solid with no alerts being recorded.
I know for sure the sites are online and stable due to another monitoring tool I have doublechecking. So it seems that how these monitors were added is the cause of our false alarms.
My question is this, what is the correct way to add a large number of HTTP content monitors to a VIP? using the successful manor makes unmanagement of individual pages a challenge and makes it impossible to schedule unmanagemetn in advance.
I checked the admin guide but was not able to find an example of this exact situation. I may be missing somethign basic here. Any suggestions would be most welcomed.
The image on the left is what I added that is working great.
The image on the right is the old monitors that are having false alarms at random.