Take a look at this picture. Do you find anything funny? We reboot our servers on Tue at 2 am. Is there a know issue with the response time and the length of time the server(s) have been up?
Found the problem.
It appears that duel core AMD processors have an issue with the sync between the cores that gets worse as up time continues. To fix only a simple driver update is needed.I am going to test the driver update and then apply in production. I will have to wait and see if it fixes the issue.
AMD's driver page.
www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_871_13118,00.html
I do see something funny.. Your images appear to be broken.
By broken what do you mean?
Fixed.
When the response time is 1000+ ms have you tried pinging the node from a command prompt on the Orion server? I'm wondering if this is an Orion thing, or a network thing. The only time I've ever see anything like this was from a routing loop.
The chart is company wide for one day I believe. I do not understand how the response time could get worse as the amount of days the server has been up grows. And at the day and time the servers reboot it looks to "reset" the response time.
I agree that it's puzzling. Can you look through individual nodes in Orion for the same time period and determine if they too are seeing the same high response times?
This in one node, it is not as pronounced at the total network but you can see the trend.
What does your Orion configuration look like? Is this a dedicated Orion server? How many nodes & elements are you monitoring? What are the hardware specs for the server? Response time can be heavily influenced by a machine under heavy strain. For instance high CPU utilization or heavy network interface usage. You might even have a memory leak that's causing the problem. This is why I think it's important to determine if the problem is with Orion or just a general network/server issue. A good why of isolating the problem to one or the other is to run a continuous ping on a few of your nodes to see if you are seeing the same high response times as Orion.
We have one Orion SLX and two SLY pollers. They are all 2 dual core Opetrons 2.8 with 4 GB ram. Total Elements, Nodes, Interfaces, Volumes are:
We have the additianal web server. the DB is on our SAN behind IBM's SVC.
Do you know about any similar issue with Xeon Dual Core ??
This is actually not an AMD specific issue.It occurs with any Dual Core CPU with power management enabled.What happens is the each Core CPU speed (frequency) is throttled down when full computing power is not required and the clocks in each core get out of sync, causing problems with applications which rely on the clock for timing (such as ping).
If you disable the CPU power management & reboot the server, then you should be OK.
Not being a server person how do you go about disabling CPU power managment? Or is it specific to the vendor? Does this have to be done in the bios?