Hello All,
Just wanted to share my experience regarding SAM windows agent, So that whoever is planning to do an upgrade will be extra cautious!!
Today I have upgraded NPM 12.01 and SAM 6.3 @ one of my customers site during non Business hours early in the morning. As usual upgrade went smooth without any issues for NPM, SAM! Was excited to see HA settings and new Linux agent.
Once the web console was up, The real problem started... Around 350 agents started updating at a time together, It took almost 4 hours for all the 350 agents status to turn to ok from Update in progress.. I thought I was relieved but there was a different problem I started getting complaints that E-mail alerts were not having the proper information.
The CPU on SERVER12345 is currently running at 98 %. The top 10 processes running at the time of this poll are listed below:
Unable to get list of processes - Unable to schedule job on agent node 827 - required APM plugin is missing or not installed properly
So I understood that there was a problem with agent plugin and when I checked the agents, Though the agent status was OK the plugin status was either Pending or Approved state..
Agent Plugin Status
Few were in pending status few were in approved status
Around 50 agents the plugin was installed properly and rest were stuck.. Then I started noticing there was a latency issue. When I checked the interface utilization I was shocked to see it has grown up more than 100 % from the time, The Agent started updating I had a hope these agent plugins will be installed without any issues, So I was just waiting for an hour and slowly each agents started Updating the plugins properly. For all the agents to complete the agent & Plugin update it nearly took 5 hours in this site because these servers were in different domain, Different network all together.
So if you plan for agent updates, Please consider these points,
- Perform in non business hours,
- update the agents in batches if you have a poor network connectivity
- Turn off the Allow automatic agent updates temporarily before starting the upgrade.
- Have adequate time depending on the number of agents. \
- Turn of the alert if you have auto ticketing enabled for CPU,Memory Stats.
Once the plugin was installed properly, My network came back to normal state and SAM started rocking !! Share your thoughts and cases, might be useful for someone who is planning to upgrade....