I had a similar issue with our HP switches doing software upgrades. Some models would reload, but most would not. I tested the script manually on a switch and it worked just fine. My work around was to setup a second job to reload the switches. By itself, the reload job worked fine. So I staggered the reload times about 15 minutes after the first job to allow it to complete, especially when upgrading large amounts of switches.
As to getting it working in 1 script, I regret not being able to get this working as of now.
Yea, I thought of creating a second job. It would probably do the trick but with minimal switches for this environment, it becomes faster to connect CLI and upgrade them that way. Network connection for the download of tftp across is relatively slow.
I'm using NCM version 7.5. I'm not using the F/W upgrade wizard, but my own script. I don't remember seeing the wizard, but even then it's probably only for Cisco iOS devices. We use ProCurve switches so I don't think it would work. I should check anyway...
well, I am currently in the process of upgrading SW to current 12.1 NPM and 7.6 NCM. I had to stop mid-install to accommodate the switch upgrade. This was done on NCM 7.5.
NCM 7.5 and this is my own script.
Could it be an issue of timing? If NCM sends the script out as fast as it can, but the switch takes considerable time copying via tftp, I can imagine the NCM script timing out and failing to reload. Would the solution be to have two jobs, one to copy the new image over, and the second job being the reload of the switch--which would be timed 30 minutes after the first job ran?
I successfully tested the new NCM GUI/scripted IOS switch & router upgrade, and while (apparently) SW has removed my shared notes and papers, you can read about the 7.6 Cisco IOS Upgrade features in NCM here: https://thwack.solarwinds.com/community/solarwinds-community/product-blog/blog/2017/03/07/ncm-76-sneak-peek-firmware-upgrades#start=25
The NCM script should literally be configured to duplicate every step you'd use to upgrade a switch or router manually from the CLI. If the above ideas don't help, I'd go through the process manually once, carefully documenting every step of the upgrade. I'd use the same user credentials that NCM uses, too, when I SSH into the network device to prepare it and to upgrade it.
Then the documented steps should be entered into NCM as a scripted job to execute, and saved for using again in the future. Then I'd start off the job and troubleshoot the steps to see where it fails. Some ideas (which you may have already considered):
- Does NCM have the appropriate rights to do all the steps? If if can move the file successfully, but doesn't have admin privileges to reboot the switch, you've found the problem.
- Is the correct file name & location referenced every time?
- Is the tftp server enabled?
- Does the tftp server have the correct path to the IOS code to be uploaded?
- Is there sufficient space on the switch for the new code?
- Is there a different option to use instead of tftp (e.g.: sftp or scp)? Tftp may not be your most efficient/reliable/successful protocol to use--especially for larger files or across a WAN with higher latency.
- How long does it take to complete the tftp process and unpack / prep the file on the switch to be ready for the reload command? Is it so long that something (NCM or the switch) times out, and the next command is either aborted by NCM, or issued to the switch and ignored because the switch isn't ready for it?
- Does your "software install file flash" command include removing the boot parameter and installing the new one?
- Ensure the switch is using TACACS, then track the exact commands NCM uses to execute the reload. AAA should show you if your NCM user account has the right Authentication and Authorization to do the job, and the Accounting function of TACACS will reveal the timing of each command, and whether NCM has the rights to do the reload. Maybe you'll find out that the reload commands are issued prior to the file completely being transferred and the switch is not ready for new commands to be received.
Is there a setting to add time for the script to complete? Either from the SW configuration or inside the script itself?
Thanks for the help! Ben
1 of 1 people found this helpful
With my 3850 stacks the couple of times that I have used SolarWinds to automate the process I did the software install using the tftp: on the command line. That way I reduced my steps to just three steps
software install file tftp://_server_/name_of_ios_file on-reboot
Can you check and see the total time of (Download from tftp) to (Start of Install) and (Reboot)?
I don't have a switch at the moment to update via solarwinds but I do recall that when I have manually updated the 3850 if was in the neighborhood of 16-20 minutes to tftp the file (100 Mbps connection) and then perform the install. Total time from start to finish with the reboot was 23 minutes for a single switch and 27 minutes for a stack of switches (2-9).
Not positive its still happening or not, but for some time NCM has been either collapsing multiple blank lines or removing blanks altogether. It's best to not have blank lines in your scripts. If it requires a simple <return> to choose whatever the default is, lets say it needs a "y" answer, go ahead and put the "y" in.
You are correct that keeping it as short as possible is best too, long scripts can have issues esp. if the command doesn't process quickly.