1.) Inherited multiple NPM instances I want to upgrade.
2.) Prior admin attempted the NPM 11.0.1 to 11.5.3 and it broke many reporting and alerting queries.
(Variables no passing correct info i.e. 'NEVER'), false alerts sent 'floods' to NOC, discovery failing, reporting not working.
3.) As there were hundreds of alerts and reports built over the last 6-8 years to work through to test and correct, it was rolled back due to outcry from NOC, as time needed to go through and fix obvious broken issues, and to dig and test each alert and report would have taken days.
I'm revisting whether to upgrade to 11.5.3 or straight to NPM 12, and if the same issues for broken alert and reports in 11.0.1 to NPM 12 are fixed, or inherited, in addition to any other unknown issues.
The two targets for upgrade are a Win 2008 SP2 VM running: NPM 11.0.1 w/NCM 7.3.2, and a seperate Win 2008 SP2 VM instance of NPM 11.0.1 w/SAM 6.1.1. Both connect to a MSSQL 2012 SP3
server for their respective databases. Each instance has additional pollers,3-4 each Unlimited SLX licenses. A given primary polling engine for NPM instances has about 8000-12000 elements.
My plan was to set up a lab mimicing both production instances for just the primary poller, at current production versions, then do upgrade, see what breaks in alerts and reports, and fix them, do exports for import to production.
Need feedback on how to best set up a Lab. I have limited element licenses we use for lab, i.e. NPM 100, so wondered if my testing could be skewed by using a copy of actual production database, or just move the schema over, and import a small representative sample of devices?!?
Option 1: 11.0.1 to 11.5.3
1.) stand up a lab of our current production Win 2008 SP2 NPM 11.0.1, NCM 7.3.2 and a seperate lab database of MSSQL 2012 SP3
2.) copy the production database over to a lab database or perhaps just the schema and get a small sample of devices imported (our Lab licenses are limited to 100)
3.) do an upgrade from 11.0.1 to NPM 11.5.3 in lab,
4.) then go through and manually test 100's of alerts and reports to find what breaks. Verify EOC still gathering server info
5.) manually fix alerts or reports, itemize and export fixed reports and alerts for import later
6.) upgrade production, import fixed alerts and reports
Option 2: 11.0.1 to 12
1.) go right to NPM 12. Would require I stand up a new server with 2012 R2 for OS v.s. above,
2.) see if same issues break, verify any impact to views, EOC rollups, plus anything else new I like to wait until at least a few dot version (i.e. .2, .3 before upgrading)
Any input or guideance appreciated!