Thanks for the feedback thread.
I also had both your issues (Errors During Upgrade to 2023.2). The maintenance is a bug (but please confirm with support tho).
How big is your environment? I am also around the 5 hour mark for most upgrades (plus any issues).
Have you run into any issues with your job engine crashing per this thread? https://thwack.solarwinds.com/product-forums/network-performance-monitor-npm/f/forum/98977/anyone-else-having-job-engine-issues-affecting-polling
We ended up upgrading to a service release instead of 2023.2.1 because of polling issues discussed in this thread.
Unfortunatly I saw the thread bharris1 mentioned after we did our upgrade to 2023.2.1. Knowning the potential of job engine failure would have caused me to pump the breaks. Knock on wood, we have not ran into that problem yet. I have enabled the "jobs lost" performance monitor in our poller monitors. I have also added a perfstack to quickly view jobs lost, jobs queued and jobs runningof the polling engines.
We had no issues during the upgrade itself. We downloaded the files the day before, rebooted the servers the day of, and ran the upgrade. It took around an hour an a half or so. We were running on 2022.4.
Post upgrade I have received a notification saying our SAM DB maintenance has not ran, but manually running DB Maintenance completes without issue. The SAM error has not cleared, so I have an open ticket with support on that.
Additionally the web ui does seem to be running a bit slower, but its been sporadiac. At some points it runs normally, then others slow. I have had a couple agents come back as "unable to connect to agent" [Server Initiated Communication], but a restart of the agent service tends to have them working again.
We have a large environment. About 120K elements and our SQL DB is around 1 Terabyte. The element # is not x-large but the data we collect and DB size probably constitutes x-large. Support is telling me the DB maintenance has been slower starting before the upgrade. I just started noticing the log was not indicating complete. The log also seems to have a different format for indicating complete. It used to have a line "nightly maintenance has completed" and this is no longer a check mark for the maintenance completing.
FWIW we haven't seen any problems with job engine failures. I think the big thing is moving to the new platform that consolidates the installer to one package; which in turn will make future upgrades much faster. We are going to 2023.3 and will probably wait for the stable release in 2024.
To add, we aren't going to upgrade to any version until Q1 of 2024 unless there is some major bug found or security patch we need.