TLDR: Looking for a report that shows all backups that have a failure error message AND that occurred in the last 7 days.
.
.
I've been asking (and asking), both here in THWACK** AND support AND with our TAM (sadly we have now lost direct access to him) for help with creating a report that will show failures in our backups. Anyway ....
The scenario is:
Currently we backup approx 2,200 nodes in rolling jobs on a weekly basis, and we need to add another 2,500 in the very near future but we want to be sure everything in place is working as expected before adding to what we have.
What we want (would like):
Ideally we want to move to a state where we can alert on every node that fails to backup - but for now, we'd settle on running a report sent once a week that shows what didn't backup in the prior 7 days.
The problem:
I can't find, nor have I been pointed at anything, nor have I been able to utter the right incantation to force SWQL to throw out what we need. Most of the problem with the last bit is that I'm a complete n00b on SQL/SWQL. Most recently I was pointed at the out of the box report "config transfer audit" *** which, when first run seems to fit the bill. On a cursory inspection it only shows approx a quarter of our currently configured backup estate and it appears to me, to only show devices that have had a manually invoked backup, and not those run from a job. Secondly it shows error messages that happened months ago, and seems to ignore the fact that there are more recent successful backups, but seemingly assigns those errors to more recent dates!
.
So ... does anyone out there already have a similar report (we only use NPM, NCM, Netflow and IPAM) that would do what we need?
Partial/start of an answer is that I did manage to cobble together this report: but I am being asked to take it a step further. It has two issues (for me anyway)
- I have no idea how to get it to show me only devices that failed in the last X days - my SQL Fu is not strong enough
- I can only filter on LoginStatus not like '%Login OK%' but I'm not convinced that covers off all failure types
.
.
.
.
.
.
.
** some threads on backups I've created, interacted on or even just read:
https://thwack.solarwinds.com/product-forums/the-orion-platform/f/forum/94060/alert-me-when-there-is-backup-failure-of-a-node-during-nightly-config-backup
https://thwack.solarwinds.com/product-forums/network-configuration-manager-ncm/f/forum/91109/how-can-i-alert-on-a-backup-failure-of-a-node
*** I was looking for reports with the word 'backup' in it - doh!