The device maintenance function within Orion currently provides two options to control alerting and monitoring during maintenance windows.
Unmanage Object
This works by stopping all polling to the object. With no data in the database, alerts cannot fire and with no records for status of the object being down SLA availability reports cannot include that data and so (assuming the device was up the whole time outside of the Unmanage schedule) SLA reports will show 100% uptime.
All good, with two primary functions provided
Mute Alerts
This option works by allowing the alert to be evaluated, but in addition if any objects that would generate an alert appear in the Muted schedule, the alert will be suppressed. All polling functionality continues, so that during a maintenance window created with this method, performance metrics are still collected, object status is identified.
This is more often used, as our customers indicate they still wish to see utilisation data during these maintenance windows. They would also like to see exactly when devices/services go down during the window, as this confirms expected activity and gives them a record of the timings of service loss events. Mute alerts simply means that alerting does not get generated.
The problem occurs where a customer who has to supply an SLA availability report that excludes scheduled maintenance windows with the Mute method, as any downtime during this period will still be included in the percentage downtime calculation due to the fact that the device status is still marked as down and the values in the database used to calculate availability are not affected by the Mute state.
Request:
Provide a mechanism that allows for Mute periods to not affect SAL availability report output
Either build the data structure to allow historic mute periods to be recorded easily or provide a status and data structure change that allows these time periods to be queried relatively easily for exclusion from availability calculation. Currently Auditing needs to be enabled and the data structure is not very efficient for including mute periods over a long period of time.