A while ago, I discovered two related issues concerning remote collectors:
- Lack of Alerts: Currently, no alert is generated when a remote collector (which is not managed as a node itself) becomes unreachable.
- Stale Node Status: Consequently, the status of nodes monitored by that DOWN remote collector remains unchanged (e.g., showing as 'Up') indefinitely, even when the last database update was weeks ago.
Suggestions for Improvement:
- To ensure timely awareness of remote site issues, I suggest implementing default ("out-of-the-box") alerts that trigger when a remote collector becomes unreachable.
- Furthermore, the status of nodes should automatically update to 'Down' or 'Unknown' when their assigned polling remote collector is unreachable. This would accurately reflect their unmonitored state.
Frankly, the current behavior and the resulting stale node statuses feels inadequate for an enterprise-level monitoring solution like SolarWinds. Reliable alerting and accurate status representation must be first priority.