Solarwinds provide OOTB template for monitoring windows failover cluster. But there is no SAM template to get metrics from linux/unix cluster. Below are some important metrics to start with:
| SUSE Linux Cluster (DRBD) |
| DRBD Cluster Node is not running |
| DRBD Disk is not running |
| DRBD Node role is unknown |
| DRBD storage Monitoring |
| Process monitoring |
| Cluster Resource Status (log) |
| Port Monitoring (SSH-22) |
|
| RHEL Linux Cluster (PCS Cluster) |
| Daemon Corosync Status is not active |
| Daemon Pacemaker Status is not active |
| Daemon PCSD Status is not active |
| RHEL Cluster Node is not running |
| Port Monitoring (SSH -22) |
| Process Monitoring (crond, chronyd, ds_agent,corosync, pcsd, pacemakerd, ubrokerd) |
| Cluster Resource Status |
| Log Mon (/var/log/rsync_fortek_replication.log) |
| Log Mon (/var/log/rsync_nfs.log, search": "ERROR*) |
| Log Mon (/var/log/mount_point_check.log, search": "ERROR*) |
Also the template should be intelligent to know between active and passive nodes in cluster, so that the alerts for node or resource down are triggered from active node only and thus minimizing the duplicate alerts in the system