Hello,
I am planning to monitor SQL Server failover clusters using AppInsight for SQL. What is the best way or process to do that?
You can, you probably want to add more stuff and not alert on much of what's in appinsight. What you add in depends a lot on what the DB does, best to talk to your DBA if you have one and see what they care about
Consider monitoring the virtual ip/dns pointer via Appinsight too. You get a current and consistent data feed. The cluster members just need "normal" monitoring. This does mean that you will need to monitor that vip as a node. AppInsight will not generally detect failovers, but you can get that information other ways. Depends on what you are doing - so you know where to look for the data. Like @adam.beedell suggested - talk to your DBA, even if that is you, and determine what is important. Also realize that you may want to change the warning and critical thresholds over time to reflect normal status of your databases. The canned thresholds can and sometimes ought to changed.
I was asked to set up something similar and used the Solarwinds SQL Query Template. We had two data centers with one side of the cluster in each one. There were many databases inside each cluster, and many clusters. During normal operations both datacenters were 'hot' so the primary on the SQL cluster could be in either one. The DBA's needed to know just a few high level items on a dashboard. Availability - Up/Down of each DB, Response time, and which Data Center was primary. We used the Solarwinds SQL Query Template and ran a simple query to get the database name from each database, both primary and secondary. This provided Up/Down and the response time for a simple query. Another query was set up to ask which server was primary for each cluster. We then created alerts and a dashboard for to show the stats for each metric.
@bobmarley - What he said! I am doing similar using custom sql queries to get status of each cluster. Who is primary/secondary, memberstate, replication sync state, etc. can be all scooped up. Those are added to a SAM template and applied as needed.
This is the way.I'm also a big fan of adding couple more queries around functionality. 'Does Important_Table_X have an update this hour?" etc
I love these kind of questions, it always opens up the discussion from management about how you show the status of these SQL clusters. This is where a well constructed Orion Map comes into its own.
Thanks @marcrobinson and @adam.beedell I was able to get few other templates from the community to help to get some idea