Monitoring Replication With SQL Sentry
Like many DBAs, I've had to deal with replication in a few different environments. While it usually ran pretty smoothly, when things went badly—it was better if I figured that out sooner rather than later. Especially in the case of using replication for HA/DR, you need to know that replication is working and is up to date.
Within SentryOne SQL Sentry, while there isn't a replication monitor like there is in SQL Server Management Studio, you can still be alerted when something in your replicated environment isn't working as planned.
The first step is to create general alerts on all of the replication based jobs that are currently enabled. Here are the jobs in my environment—and yes, I'm having issues at the moment.
You'll want to create alerts on failures and it might be beneficial to create alerts on retry for the replication agent jobs (the last three above).
You may already have alerts set up for all job failures, but you can also navigate to these jobs in the SQL Sentry navigation pane. If you select a specific job, the applicable alerts will appear.
Setting up alerts on these job failures gets you part of the way there, but Advisory Conditions can help alert you to additional potential issues in your environment.
Advisory Conditions can be used to create alerts around the values from a T-SQL query, WMI query, performance counter value, or any combination of these.
Here is a link to a zip file with some Advisory Conditions that have been created to monitor replication. Once the file has been unzipped, you can import any or all of these into your environment by opening the conditions list (expand Advisory Conditions in the image above) and with a right click, choose Import. The Advisory Conditions include the following:
Replication is Installed—This should be fairly obvious, but could be helpful in the global alerts to let you know if someone has set up replication on one of your servers.
Distributor offline—Also fairly obvious, but any of the databases in your replication scenario go offline, replication will as well.
Distribution Delivery Latency/Transactional Log Reader Latency—Both of these conditions first check to see if replication is installed (all of our replication conditions do) and, if so, query the Dist:Delivery Latency and Logreader:Delivery Latency performance counters. The min and max settings are set to 5 and 10,000, but those can be adjusted for your environment. The distribution delivery latency measures period of time from when commands are delivered to the distribution database to when they are applied at the subscriber. The Log Reader latency is between commands applied at the publisher to when they are delivered to the distribution database.
Distribution Delivery Cmd/sec/Transactional Log Reader Cmd/sec—Looking at commands per second can help identify increases in replication traffic or when fewer commands than expected are being delivered indicating some kind of conflict. As with the Advisory Condition above, you can set the min and max values based on your environment.
If there are other performance counters you would like to monitor, you can use the following code, just replace the counter name, to create additional conditions. As a note, this can be used for alerting on any performance counters.
DECLARE @MinThreshold INT, @MaxThreshold INT SELECT @MinThreshold = 5, @MaxThreshold = 10000 SELECT object_name, counter_name, instance_name, cntr_value FROM sys.dm_os_performance_counters WHERE object_name like '%:Replication Dist.%' AND counter_name = 'Logreader:Delivered Cmds/sec' AND cntr_value NOT BETWEEN @MinThreshold AND @MaxThreshold;
Within Advisory Conditions, you can also alert on the results of a query, so you might want to query MSSubscriptions for any subscriptions with a status of 0 (inactive).
For more information about creating alerts in SQL Sentry, you can watch this video from courses.sentryone.com.