The instance availability alert should alert you when a monitored instance goes down. I've tested in our environment, but would suggest you test in yours.
Might you have a dev instance you can use for this?
I am looking for an alert if the Monitor is offline for an instance - we had a reboot of the client, but the monitor did not come back up, while the client itself was up and running just fine. The alert above will only fire if the client itself is down.
3 of 3 people found this helpful
If I understand correctly, the monitor was stopped (likely due to instance being down and unresponsive). When the instance came back up, the monitor remained in the stopped state. Are you wanting a monitor the monitor? One thing I've done is schedule out a job that periodically runs a query against the DPA repository to set the command column to START. Something like:
update COND set command='START' where...
Or if you want ALL of them started, just don't include the where clause.
If you don't want things to be started automagically, you could have the job run a select of command and status and send you the results periodically so you know the current state.
Nothing exists to do this out of the box currently though.
hi, thanks for your post, we are looking to build this same alert as one of our database monitors stopped recently, and the instance was and happy and we were unaware of it for awhile before someone brought it to our attention.
would you be willing to share the command and tables where to find this information?
thanks so much in advance!
1 of 1 people found this helpful
So not sure how you are going to schedule the job to run (SSMS or cronjob or ??), but here's what you will want to do:
take a look at COND in the DPA repository. That table has the info you need and updates directly to it (be careful of course) can cause action to be taken - like starting monitoring programmatically.
select NAME, COMMAND, STATUS from COND;
Should give you something like this:
name command status DPASQL2K5 START STARTED DPASQL2K8 START STARTED DPASQL2K8R2-WRG START
If the command is 'START' and the status is 'STOPPED', you know you the monitor is supposed to be up, but it's down.
Hope that gets you going in the right direction!
Yes thank you very much I just figured it out finally when I got your email,
but your answer raises another question, if the monitor has really stopped, shouldn't the status be in a stopped state?
because the query reported it was in a start state and the monitor really had stopped would that indicate the reporting was off and something needed to be reset in the database to report accurately?
also, can this be set up in a custom alert or just setup in a sql job?
thanks much! skenow
Yes, this can be set up using any kind of logic or schedule (I won't get into those specifics as there are MANY ways one could implement this).
What I meant is that the command and status *should* be consistent. If the command is set to 'START' and status is 'STOPPED', likely something has gone wrong and I'd look in the error logs.
ok thank you for the clarification
ok so I created a custom alert in dpa to monitor if monitoring has stopped on a server;
(Executes a user-defined SQL statement that will return one or more name/numeric value pairs) using the SQL statement, select NAME, COMMAND, STATUS from [ignite].[COND] WHERE [STATUS] = 'STOPPED',
Also set Notification Policy to "Notify when level is not normal". and added 2 servers for testing one with monitor on and one with off,
the first few tests of the alert gave correct results back, One status Normal, one Status Broken. Now no matter how I test it (servers on or off) both servers come back as a broken state, even though the monitors are on.
running the statement directly in SSMS brings back the correct results.
there is no SQL Alert type currently set up that accurately describes my sql statement I'm running and there isn't a way to create one3 that I am aware of.
can you provide some help on what the issue may be, I'm lost on this and we need this setup to monitor if monitoring has stopped on our servers?
thanks in advance for any help you may provide me.
I see a couple of issues. Try this SQL using a custom multi-numeric return.
select NAME+' Monitor is stopped', 1 from [ignite].[COND] WHERE [STATUS] = 'STOPPED' and [COMMAND] = 'START'
This will prevent you getting alerted when the monitoring was turned off explicitly (command set to STOP) - which assuming would be a false positive for you.
Also, the format of the multi-numeric alert can really only handle two outputs (one alphanumeric and one numeric value).
thanks very much, that seems to work well and was working hen only 1 server monitor was down, but adding another one in a stopped state it fails to send alert message. I created a alert group and assigned appropriate groups, etc to receive alerts but get no emails. any ideas on why not?
and again thank you so much in advance !
When you query the COND table, what does it show for command and status for each instance? The script I added will only flag instances that *should* be monitored but are not for some reason (like an error). Make sure you aren't expecting an alert if you stopped a monitor manually because the command in that case would be set to 'STOP' which gets excluded.
hi - I took off the last part [Start] because we want to know if its stopped no matter the reason. but it does report now each instance in a stopped state, but I don't always get an email. not sure why.
and the more instances I have added to alert I get an email with each instance listed and stopped ones in body of email. I will copy below. can the alert be set so that it only gives me the stopped instances and with that instance name and not all the others?
Alert: Susie Test Monitor Custom SQL Alert - Multiple
Database Instance: DPSHxxxxxxxx
Execution Time: Thursday - April 20, 2017 09:40:46
View Alert Status: http://mcl-swdpa1:8123/iwc/alertMain.iwc
TEST for monitor is down please investigate
Parameter: Fxx-Sxxxxx Monitor is stopped
This message was system-generated. Do not reply to this
If I set to monitored databases it reports BROKEN, if set to repository it sends me this message but only one time not each time the alert runs. I would expect to receive an alert until the Stopped condition is fixed.
Also my alert level is set to 4 min and 10 max in HIGH. Policy is set to "Notify when level is not Normal"
I've read through the online documentation in the administrators guide but it does not really give good examples or 'how to' setup information so at a loss here. are there others docs to look at also?
thank you very much,
How many instances are you monitoring? You might have to break the alerts out to look at just a specific instance (add 'AND id=<id from COND>' to the alert SQL).