Custom SQL Node Alert:
Where Not IP_Address in (
where [DateTime] > DATEADD(HOUR,-<_X_HOURS>,GETDATE())
group by IP
having COUNT(IP) > <_X_NUMBER>
Is that just for 1 specific node? I want to be alerted if any node hasn't sent any syslog messages in 48 hours. Not sure what the variables are in your SQL statement.
Thanks for the reply.
Open the Alert manager and change it to custom SQL sub type Node, it fills out the "Select Blah from blah", but then allows you to choose where and/or joins to get the data you want.
That querry would alert individually for each Node that doesn't have records in syslog for x period - you case might be to replace <_X_HOURS> with 48 and <_X_NUMBER> with 0, although sence you now are saying "any syslog messages" vs "been less than (x) number of syslog messages", it might be better to use "SELECT distinct IP FROM SysLog where [DateTime] > DATEADD(HOUR,-<_X_HOURS>,GETDATE())" so SQL doesn't need to count the records.
if you want specific nodes, you could limit it down by adding " and Caption in ('routera','serverb','firewallx') "
lots of math up there.
just add a field to the nodes table that stores the date/time of the last syslog you recieved from that address.
then write an alert to check that field vs current date/time
the one alert above looks like it would be reading the syslog table every minute (polling cycle) which could really kill NPM perfomance if it has to read through a few million syslogs for all the IPs when depending on the interval you set with X
this should keep the load from this one alert to a minimum
The table is indexed on those fields, but good point, it is extra overhead but if you are looking for 48 hours, you could set your polling cycle to be something like 6 hours, then it wouldn't be much of a hit
(oh, did you notice you can't post ill preceded by a k, it masks it out as though it is foul language. It makes it look worse, like you actually used a curse word... silly thwack)
Well, that works, let me throw another wrinkle in here for you. Sorry about adding additional criteria, I guess I originally didn't have a good focus on exactly what I wanted, and as things came into focus and I started seeing results, then there were some things I needed to refine.
The SQL script works. Here's what I plugged into SQL to get results:
Select distinct Hostname from SysLog
Where Not Hostname in (
where [DateTime] > DATEADD(HOUR,-48,GETDATE())
group by Hostname
having COUNT(IP) > 1
However, there's 2 caveats to this:
1- There's some things I don't want in the results. For example, customer firewalls that I just do icmp monitoring on the outside interface that will never send me syslogs. I have a custom node property that I can use as a filter field. The field is called Hardware_Type. So Nodes.Hardware_Type is the column. For example, exclude any result where Nodes.Hardware_Type = 'CUSTOMER_FIREWALL'. I imagine this requires an inner join on Node.nodeid=syslog.nodeid
2- This will only work for nodes that have previously sent in a syslog message within the table trim period. If I add a host and it never has sent a syslog message then it won't show. There has to be a way to cross reference results to the existing nodes in the Nodes table.
Urgh. Getting more complicated than I originally thought. Isn't that always the case though?
I appreciate the responses. Please help with this one last (I promise!) teensie weensie little hurdle.
Select Caption From Nodes
where not IP_Address in (Select distinct IP from syslog where [DateTime] > DATEADD(HOUR,-48,GETDATE()))
and (Hardware_type not in ('CUSTOMER_FIREWALL','SIMPLEDEVICE','WORKSTATION') or Harware_type is null)
I don't trust syslog to resolve the hostname, but it can't mess up the IP, but you can change it if you like.
Okay, that works in SQL, but not in Orion. WTF?
In SQL I get the correct results. When I setup the alert in Orion it triggers on all the hosts. WTF?
Still can't figure this out. It's driving me crazy. I plug this into SQL:
Select nodes.Caption From Nodes
where not nodes.IP_Address in (Select distinct syslog.IP from syslog where syslog.DateTime > DATEADD(HOUR,-48,GETDATE()))
and (nodes.Hardware_Type not in ('CUSTOMER_FIREWALL','VMWARE_HOST', 'CLUSTER', 'SAN') or Nodes.Hardware_Type is null)
And I get 15 results, all valid.
But when I setup my alert:
It's triggering on all the hosts. Help!
Nevermind. I'm dumb. My first go around with Custom SQL Alerts. I didn't notice the select statement at the top was already there. Working correctly now!
he he yeah, thats why I didn't include the "select blah from blah" in my first post...
The thing I like the most about the Orion products is that the only limit is how imaginative you can get, the product can do it as long as you find the right spot to tweak.
Plus this forum has so many good idea's in it that you can tweak and steal for oneself.