2 Replies Latest reply on Jun 8, 2018 8:51 AM by joe.tran

    How can I make SolarWinds interpret a timeout as "Application Down"?


      I have a custom Perl script that is run by the SW agent on one of our AIX servers. It uses the IO::Socket::INET library to open a connection to a homegrown app and send an XML string to that app. It then interprets the result and ascertains if any of the returned data indicates a problem that we want to alert on.


      However, there's an issue that we've been trying to track down whereby this homegrown app stops responding to my script. Eventually the polling of the custom app monitor hits the configured timeout, and this happens:


          Application "MyApp" on node "mynode" is in an unknown state.


      OK, that's nice, but for the sake of this application, that is the same as being down. If the app actually does go down we have an alert configured to notify the appropriate people.


      So my question is this: how can I tell the custom script monitor that a timeout actually means that the app is down, rather than in an unknown state?


      I realize I could make my perl script smarter and modify it to set a timeout of some sort in the socket code, but I figured that since SAM already has the timeout stuff built in to the monitor framework, there must be a way I can do this with a simple configuration change.


      Any suggestions would be much appreciated.