How best to document "Why" an application is in a down state?

Question

We're about a year into our Orion deployment now, and certain blind spots have begun to show up for us. One popped up this morning, in fact.

In SAM, we have an application monitor for Oracle Database Availability. It reported "down" a couple hours ago, and the down state just came back.

The reason it's in a down state is an Oracle error:

"Oracle returned an error: ORA-12514: TNSListener does not currently know of service requested in connect descriptor."

This is expected because the database is being rebooted, but I wouldn't have seen such an error if I wasn't looking at the application summary at the exact moment it reported down.

My question: Is there a way in Orion to log or otherwise view errors or the cause of an application reporting "down?" I suspect I might be missing something obvious, but I wanted to throw it out to the community anyways.

frak · Accepted Answer

You need the history stored in another table.

When the data changes, it the old text data is put into a secondary table with the timestamp from the original data. this is sometimes referred to as a Z table.

I would do this by a regular SQL job (every few minutes), or a SQL trigger. A trigger is how it is usually done.

How best to document "Why" an application is in a down state?

Top Replies