This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

How best to document "Why" an application is in a down state?

We're about a year into our Orion deployment now, and certain blind spots have begun to show up for us. One popped up this morning, in fact.

In SAM, we have an application monitor for Oracle Database Availability. It reported "down" a couple hours ago, and the down state just came back.

The reason it's in a down state is an Oracle error:

ahbrook_0-1611757785909.png

"Oracle returned an error: ORA-12514: TNSListener does not currently know of service requested in connect descriptor."

This is expected because the database is being rebooted, but I wouldn't have seen such an error if I wasn't looking at the application summary at the exact moment it reported down.

My question: Is there a way in Orion to log or otherwise view errors or the cause of an application reporting "down?" I suspect I might be missing something obvious, but I wanted to throw it out to the community anyways.

  • That should appear in text part of the component.  There us no history of the text, retained, so you can only see a text message for the current state.

    I have not used the Oracle monitor, so I have not seen this in this specific case.

  • Exactly - the text of the component only showing up for the current state is what I'm after. How have other users gone about capturing that historical data? Is this something where logging might help?

  • You need the history stored in another table.

    When the data changes, it the old text data is put into a secondary table with the timestamp from the original data.  this is sometimes referred to as a Z table.

    I would do this by a regular SQL job (every few minutes), or a SQL trigger.   A trigger is how it is usually done.

  • That makes a lot of sense, although it is not idea as I was hoping to avoid doing any SQL configuration.

    Has anyone ever put in a feature request for this? the only thing against it I can think of is the size of things. If you track every result of every component monitor, those tables would become extremely huge. Only logging failures would be more ideal, but could also blow up quickly.

    Still, this gives me a lot to think about. Thank you so much for your help.

  • actually if you store the table (or at least varchar columns) as compressed, it will be quite small.

    due to the large amount of duplicate text.  Messages repeat a lot.  And when there is no error message, well NULL is easy to store.