This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Architecture overview and the alert levels – P1,P2, etc. and their respective thresholds?

Hi, could you please help me to prepare this. I need to prepare low level design. I added a few points in the document but the customer was not happy with that. Can you please give me a few ideas about the architecture overview and also about Alerts. 

  • Hi, could you please help me to prepare this. I need to prepare a low level design. I added a few points in the document but the customer was not happy with that. Can you please give me a few ideas about the architecture overview and also about Alerts. 

  • Hey  ,

    Are you asking for guidance on how to build the SolarWinds environment?  There is documentation for that, some things you need to be thinking of are the # of modules you are using, # of nodes you are monitoring.

    Links:

    - Build documentation:

    https://documentation.solarwinds.com/en/Success_Center/orionplatform/Content/Core-Multi-Module-System-Guidelines.htm

    - Previous discussion on this topic, I used some of the guidance to build our environment

    https://thwack.solarwinds.com/t5/THWACK-Community-Discussions/I-will-build-a-new-SQL/td-p/129831

    Hope this helps some.

  • Solarwinds have any alerts level like P1, P2 etc.. Also I need a low level design template.

  • I'd doubt anyone has a design template to offer you, you'll have to do that leg work yourself.
  • Just to define a couple of things that I have learned over the years.  I know that some of this is very basic but a set of common definitions are needed.  And of course I may be totally wrong but this is how I manage my environment.

    • Monitoring is the state of the element (Node/Interface/Volume).
      I have found that the threshold levels that are set for an element are not used for alerting.  They are used for the represent the state of the element.
    • Alerting is action to be taken when a defined level is breached.
      Setup of an alert can be tricky.  A clear understanding of what is not normal for your environment is needed.  This is where a good understanding your environment is very important to help in setting your thresholds.  Talk to the teams that want Alerting. Check here for a listing of actions Alert actions available in the Orion Platform

    Now to your question P1, P2, etc. 

    This, sounds like you are using ServiceNow for your ticketing.  We use that too but not the Orion integration to ServiceNow, we go via Moogsoft (an event correlation manager) which has integration to Orion and ServiceNow.   Moogsoft pulls the Event Table and checks for new events.

    The “secret” is to populate the event with the information needed to create that INC in ServiceNow.  So each alert is assigned a “P” level.  At first I was hard coding for each alert, this worked until “they” wanted to change the level.

    I created custom properties for my alerts This makes it much easier to change the level

    Alert_Action – These are the action taken by this alert
    INC_Assign_Team – this is the team that is to action the Alert
    INC_Priority – This is the “P” level of the alert.  P1 – P4
    ResponsibleTeam – The team that manages the environment

    The core to this is the formatting of the alert and Perfmon event.  Below is a basic alert that I use for a node down.  I use the idea of BREACH and NORMALIZED

    The Trigger Action
    ORION ALERT: ${N=Alerting;M=AlertName} BREACH ${N=SwisEntity;M=Caption};

    EVENT TIME: ${N=Alerting;M=AlertTriggerTime;F=DateTime};
    EVENT ID: ${N=Alerting;M=AlertActiveID};
    ALERT NAME: ${N=Alerting;M=AlertName};
    ALERT ID: ${N=Alerting;M=AlertID};
    NODE: ${N=SwisEntity;M=Caption};
    NODE IP: ${N=SwisEntity;M=IP_Address};
    [
    SITE CODE: ${N=SwisEntity;M=CustomProperties.SiteCode};
    SEVERITY: ${SQL:SELECT INC_Priority FROM AlertConfigurationsCustomProperties WHERE AlertID = ${N=Alerting;M=AlertID}};
    ASSIGN TEAM: ${SQL:SELECT INC_Assign_Team FROM AlertConfigurationsCustomProperties WHERE AlertID = ${N=Alerting;M=AlertID}};
    OPTIER1: Malfunction;
    OPTIER2: Connectivity;
    PRODUCT NAME: Network Connection;
    PRODTIER1: Hardware;
    PRODTIER2: Network;
    PRODTIER3: Connection;
    ]

    Information with in the [ ] is what is used to create the INC.
    As you can see I use a lot of variables.  The one that effects the INC “P” level is

    SEVERITY: ${SQL:SELECT INC_Priority FROM AlertConfigurationsCustomProperties WHERE AlertID = ${N=Alerting;M=AlertID}};

    We also use the P level of P0.  This will close the INC.

    The Reset Action

    ORION ALERT: ${N=Alerting;M=AlertName} NORMALIZED ${N=SwisEntity;M=Caption};
    EVENT START TIME: ${N=Alerting;M=AlertTriggerTime;F=DateTime};
    EVENT END TIME: ${DateTime};
    EVENT DURATION: ${N=Alerting;M=DownTime} Minutes;
    ALERT NAME: ${N=Alerting;M=AlertName};
    NODE: ${N=SwisEntity;M=Caption};
    NODE IP: ${N=SwisEntity;M=IP_Address};
    [
    SITE CODE: ${N=SwisEntity;M=CustomProperties.SiteCode};
    SEVERITY: P0;
    ]

    So a long answer to a short question.  Orion does not directly support for “P levels”, not that I have found.  But you can create a process to use them.  This is one of the powers of the Orion Platform the ability to create and use custom properties.

    I hope this is of some help.