Orion Server

Version 18

    This template assesses the status of Windows services related to SolarWinds Orion servers.

     

    Prerequisites: WMI access to the target server.

    Credentials: Windows Administrator on the target server.

     

    Monitored Components:

    SolarWinds Orion Job Engine

    This monitor returns the CPU and memory usage of the SolarWinds Orion Job Engine service. This service is used to perform recurring work. This service creates various Job Engine Worker processes for scalability and robustness. The job engine writes information about each job to its database.

     

    SolarWinds Orion Module Engine

    This monitor returns the CPU and memory usage of the SolarWinds Orion Module Engine service. This service is used  to talk to the database.

     

    SolarWinds Orion Job Scheduler

    This monitor returns the CPU and memory usage of the SolarWinds Orion Job Scheduler service. The Job Scheduler service dispatches work to local and/or remote job engines.

     

    SolarWinds Syslog Service

    This monitor returns the CPU and memory usage of the SolarWinds Syslog service. This service is responsible for logging events in log files.

     

    SolarWinds Alerting Service V2

    This monitor returns the CPU and memory usage of the SolarWinds Alerting Service V2. This service is responsible for evaluating alert conditions, triggering alerts and running alert actions.

     

    SolarWinds Alerting Engine

    This monitor returns the CPU and memory usage of the SolarWinds Alerting Engine service. This service is responsible for Advanced Alerting.

     

    SolarWinds Website

    This component monitor tests a web server's ability to accept incoming sessions and transmit the requested page. The component monitor can optionally search the delivered page for specific text strings and pass or fail the test based on that search. By default,  it monitors TCP port 80.

     

    SolarWinds Job Engine v2

    This monitor returns the CPU and memory usage of the SolarWinds Job Engine v2 service. This service is used to perform recurring work. This service creates various Job Engine Worker processes for scalability and robustness. The job engine writes information about each job to its database.

     

    SolarWinds Collector Service

    This monitor returns CPU and memory usage of the SolarWinds Collector service. This service takes part in data synchronization between the poller and the Orion database.

     

    SolarWinds Collector Data Processor

    This monitor returns the CPU and memory usage of the SolarWinds Collector Data Processor service. This service is responsible for volume and node data synchronization between the Collector and the Standard Poller.

     

    SolarWinds Collector Management Agent

    This monitor returns the CPU and memory usage of the SolarWinds Collector Management Agent service. This service takes part in data synchronization between the Collector and the Standard Poller.

     

    SolarWinds Collector Polling Controller

    This monitor returns the CPU and memory usage of the SolarWinds Collector Polling Controller service. This service takes part in data synchronization between the Collector and the Standard Poller.

     

    SolarWinds Information Service

    This monitor returns the CPU and memory usage of the SolarWinds Information service. This service is used by websites to talk to the database. This service is also responsible for how the pollers talk to each other.

     

    SolarWinds Information Service V3

    This monitor returns the CPU and memory usage of the SolarWinds Information service V3. This service is used by websites to talk to the database. This service is also responsible for how the pollers talk to each other.

     

    SolarWinds JMX Bridge

    This monitor returns the CPU and memory usage of the SolarWinds JMX Bridge service. The JMX Bridge is only used if you are monitoring Java Application Servers such as WebSphere, WebLogic, or Apache Tomcat via JMX.

    Note: By default this monitor is disabled.

     

    SolarWinds Trap Service

    This monitor returns the CPU and memory usage of the SolarWinds Trap service. This service is responsible for catching and logging trap events.

     

    File Count Monitor - JET Files

    This monitor returns the number of JET files in C:\Windows\Temp which prevents new DB connections and causes polling to halt. This monitor should be less than 65,530. These files can be deleted. They usually stay in the system only because an application  that uses them have accessed a database has crashed and the files were not properly deleted. No more than 65KB should be in this folder.

     

    MSMQ Messages in Queue

    This is the total number of Message Queuing messages that currently reside in the selected queue. When the Data Processor receives more results into MSMQ than it is able to process and pass to the Standard Poller, MSMQ continues growing. The size of MSMQ should be near 0 almost all of the time. Some spikes may appear, but the Data Processor needs to be able to clean up the MSMQ quickly, otherwise it will not be able to handle DB blackouts or maintenance. (Standard Poller performance is affected by DB performance significantly.)

    Note: Before using this counter, you should set the correct instance beginning with:
    <HOSTNAME>\private$\solarwinds\collector\processingqueue

    where  <HOSTNAME> - hostname (without < >) of target server.

    For example: APMhost.By default, the instance is set to: <HOSTNAME>\private$\solarwinds\collector\processingqueue\solarwinds.node.hardwarehealth.wmi

    All available instances can be found by running the perfmon utility and searching for “Messages in Queue” counter in the “MSMQ Queue” category.

    Note: This monitor is disabled by default

     

    Perfmon DPPL Avg. Time to Process Item

    This monitor returns the time needed to process one item. If this number is 1, it means you are able to process one item per second. 0.01 means 100 items per second. The returned value should be as low as possible.

     

    Perfmon DPPL Waiting Items

    This monitor returns items in the queue pulled from the message queue but waiting for other results to be processed. This should be less than 40. If this number is holding at or above 40, this may indicate issues concerning DB response time, performance issues, or many down elements.

     

    MSMQ Folder Size

    This monitor returns the MSMQ folder size. This monitor should be less than 800 MB. MSMQ maximum size is 1GB. If the 1GB limit is reached, polling will stop working correctly.

    To Increase the MSMQ size, you should open Computer Management > Features > Messaging Queuing. From here, right-click and change MSMQ Messaging 1 GB Limit to 1.5GB. For Windows Server 2003, this is found under the Storage section.

    See: http://knowledgebase.solarwinds.com/kb/questions/3510/Microsoft+Message+Queue+Fills+Directory+with+Orphaned+Files.

     

    Process Monitor - SWJobEngineWorker2.exe

    This monitor returns the number of Job Engine worker processes and its CPU and memory usage. A value of 10 or lower is acceptable. If the returned value is 100 or greater, there may be problems with jobs hanging.

     

    Job Engine v2: Jobs Queued

    This monitor returns the number of jobs waiting for execution due to insufficient resources. This value should be less than 10 at all times.

     

    Job Engine v2: Jobs Lost

    This monitor returns the number of lost jobs. This value should be zero at all times.

     

    Job Engine v2: Jobs Running

    This monitor returns the number of jobs currently running.

     

    Job Engine v2: Worker Processes

    This monitor returns the number of worker processes used. A value of 20 or lower is acceptable. If the returned value is 100 or greater, there may be problems with jobs hanging.

     

    Job Scheduler v2: Average Execution Delay

    This monitor returns the average delay, in milliseconds, between the time when the job is supposed to be executed and the time that it actually is executed. This value should be less than 100,000.

     

    Job Scheduler v2: Results Notified Error

    This monitor returns the number of errors that occurred when sending the results back. This value should be zero at all times.

     

    RabbitMQ Service Monitor

    This monitor returns information about RabbitMQ services running on a node with the Windows operating system.

     

    SSL Listeners Port Monitor

    This Monitor returns information about TCP port 5671 needed to listen on a socket that is going to be used for SSL connections. This setting is controlled by the Rabbit SSL_listeners argument to RabbitMQ.

     

    RabbitMQ Folder Size

    This monitor returns the Orion RabbitMQ folder size. If the folder is growing, RabbitMQ is writing messages not beeing delivered to disk, or the machine is under memory pressure.

    Note: This monitor is disabled by default

     

    SWIS PubSub Messages Queued

    This is the total number of Message Queuing messages that currently reside in the SWIS PubSub queue. When publisher sends more messages then subscribers are able to process, or if there are any message delivery issues, RabbitMQ continues growing. The size of the queue should be near 0 almost all of the time. Some spikes may appear, but SWIS needs to be able to clean up the MSMQ quickly,

    Note: This monitor is disabled by default

     

     

    Last updated: 1-12-2017