AIX

This template assesses the performance of the AIX operating system installed on the target server. Perl scripts are used for monitoring the performance of queries.

Download and install the NET-SNMP agent on the AIX server. Visit the SolarWinds Success Center and see the Configure Net-SNMP for Linux devices article for details.

Click here to learn about prerequisites and individual component monitors in this template, as included in the SAM Template Reference.

Note: This template was updated in SAM 2019.4 to support localization updates planned for future SAM releases. To learn about importing this template, see Import and export SAM templates.

Prerequisites: SSH and Perl installed on the target server.

Credentials: Root credentials on the target server.

Monitored Components:

Note: You need to set thresholds for counters according to your environment. It is recommended to monitor counters for some period of time to understand potential value ranges and then set the thresholds accordingly. For more information, see Manage thresholds in SAM.

CPU statistic (%)

This monitor returns the percentage of CPU time used. The returned values are as follows:

User – This component returns the percentage of CPU time spent running non-kernel code (user time). This represents the time spent executing user code. This statistic depends on the programs that the user is running. It is recommended to use the lowest threshold possible.

System – This component returns the percentage of CPU time spent running the system kernel code (system time). It is recommended to use the lowest threshold possible.

Wait – This component returns the percentage of CPU time waiting for I/O. It is recommended to use the lowest threshold possible.

Idle – This component returns the percentage of CPU time spent idle. It is recommended to use the highest threshold possible at all times.

System faults statistic/sec

This monitor returns the rate of system faults, per second. The returned values are as follows:

Interrupts – This component returns the number of interrupts per second. The threshold for this component depends on the processor. For modern CPUs, a threshold of 1,500 interrupts/sec is a acceptable. A dramatic increase in this value, without a corresponding increase in system activity, indicates a hardware problem.

System_Calls – This component returns the number of system calls per second. This is a measure of how busy the system is handling applications and services. High System Calls/sec indicates high utilization caused by software. With today's faster CPUs, 20,000 would represent a reasonable threshold.

Context_Switches – This component returns the number of context switches per second. High activity rates can result from inefficient hardware or poorly designed applications. The normal amount of Context Switches/Sec depends on your servers and applications. The threshold for Context Switches/sec is cumulative for all processors, so you need a minimum of 14,000 per processor (single=14,000, dual=28,000, quad=56,000, and so forth).

Kernel threads statistic

This monitor returns the number of kernel threads in different states. The returned values are as follows:

In_Run_Queue – This component returns the average number of runnable kernel threads over the sampling interval. This should be as low as possible. If the run queue is constantly growing, it may indicate the need for a more powerful CPU or more CPUs. Set the thresholds appropriately for your environment.

Waiting_For_resources – This component returns the average number of kernel threads placed in the VMM wait queue (awaiting resource, awaiting input/output) over the sampling interval. This should be as low as possible. Set the thresholds appropriately for your environment.

Memory and Swap statistic (MB)

This monitor returns the memory and swap statistic in MB. The returned values are as follows:

Free_Memory – This component returns the amount of available memory in MB. Use the highest threshold possible at all times. Set the thresholds appropriately for your environment.

Used_Memory – This component returns the amount of used memory in MB. Use the lowest threshold possible.

Free_Swap – This component returns the amount of available swap in MB. Use the highest threshold possible at all times. Set the thresholds appropriately for your environment.

Used_Swap – This component returns the amount of used swap in MB. Use the lowest threshold possible.

Paging statistic/sec

This monitor returns the different paging statistics. The returned values are as follows:

Page_Faults – This component shows the number of page faults per second. This is not a count of page faults that generate I/O. Some page faults can be resolved without I/O. Use the lowest threshold possible.

Paged_In – This component returns the rate of pages "paged in" from paging space in kB, per second. The operation of reading one inactive page or a cluster of inactive memory pages from the disk is called a "page in." Use the lowest threshold possible.

Paged_Out – This component returns the rate of pages "paged out" from paging space in kB, per second. The operation of writing one inactive page or a cluster of inactive memory pages to the disk is called a "page out." Use the lowest threshold possible. Values above 20 pages (80 kB), or so, indicate a significant performance problem. In this situation, more memory should be installed.

Processes in different states

This monitor returns the different paging statistics. The returned values are as follows:

Zombie – This component returns the number of processes that are terminated and where the parent is not waiting. This should always be zero. If it is not zero, you should manually kill zombie processes. Use the following commands to see these zombie processes:

       ps –ef | grep defunct.

Active – This component returns the number of processes that are on run queue.

Swapped – This component returns the number of processes that are currently in swap.

Idle – This component returns the number of processes that are idle (waiting for startup).

Canceled – This component returns the number of processes that were canceled.

Stopped – This component returns the number of processes that are stopped, either by a job control signal or because it is being traced.

Space on root emoticons_check.png partition (MB)

This monitor returns the available and used space of the root emoticons_check.png partition in MB. The returned values are as follows:

Available_Space – This component returns the available space on the root (/) partition in MB. Use the highest threshold possible at all times.

Used_Space – This component returns the used space on the root (/) partition in MB.

Percentage of using system devices

This monitor returns the name of the system device and the percentage of time the device was busy servicing a transfer request.

Note: After applying this template on the target node, you should navigate to, Edit Application Page and click Get Script Output in the Script section. This will build the list of system devices that should be monitored.

Disk operations/sec of system devices

This monitor returns the name of the system device and its read/write transfers to or from the device.

Note: After applying this template on the target node, you should navigate to, Edit Application Page and click Get Script Output in the Script section. This will build the list of system devices that should be monitored.

Top 10 active processes

This monitor returns the top 10 active processes and share of CPU usage in percent.