Helix Universal Media Server (Linux/Unix)

Version 2

    Helix Universal Media Server (Linux/Unix)

    This template assesses the performance of the Real Helix Universal Media Server on Linux/Unix machines. It uses Perl scripts to get statistics from RSS log files and an SNMP process monitor for monitoring Universal Media Server process.

    Prerequisites: SSH and Perl installed on the target server. SNMP installed on the target server and permission to monitor rmserver processes.

    Credentials: Root credentials on the target server.

    Accepted RSS Log File format (v. 14.2)

    Warning: These templates will work with RSS log files that have the following format (v. 14.2):

        Server Stats (13-Dec-2011 20:00:00)

        Uptime: 0 days, 00:00:55

        Players: 0 (New Players: 0, 0.00/sec, Players Leaving: 0, 0.00/sec)

        Players by Protocol: 10% RTSP, 15% RTMP, 20% MMS, 25% HTTP Total

        HTTP Connections: 1% Persistent, 2% Secure, 3% Cloaked

        Players by Transport: 1% TCP, 2% UDP, 3% MCast

        Net Devices: 99

        Memory Stats: 44.2 MB In Use (46378308 Bytes; Cache Hit Ratio 25%)

        Memory Alloc: 50.6 MB (53046588 Bytes; 12950 Pages); MMapIO: 0.0 MB

        Recent Memory Stats: 63702 Mallocs, 31416 Frees, 22015 CacheMalloc 14902 CacheMiss

        Recent Memory Stats: 38541 CachedNew, 26647 CachedDel

        Recent Memory Stats: 0 PageFrees (0.00%), 12885 PageAllocs (20.23%)

        60 Second Memory Stats: 0 FreeListEntriesSearched (0.00 per Malloc)

        Memory Allocated OverHead: 0 Free Pages Outstanding, 14.4% Overhead

        Memory per Player: 0k

        Bandwidth Stats: Output 0.02 Mbps, 0.0 Kbps Per Player

        Bandwidth Stats: 116238% Subscribed (0.00 Mbps), 100% Nominal (0.00 Mbps)

        Misc Recent Stats: Packets 1, Overload 2, NoBufs 3, OtherUDPErrs 4, Behind 5

        Misc Recent Stats: Resend 6, AggregatedPkts 7

        Misc Recent Stats: WouldBlocks 8, Accepts 12 (0.20/sec)

        Mutex Collisions: 0 / sec, ~0.000% CPU Spinning, ~75.000% Memory Ops

        Scheduler Items: 197 (With Mutex), 133 (Without Mutex)

        Network Items: Read 27 (0.00% With Mutex), Write 153 (0.00% With Mutex)

        Misc: 0 File Objs, 0 IdlePPMs, 0 Forced Selects, 0 AggSupport, 4:15 CAs

        Broadcast Reception: Feeds 0, 1.12 Kbps, Packets 0, Lost 0, Lost Upstream 3

        Broadcast Reception: Resends 0, Out of Order 2, Duplicate 0, Late 0

        Broadcast Distribution: Feeds 0 (Push 0, Pull 0), 0.12 Kbps, Packets 0

        Broadcast Distribution: Resends 444 (Requested 0), Lost Upstream 0, Dropped 0

        Broadcast Core: 2 Dropped Packets, 2 Client Overflows

        CPU Usage: 0% User, 0% Kernel, 3% System

        Main Loop Iterations: 7/sec (LoadState: 0)

        MainLoopIts:        56      0     39      1      1      1      3      1      1      1     57      1      1      1      1      1     64    126     23      2      1      1      0     62

        Registered FDs:   5      0     25      4      4      4      4      4      4      4      4      4      4      4      4      4      7      7      4      4      6      4      4      6

        Elapsed Time: 60.00

    Monitored Components:

    Note: You need to set thresholds for counters according to your environment. It is recommended that you monitor counters for some length of time to understand potential value ranges and then set the thresholds accordingly. For more information, see http://knowledgebase.solarwinds.com/kb/questions/2415.

    Note: All Linux/Unix monitors retrieve statistical information from RSS log files which can be found here:

    <Path_To_Helix_Folder>/Logs/rss/.

    Each RSS configuration varies as to how many log files will be stored and how often this statistic will be updated. By default, the RSS interval is set to one minute in the following folder.

    (rmserver.cfg file, "RSS Logging" section)

    Before using this template, the correct arguments should be set to all Linux/Unix monitors. Following is an example using the arguments field:

     

    perl ${SCRIPT} "/usr/Helix/Logs/rss/rsslogs.*"

    rsslogs.* should be put at the end of the RSS folder path. The entire argument should be within in double brackets. The script in monitors automatically determine which log file is the most recent and will retrieve statistical information from it.

    CPU usage

    This component monitor returns the overall use of system processor time. For machines with up to 4 CPUs, these statistics provide an accurate guide to the server processing load. For machines with more processors, the number may fall short in representing the actual server load. Returned values are as follows:

    User - This monitor returns the percentage of time used by the server itself, not counting operating system time and kernel time.

    Kernel - This monitor returns the percentage of time used by the operating system kernel on behalf of the server.

    System - This monitor returns the total system CPU load, including CPU used by other applications. This includes both user and kernel time.

    Memory Statistic and Allocations

    This component monitor returns statistics about actual server memory usage, memory allocation according to the operating system, and allocated memory, in Kilobytes, divided by the number of currently connected media players at the time the RSS report was written. Returned values are as follows:

    Used - This monitor returns the amount of memory, in units defined by a script, in use by the server when the RSS report was written. This may be less than the amount of application memory reported by the operating system. The difference represents memory that the server has reserved for use, but that is currently idle.

    Cache_Hit_Ratio- This monitor returns the percentage of memory operations during this RSS period that were carried out using the server's memory cache. (This cache is internal to the server and unrelated to L1/L2 system memory cache.) Because these operations occur faster with less resource contention than non-cached memory functions, a higher percentage indicates greater server efficiency.

    Used_High_Watermark- This monitor returns the high-water mark for the amount of memory used, in units defined by a script. This should be close to the amount of application memory in use as reported by the operating system. This number, which may have been reached in a preceding RSS period, never falls until the server is restarted. The maximum value is set by the -m command option.

    Pages - This monitor returns the high-water mark for the amount of memory used, in pages.

    Memory_Mapped_IO- This monitor returns the amount of memory, in units defined by a script, used for memory-mapped I/O when this report was written. This memory is not included in the application memory allocation. It is not limited by the -m option.

    Memory_Per_Player- This monitor returns the allocated memory, in kB, divided by the number of currently connected media players. Keep in mind that this is only an average and the amount of memory allocated to each player can vary widely depending on the type of player and the type of media streamed.

    Players statistic

    This component monitor returns general statistics about how many new players are connected or disconnected. Returned values are as follows:

    Total - This monitor returns the total number of players connected when the RSS report was written.

    New - This monitor returns the number of new player connections since the start of this RSS interval.

    Leaving - This monitor returns the number of players disconnected since the start of this RSS interval.

    Players by Protocol

    This component monitor returns the percentage of media players using each of the supported control protocols in their streaming sessions. This breakdown includes the total number of players connected when the RSS report was generated. Returned values are as follows:

    RTSP - This monitor returns the percentage of players connected through RTSP and RTSP cloaked as HTTP.

    RTMP - This monitor returns the percentage of players connected through RTMP and RTMP cloaked as HTTP.

    MMS - This monitor returns the percentage of players connected through MMS and MMS cloaked as HTTP.

    HTTP - This monitor returns the percentage of players using HTTP as the control protocol. This does not include players using RTSP or MMS cloaked as HTTP.

    Players by Transport

    This component monitor returns the percentage of media players using each of the supported transports in their streaming sessions. This breakdown includes the total number of players that have connected since the start of this RSS interval. The total number of players connected in a scalable multicast is unknown during the broadcast. Players may report connection statistics to a Web server after the broadcast ends. Returned values are as follows:

    TCP - This monitor returns the percentage of players receiving data by TCP.

    UDP - This monitor returns the percentage of players receiving data by UDP.

    Multicast - This monitor returns the percentage of players connected through a back-channel multicast.

    HTTP Connections

    This component monitor returns the percentage of persistent, secure and cloaked, connections to the server. Returned values are as follows:

    Persistent - This monitor returns the percentage of persistent HTTP connections.

    Secure - This monitor returns the percentage of secure HTTP connections.

    Cloaked - This monitor returns the percentage of cloaked HTTP connections.

    Bandwidth Statistic

    This component monitor provides a guide to how well the server fulfills the requirements for streaming rates of requested media. These statistics can help you determine if the server's outgoing bandwidth is sufficient to meet your streaming media needs. Returned values are as follows:

    Output - This monitor returns the average amount of bandwidth, in units defined by a script, that the server delivered to the network over the RSS period.

    Bandwidth_Per_Player - This monitor returns the average amount of bandwidth per media player during the RSS period, in units defined by a script. This is equal to the Output value divided by the number of players connected when the RSS report was written.

    Subscribed - This monitor returns the percentage value that indicates the total output bandwidth divided by the cumulative bandwidth for all media player Subscribe requests. An RTSP Subscribe request indicates the encoded rate of a clip or broadcast requested by the media player.

    For example: To receive a 512 Kbps encoding of a SureStream clip, a player sends a Subscribe request for the 512 Kbps stream. The returned value should be as high as possible. This monitor can return values greater than 100%. If the returned value is less than 100%, it may indicate that the server is not meeting the player's stream requests.

    Nominal - This monitor returns the percentage value that equals the total outgoing bandwidth divided by the delivery bandwidth that media players have requested using the RTSP SetDeliveryBandwidth directive. This directive indicates the rate at which the player wants to receive the stream, regardless of the stream's encoded rate.

    For example: Upon requesting a stream, a player may set an initial high delivery rate to fill its buffer quickly. The returned value should be as high as possible. This monitor can return values greater than 100%. If the returned value is less than 100%, it may indicate that the server is not meeting the player's stream requests.

    Network Recent Statistic

    This component monitor returns miscellaneous network statistics. Returned values are as follows:

    Written_Packets - This monitor returns the number of packets written to the network in this RSS period.

    Overload - This monitor returns the number of packets that were late being scheduled.

    No_Buffer_Errors - This monitor returns the number of ENOBUFS errors returned from UDP socket network write calls. An ENOBUFS error indicates that the output queue for a network interface was full. This generally means that the interface has stopped sending, which may be caused by transient congestion. A consistent, high number of these errors across many RSS periods indicate a consistently congested network.

    Other_UDP_Errors - This monitor returns the number of general UDP write errors encountered. A consistent, high number of these errors may indicate technical problems with your network connection.

    Behind - This monitor returns the number of packets that were written to the network late. These packets may or may not arrive too late to be of use to a media player. A large number of late packets may cause increased media player resend requests.

    Resend - This monitor returns the number of packet resend requests made by media players that the server honored.

    Aggregated_Packets - This monitor returns the number of aggregated packets written. Aggregated packets are packets from 200 to 1,350 bytes in size that the server writes when it determines that the large size is an efficient delivery means given the current server load and network state. Aggregating packets reduces CPU load and helps the server run more efficiently. As server load increases, the server tries to write more, and larger, aggregated packets. Packets are aggregated for streams using the UDP transport and the RealNetworks proprietary RDT packet format. Packets for streams using the TCP transport or the standards-based RTP format are not aggregated.

    WouldBlocks - This monitor returns the number of packet write attempts during this RSS interval that were blocked by the network (i.e. EWOULDBLOCK errors). When a write attempt is blocked, the server queues the blocked packet, attempting to deliver it later. In many cases, a successful delivery may occur within a few milliseconds of the blocked attempt, allowing the packet to reach the media player on time. A positive value for WouldBlocks typically reflects temporary network congestion. However, a consistently high number of blocked writes across several RSS periods may indicate persistent network problems. If you notice a frequent, high number for WouldBlocks, check for increases in the Behind and Resend values on the preceding lines of the RSS report to determine if the WouldBlocks events affected packet delivery.

    Accepts - This monitor returns the number of incoming socket connections accepted since the last RSS interval. The number is typically close to the New players number, as most Accepts indicate a request from a new media player or another resource, such as a proxy or a transmitter. If the returned value is far greater than the New players number, this may indicate an external security issue, such as a denial-of-service attack.

    Mutex Collisions, Scheduler and Network Items

    This component monitor returns the statistic concerning Mutex Collisions, the activity of the server's internal Scheduler, and network items such as reads and writes. Returned values are as follows:

    Mutex_Collisions - This monitor returns the average number of collisions per second, as measured across this RSS period. A Mutex collision occurs when one server process must wait for another process to release a lock on a shared server resource. Mutex collisions are normal, and the number to expect can vary greatly depending on server tasks and load, as well as the machine architecture. A consistently high number, such as 100,000 or more collisions per second, may indicate a server problem.

    CPU_Spinning - This monitor returns a rough measure of Mutex collisions as related to average CPU usage. Ideally, this value should be near zero. A number greater than a fraction of a percent (such as 2.000%) indicates a great deal of Mutex contention.

    Memory_Ops - This monitor returns the approximate percentage of Mutex collisions caused by non-mainlock locks. These are primarily memory-related locks, but also include registry locks and other types of locks. This value should be as low as possible.

    Scheduled_Items_With_Mutex - This monitor returns the number of non-threadsafe actions that were scheduled to occur at a specific time that were triggered. These items cause greater Mutex contention, as well as reduced scalability and performance relative to the Without Mutex items.

    Scheduled_Items_Without_Mutex - This monitor returns the number of threadsafe actions that were scheduled to occur at a specific time that were triggered. A higher value relative to the With Mutex value indicates better server performance.

    Network_Read_Items - This monitor returns the number of reads to network sockets that were completed in this RSS interval.

    Network_Write_Items - This monitor returns the number of writes to network sockets that were completed in this RSS interval.

    Miscellaneous Statistic

    This component monitor returns information about the internal server state. Returned values are as follows:

    File_Objects - This monitor returns the number of internal file objects currently in use. When a server plug-in generates streaming packets for a clip, it opens one or more file objects. The number of open file objects may therefore be twice or more the number of connected media players.

    Idle_Streams - This monitor returns the number of streams that are currently idle. (PPM refers to the server's standard packet delivery system.) Each stream using PPM periodically enters a state where it is ready to send more data to the player, but does not yet have any packets to send. That stream then goes idle, temporarily.

    Forced_Selects - This monitor returns the number of times the server had to service a timer-triggered event without having data to read or write.

    Aggregation_Support - This monitor returns the number of PPM streams that support packet aggregation.

    Total_Crash_Avoidances - This monitor returns the total number of crash avoidances (CAs) since the last server restart. A CA occurs when the server uses fault-tolerance features to compensate for a problem. For example, if the server encounters corrupt packets in a media file, it attempts to compensate by dropping the corrupt packets and continuing the stream past the corruption point. If it can compensate without terminating the stream, it logs the event as a CA. A small number of CA's is to be expected, and does not indicate a significant problem. A consistently high number of CA's across several RSS periods may indicate serious system problems.

    Current_Crash_Avoidances - This monitor returns the number of crash avoidances (CAs) have occurred within the current, four-hour window. If it reaches 1,000, the server automatically restarts in an attempt to reset into a more stable state. A CA occurs when the server uses fault-tolerance features to compensate for a problem. For example, if the server encounters corrupt packets in a media file, it attempts to compensate by dropping the corrupt packets and continuing the stream past the corruption point. If it can compensate without terminating the stream, it logs the event as a CA. A small number of CA's is to be expected, and does not indicate a significant problem. A consistently high number of CA's across several RSS periods may indicate serious system problems.

    Net_Devices - This monitor returns the number of network connections to the server, other than media player requests, at the time the RSS report was written. This number includes proxy accounting connections, as well as connections to other Helix Servers for content distribution, live stream splitting, and so on.

    Broadcast Reception Statistic

    This component monitor provides statistics about incoming broadcast streams. Returned values are as follows:

    Feeds - This monitor returns the number of live feeds coming into the server from encoders or other Helix Server transmitters when the RSS report was written.

    Total_Bandwidth - This monitor returns the total amount of bandwidth, in units defined by a script, coming into the server as live streams.

    Packets - This monitor returns the total number of live stream packets arriving at the server during this RSS period.

    Lost - This monitor returns the total number of packets lost in transit to this server during this RSS period.

    Lost_Upstream - This monitor returns the number of live packets reported as lost by upstream transmitters. These packets were lost before the live stream was sent through the network to this receiver.

    Resends - This monitor returns the total number of packet resends this receiver requested from encoders or upstream transmitters. The receiver requests a resend of a lost packet only if it determines the packet will arrive in time to be of use in its broadcast stream. In general, the lower the amount of receiver buffering, the fewer resends the receiver requests.

    Out_of_Order - This monitor returns the total number of packets for live streams received out of order during this RSS interval.

    Duplicate - This monitor returns the total number of duplicate packets received for all live streams during this RSS interval.

    Late - This monitor returns the total number of late packets for all live streams received during this RSS interval. The packets may or may not have been too late to be of use in the broadcast.

    Broadcast Distribution Statistic

    This component monitor provides statistics about streams being split to downstream receivers. This information applies only to downstream Helix Server receivers. Returned values are as follows:

    Feeds - This monitor returns the total number of split feeds (both push and pull) being transmitted by this server when the RSS report was published.

    Push - This monitor returns the total number of push-split feeds being transmitted by this server. With a push feed, the server sends the stream to a downstream receiver once the stream is available.

    Pull - This monitor returns the total number of pull-split feeds being transmitted by this server. With a pull feed, the server does not send the stream to a downstream receiver until the receiver requests it.

    Data_Transmitted - This monitor returns the amount of data transmitted by all feeds, in units defined by a script.

    Packets - This monitor returns the total number of packets transmitted for the live feeds during this RSS interval.

    Resends - This monitor returns the number of resend requests processed by this transmitter. The transmitter will not honor the resend request if the packet was lost upstream, or if it determines that the packet will arrive at the receiver too late to be of use.

    Requested - this component returns the number of resends requested by downstream receivers.

    Lost_Upstream - This monitor returns the number of live packets lost by upstream transmitters. A value greater than 0 is reported only if this server is not the origin transmitter for the stream.

    Dropped - This monitor returns the number of packets dropped by this transmitter. This typically occurs if the packet arrives late from the encoder or upstream transmitter, and the server determines that the packet will subsequently arrive at downstream receivers too late to be of use.

    Broadcast Core Statistic

    This component monitor provides statistics about streams being split to downstream receivers. This information applies only to downstream Helix Server receivers. Returned values are as follows:

    Dropped_Packets - This monitor returns the number of live stream packets dropped in this RSS period by the broadcast core. These packets were queued for another process or thread, but were dropped when the queue for that process or thread overflowed. A significant number of dropped packets indicate a general system overload.

    Client_Overflows - This monitor returns the number of live stream packets dropped in this RSS period because the outgoing connection to the receiving client was blocked. Although the server buffers packets to compensate for temporary blockages, it drops the packets if the network blockage does not clear quickly enough.

    Main Loop Iterations

    This component monitor provides statistics about streams being split to downstream receivers. This information applies only to downstream Helix Server receivers. Returned values are as follows:

    Main_Loop_Iterations - This monitor returns the average number of loop iterations for all server processes per second during this RSS period.

    Load_State - This monitor returns the server's current load state: 0-Normal, 1-High, 2-Extreme. The server gauges its internal state using several measurements, including the number of packets written late to the network. During high load states, the server attempts to write large aggregate packets to conserve CPU usage.

    Process: rmserver

    This component monitor returns the CPU and memory usage of Real Helix Universal Media Server process (rmserver).


    Portions of this document were originally created by and are excerpted from the following sources:

    RealNetworks, Inc., “Helix Server and Helix Proxy Troubleshooting Guide,” Copyright © 2006-2008 RealNetworks, Inc.  All rights reserved. Available at http://www.google.com/url?sa=t&rct=j&q=%22a%20mutex%20collision%20occurs%20when%20one%20server%20process%20must%20wait%20for%20another%20%22&source=web&cd=4&ved=0CDoQFjAD&url=http%3A%2F%2Fservice.jp.real.com%2Fhelp%2Flibrary%2Fguides%2FHelixServerWireline12%2Fpdf%2FServerProxyTroubleshoot.pdf&ei=uyqQT9_XE6ie2wXPo5iSBQ&usg=AFQjCNE3CpUHu-LmpfScz2A9GOM9ysztgA&cad=rja.