Exchange 2013 Mailbox Role Counters (Advanced)

Version 2

    This template contains advanced performance and statistics counters for monitoring Exchange 2013 Mailbox Role. Some of the counters may require manual configuration, such as correcting thresholds for the client’s environment. Use this template in addition to the Exchange 2013 Mailbox Role Services and Counters (Basic) template.


    Prerequisites:
    RPC and WMI access to the Exchange server.

    Credentials: Windows Administrator on the target server.


    Monitored Components

    Database: I/O Database Reads (Attached) Average Latency

    This monitor shows the average length of time, in milliseconds, per database read operation. It indicates the average time, in milliseconds (ms), to read from the database file. The average returned value should be below 20ms. Spikes (maximum values) shouldn't be higher than 100ms.

     

    Database: I/O Database Writes (Attached) Average Latency

    This monitor indicates the average time, in ms, to write to the database file. This counter is not a good indicator for client latency because database writes are asynchronous. In general, this latency should be less than the MSExchange Database\I/O Database Reads (Attached) Average Latency when battery-backed write caching is utilized.

     

    Database: I/O Log Writes Average Latency

    This monitor indicates the average time, in ms, to write a log buffer to the active log file. The returned values should be 10 on production servers. If this counter is greater than 10, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.

     

    Database: I/O Database Reads (Recovery) Average Latency

    This monitor indicates the average time, in ms, to read from the database file. The average returned values should be below 200ms. Spikes (maximum values) should not be higher than 1,000ms.

     

    Database: I/O Database Writes (Recovery) Average Latency

    This monitor indicates the average time, in ms, to write to the database file. In general, this latency should be less than the MSExchange Database\I/O Database Reads (Recovery) Average Latency when battery-backed write caching is utilized.

     

    Database: I/O Log Reads Average Latency

    This monitor indicates the average time, in ms, to read data from a log file. This is specific to log replay and database recovery operations. The average returned value should be below 200ms. Spikes (maximum values) should not be higher than 1,000ms.

     

    Database: Version buckets allocated

    This monitor shows the total number of version buckets allocated. The returned values should be less than 12,000 at all times. The maximum default version is 16,384. If version buckets reach 70% of maximum, the server is at risk of running out of the version store.

     

    Database: Database Cache Size (MB)

    This monitor shows the amount of system memory, in megabytes (MB), used by the database cache manager to hold commonly used information from the database files to prevent file operations.

    The maximum value is 2GB of RAM (3GB for servers with sync replication enabled). This and Database Cache Hit % are useful counters for gauging whether a server's performance problems might be resolved by adding more physical memory. Use this counter along with Store Private Bytes to determine if there are store memory leaks. If the database cache size seems too small for optimal performance and there is little available memory on the system (check the value of Memory/Available Bytes), adding more memory to the system may increase performance. If there is ample memory on the system and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. Increasing this limit may increase performance.

     

    Database: Log Bytes Write/sec

    This monitor shows the rate of bytes written to the log. The returned values should be less than 10,000,000 at all times. With each log file being 1,000,000 bytes in size, 10,000,000 bytes/sec would yield 10 logs per second. This may indicate a large message being sent or a looping message.

     

    Assistants: Events in queue

    This monitor shows the number of events in the in-memory queue waiting to be processed by the assistants. The returned values should be a low value at all times. High values may indicate a performance bottleneck.

     

    Assistants: Average Event Processing Time in Seconds

    This monitor shows the average processing time of the events chosen. The returned values should be less than two at all times.

     

    Assistants: Mailboxes processed/sec

    This monitor shows the rate of mailboxes processed by time-based assistants per second. This determines current load statistics for this counter.

     

    Assistants: Events Polled/sec

    This monitor shows the number of events polled per second. This determines current load statistics for this counter.

     

    Resource Booking: Average Resource Booking Processing Time

    This monitor shows the average time to process an event in the Resource Booking Attendant. Returned values should be a low value at all times. High values may indicate a performance bottleneck.

     

    Resource Booking: Requests Failed

    This monitor shows the total number of failures that occurred while the Resource Booking Attendant was processing events. Returned values should be 0 at all times.

     

    Calendar Attendant: Average Calendar Attendant Processing time

    This monitor shows the average time to process an event in the Calendar Attendant. Returned values should be a low value at all times. High values may indicate a performance bottleneck.

     

    Calendar Attendant: Requests Failed

    This monitor shows the total number of failures that occurred while the Calendar Attendant was processing events. Returned values should be 0 at all times.

     

    Store Interface: ROP Requests outstanding

    This monitor shows the total number of outstanding remote operations requests. This is used for determining the current load.

     

    Store Interface: RPC Requests outstanding

    This monitor shows the current number of outstanding RPC requests. Returned values should be 0 at all times.

     

    Store Interface: RPC Requests sent/sec

    This monitor shows the current rate of initiated RPC requests per second. This is used for determining the current load.

     

    Store Interface: RPC Slow requests latency average (ms)

    This monitor shows the average latency, in ms, of slow requests. Used for determining the average latencies of RPC slow requests.

     

    Store Interface: RPC Requests failed (%)

    This monitor shows the percentage of failed requests in the total number of RPC requests. Failed means the sum of failed with error code plus failed with exception. Returned values should be less than 1 at all times.

     

    Store Interface: RPC Slow requests (%)

    This monitor shows the percentage of slow RPC requests among all RPC requests. A slow RPC request is one that has taken more than 500ms. Returned values be less than 1 at all times.

     

    Submission: Failed Submissions Per Second

    This monitor shows the number of failed submissions per second. Returned values should be 0 at all times.

     

    Submission: Temporary Submission Failures/sec

    This monitor shows the number of temporary submission failures per second. Returned values should be 0 at all times.

     

    Replication: CopyQueueLength

    This monitor shows the number of transaction log files waiting to be copied to the passive copy log file folder. A copy is not considered complete until it has been checked for corruption. Returned values should be less than 1 at all times for continuous replication.

     

    Replication: ReplayQueueLength

    This monitor shows the number of transaction log files waiting to be replayed into the passive copy. This indicates the current replay queue length. Higher values cause longer store mount times when a handoff, failover, or activation is performed.

     

    Database Instances (edgetransport): I/O Log Writes/sec

    This monitor shows the rate of log file write operations completed. This determines the current load. Compare these values to historical baselines.

     

    Database Instances (edgetransport): I/O Log Reads/sec

    This monitor shows the rate of log file read operations completed. This determines the current load. Compare these values to historical baselines.

     

    Database Instances (edgetransport): Log Generation Checkpoint Depth

    Represents the amount of work (in count of log files) that needs to be redone or undone to the database files if the process fails.

     

    Database Instances (edgetransport): Version buckets allocated

    Total number of version buckets allocated. This shows the default backpressure values as listed in the edgetransport.exe.config file. Returned values should be less than 200 at all times.

     

    Database Instances (edgetransport): I/O Database Reads/sec

    This monitor shows the rate of database read operations completed. This determines the current load. Compare these values to historical baselines.

     

    Database Instances (edgetransport): I/O Database Writes/sec

    This monitor shows the rate of database write operations completed. This determines the current load. Compare these values to historical baselines.

     

    Database Instances (edgetransport): Log Record Stalls/sec

    This monitor shows the number of log records that cannot be added to the log buffers, per second, because they are full. If this counter is nonzero most of the time, the log buffer size may be bottlenecking. Returned values should be less than 10 per second on average. Spikes (maximum values) should not be greater than 100 per second.

     

    Database Instances (edgetransport): Log Threads Waiting

    This monitor shows the number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be bottlenecking. Returned values should be less than 10 threads waiting on average.

     

    Transport SMTP Receive: Average bytes/message

    This monitor shows the average number of message bytes per inbound message received. This determines sizes of messages being received for an SMTP receive connector.

     

    Transport SMTP Receive: Messages Received/sec

    This monitor shows the number of messages received by the SMTP server each second. This determines current load. Compare these values to historical baselines.

     

    Transport SMTP Send: Messages Sent/sec

    This monitor shows the number of messages sent by the SMTP send connector each second. This determines current load. Compare these values to historical baselines.

     

    Transport Queues: Messages Queued for Delivery Per Second

    This monitor shows the number of messages queued for delivery per second. This determines current load. Compare these values to historical baselines.

     

    Transport Queues: Messages Completed Delivery Per Second

    This monitor shows the number of messages delivered per second. This determines current load. Compare these values to historical baselines.

     

    Transport Queues: Messages Submitted Per Second

    This monitor shows the number of messages queued in the Submission queue per second. This determines current load. Compare these values to historical baselines.

     

    Transport Queues: Retry Non-Smtp Delivery Queue Length

    This monitor shows the number of messages in a retry state in the non-SMTP gateway delivery queues. Returned values should not exceed 100.

     

    Portions of this document are provided courtesy of the following sources:.

    Mailbox Server Counters: Exchange 2010 Help, "Microsoft TechNet":
    http://technet.microsoft.com/en-us/library/ff367871.aspx

    Client Access Server Counters: Exchange 2010 Help: "Microsoft TechNet":
    http://technet.microsoft.com/en-us/library/ff367877.aspx.