cancel
Showing results for 
Search instead for 
Did you mean: 

Monitoring Microsoft Exchange from the Real World (Part 2)

Level 10

After taking a look at what it means to monitor the stability of Microsoft Exchange, and choosing a product option that won’t keep your organizational staff busy for months configuring it we will now look at what it means to monitor Exchange Online in the Office 365 product platform.   Yes, you did read that correctly, Exchange Online.  Isn’t Microsoft monitoring Exchange Online for me? Well yes, there is some level of monitoring, but we as customers typically do not get frontline insight into the aspects of the product that are not working until something breaks.  So, let’s dive into this a little bit further.


Exchange Online Monitoring


If your organization has chosen Exchange Online your product SLA’s will generally be around 99.9x%.  The uptime varies from month to month, but historically they are right on track with their marketed SLA or they will slightly exceed it.  As a customer of this product your organization is still utilizing the core Exchange features such as a Database Availability Group (DAG) for your databases, Outlook Web App, Azure Active Directory, Hub Transport servers, CAS servers etc, but the only difference is that Microsoft maintains all of this for your company.  So assuming that Office 365/Exchange Online meets the needs of your organization this is great, but what happens when something is down? 99.9x% is good, but it’s not 99.999%, so there are guaranteed to be some occurrences of downtime.


Do I really need monitoring?


Not convinced monitoring is required?  If your organization has chosen to move to the Exchange Online platform; being able to understand what is and isn’t working in the environment can be very valuable.  As an Exchange Administrator within the Exchange Online platform, if something isn’t working I can guarantee that leadership is still going to look to me to understand why even if the product is not maintained onsite.  Having a quick and simple way to see that everything is functioning properly (or not) through a monitoring tool can allow you to quickly provide your leaders the feedback they need to properly communicate to the business what is happening.  Even if the only thing I can do next is contact Microsoft to report the issue.


Corrective Measures


Choose a monitoring tool for Exchange Online that will provide valuable insights into the your email functionality.  My guidance here would be relatively similar to suggestions that I would make for Exchange On-Premises.


  • Evaluate several tools that offer Exchange Online monitoring, and then decide which one will best suit your organizations requirements.
  • Implementation of monitoring should be project with a dedicated resource.
  • The tool should be simple and not time consuming to configure. 
  • Choose a tool that monitors Azure Connectivity too. Exchange Online depends heavily on Azure Active Directory and DNS, so being aware of the health of your organizations connectively to the cloud is important.
  • Make sure you can easily monitor your primary email functionality.  This can include email flow testing, Outlook Web App, Directory synchronization, ActiveSync, and more.
  • Ensure that the tool selected has robust reporting. This will allow for time saving’s from scripting your own reports, and allow for better historical trending of email information.  These reports should include things such as mail flow SLA’s, large mailboxes, abandoned mailboxes, top mail senders, public folder data, distribution lists and more.


These considerations will help your determine which product solution is best for your organizational needs.


Concluding Thoughts


Monitoring the health of your organizations connectivity to the cloud is valuable to providing insight into your email system.  There are options that with provide you and your organizational leadership instant insight into the health ensuring that there is an understanding of overall system health, performance and uptime.

14 Comments

When the SLA is reported at 99.9x%, does it exempt down time for planned maintenance, or is this an "absolute" up time and availability stat?

I keep shooting for five 9's, and can achieve it only if I'm allowed to exempt planned outages for preventive maintenance, equipment refresh, and code or component upgrades.

For the article, absolutely monitoring is required!  Especially if your contract defines refunds or damages or rate changes if the SLA is violated or exceeded.  I can't think of a better use for Orion's modules, and if your Exchange service provider is not doing a great job, and if those refunds or penalties are defined in your contract, that's putting money in your bank and putting a watch dog on the provider.  Talk about an incentive for them to keep to the SLA!

Level 8

thanks for write.

Level 12

yes maintenance  should be proper

Level 10

Office 365 uptime SLA's do not typically include maintenance, but it varies per area within the product.  Here is a good link to get you started with the area you are inquiring about.  Microsoft Volume Licensing - Product Licensing Search

Level 9

O365 SLA's are not based on x9's but rather recovery time. Listen to one of their deep dive discussions on how they make O365 run, they don't concentrate on 5-9's they say that systems will go down but they really look at when it does go down how quickly can they recover.

MVP
MVP

I agree with that one must still do monitoring even if the server lives in the cloud. There are plenty of times when the provider doesn't even know there's an issue until someone informs them (they should run Solarwinds )

MVP
MVP

SLA's are a 4 letter word around here...

MVP
MVP

first time looping in cloud services with O365. Very obvious that monitoring is vital, just starting to figure out how to do that now.....

Level 10

What's an SLA...?  and These are not the SLAs you are looking for, were somewhat common phrases in some of my previous employments.....

Level 12

everyone are agree for maintenance will proper for any tech.

Level 12

happy new year 2016.........

Level 17

I can monitor something I do not control, but I can not trust something that I can not monitor.

SLA/SLD's very critical in this new world of managed or 'out of house' solutions

Level 8

Good article - brings up things to consider.

Interesting.... Lots to think and ponder.