Skip navigation

Here at SolarWinds we offer a product called EOC - or Enterprise Operations Console. EOC is used when you want to deploy Orion in a distributed management architecture. Today I received a question about when you should think about deploying such an architecture so I thought I'd take a few minutes to write on the subject.

 

Before we go into the reasons to deploy such an architecture we should agree on what exactly a distributed network management architecture is. For the purposes of this conversation, a distributed network management architecture is any NMS that has more than core. The core being the main server and/or database. In many cases these systems are distributed geographically but not always. In almost all cases, because the network management data is being stored in more than one system you'll make some tradeoffs with regards to global reporting capabilities. These tradeoffs can be mitigated but not without additional investment of time and/or resources.

 

There are three primary reasons for deploying a distributed network management architecture. Let's explore each individually...

 

Scale - the reason that most organizations deploy a distributed management system is that their network is so big that a single core system simply can't keep up. With Orion (for instance) this typically happens when the network grows beyond around 50,000 managed elements. In a network like that, the database backend for the NMS (SQL in Orion's case) simply can't keep up. Some of the data that Orion collects (like NetFlow and Syslog for instance) can be quite database intensive from an I/O standpoint.

 

Geography - if your network has two logical hubs, for instance if you have networks in Europe and Asia with data centers in each, it might make a lot of sense to deploy a geographically distributed NMS. When you're polling a large network across the WAN, especially an expensive and possibly high-latency WAN, you can experience a signficant degradation in data quality and performance visibility. Additionally, if the WAN connection is down you aren't able to collect data for those remote networks. Add to that the cost of polling for data across inter-continental links and distributed NMS suddenly become a very good idea for these types of networks. In these cases the question is usually not whether or not to distribute but whether or not to rollup the results into a product like EOC.

 

Disaster Recovery (DR) - another common reason for deploying your NMS in a distributed model is for disaster recovery. We see this commonly within companies that have DR sites or redundant data centers. These are special cases and so we'll not go into the details here.

 

So, in a nutshell, if you've got Orion and you're wondering whether or not you should deploy EOC. Ask yourself these questions:

a) Would I benefit from distributing my NMS geographically?

b) Has my NMS grown beyond what I can build a SQL Server to support?

 

Ping me back if you have any questions/comments and be careful out there...


Flame on...
Josh

 

 

One of the most common things I'm asked to explain is the difference between Cisco IP SLA and NetFlow. At a glance, they have a lot in common:

  • They're both supported on many Cisco devices (routers, swithes, firewalls, etc)
  • They both can help you understand network performance, especially on the WAN
  • Network management applications like Orion and the SolarWinds free tools support both

All that said, the similarities mostly end there...

NetFlow
Let's for a moment group all flow technologies including NetFlow, JFlow, sFlow, and IPFix together and just call them "NetFlow" to keep things simple. NetFlow is a feature on your routers and switches that analyzes the traffic that is going in one interface and out another to tell you things like the source and destination IP addresses, the protocol, the application (really source and destination ports), the amount of data being sent/received, and so on. NetFlow's primary purpose is to help you understand the bandwidth consumption on your network. It answers the questions of "who is using my bandwidth and what are they doing with it?"

NetFlow can't tell you about things like application performance, response time, errors, jitter, and packet loss. Remember - its job is to help you analyze bandwidth consumpiton; not to help you analyze network performance.

Cisco IP SLA
IP SLA is very different from NetFlow. Cisco IP SLA is a feature on your routers and switches that allows you to configure these devices to run tests from their location on the network. The tests, called operations, are used to take measurements of network performance and reachability. For instance, you may want to know how HTTP traffic going to google.com varies from your different WAN locations. You can use IP SLA to measure this from the edge routers in those sites. You might also use IP SLA to measure latency, jitter, MOS, DNS performance, and etc.

The devices run these operations on a scheduled basis and store the results in memory. Then, you use a network management system like Orion to pull the data back to a central location for analysis, alerting, and reporting.

The sweet spot...
Ultimately, you'll want to be able to use both NetFlow and IP SLA together so that you have a complete picture of network performance. In a perfect world, IP SLA tells you where you have issues and what they are - NetFlow tells you why.


Flame on...
Josh
Follow me on Twitter

One of the most difficult aspects of analyzing network traffic is that so much of today's network traffic is web traffic riding on either port 80 (HTTP) or port 443 (HTTPS).  When you analyze network traffic using a technology like NetFlow, sFlow, JFlow, or IPFix, the protocol tells you (among other things) the source and destination addresses, the protocol (TCP, UDP, etc), and the source and destination port numbers - but not the application.

There are a few ways of getting around this. The latest version of the Orion NetFlow Traffic Analyzer (NTA) leverages one of these methods by allowing you to assign addresses to port and address groups, ranges, and combinations. For instance, you may say that any HTTP traffic for the address range of 137.3.11.0/24 is Exchange OWA traffic while HTTP traffic to 10.199.1.0/24 is Intranet traffic.

This latest version of Orion NTA that we've produced here at SolarWinds also includes several performance/scalability enhancements and some great new features that make it much easier to understand the data that NTA is telling you about your network traffic. As always, you can download all of the SolarWinds applications from http://www.solarwinds.com and try them out for free...


Flame on...
Josh
Follow me on Twitter

Not too many years ago I ran a large network for the United States Air Force. For the first several years I was there we had an old copper-wire backbone and it was amazing to see the havoc that weather could wreak upon that network. Sometimes you practically needed scuba gear to go into those manholes and troubleshoot wiring issues. You could practically track where the storm was moving based upon the color changes on my maps within my NMS.

Even today, with best in class technologies, weather can play a big role in network performance and reliability. One of the coolest things that you can do to keep an eye on both your network and the current weather is to use an active weather map as the background for some of your maps in Orion. The PM team covers the specifics of how you can do this Using a weather map as your background for your maps.


Flame on...
Josh
Follow me on Twitter

Filter Blog

By date: By tag: