Skip navigation
1 2 Previous Next

Geek Speak

22 Posts authored by: Andy McBride

Occasionally I'll run across a customer who needs to quantify the traffic overhead associated with network management. Ten years ago we used to accomplish this using traffic stats from NMS interfaces, some assumptions, and some arithmetic. What we would do is take average daily traffic from the NMS and assume that all nodes impacted the traffic equally. From there we would calculate the impact on WAN connections using the number of nodes on the far side of the WAN. While this method probably does yield a good estimate of the traffic, these days the words "probably" and "estimate" are usually not sufficient.

 

Today, chances are that you already are measuring this traffic, you just need to know where to look. If you are using NetFlow Traffic Analyzer (NTA) finding the data is easy. NTA has features that allow you to define traffic types by an IP address (endpoint) or as a application (IP and ports). A simple way to view network management traffic from production traffic is to use the IP Address Group feature. This option is found in the NTA settings page.

 

NTA_IP_ADD_group.png

 

Add a group with the IP addresses of your NPM and SAM servers, and then disable the default groups.

 

ADD IP  ADD grp.png

 

Now, on to the NetFlow -> IP Groups page.

 

netflow total ip mgmt traffic.png

This graph is showing the total amount of traffic from management servers network-wide. The question is how much of this traffic is going over any particular WAN link and how is that affecting the WAN? This part is easy. Just navigate on the IP address group page to the NetFlow Sources and drill down to the interface you are interested in.

 

NTA management traffic per IF.png

Here we can see that the total bandwidth for management traffic on this interface peaks out at about 0.4%. This interface is Ethernet  so your WAN connections will probably peak a bit higher than this. If you want to add you LEM, SAM, FSM, or other management traffic just add the IP address of those servers to the IP Address Group, and you are done.

 

The nice thing about this is that NTA keeps the IP address groups, so you never have to rebuild these views.

 

If you want to try this on your network but don't have NTA, try the 30 day evaluation and see what your overhead is.

 

I have been in IT since it was called Management Information Systems (MIS), and I still have no idea what that meant. I think IBM had a contest to see who could come up with the most obscure names for various technologies. At one point I was working for a law firm keeping their NetFrame 450s up and running NetWare 3.2. If you don't know what a NetFrame was, it looked a lot like the monolith in that the primates were beating with bones the beginning of 2001, A Space Odyssey. I'm pretty sure that it came with the same toolset from the movie as well. Between wrestling those beasts and helping users who had never seen a PC before, I learned a lot of things that were not covered in any of the training books or courses I attended. Here are a few things I learned the hard way:

 

  1. Entering the name of a NetWare loadable module directly rather than entering "load" first abends the server, resulting in a call to "the office".
  2. Novell went away for a reason.
  3. Spare parts are of no use when the key for the storage room is lost.
  4. The NetFrame we recommended to a customer was last seen with a jack-in-the-box head and arms taped to it.
  5. Despite many predictions, Ethernet does scale.
  6. Users no not appreciate diagnosing the issue as a "loose nut on KB".

The good part of all of this is that the problem technologies (and weird names) have mostly gone the way of the Dodo. I am constantly amazed at the reliability of networks today and at the very short Mean Time to Restore (MTTR) we are seeing.

 

We have done a lot of growing this year at SolarWinds. Chances are that there is no IT management issue you have that we don't have a great solution for. If it has been a while, check out the depth of our product lines and what our customers have to say.

 

One of the most visited posts ever on Geek Speak talks about when to make the move to an Enterprise Operations Center/mutli-Orion core architecture. Now that companies are investing in IT infrastructure again, I think it deserves a second look!

 

http://thwack.solarwinds.com/community/solarwinds-community/geek-speak_tht/blog/2010/02/25/understanding-when-to-deploy-a-distributed-network-management-architecture

 

Enjoy!

We have been seeing a pronounced trend of customers making the transition to SNMP over the past year. Most of the feedback we get is that the transition was a lot easier that the customer had predicted. Consider the benefits SNMPv3 give you over SNMP v2c.

  • Encrypted Communications - no plain text community strings.
  • User Level Access Control - specific message types according to requesting device.
  • View Based Access Control - allow access to a part of a MIB rather than all or none.

 

Rolling out SNMP v3 should be accomplished is a two step plan:

  1. Place the new SNNP v3 configuration on one device ant test. This will typically be done in a lab environment.
  2. Roll-out the configuration to production devices using a configuration management tool.

 

Here are a couple of reference papers to get you started.

 

http://www.cisco.com/en/US/docs/ios/12_0t/12_0t3/feature/guide/Snmp3.html

http://www.solarwinds.com/documentation/Orion/docs/Implementing_SNMPv3r1.pdf

 

--

 

Andy McBride

VMware and SNMP

Posted by Andy McBride Dec 24, 2012

Virtual servers are perhaps the best technology invention in the past 15 years. Just considering the cost saving and positive impact of doing more with lees, nothing comes close in my opinion. As the virtual server market grew, some server management tool fell behind; SNMP most notably. From thin to non-existent SNMP implementations good solution were hard to find We created a VMware free tool but in ESX 3.5 you had to manually configure the servers by editing the snmpd.conf file. ESXi 3.5 the SNMP agent only had trap capabilities.

 

Along comes Cloud Computing and vShpere!

The concept of cloud computing along and the very large scale deployments that make the cloud possible forced VMWare to create a centralized and intelligent management center. Wikipedia has a very good graphic depicting the inherent complexity of a vShpere environment. This shows the level of complexity in a cloud environment. As you can see, there really is no room for manual configurations to enable SNMP. vSphere is the cookie cutter that enables the infrastructure, all you have to do is point it back to your SolarWinds Virtualzation Management and Storage Manager.

 

Occasionally we get a customer who needs to add a network to Network Performance Monitor that is not reachable from their main network. This type of connectivity is shown in the figure below.

dual nic orion.png

Making this work is not very difficult as long as you follow a few rules.

 

  1. As always, the networks cannot share any IP address space.
  2. Don't count on the default gateway alone to define any networks except the ones directly attached to your server.
  3. Use the route add command in a Windows prompt to add all networks you need to manage.
  4. Do not connect the two networks. If you add a router later the ability to route between the networks will break this solution.

 

To see more unique Network Management ideas, visit our NPM thwack forum.

 

 

 

Network Management and the Management Information Base (MIB)

If you are working in IT chances are good that you have been working with SNMP and MIBs. Almost every type of Network Management System uses MIBs. In my New to Networking series, I discuss MIBs at length in Volume 4, Introduction to SNMP. I think that paper gives good overview of MIBs and how they function, so I won't discuss that here. What we will look at is the process of creating a MIB. The Internet Engineering Task Force (IETF) is the party responsible for overseeing the MIB development process and approving MIBs as standards. The IETF welcomes anyone with technical competence to submit work for development and eventual approval as a MIB. So, where does this process start?

 

Step One - Request For Comments (RFC) Submission

To get the MIB "ball" rolling, the submitter works with IETF members to create an RFC and have an RFC number assigned. The RFC is a working document for the submitter to communicate their need and intention for a new MIB. The IETF is so invested in the RFC process that they have an RFC that defines the IETF. IETF members review the RFC submission and comment back to the submitter and any IETF working committee assigned the RFC. If the IETF decides to move forward with the MIB, they assign the RFC a MIB number on the experimental MIB branch and reserve a MIB number on either the standard or enterprise branch. Standard branch MIBs are vendor independent whereas enterprise branch MIBS are vendor specific.

 

This diagram shows the structure of MIBs from the root to these branches.

toplevelMIB structure.jpg

 

Step Two - RFC and MIB Approval

Now don't let this fool you into thinking that this is an easy process to get to this point. This RFC describes the process in full. IETF members tend to be very academic and process oriented people, so this is not for those who are easily frustrated. The good news is that this process works and is the globally accepted method of creating and publishing a new MIB. If you want to take a deeper dive into MIBS I recommend SNMP MIB Handbook by Larry Walsh.

It seems to me that any reference to honey pots requires a reference to Winnie the Pooh, so I'll get that out of the way, and we'll all have that song going through our heads all day. A network honey pot works just like the kitchen honey pot. You open one up and the honey pot attracts flys. In networking the flies are hackers or malicious bots. Honey pots operate by appearing to a hacker to be an attractive hacking target.  The honey pot emulates a full production network or network device so a hacker snoops around while the honey pot captures information about the hacker's attack and the hacker. Network Security Engineers respond to the attack and change firewall rules to prevent further atack.

 

Honey Pot Implementations

 

I have seen two implementations of honey pots, a DMZ implementation and a production server implementation. Here is an illustration of each type:

 

DMZHoneypot.jpg

_____________________________________________________________________________________________

 

serverHoneypot.jpg

________________________________________________________________________________________________

 

The DMZ implementation has an obvious advantage; that the hacker can be tracked before they get into the production network. The server implementation also has an advantage; it detects and traps hackers that have made their way into the production network.So, which one should you use? Both. Implementing one type does not eliminate the ability to implement the other type. Most companies have multiple honey pots in multiple locations. Once an intrusion has been detected it has to be eliminated as fast as possible and the firewalls all should be updated to exclude that attack.

 

Firewall Security Management

 

Centralized firewall security management and device configuration management are crucial at this point. As your network grows, the number of firewalls grows. If you have to manually push new rules out to a few firewalls, that can take a lot of time. During this time the attack is probably still active. Time saved in this process by bringing all of your firewalls into a multi-vendor management platform reduces this time, lessening the threat.

 

 

Andy McBride

Working with DNS

Posted by Andy McBride Nov 8, 2012

DNS is probably the most broadly used network service today, yet to most DNS users the service is completely transparent. The most common use for DNS is navigating to a web site. Whether you go to a site by typing the site name in browser or by clicking on a link, the first stop is a DNS server. Computers use numerical IP addresses, not canonical names to find other computers. People are much better at remembering names than numbers. DNS servers translate the name used for network device, such a web server, to an IP address for that server. From there IP routing and a few other technologies connect you to the server. It all sounds simple, but here is an analysis of the DNS resolution for Microsoft.com from my desk using SolarWinds Engineer's Toolset DNS Analyzer tool.

msdns.png

As you can see, there is a lot going on. Microsoft wants to make sure that people can always access their site. So they have implemented multiple name servers with multiple paths to two authoritive addresses for Microsoft.com. Compare the DNS map above with the results of an nslookup below done from the same PC.

msnslookup.png

...and the results of a trace route from my PC.

mstrcrt.png

What we see is nslookup locating the two addresses to Microsoft and the ping tracerout finding a path through the Internet to Microsoft. Microsoft chooses to configure their web server to not respond to ping, so that is where it times out. This is not an unusual practice.

 

DNS has many other functions including reverse DNS where an IP address is translated to a name. As a former network engineer, I prefer to use IP address rather than DNS names where possible. The problem with continuing that practice is the rapid adoption of IPv6 addresses. Google's IPv6 Address is 2001:4860:4860::8888, and that is an abbreviated address!

 

The bottom line is the more you know about DNS, the more you understand how critical it is. Check out our other tools for DNS, including many free ones, at DNS Stuff

 

 

I have been working with our Web Help Desk team the past couple of months, so I have learned a lot about Service Management. The first thing that strikes me is how much easier it is to implement and use compared to the NOC builds I used to do in the 90"s. The typical ticketing system back then was a client/server application running on a proprietary database. Not only were these systems a nightmare to install, we would also have to train a DBA and the help desk team before anyone could use it. The installation, configuration and testing was about a two week process, performed by me, the NOC Consultant.

 

Get Service Management Running Quickly!

What was my experience this time around? A lot of things have changed for the better, here is this list!

  • Installation under 20 minutes.
  • Choice of industry standard databases.
  • Configuration and testing in one day.
  • A great deal of flexibility with out feeling like I was lost in a spider web.

That's when it hit me! It really felt like a spider web trying to get those old systems humming. It wasn't the nice, concentric type of web, more the spider on crack CIA experiment web.

 

Join Us and See for Yourself

On November 15 at 11:00 AM CTS Manish Chacko and I will be hosting a live Webinar feature Web Help Desk Solutions to Difficult Service Management Issues. This will be a one hour deep dive into the flexibility and ease-of-use built into Web Held Desk.  Look for an invitation for this event soon!

The heat in the tablet battles got turned up to 11 in the past couple of weeks. With Apple announcing the iPad mini, the new Amazon Kindle Fire and all the other competitors adding features and slashing prices, the consumer market is a good place to be. I find it interesting how quickly IT departments have adapted to the explosion of Bring Your Own Device (BYOD) connectivity demands. My tablet has VPN software, so I can access my development VMs from anywhere I get 3G or Wi-Fi. This begs the IT question, "What are all these BYOD users doing and how do we manage them?".

Managing a BYOD World

One thing that makes managing BYODs fairly easy is the fact that they almost all use Wi-Fi to connect to the company network. IT departments can use the tools they probably already have to manage BYODs just like they manage most endpoints. Let's start from the connectivity with wireless network management. Wireless access points grant access to  BYODs just like any other wireless device. This means that the DYODs are subject to the same access and security restrictions. From there, the BYOD traffic usually accesses the Internet using the company IT infrastructure. So you manage whatever the BYODs are doing just like they were any other device. The BYOD traffic crossing NetFlow export interfaces are all NetFlow endpoints, traceable by NetFlow Traffic Analyzer.

 

A Word for the BYOD Users

BYOD does not equal bring you own rules. I have overheard a couple of BYOD users who were startled to find out they their device was denied access to a blacklisted site. They seemed to believe that because it was their own device it was none of the company's business what the accessed. I'm not judging here, just reminding that the rules apply to BYODs just like they do for your company laptop. That is not my opinion, that is just how the network handles endpoint traffic.

 

 

Andy McBride

Making SQL Scream

Posted by Andy McBride Oct 29, 2012

If you are one of the thousands of users of our SolarWinds Products, or one the users of thousands of other products that use SQL for the back-end database, you already know that the importance of SQL performance cannot be overstated. Any application that interfaces with a database is only as fast as the database it uses. The good news for Microsoft SQL users is that there are a lot things you can do to make sure your SQL Server is a screaming machine. Here are some of my recommendations:

 

  • Keep your primary files (yourdatabase.mdf and yourdatabase.ldf) and your temporary files (tempdev.mdf and templog.ldf) on separate arrays. Moving the I/O intensive temp traffic off of your primary file drives will result in a nice performance boost.
  • Use RAID 10 for primary and temp files. RAID 5 or 6 will kill you array speed. Raid 01 is a poor choice as well as many controllers do not implement it well.
  • USE 15,000 RPM drives or high end SSDs if possible. SSD price continues to fall and longevity now rivals spindle storage.
  • Use 64 bit SQL and lots of RAM.
  • If SSDs are not an option, try RAMDisk for your primary .MDF files.

Just implementing 3 or 4 of these will have a huge impact on your SQL performance and the application using SQL for data storage.

 

For more tips, take a look at this Technical Reference on Managing Orion Performance.

I believe that an IT department's best tool for keeping customers happy is through communication. When I turn the monitors on, login, and begin the day's first task I expect to find what I need on the network quickly. I want to get my first task done and move on to the next.  It's like flying - I want to get from point A to point B and then move on to the next trip. There is nothing worse than showing up for a flight to find it is delayed for some unknown amount of time and some unknown reason.  I traveled two to three times a week during most of the nineties. Unfortunately this was probably the worst period for airline communications. This was the time of the big IT build-out, and IT was hearing the same thing the airlines were, "Tell me what the heck is going on!".

 

IT Service and Bragging Rights

 

IT services are much more reliable overall than when we were just figuring out how to deliver networks on a large scale. TCP/IP winning the protocol wars was a big step forward. Not that TCP/IP was the best layer 3 and 4 solution, it was the most widely accepted and the deal was done when Microsoft and Novell announced native support for TCP/IP. IT is by no means an easy job, you have to know a lot and be able to connect the dots in pressure situations. Coordinating hundreds of flights every day can't be easy either. I don't know if the airlines are actually doing better than they were 15 years ago, but I feel more confident traveling now because they have improved so much in communicating with passengers.

 

Here are three steps I see IT following today to make their users more confident when an issue arises:

  1. Detecting the issue.
  2. Assessing the impact.
  3. Communicating the issue to the users.

Then IT does something the airlines probably will never do; they ask you how your experience was. Not only that but they communicate the issue status all the way through to resolution.

 

IT Service Management Automation

If you have read many of my blogs, you know I'm a huge fan of automation in IT. Without automation the above steps would take a long time. So, what are the gears that make this automation possible?

  1. Network Management System (NMS) fault and performance alerting.
  2. NMS alert to Help Desk ticketing.
  3. Help Desk task management, messaging, and surveying.

When the above are all tied together, IT manual task and process overhead are greatly reduced and the user knows that IT sees the issue and is on it. With SolarWinds Network Performance Monitor, Server and Application Monitor,  and Web Help Desk you will have this covered.

 

Now if I could just stop the airlines from charging me for extra for some leg room......

 

 

Knowing the Vulnerabilities is Key.

From what I have seen in my years of network management, there is a good deal of misunderstandings surrounding implementing SNMP. The procedures for basic implementation are well understood, but the problems with using default settings and broadcasting SNMP are not well known. Considering that SNMP is used to manage most every network, making sure that you secure access to SNMP is critical. Here are the areas I believe you should check in your network.

 

Proper Use of Community Strings.

Community strings are a type of password. They control access to Management Information Bases (MIBs) and define the level of access. Here is where one of the problems occurs. Now I don't have a scientific poll, but I am willing to bet if you asked a group of network engineers what the SNMP v2c community strings are, a good number of them would answer, "public and private". The correct answers are read only and read/write.


This misunderstanding happens because the default settings for read only and read/write are "public" and "private". When SNMP v2c is enabled, most devices will populate the read only and read write community string fields with these defaults. I have seen more than once where the Network Management System (NMS) SNMP strings were then set to public and private to allow the NMS to communicate with the devices. Here are a few things you can do to increase the level of security on SNMP v2c.


Best Practices to Avoid SNMP Security Issues.

  • Never use default community strings on devices or your NMS.
  • Use unique community strings by geography or by device function. For example, create unique community strings for WAN access devices, EMEA area devices, data center devices, etc. SNMP v2c community strings are passed in plain text, so this way if one area or device type becomes compromised, the rest of the network is not compromised.ove
  • Run a scheduled discovery for devices using the default community strings as well as the discoveries using valid strings. Once you have a discovery for devices answering to default strings, add an alert for that condition. This automates locating and taking action to correct these devices. You will want to give your network security a heads-up as a scan for default community strings may trigger alerts in security devices.
  • Use automated network configuration management. While default community strings are a security issue, the root cause lies in configuration management weaknesses.
  • Use an automated policy compliance reporting package to demonstrate compliance with internal and external policy requirements.

 

A couple of great pieces of software to accomplish the above best practices are SolarWinds Network Performance Monitor (NPM) and Network Configuration Manager (NCM).

 

NPM network monitoring software discovers devices with default community strings and alerts on the issue. NCM offers extensive configuration management and compliance reporting.

If you are ready for the jump to SNMP v3, check out this technical reference.

 

SNMP - It's not just your grandfathers protocol anymore!

 

Yes, SNMP has been around long enough for a few grandfathers to have used it. In version 1 there was not really a lot you could do. You could find the value of a system OID and set an OID if you really knew what you were doing. Version 2 added the ability to perform get bulk requests, rather than the old way of issuing strings of get next requests. Using get bulk was like being able to look at a whole page of a book, where get next was like having to ask for each word one at a time.

 

Version 3 added a very strong and flexible security mechanism along with some other minor features. For a long time version 3 seemed to scare people off because of the many options of the security models it uses.  If you have been interested in SNMP v3 and want to know how it is implemented, see this Technical Reference.

 

If you want to know more about SNMP and how it works, see this educational paper.

Filter Blog

By date: By tag: