cancel
Showing results for 
Search instead for 
Did you mean: 

The network is responsible. Always.

Level 11

Last time I told you guys I really love the Ford story and how I view storage in the database realm. In this chapter, I would like to talk about another very important piece of this realm, The Network.

When I speak with system engineers working in a client's environment, there always seems to be a rivalry between storage and network regarding who's to blame for database issues. However, blaming one another doesn’t solve anything. To ensure that we are working together to solve customer issues, we need to first have solid information about their environment.

The storage part we discussed last time is responsible for storing the data, but there needs to be a medium to transport the data from the client to the server and between the server and storage. That is where the network comes in. And to stay with my Ford story from last time, this is where other aspects come into play. Speed is one of these, but speed can be measured in many ways, which seems to cause problems with the network. Let’s look at a comparison to other kinds of transportation.

Imagine that a city is a database where people need be organized. There are several ways to get the people there. Some are local, and thus the time for them to be IN the city is very short. Some are living in the suburbs, and their travel is a bit longer due to having a further distance to travel, with more people traveling the same road. If we go a bit further and concentrate on the cars again, there are a lot of cars driving to and from the city. How fast one comes to or from the city depends on others who are similarly motivated to get to their destination as quickly as possible.  Speed is therefore impacted by the way the drivers perform and what happens on the road ahead.

Screen Shot 2017-10-30 at 13.56.04.png

The network is the transportation medium for the database, so it is critical that this medium is used in the correct way. Some of the data might need something like a Hyperloop to travel back and forth over medium-to-long distances, while other data may have enough for those shorter trips.

Having excellent visibility into the data paths to see where congestion might become an issue is a very important measurement in the database world. As with traffic, it gives one insight into where troubles could arise, as well as offering the necessary information about how to solve the problem that is causing the jam.

I don't believe the network or storage is responsible. The issue is really about the how you build and maintain your infrastructure. If you need speed, make sure you buy the fastest thing possible. However, be aware that what you buy today is old tomorrow. Monitoring and maintenance are crucial when it comes to a high performing database. Make sure you know what your requirements are and what you end up getting to satisfy them. Be sure to talk to the other resource providers to make sure everything works perfectly together.

I'd love to hear your thoughts in the comments below.

15 Comments
michael.kent
Level 13

vinay.by
Level 16

Nice write up

scottmathews
Level 9

How do you measure throughput on that "network?" I'm guessing it's about 2.1 BPS (Bleats Per Second).

Jfrazier
Level 18

Brings a whole new meaning to the term "dropped packets"...

tallyrich
Level 15

Agreed "blaming one another doesn’t solve anything" but there's such a rich tradition in the blame game. I work with a very well functioning infrastructure team, but still have difficulties with the DBAs and the Applications groups.

What I've found is that the infrastructure group is very good about accepting responsibility. If we break something we put our paw in the air and admit it - then as a team we work together to fix it and move on. We are not about blame, but about providing good service.

I think the difference is that our team leadership appreciates honesty and doesn't chase us around with a stick when something goes wrong. We are not "rewarded" for mistakes, but simply held responsible with words like "what went wrong," "what did you learn," and "how can we help prevent that in the future."

ecklerwr1
Level 19

QOS, netflow, nbar, and such has really helped us with these kinds of problems.  The QOS is fun part!

rschroeder
Level 21

Improve efficiency by moving users/people into the city/data center as best you can.  For network users, great big pipes and fast app & database equipment can be part of the solution.  So can Citrix, especially if it keeps users off the regular highways, but is dependent on the Info Highway being big enough to handle the I.P. traffic.  And of course the home pipe (or driveway?) has to be big enough to carry the individual user's load on the highway (network).

The Info highway analogy means roads are network connections, cities are data centers, big-box stores are application servers, warehouses are databases, etc.

If you move your people closer to their jobs, they'll spend less time on the road driving to & from work.  If you have people telecommute to work, they'll spend a LOT less time driving to/from work.

It depends on what you and they can afford.  If everyone in a company wanted to work and live in the same sky scraper, that's a pretty good way to minimize road traffic.  But will they all be happy there?

d09h
Level 16

In a few different jobs, we (network guys) would be asked for a packet capture when an application did not work.    As if the capture, and no knowledge of how a system was supposed to work, was all one would need for any issue.  Typically other avenues of troubleshooting would pause while networking team SPANd a port and created a capture (or in one job use the Infinistream 'time machine' to retrieve conversation X from date Y at time Z). After obtaining a capture, there was the inevitable 'OK network team, what isn't my application working' question.

Sometimes the packet capture issue could be shown to be unnecessary via 'netstat -b -p TCP', which I like to think of as poor man's Wireshark.  If the network is supposedly breaking application 'whatever' on port 49852, why do both sides show an established connection on that port?  "What will I be looking for in the packet capture that I know you will be asking for next?" Obviously, tact is necessary, as is the acceptance that initial clues may be wrong, and maybe it is the network, so tone and choice of words could make or break future collaboration success.

d09h
Level 16

I love tcpick command on Linux systems as Wireshark workaround.  Obviously tcpdump with a grep helps too, but to just get an understanding of traffic flow, tcpick has saved me so much time.  My brother's phone could connect to the IPFire wifi I set up, but Android messages and lack of working applications indicated he had connectivity issues.  For whatever reason, his phone couldn't deal with the fact that my network allowed DNS requests to the handful of DNS servers I designated, and no others.  Saw the DNS requests to servers I had not allowed.  Not sure I'd blame the network on this connectivity issue, but it was instrumental in determining a cause and the solution.

Similar story at a previous job.  Hard coded IP addresses in an application are always fun to discover!  It's the network!!!  You made changes in the network, and the application stopped working.  The network is the cause!!

deverts
Level 14

BLEAT, BLEAT, Get out of the way, Ew!

tinmann0715
Level 16

In my experiences I have come to realize that many times it is the network. But more specifically, it is at the physical layer. Next in line: configuration issue... aka incorrect setting(s).

tallyrich
Level 15

Most of my "Network" issues have been at Layer 8 - the Carbon Layer

Image result for osi layer 8

byrona
Level 21

The lines between networking and storage can also get pretty blurry because networking for storage is a bit of it's own animal and his it's own set of special requirements, for example setting up jumbo frames or making sure you are using the storage vendors networking software.  Not following best practices on these things can cause significant issues when it comes to storage networking.

gfsutherland
Level 14

I couldn't agree more!!!

UHS - User Head Space!

bturner1
Level 9

Well stated and easy to remember with the automobile analogy.   Thanks!

About the Author
In the IT since 1998 and enjoying every last bit of it. The last few years are mainly focused on virtualization and Storage. VMware VCAP-DCA, VCP 4/5, VSP 4/5, VTSP 4/5, MCSA, MCTS, MCP, CCA and CCNA