Skip navigation

For a year that started out so very slowly in terms of “The Big Exit,” 2016 has become so compelling.

 

There have been very few infrastructural IPO’s this year, but there have been some interesting acquisitions, and some truly significant changes in corporate structures. I will highlight a few, and maybe even posit an opinion or two.

 

The Hyper-Converged market is growing steadily, with new players on a practically daily basis. Nutanix, who has stated from an early position that their exit would be one of Initial Public Offering, has pulled back on their timeframe a couple of times recently. They are consistently viewed as the big dog on the Hyper-Converged Infrastructure. With strong numbers, a loyal fanbase, and a team of salespeople, engineers, and SE’s who’ve at times rabidly promoted their offerings, they come by the stature in this space quite rightly. These statements are not, from me, about the quality, comparison, or reflections on any of the competitors in this growing space. It does seem that the only thing really holding back this company from their desired exit is one of the marketplace shying away from IPO’s, and that if they’d wanted to become an acquisition target, that quite possibly, their brand could have become a part of some other organization’s product line.

 

The massive companies in the space are not immune to these kinds of changes either. For example, after Dell decided to go private, and made that a reality, they then set their sights toward acquiring the largest storage vendor in the world. After what I’m sure have been long and arduous conversations, much negotiation, and quite a bit of oversight from the financial world, they’ve now made the acquisition of EMC a reality. While we will likely not see the full fallout of which companies stay in the newly created DellEMC, and which get sold off in order to make up some of the costs of the acquisition. The new company will be the largest technology company in the world and comprise so many different services offerings, storage offerings, converged architectures, etc. It’s a truly stunning acquisition, which will theoretically alter the landscape of the industry in profound ways. The future remains uncertain regarding Compellent, EqualLogic, Isilon, ExtremeIO, VNx, and Vmax, to name a few storage only brands, Dell Professional Services, Virtustream, and even VMware can be potentially be spun off. Although, I do suspect that VMware will remain a company doing business as it always has, a free spirit remaining a part of the architectural ecosystem, and beholden to none. VCE itself, could change drastically, as we really don’t know how this will affect the Cisco UCS/NEXUS line.

 

I recently wrote a posting on the huge and earth shattering changes taking place at HPe as well. Under the guidance of Meg Whitman, HP has now split itself into two distinct companies. Consumer division (Hewlett Packard) comprised of desktops, laptops, and printing as well as other well-known HP brands on one side, while Servers, enterprise storage, and Aruba Networking etc. become part of the other side (Hewlett Packard Enterprise). The transition, first launched at HP Discover 2015, has gone quite smoothly. Channel relationships have, if anything grown stronger. From this man’s perspective, I am impressed. Now, the recent announcement that Enterprise Software will be sold off to MicroFocus, the brand that used to market a version of Cobol, a global presence, will now be the owner of these major software releases. For my money, the operations should be just fine. I don’t really see how it’ll change things. Certainly some of our contacts will change, but as to how smoothly this newest change will transition is left to be seen.

 

Pure Storage, the last big IPO to transpire in the enterprise infrastructure space, has gone, for the most part very well. These folks seem to have a great vision of where they want to head. They’ve successfully built a second storage platform, essentially on the sly (FlashBlade), meanwhile their numbers have been on the whole, quite good. I’m so interested to see where things go with Pure. I do feel that they’ll handle their future with aplomb, continue to grow their market share, and create new products with an eye to gaps, and customer requirements. Their Professional services group, as well as their marketing have been standout performers within the industry. I do find it interesting, though, that they have been turning the world orange and converting customers away from legacy storage brands to new platform approach, while taking the needs of their customers even in use-cases that hadn’t necessarily been core to their approach gracefully, aggressively, and competently.

 

Finally, I’ll mention another key acquisition. NetApp, one of those stalwart legacy storage companies, at one time a great alternative to other monolithic storage vendors had gotten stale. By their admission, the reliance they had on an older architecture truly needed a shot in the arm. Well, they achieved this by purchasing SolidFire. SolidFire, a very tightly focused storage player in the All Flash Array market, was accomplishing some truly high-end numbers for a startup, going into datacenters, and service providers and replacing far larger players meanwhile solving problems that in many cases had existed for years, or creating solutions for brand new Cloud related issues. A truly impressive feat for such a lean startup. They’ve proven to be just the key that fit the lock. I’m very interested to see how smoothly this acquisition will go moving forward. I wonder how well that startup mentality will fuse with the attitudes of a much slower moving culture, as NetApp had become. Will attrition force the rockstar development team to slow down, focusing on integration, or will they be allowed to continue along the path they’d cut over the previous few years, run as a separate entity amongst the rest of NetApp? I am curious as to whether some of the great SolidFire people will leave once the dust settles, or if it’s to their benefit to grow within this larger entity. The truth will prove itself.

 

The current crop of candidates looking to go public all seem to revolve around the Software related cloud business model, companies like Twilio, Blue Apron, and Casper Mattresses appear to be the kind of contender that are poised to transform. They seem to focus on software as their model. From a real IT perspective, I’ve heard Basho mentioned, (A brand new platform for distributed Database), Code42 (the creators of Crash Plan), DropBox (the ubiquitous cloud file share/storage location), and xMatters (A leader in the IOT landscape) as potential candidates for public offering.

 

As to mergers and acquisitions, there does seem to be a better taste toward companies like Acacia (in Optical Networking), Pandora (the streaming media company), Zynga (video games like Farmville), and a couple of semiconductor firms: Macom and SunEdison.

 

Updates after writing and before posting: On Tuesday September 20, Nutanix filed (Again) for an IPO, setting the Initial public offering at 209Million based on a corporate valuation at approximately 1.8Billion. Filing paperwork here. And, the VMware vRealize Management Suite, as yet another fallout from the DellEMC deal, has been sold off to Skyview Capital. I’m fairly confident that we’ll be seeing more and more changes in this shifting landscape in the very near future.

 

We are living in a time of change in tech business. What or who is next for companies like those I’ve highlighted? Who will come up with the next ground breaking tech? And, who’s next to set the world on fire with their huge business news?

There's been a long-standing "discussion" in the world of storage regarding snapshots and backups. Some people say that snapshots can replace backups, while others say that just can't be true. I side with the latter, but the latest industry developments are making me reconsider that stance.

 

What's a Backup?

 

A backup isn't just a copy of data. A backup has to be recoverable and reliable, and most snapshots just don't meet that criteria.

 

What does "recoverable" mean? Backups have to be indexed and searchable by common criteria like date, file name, location, file type, and so on. Ideally, you could also search by less-common criteria like owner, content, or department. But at the very least there should be a file-level index, and most snapshot tools don't even have this. It's hard to expect a block snapshot to include a file index, but most NAS systems don't have one either! That's just not a backup.

 

Then we have to think about reliability. The whole point of a backup is to protect your data. Snapshots can protect against deletion and corruption, but they don't do much if the datacenter catches on fire or a bug corrupts your storage array. And many snapshot systems don't "snap" frequently enough or keep enough copies long enough to protect against corruption very long. This is why storage nerds like me say "your backup should be on a different codebase and your archive in a different zip code."

 

Then there's the question of management. Most backup systems have "friendly" interfaces to schedule regular backup passes, set retention options, and execute restores. Many years ago, NetApp showed just how friendly a snapshot restore can be, but options for what to backup and when remain pretty scarce. Although backup software isn't known for having the friendliest interface, you usually have lots more options.

 

Snap-Based Backups

 

But array snapshots can be an important part of a backup environment, and many companies are headed in that direction.

 

Most of today's best backup products use snapshots as a data source, giving a consistent data set from which to read. And most of these products sport wide-reaching snapshot support, from storage array vendors to logical volume managers. This is one source of irritation when people claim that snapshots have nothing to do with backups - of course they do!

 

Some snapshot systems also work in concert with data replication solutions, moving data off-site automatically. I've enjoyed the speed boost of ZFS Send/Receive, for example, and have come to rely on it as part of my data protection strategy. This alleviates my "different zip code" concern, but I would prefer a "different codebase" as well. That's one thing I liked at this week's NetApp Insight show: A glimpse of Amazon S3 as a replication target.

 

Then there are the snapshot-integrated "copy data management" products from Catalogic, Actifio, and (soon) NetApp. These index and manage the data, not just the snapshot. And they can do some very cool things besides backup, including test and development support.

 

Stephen's Stance

 

Snapshots aren't backups, but they can be a critical part of the backup environment. And, increasingly, companies are leveraging snapshot technology to make better backups.

 

I am Stephen Foskett and I love storage. You can find more writing like this at blog.fosketts.net, connect with me as @SFoskett on Twitter, and check out my Tech Field Day events.

Thanks to the Internet of Things (IoT), we're on the lookout for invisible devices that are now capable of becoming vectors for all kinds of nasty services. The webcam attack on Brian Krebs is only the beginning. Could you imagine the focal power of a wide variety of IoT devices being brought to bear on Amazon? Or on Google? The potential for destruction is frightening. But it doesn't have to be. It just takes a little effort up front.

If It Talks Like A Duck

One of the best things about IoT devices is that they are predictable. They have static traffic patterns. Thermostats should only ever talk to their control servers, whether they be in the cloud or at a utility service provider. Lightbulbs should only ever talk to update servers. In the enterprise, devices like glucose meters and Point-of-Sale credit card readers also have traffic profiles. Anything that doesn't fit the profile is a huge clue that something is going on that shouldn't be.

Think back to the Target POS data breech. The register payment scanners were talking to systems they had never talked to before. No matter how small or isolated that conversation, it should have been a warning that something fishy was happening. Investigation at that point would have uncovered a breech before it became a publicity nightmare.

IoT devices should all have a baseline traffic profile shortly after they are installed. Just like a firewall port list, you should know which devices they are talking to and what is being transmitted. It should be incumbent on the device manufacturers to provide this info in their documentation, especially for enterprise devices. But until we can convince them to do it, we're going to need some help.

Tools like SolarWinds Network Traffic Monitor can help you figure out which devices are talking to each other. Remember that while NTM is designed to help you ferret out the worst offenders of traffic usage in your network, IoT devices may not always be trying to send huge traffic loads. In the case of the IoT DDoS, NTM should have seen a huge increase in traffic from these devices that was out of character. But in security cases like Target, NTM should be configured to find out-of-profile conversations with things like accounting servers or PCI devices.

You Need To Walk Like A Duck

I know I said it before, but your company absolutely has to have some kind of IoT policy in place. Today. Not after a breech or an incident, but ahead of time. This helps you control the devices flowing into your network from uncontrolled sources. It allows you to remind your executives that the policy doesn't require them to have Hue color-changing lightbulb. Or that they need to remove the unauthorized security camera watching the company fridge for the Lunch Bandit.

Sure, IoT is going to make our lives easier and happier. Until it all falls down around our ears. If I told your security department that I was about to drop 300 unsecured devices onto the network that can't been modified or moved, they would either have a heart attack or push back against me. But if your monitoring system is ready to handle them there won't be any issues. You have to walk the security walk before you're giving the security talk to a reporter from the New York Times.

The IT help desk is the lifeblood of an organization. It assists co-workers and end-users in many critical ways, including troubleshooting, answering questions, solving known problems, and helping the organization maintain productivity. However, if you’re manually managing the IT help desk function for your business, the speed at which you’re able to resolve issues could suffer, leading to delayed ticket resolution and unhappy end-users.

 

In this blog, I’ll detail five challenges businesses face by manually managing the IT help desk (as illustrated in the infographic), and offer up an alternative solution that’s sure to help desk admins and technicians avoid a lot of unnecessary headaches.

 

The Trouble With a Manual Help Desk

 

  1. Routing Tickets – Manually managing the IT help desk can make it difficult to track the availability of technicians to respond to issues. In this case, accidentally doubling down on requests is far from unheard of. And without in-depth knowledge of each individual on your team, assigning tasks based on the criticality and technicality of issues is more of a shot in the dark. Regardless, both scenarios can delay the process of even responding to a ticket, much less meeting a resolution.

  2. Tracking Down the End-user - When relying on a manual system for running the IT help desk, tickets are addressed in-person, which wastes time tracking down the end-user to provide hands-on assistance. Clearly, time spent running around leaves less time for resolving issues.

  3. Tedious, Manual Support – Troubleshooting at the “scene of the crime” (i.e. the end-users’ workstation) can have its benefits, but it leaves end-users twiddling their thumbs while a support technician diagnoses and corrects an issue. Of course, that end-user has much better things to do with their time, including hitting the next deadline, or getting to a meeting, which simply compounds the issue.

  4. Manually Closing Tickets – Another aspect of manually managing the IT help desk involves closing support cases. Though seemingly painless, the process of updating a static spreadsheet somewhere and contacting the supported party to confirm satisfaction can be even more of a time suck for IT support admins than you realize.

  5. Results and Performance – We all know time is money. Therefore, the time it takes to resolve an issue, which can certainly be compounded by all the factors I’ve listed above, means time (and money) wasted because someone is left stranded by IT issues. Plus, when evaluating the performance of the IT help desk organization as a whole, let’s just say downtime is frowned upon.

 

Two Ways to Run an IT Help Desk

 

IT Help Desk technicians face challenges on a daily basis. Manually managing the entire help desk function does not have to be one of those challenges. Through the combined use of SolarWinds® Web Help Desk and DameWare® Remote Support, life for IT pros can be far simpler. Take a look at the infographic below to see a comparison of what it’s like to manage a web help desk operation manually versus the use of these two solutions.

 

A lot can be said about the benefits of enabling remote support capabilities within your help desk solution, especially when IT issues arise. Consider the combined use of these two SolarWinds solutions, if not for the end-users and co-workers you support, but for the productivity of your businesses as a whole.

 

 

2 Ways to Run a Help Desk

2 Ways to Run a Help Desk from SolarWinds

thwack-camp-banner-right-300_blk.png

I posted a brief introspective note about THWACKcamp right after the event but now that some time has passed, I wanted to share some of the things that rose to the surface now that I have the benefit of additional time and distance (and let's be honest, SLEEP!).

 

First and foremost, credit needs to be given where it's due - to YOU. The THWACK community really brought it this year, in numbers and enthusiasm the likes of which we'd never seen before. Being on chat with everyone was like a years' worth of SolarWinds labs all smashed together (which, when you consider we had 15 separate sessions, it pretty much was!); the attendance was through the roof; and the range of questions and conversations have spawned discussions which continue now, 2 weeks later. I will happily don my "Captain Obvious" cape to point out that all of this would have been impossible without this amazing community which you've built and allowed us to be caring stewards of.

 

Second, kudos has to be given to the SolarWinds team, who worked their heart out to make this event a reality. The video crew, events team, graphics design artists, THWACK administrators, writers, and coordinators of all stripes, shapes, and sizes spent months pouring over every detail, thinking through each session, and crafting the best possible online experience - resulting in the largest conference of monitoring experts in the world - online or off.

 

With those very important thanks taken care of, here's what lingers in my memory:

 

1) You're not alone

The theme this year was "Challenge Accepted", but there's a risk of accepting challenges with an "I'll just do it myself" mentality. That's not the spirit of the theme, that's certainly not the spirit of THWACK, and if you were listening, you realized that's not the take-away from any of the sessions. Rarely was there a moment on screen when someone stood alone, and that was by design. As you accept your challenges - from being an accidental DBA to plumbing the depths of log management and even to figuring out how to make monitoring matter to management - you have a team behind you. That team may sit one cube over. Or it may be an IM, forum post, or phone call away. But irrespective of distance or time zone, there are people who share your passion and are ready and willing to help out.

 

2) The challenge is not unique

I'm not pulling a "Fight Club" reference, trying to tell you that you are not a special snowflake. I personally believe in the unparalleled blessing of recognizing the unique gifts each person brings to the table. But this challenge you've accepted - this problem you have to solve right now? THAT is not special. You may be monitoring one of those "hard to reach places" that seems to be completely off the radar of most tools. Or you are once again playing detective, searching for clues to the elusive root cause. No matter the situation, THWACKcamp taught us that this has happened before and someone has seen it and solved it. That's where item #1 comes into play. You gather your posse, you do the research, and you solve the problem.   Across two days and 15 sessions we heard from Geeks from all walks of life sharing examples of the challenges they faced and what was amazing was how often we shared similar experiences. That should be a huge relief to those of us who operate as "an army of one" at our respective company. It serves as a reminder that this challenge we're working on has been seen before, and solved. Which leads me to the last point:

 

3) The challenge is never insurmountable

To paraphrase Anthony Hopkins' character Charles Morse in "The Edge" - What one geek can do, another can do.  Someone else has stared this beast in the face and forced it to back down into the primordial slime from which all our worst IT issues arise.

 

The message after watching THWACKcamp this year was that this challenge HAS been accepted and resolved in the past, it CAN be solved by you today, and there's a TEAM standing beside you to help you out.

 

So let's all resolve to accept that next challenge, gather our team, and start building some epic stories and incredible solutions to talk about.

 

Because THWACKcamp 2017 can't be that far away.

 

[EDIT: Tip of the hat to designerfx for reminding me to add a link to the videos:]

To relive the excitement of THWACKcamp 2016, head over to this link: http://thwackcamp.com. Pop some popcorn, watch each video as many times as you like, and bask in the #MonitoringGlory!

I'm at Microsoft Ignite this week, so if you are at the show please stop by the SolarWinds booth and say hello! I'd love to talk data, databases, and bacon with you, not necessarily in that order.

 

Here's a bunch of stuff I thought you might find interesting, enjoy!

 

Oracle's Cloudy Future

Last week was Oracle Open World, and we saw Larry Ellison talk about how much better Oracle Cloud is when compared to AWS. Read this and understand why Larry is wrong, and why Oracle is falling behind. In short, they are the new IBM.

 

The War On Cash

Long, but worth the read. My first thought was about the Star Trek future where money wasn't needed, and perhaps eliminating cash is the first step towards that. If so, then things will get worse before they get better.

 

Oh, shit, git!

Using Git? That's great! Make a mistake? You might want to read this.

 

Bad Security Habits Persist Despite Rising Awareness

Because you can't fix stupid, no matter how hard you try. Apparently.

 

Yahoo Says at Least 500 Million Accounts Breached in Attack

See preceding comment.

 

Top 10 ways to secure your mobile phone

Good list of tips here for anyone with a mobile phone. I like the idea of the remote wipe, will be adding that to my toolbox.

 

A Digital Rumor Should Never Lead to a Police Raid

I had never thought about the civil liberties aspect of a home raid brought about by IP address details. That's probably due to the fact I've never been worried about my home being raided, either. Anyway, interesting debate here.

 

This is how I imagined gminks rolled into work this past Monday:

 

awesome.gif

tldr;

SolarWinds and Vyopta now integrate so that you can monitor live data from your video infrastructure and access switch interface for any problem call in any conference room for Polycom or Cisco endpoints.

 

Key Features:

  • Simple API-level integration
  • Single click from any Cisco or Polycom endpoint to NPM interface details page
  • Live call stats, video device status, camera/display connection data, and registration info.

 

Eliminate the Video Conference Blind-Spot

Do you ever enter a never-ending blame game with your A/V team about why video conferences fail? Are you responsible for the video infrastructure in your environment? Perhaps even if you don’t want to be? Tired of those codecs and video infrastructure being a black-box in terms of actual call statistics and quality metrics? Want to bridge the visibility gap between your voice / video and the rest of your network infrastructure? Well perfect - because Vyopta’s real-time video monitoring now integrates with SolarWinds Network Performance Monitor.
     With this integration you are now able to monitor, alert, and diagnose end-to-end video call issues through Vyopta AND identify whether it is a network problem or a video device problem. Furthermore, with one-click access to NPM from every video endpoint, you can diagnose and fix the issue if it is a network problem. On the Vyopta side- call quality and hardware statistics are pulled directly from your endpoints and bridges via API. Whether you are using Cisco, Polycom, Acano, Pexip or Vidyo in whichever flavor, your data is combined and normalized in real-time. Based on this broad dataset, you are able to assess end-to-end call quality per call or determine whether an issue may be systemic within your video environment. Perhaps it’s as simple as the screen or camera being disconnected on the endpoint. Maybe the user dialed the wrong number. In Vyopta, you can get alerted for and diagnose the following issues at a glance:

  • Camera/Display disconnect
  • Endpoint becomes unregistered (unable to make calls)
  • Endpoint down
  • Bad call quality from gateway, bridge, or endpoint (packet loss or jitter)
  • High Packet Loss

 

DashboardWeb.png

 

Vyopta’s built in dashboards you can also quickly evaluate the health of your bridging infrastructure. Perhaps one of your MCU’s is at capacity, or you have a spike in traversal calls:

 

RT Capacity.png

 

If the issue isn’t with an endpoint or bridge, you can click on the helpful SolarWinds link next to the endpoint to take you right to the connected access-layer switch interface in NPM:

 

Picture1.png



Once in NPM, you can determine if there is a common interface-level issue (VLAN / duplex / etc) or start to drive upstream into the infrastructure. Enhance your situational awareness with Netflow data or perhaps proactive UDP IPSLA transactions in VNQM. Recent config change bork DSCP tagging? NCM has you covered.

 

Screen Shot 2016-09-10 at 2.53.50 PM.png

 

So next time users start rumbling that those darn vidcons “don’t work” or the CEO’s call drops in the middle of a board meeting, know that your video infrastructure doesn’t have to be a black-box. With Vyopta and SolarWinds integration, it’s easy to troubleshoot. No more chasing phantom issues - isolate the root cause of video conference issues in just a few clicks.

There is hardly a government IT pro who has not seen sluggish applications create unhappy users.

 

Because the database is at the heart of every application, when there’s a performance issue, there’s a good chance the database is somehow involved. With database optimization methods -- such as identifying database performance issues that impact end-user response times, isolating root cause, showing historical performance trends and correlating metrics with response time and performance -- IT managers can speed application performance for their users.

 

Start with these four database optimization tips:

 

Tip #1: Get visibility into the entire application stack.

The days of discrete monitoring tools are over. Today’s government IT pros must have visibility across the entire application stack, or the application delivery chain comprising the application and all the backend IT that supports it -- software, middleware, extended infrastructure and especially the database. Visibility across the application stack will help identify performance bottlenecks and improve the end-user experience.

 

Tip #2: See beyond traditional infrastructure dashboards.

Many traditional monitoring tools provide a dashboard focused on health and status, typically featuring many charts and data, which can be hard to interpret. In addition, many don’t provide enough information to easily diagnose a problem -- particularly a performance problem.

 

Tools with wait-time analysis capabilities can help IT pros eliminate guesswork. They help identify how an application request is executed step-by-step and will show which processes and resources the application is waiting on. This type of tool provides a far more actionable view into performance than traditional infrastructure dashboards.

 

Tip #3: Reference historical baselines.

Database performance is dynamic. It is critical to be able to compare abnormal performance with expected performance. By establishing historic baselines of application and database performance that look at how applications performed at the same time on the same day last week, and the week before that, etc. , it is easier to identify a slight variation before it becomes a larger problem. And, if a variation is identified, it’s much easier to track the code, resource or configuration change that could be the root cause and solve the problem quickly.

 

Tip #4: Align the team.

Today’s complex applications are supported by an entire stack of technologies. And yet, most IT operations teams are organized in silos, with each person or group supporting a different part of the stack. Unfortunately, technology-centric silos encourage finger-pointing.

 

A far more effective approach shares a unified view of application performance with the entire team. In fact, a unified view based on wait-time analysis will ensure that everyone can focus on solving application problems quickly.

 

Remember, every department, group or function within an agency relies on a database in some way or another. Optimizing database performance will help make users happier across the board.

 

Find the full article on Government Computer News.

Following my review of Solarwinds Virtualization Manager 6.3, the fair folks at Solarwinds gave me the opportunity to put my hands on their next planned release, namely VMAN 6.4. While there is no official release date yet, I would bet on an announcement within Q4-2016. The version I tested is 6.4 Beta 2. So what’s new with this release?

 

From a UI perspective, VMAN 6.4 is very similar to its predecessor. Like with VMAN 6.3, you install the appliance and either install VIM (Virtual Infrastructure Monitor component) on a standalone Windows Server, or integrate with an existing Orion deployment if you already use other Solarwinds products. You’d almost think that no changes have happened until you head over to the « Virtualisation Summary » page. The new, killer feature of VMAN 6.4 is called « Recommendations » and while it seems like a minor UI improvement there’s much more to it than it looks like.

 

While in VMAN 6.3 you are presented with a list of items requiring your attention (over/under-provisioned VMs, idle VMs, orphan VMDK files, snapshots etc. – see my previous review), in VMAN 6.4 all of these items are aggregated in the « Recommendations » view.

 

Two types of recommendations exist: Active or Predicted. Active Recommendations are immediate recommendations that are correlated with issues that are currently showing up in your environment. If you are experimenting memory pressure on a given host, an active recommendation would propose you to move one or more VMs to another host to balance the pressure. Predicted recommendations, on the other hand, focus on proactively identifying potential issues before they become a concern, based on usage history in your environment.

 

The « Recommendations » feature is very pleasant to use and introduces a few elements that are quite important from a virtualisation administrator perspective:

 

  • First of all, administrators have the possibility to apply a recommendation immediately or schedule it for a later time (out of business hours, change windows, etc.)
  • Secondly, an option is offered to either power down a VM to apply the recommendation or to attempt to apply the recommendation without any power operations. This features comes in handy if you need to migrate VMs, as you may run into cases where a Power Off/Power On is required, while in other cases a vMotion / live migration will suffice
  • Last but not least, the « Recommendations » module will check if the problem still exists before actually applying a recommendation. This makes particularly sense in the case of active recommendations that may no longer be relevant by the time you decide to apply the recommendation (for example if you decide to schedule a recommendation but the issue is no longer reported by the scheduled time)

 

A nice and welcome touch in the UI is a visual aid that shows up when hovering your mouse over the proposed recommendations. You will see a simple & readable graphical view / simulation of the before & after status of any given object (cluster, datastore, etc.) in case you decide to apply the recommendation.

 

Max’s take

 

The “Recommendations” function, while apparently modest from an UI perspective, is in fact an important improvement that goes beyond the capacity reclamation and VM sprawl controls included in VMAN 6.3. Administrators are now presented with actionable recommendations that are relevant not only in the context of immediate operational issues, but also as countermeasures to prevent future bottlenecks and capacity issues.

 

A few side notes: if you plan to test the beta version, reach out to the Solarwinds engineers. The new “Recommendations” function is still being fine-tuned and you may not be able to see it if you integrate it with your current VIM or Orion environment. Once you install VMAN 6.4, you should let it run for approximately a week in order to get accurate recommendations.

Flash storage can be really, really fast. Crazy fast. So fast that some have openly asked if they really need to worry about storage performance anymore. After all, once you can throw a million IOPS at the problem, your bottleneck has moved somewhere else!

 

So do you really need to worry about storage performance once you go all-flash?

 

Oh yes, you definitely do!

 

All-Flash Storage Can Be Surprisingly Slow

 

First, most all-flash storage solutions aren't delivering that kind of killer performance. In fast, most all-flash storage arrays can push "only" tens of thousands of IOPS, not the millions you might expect! For starters, those million-IOPS storage devices are internal PCIe cards, not SSD's or storage arrays. So we need to revise our IOPS expectations downwards to the "hundred thousand or so" than an SSD can deliver. Then it gets worse.

 

Part of this is a common architectural problem found in all-flash storage arrays which I like to call the "pretend SSD's are hard disks" syndrome. If you're a vendor of storage systems, it's pretty tempting to do exactly what so many of us techies have done with our personal computers: Yank out the hard disk drives and replace them with SSD's. And this works, to a point. But "storage systems" are complex machines, and most have been carefully balanced for the (mediocre) performance characteristics of hard disk drives. Sticking some SSD's in just over-taxes the rest of the system, from the controller CPU's to the I/O channels.

 

But even storage arrays designed for SSD's aren't as fast as internal drives. The definition of an array includes external attachment, typically over a shared network, as well as redundancy and data management features. All of this gets in the way of absolute performance. Let's consider the network: Although a 10 Gb Ethernet or 8 Gb Fibre Channel link sounds like it would be faster than a 6 Gb SAS connection, this isn't always the case. Storage networks include switches (and sometimes even routers) and these add latency that slows absolute performance relative to internal devices. The same is true of the copy-on-write filesystems protecting the data inside most modern storage arrays.

 

And maximum performance can really tax the CPU found in a storage array controller. Would you rather pay for a many-core CPU so you'll get maximum performance or for a bit more capacity? Most storage arrays, even specialized all-flash devices, under-provision processing power to keep cost reasonable, so they can't keep up with the storage media.

 

Noisy Neighbors

 

Now that we're reset our expectations for absolute performance, let's consider what else is slurping up our IOPS. In most environments, storage systems are shared between multiple servers and applications. That's kind of the point of shared networked storage after all. Traditionally, storage administrators have carefully managed this sharing because maximum performance was naturally quite limited. With all-flash arrays, there is a temptation to "punt" and let the array figure out how to allocate performance. But this is a very risky choice!

 

Just because an array can sustain tens or even hundreds of thousands of I/O operations per second doesn't mean your applications won't "notice" if some "noisy neighbor" application is gobbling up all that performance. Indeed, performance can get pretty bad since each application can have as much performance as it can handle! You can find applications starved of performance and trudging along at disk speeds...

 

This is why performance profiling and quality of service (QoS) controls are so important in shared storage systems, even all-flash. As an administrator, you must profile the applications and determine a reasonable amount of performance to allocate to each. Then you must configure the storage system to enforce these limits, assuming you bought one with that capability!

 

Note that some storage QoS implementations are absolute, while others are relative. In other words, some arrays require a hard IOPS limit to be set per LUN or share, while others simply throttle performance once things start "looking hot". If you can't tolerate uneven performance, you'll have to look at setting hard limits.

 

Tiered Flash

 

If you really need maximum performance, tiered storage is the only way to go. If you can profile your applications and segment their data, you can tier storage, reserving maximum-performance flash for just a few hotspots.

 

Today's hybrid storage arrays allow data to be "pinned" into flash or cache. This delivers maximum performance but can "waste" precious flash capacity if you're not careful. You can also create higher-performance LUNs or shares in all-flash storage arrays using RAID-10 rather than parity or turning off other features.

 

But if you want maximum performance, you'll have to move the data off the network. It's pretty straightforward to install an NVMe SSD in a server directly, especially the modern servers with disk-like NVMe slots or M.2 connectors. These deliver remarkable performance but offer virtually no data protection. So doing this with production applications puts data at risk and requires a long, hard look at the application.

 

You can also get data locality by employing a storage caching software product. There are a few available out there (SanDisk FlashSoft, Infinio, VMware vFRC, etc) and these can help mitigate the risks of local data by ensuring that writes are preserved outside the server. But each has its own performance quirks, so none is a "silver bullet" for performance problems.

 

Stephen's Stance

 

Hopefully I've given you some things to think about when it comes to storage performance. Just going "all-flash" isn't going to solve all storage performance problems!

 

I am Stephen Foskett and I love storage. You can find more writing like this at blog.fosketts.net, connect with me as @SFoskett on Twitter, and check out my Tech Field Day events.

Screen Shot 2016-09-20 at 11.34.34 AM.png

 

This week I will be in Atlanta for Microsoft Ignite, splitting time between the Microsoft and SolarWinds booths in the exhibit hall. I have the privilege of delivering a session in the Community Theater on Tuesday, the 27th of September, from 2:50-3:10PM EDT. The title of the talk is "Performance Tuning Essentials for the Cloud DBA," and it's a story that's been on my mind for the past year or so.

 

First, "cloud DBA" is a phrase I borrowed from Rimma Nehme who mentioned the term during her PASS Summit keynote in 2014. Dr. Nehme was reinforcing an idea that I have been advocating for years, and that is for database administrators to stop thinking of themselves as DBAs and start thinking of themselves as data professionals. To a DBA, it shouldn't matter where the data resides, either down the hall or in the cloud. And for those of us that are accidental DBAs, or accidental whatevers, we know that there will soon be accidental cloud DBAs. And those accidental cloud DBAs will need help.

 

And that help begins with this 20-minute session at Ignite tomorrow.

 

During that session, you are going to hear me talk about the rise of hybrid IT, the changing face of IT, and how we won't recognize things in five years. An accidental cloud DBA will be overwhelmed at first, but we will help provide the structure they need for a solid foundation to perform well in their new role. And I will share some tips and tricks with you to help all cloud DBAs to be efficient and effective.

 

So if you are at Microsoft Ignite this week, stop by to chat with me in the booth, or after my session Tuesday. I'd be happy to talk cloud, data, databases, and technology in general.

Leon Adato

Called to Account

Posted by Leon Adato Expert Sep 26, 2016

showmethemoney.jpg

As IT professionals who have a special interest in monitoring (because why ELSE would you be here on THWACK, except for maybe the snuggies and socks), we understand why monitoring - and more importantly, automation in response to monitoring events - is  intrinsically useful and valuable to an organization. We understand that automation creates efficiencies, saves effort, creates consistency, and most of all, saves money in all it's forms - labor, downtime, lost opportunity, and actual cost.

 

To us, the benefits of automation are intuitively obvious.

 

And that's a problem. When I was in university and a professor used the phrase "it's intuitively obvious" that meant one thing: run like hell out of the class because:

A) the professor didn't understand it themselves, and

B) it was going to be on the exam

 

The idea that things that we see as intuitive are the same things we find either difficult or unnecessary to explain was driven home to me the other day when I solicited examples of the measurable business value of monitoring automation.

 

I said,

"I’m putting together some material that digs into the benefit of automated responses in monitoring and alerting. I’m looking for quotable anecdotes of environments where this has been done. What’s most important are the numbers, including the reduction in tickets, the improvement in response, etc.

 

Example:

A certain company “enjoyed” an average of 327 disk full alerts each and every month, which required roughly 20 minutes of staff time to investigate. After implementing a simple vbscript to clear the TEMP directory before opening a ticket, the number of alerts dropped to 57 per month. Now each of those tickets required one hour to resolve, but that’s because they were REAL events that required action, versus the spurious (and largely ignored) quantity before automation was introduced.)

 

If you have any stories like this, I’d love to hear it. Feel free to pass this along to other colleagues as you see fit."

I received back some incredible examples of automation - actual scripts and recipes, but that's not what I wanted. I wanted anecdotes about the results. So I tried again.

 

What I am looking for is more like examples that speak to how monitoring can justify its own existence to the bean counters who don’t care if technology actually works as long as it’s cheap.

I got back more examples of monitoring automation. Some even more jaw-droppingly awesome than before. But it's still not what I wanted. I started cornering people one-on-one to see if maybe email was the wrong medium.

 

What I discovered upon intense interroga... discussion was that IT pros are very familiar with how automation is accomplished. We remember those details, down to individual lines of code in some cases. But pressed for information that describes how that automation affected the business (saved money, reduced actionable tickets, averted specific downtimes which would have cost X dollars per minute), all I got was, "Well, I know it helped. The users said it was a huge benefit to them. But I don't have those numbers."

 

That was the answer in almost every case.

 

I find this interesting for a couple of reasons. First, because you'd think we would not only be curious, we would positively bask in the glory that our automation was saving the company money. We are, after all, IT professionals. The job practically comes with a cape and tights.

 

Second, we're usually the first ones to shout "Prove it!" when someone makes a claim about a particular fact, event, effect, or even opinion. Did someone say Star Trek IV was the highest grossing movie of the franchise? Show me the BoxOfficeMojo.com numbers for that!* At a dinner party and someone says sharks are a major cause of death in the summer? You're right there to list out 25 things that are actually more likely to kill you than sharks.**

 

But despite our fascination with facts and figures when it comes to ephemera and trivia, we seem to have a blind spot with business. Which is a shame.

 

It's a shame because being able to prove the impact that monitoring and alerting has on the business is the best way to get more of it. More staff, more servers, more software, more time, and most importantly, more buy-in.

 

Imagine providing your CEO with data on how one little alert saved $250 each and every time it triggered, and then opening up the ticket logs to show that the alert triggered 327 times last month. That's a savings of $81,750 in one month alone!!

 

Put those kind of numbers against a handful of automated responses, and you could feel like Scrooge McDuck diving into his pool of money every time you opened the ticket system!

 

So prove me wrong. In the comments below, give me some examples of the VALUE and business impact that monitoring has had.

 

More than just giving me grist for the mill (which, I'll be honest, I'm totally going to use in an upcoming eBook, totally giving credit where credit is due, and THWACK points!) what we'll all gain is insight into the formulae that works for you. Hopefully we can adapt it to our environment.

 

* In actuality StarTrek IV ranked 4th, unless you adjust for inflation in which case it was 3rd. First place is held in both categories by the original motion picture. The more you know.

**Sharks kill about 5 people annually***. Falling out of bed kills 450. Heck, bee stings claim 53 lives each year. So go ahead and dive in. The water's fine and Bruce the shark probably will leave you alone.

*** Unless you are watching "Sharknado". Then the number is closer to 16 people.

It's been an interesting few days. Brian Krebs had his website taken down by the largest Distributed Denial of Service (DDoS) attack ever seen. It massed some 665 Gbps of traffic that assaulted Akamai like storm that couldn't be stopped. Researchers have been working to find out how this attack was pulled off, especially considering that this attack was already more than twice the size of the largest DDoS attacks Akamai had ever seen. A news article late Friday said that the attack likely started from IoT devices repurposed for packet flooding.

Most of the recent DDoS attacks have come from stressing tools or other exploits in UDP-based services like DNS or NTP. For the attack vectors to shift to IoT means that nefarious groups have realized the potential of these devices. They sit in the network, communicating with cloud servers to relay data to apps on smartphones or tablets. These thermostats, clocks, cameras, and other various technology devices don't consume much bandwidth in normal operations. But just like any other device, they are capable of flooding the network under the right conditions. Multiply that by the number of smart devices being deployed today and you can see the potential for destruction.

What can IT professionals do? How do these devices, often consumer focused, fit into your plans? How can you keep them from destroying your network, or worse yet destroying someone else's in an unwitting attack?

Thankfully, tools already exist to help you out. Rather than hoping that device manufacturers are going to wake up and give you extra controls in an enterprise, you can proactively start monitoring those devices today. These IoT things still need IP addresses to communicate with the world. By setting you monitoring systems to sweep periodically for them, you can find them as they are brought onto the network. With tools like those at Solarwinds, you can also trend those devices to find out what their normal traffic load is and what happens when it starts bursting well beyond what it should be sending. By knowing what things should be doing, you can immediately be alerted for things that aren't normal.

These tools can also help you plan your network so that you can take devices offline or rate limit them to prevent huge traffic spikes from ever becoming an issue. You can then wait for the manufacturer to patch them or even create policies for their use that prevent them from causing harm. The evidence from a series of traces of bad acting devices in your network can be a great way to convince management that you need to change the way things work with regard to IoT devices. Or even to ensure that you have an IoT policy in place to begin with.

All that's required for the bad guys to use your network for their evil schemes is for networking professionals to do nothing. Make sure you know what's going on in your system so you're not surprised by complacency.

squirrel.jpg

Have you ever had the experience where you start looking into something, and every time you turn a corner you realize you have uncovered yet another thing you need to dig into in order to fully understand the problem? In my workplace we refer to this as chasing squirrels; being diverted from the straight path and taking a large number of detours long the way.

Managing infrastructure seems that way sometimes; it seems that no matter how much time you put in to the monitoring and alerting systems, there's always something else to do. I've looked at some of these issues in my last two posts (Zen And The Art Of Infrastructure Monitoring and Zen And The Ongoing Art Of Infrastructure Monitoring), and in this post I'm chasing yet another squirrel: the mythical baseline.

 

BASELINING ALL THE THINGS

 

If we can have an Internet of Things, I think we can also have a Baseline of Things, can't we? What is it we look for when we monitor our devices and services? Well, for example:

 

  • Thresholds vs Capacities: e.g. a link exceeds 90% utilization, or Average RAM utilization exceeds 80%. Monitoring tools can detect this and raise an alert.
  • Events: something specific occurs that the end system deems worthy of an alert. You may or may not agree with the end system.

 

These things are almost RESTful in as much as there are kind of stateless: absolute values are at play and it's possible to trigger a capacity threshold alert, for example, without having any significant understanding of the previous history of the element's utilization. There are two other kinds of things I might look at:

 

  • Forecasting: Detecting that the utilization trend over time will lead to a threshold event unless something is done before then. This requires historical data and faith in the curve-fitting abilities of your capacity management tool.
  • Something Changed: By way of an example. If I use IP SLA to monitor ping times between two data centers, what happens if the latency suddenly doubles? The absolute value may not be high enough to trigger timeouts, or to pass a maximum allowable value threshold, but the fact that the latency doubled is a problem. To identify this problem requires historical data, again, and the statistical smarts to determine when a change is abnormal compared to the usual jitter and busy hour curves.

 

This last item - Something Changed - is of interest to me because it offers up valuable information to take into account when a troubleshooting scenario occurs. For example if I monitor the path traffic takes from my HQ site to, say, an Office 365 site over the Internet, and a major Internet path change takes place, then when I get a call saying that performance has gone down the pan, I have something to compare to. How many of us have been on a troubleshooting call where you trace the path between points A and B, but it's hard to know if that's the problem because nobody knows what path it normally takes when things are seeming going well. Without having some kind of baseline, some idea of what NASA would call 'nominal', it's very hard to know if what you see when a problem occurs is actually a problem or not, and it's possible to spend hours chasing squirrels when the evidence was right there from the get go.

 

Many monitoring systems I see are not configured to alert based on a change in behavior that's still within thresholds, but it's something I like to have when possible. As with so much of infrastructure monitoring, triggering alerts like this can be plagued with statistical nightmares to figure out the difference between a system that's idle overnight seeing its utilization increase when users connect in the morning, and a system that usually handles 300 connections per second at peak suddenly seeing 600 cps instead. Nonetheless, it's a goal to strive for, and even if you are only able to look at the historical data in order to confirm that the network path has not changed, having that data to hand is valuable.

 

KILLING THE DEAD THINGS

 

Moving in a different but related direction, knowing if what you're monitoring is actually active would be nice, don't you think? My experience is that virtualization, while fantastic, is also an automated way to ensure that the company has a graveyard of abandoned VMs that nobody remembered to decommission once they were no longer needed. This happens with non-virtualized servers too of course, but they are easier to spot because the entire server does one thing and if it stops doing so, the server goes quiet. Virtual machines are trickier because one abandoned VM in ten deployed on a server can't be detected simply by checking network throughput or similar.

 

Knowing what's active helps minimize the number of squirrels that distract us in a troubleshooting scenario, so it's important to be able to tidy up and only monitor things that matter. In the case of VMs, SolarWinds' Virtualization Monitor has a Sprawl Management dashboard which helps identify VMs which have been powered down for over 30 days, as well as those which appear to be idle (and presumably no longer needed). In addition, if there are VMs running at 100% CPU for example (and triggering alerts most likely), those are also identified as being under-provisioned, so there's a chance to clean up those alerts in one place (referred to as VM right-sizing). Similarly for network ports, SolarWinds' User Device Tracker can identify unused ports so they can be shutdown to ensure that they do not become the source of a problem. This also allows better capacity planning because unused ports are identified as such, and can then be ignored when looking at port utilization on a switch.

 

PULLING THE THINGS TOGETHER

 

Looking at the list of things I want my monitoring and alerting systems to do, it seems that maybe no one system will ever provide everything I need in order to get that holistic view of the network that I'd like. Still, one thing SolarWinds has going for it is that Orion provides a common platform for a number of specialized tools, and the more SolarWinds uses data from multiple modules to generate intelligent alerting and diagnosis, the more powerful it can be as a tool for managing a broad infrastructure. Having a list of specific element managers is great for the engineering and operations teams responsible for those products, but having a more unified view is crucial to provide better service management and alerting.

 

Do you feel that Orion helps you to look across the silos and get better visibility? Has that ever saved your bacon, so to speak?

We just finished THWACKcamp last week and now we turn around and head to Atlanta for Microsoft Ignite next week. If you are at Ignite please stop by our booth to say hello! I'd love to chat with you about data, databases, or tech in general. While at VMworld I found myself getting involved in a lot of storage discussions, I suspect that next week I will be in a lot of Cloud discussions. Can't wait!

 

Here's a bunch of stuff I thought you might find interesting, enjoy!

 

What Carrie Underwood’s success teaches us about IBM’s Watson failure

Until I read this I didn't even know IBM was offering Watson as a service. No idea. None. In fact, I'm not sure I know what IBM even offers as a company anymore.

 

Microsoft surpasses IBM's Watson in speech recognition

Years ago I worked in the QA department for speech recognition and Microsoft invested heavily in our software. So I'm not surprised that they continue to make advances in this area, they've been investing in it for decades.

 

The Internet knows what you did last summer

If you want to retain your privacy, don't use computers. Everything is tracked in some way, no matter what. Data is power, and every company wants access to yours.

 

The Rivers of the Mississippi Watershed

Because I love data visualizations and so should you.

 

Ford charts cautious path toward self-driving, shared vehicles

Another week, another article about self-driving cars. You're welcome.

 

Someone Is Learning How to Take Down the Internet

This could have been titled "Someone is learning how to make clickbait titles and you won't believe what happens next". Do not read this one unless you have your tinfoil hat on.

 

DevOps and the Infrastructure Dumpster Fire

Yeah, DevOps is a lot like a dumpster fire.

 

This one time, at THWACKcamp, I crashed the set to take a selfie with my favorite dead celebrity Rob Boss:

rob-boss - 1.jpg

arjantim

To the cloud and beyond

Posted by arjantim Sep 20, 2016

The last couple of weeks we talked about the cloud and the software defined data center, and how it all comes into your it infrastructures. To be honest I understand a lot of you have doubts when talking and discussing cloud and SDDC. I know the buzzword lingo is strong, and it seems the marketing teams come up with new lingo every day. But all-in all I still believe the cloud (and with it SDDC) is a tool you can’t just reject because of looking at it as a marketing term.

 

One of the things mentioned was that the cloud is just someone else’s computer and that is very true, but saying that is forgetting some basic stuff. We have had a lot trouble in our datacenters, and departments sometimes needed to wait months before their application could be used, or the change they asked for was done.

 

Saying your own datacenter can do the same things as the AWS/AZURE/GOOGLE/IBM/ETC datacenters can do, is wishful thinking at best. Do you get your own CPU’s out of the Intel factory, do you own the microsoft kernel, and I could continue with much more you will probably never see in your DC. And don’t get me wrong, I know some of you work in some pretty amazing DC’s.

 

Let’s see if we can put it all together and come to a conclusion most of all can share. First I think it is of utmost importance to have your environment running at a high maturity level. Often I see the business running to the public cloud and complaining about their internal IT because of lack of resources and money to perform the same levels as in a public cloud. But throwing all your problems over the fence into the public cloud won’t fix your problem. No it will probably make it even worse.

 

You’ll have to make sure you’re in charge of your environment before thinking of going public, if you want to have the winning strategy. For me the Hybrid cloud or the SDDC is the only true cloud for much of my customers, at least for the next couple of years. But most of them need to get their environments to the next level, and there is only one way to do that.

 

Know thy environment….

 

We’ve seen it with outsourcing, and in some cases we are already seeing it in Public Cloud, we want to go in but we also need the opportunity to go out. Let’s start with going in:

 

Before we can move certain workloads to the cloud we need a to know our environment from top to bottom. There is no environment where nothing goes wrong, but environments where monitoring, alerting, remediation and troubleshooting is done at every level of the infrastructure and where money is invested to keep a healthy environment, normally tend to have a much smoother walk towards the next generation IT environments.

dart.png

The DART framework can be used to get the level needed for the step towards SDDC/Hybrid Cloud.

 

We also talked about getting SOAP, Security, Optimization, Automation and Reporting to make sure we get to the next level of infrastructure, and it is as much importance as the DART framework. If you want to create the level IT environment you need to be in charge of all these bulletpoints. If you are able to create a stable environment, on all these points, you’re able to move the right workloads to environments outside your own.

Screen Shot 2016-09-20 at 20.24.28.png

 

I’ve been asked to take a look at Solarwinds Server and Application Monitor (SAM) 6.3 and tell something about it. For me it is just one of the tools you need in place to secure, optimze and automate your environment and show and tell your leadership what your doing and what is needed.

 

I’ll dive into SAM 6.3 a bit deeper when I had the time to evaluate the product a little further. Thanks for hanging in there, and giving all those awesome comments. There are many great things about Solarwinds:

 

  1. 1. They have a tool for all the things needed to get to the next generation datacenter
  2. 2. They know having a great community helps them to become even better

 

So Solarwinds congrats on that, and keep the good stuff coming. For the community, thanks for being there and helping us all get better at what we do.

If you’re not prepared for the future of networking, you’re already behind.

 

That may sound harsh, but it’s true. Given the speed at which technology evolves compared to the rate most of us typically evolve in terms of our skillsets, there’s no time to waste in preparing ourselves to manage and monitor the networks of tomorrow. Yes, this is a bit of a daunting proposition considering the fact that some of us are still trying to catch up with today’s essentials of network monitoring and management, but the reality is that they’re not really mutually exclusive, are they?

 

In part one of this series, I outlined how the networks of today have evolved, and what today’s new essentials of network monitoring and management are as a consequence.

 

Before delving into what the next generation of network monitoring and management will look like, it’s important to first explore what the next generation of networking will look like.

 

On the Horizon

 

Above all else, one thing is for certain: We networking professionals should expect tomorrow’s technology to create more complex networks resulting in even more complex problems to solve.

 

Networks growing in all directions

 

Regardless of your agency’s position, the explosion of IoT, BYOD, BYOA and BYO-everything is upon us. With this trend still in its infancy, the future of connected devices and applications will be not only about the quantity of connected devices, but also the quality of their connections tunneling network bandwidth.

 

Agencies are using, or at least planning to use, IoT devices, and this explosion of devices that consume or produce data will, not might, create a potentially disruptive explosion in bandwidth consumption, security concerns and monitoring and management requirements.

 

IPv6 eventually takes the stage…or sooner (as in now!)

 

Recently, ARIN was unable to fulfill a request for IPv4 addresses because the request was greater than the contiguous blocks available. IPv6 is a reality today. There is an inevitable and quickly approaching moment when switching over will no longer be an option, but a requirement.

 

SDN and NFV will become the mainstream

 

Software defined networking (SDN) and network function virtualization (NFV) are expected to become mainstream in the next five to seven years; okay, maybe a bit longer for our public sector friends. With SDN and virtualization creating new opportunities for hybrid infrastructure, a serious look at adoption of these technologies is becoming more and more important.

 

So long WAN Optimization, Hello ISPs

 

Bandwidth increases are outpacing CPU and custom hardware’s ability to perform deep inspection and optimization, and ISPs are helping to circumvent the cost and complexities associated with WAN accelerators. WAN optimization will only see the light of tomorrow in unique use cases where the rewards outweigh the risks.

 

Farewell L4 Firewalling

 

Firewalls incapable of performing deep packet analysis and understanding the nature of the traffic at the Layer 7 (L7), or the application layer, will not satisfy the level of granularity and flexibility that most network administrators should offer their users. On this front, change is clearly inevitable for us network professionals, whether it means added network complexity and adapting to new infrastructures or simply letting withering technologies go.

 

Preparing to Manage the Networks of Tomorrow 

 

So, what can we do to prepare to monitor and manage the networks of tomorrow? Consider the following:

 

Understand the “who, what, why and where” of IoT, BYOD and BYOA

 

Connected devices cannot be ignored. According to 451 Research, mobile Internet of Things (IoT) and Machine-to-Machine (M2M) connections will increase to 908 million in just five years. This staggering statistic should prompt you to start creating a plan of action on how you will manage these devices.

 

Your strategy can either aim to manage these devices within the network or set an organizational policy to regulate traffic altogether. Curbing all of tomorrow’s BYOD/BYOA is nearly impossible. As such, you will have to understand your network device traffic in incremental metrics in order to optimize and secure them. Even more so, you will need to understand network segments that aren’t even in your direct control, like the tablets, phablets and Fitbits, to properly isolate issues.

 

Know the ins and outs of the new mainstream

 

As stated earlier, SDN, NFV and IPv6 will become the new mainstream. We can start preparing for these technologies’ future takeovers by taking a hybrid approach to our infrastructures today. This will put us ahead of the game.

 

Start comparison shopping now

 

Evaluating virtualized network options and other on-the-horizon technologies will help you nail down your agency’s particular requirements. Sometimes, knowing a vendor has or works with technology you don’t need right now but might later can and should influence your decisions.

 

Brick in, brick out

 

Taking on new technologies can feel overwhelming. Look for ways that potential new additions will not just enhance, but replace the old guard. If you don’t do this, then the new technology will indeed simply seem to increase workload and do little else. This is also a great measuring stick to identify new technologies whose time may not yet have truly come for your organization.

 

To conclude this series, my opening statement from part one merits repeating: learn from the past, live in the present and prepare for the future. The evolution of networking waits for no one. Don’t be left behind.

 

Find the full article on Federal Technology Insider.

September has now become a series of IT industry events. From VMware VMworld to SolarWinds THWACKcamp to Oracle OpenWorld to Microsoft Ignite, it seems like an endless procession of speaking sessions, in booth demo presentations and conversations with IT professionals in those communities.  That last aspect that is my favorite aspect of industry events. The work we do needs to have meaning and the people interaction is my fuel for that meaningful fire. Technologies, people, and processes will always change. Similarly, the desire to learn, evolve and move forward remains the constant for successful integration into any new paradigm. Be constant in your evolution.

 

Here's a brief recap of occurrences at recent events that I had the great privilege of attending and participating in:

 

VMworld 2016

SolarWinds Booth Staff at VMworld 2016sqlrockstar kong.yang at VMware vExpert Party Mob Museum
chrispaap before he rocked the booth with his Scaling Out Your Virtual Infrastructure
VMworld2016-booth.PNGHeadGeeks-vExpert-MobMuseum.pngCPaap-VMworld.png

 

 

THWACKcamp 2016

Head Geeks with their Executive Leader jennebarbour
Radioteacher kong.yang DanielleH photo bombed by KMSigmasqlrockstar - it's make-up time #ChallengeAccepted w/ hcavender. Peace out brother :-)
HeadGeeks THWACKcamp 2016.jpgTHWACKcamp - Community.pngTHWACKcamp - putting lipstick on a pig.JPG


Coming soon to an IT event near you: IT Pro Day, Microsoft Ignite in HAWT-lanta, Chicago SWUG, and AWS re:Invent in Las Vegas. Stay thirsty for IT knowledge and truths my friends! Let me know if you'll be at any of these events, always happy to connect with THWACK community members and converse the IT day away.

nannas_ring2.jpg

We just spent two days wrestling with this years’ THWACKcamp theme, and I think we’ve all come away much richer for the discussions held, the information shared, and the knowledge imparted.

 

And as I sit here in the airport lounge, tired but exhilarated and energized, an old post appeared on FB feed. It was written and posted by a friend of mine a few years ago, but came back up through serendipity, today of all days.

 

It tells the story of why this amazing woman who's been my friend since 7th grade chose the sciences as her life's path. And it starts with with a ring - one which was given 50 years late, but given never the less. And why the ring was actually secondary after all.

 

We lived near each other, played flute one chair apart in band, shared an interest in all things geeky including comic books and D&D, and when she got her license in high school we carpooled to HS together because we both suffered from needing to be “in place” FAR too early in the morning. Many a sleepy morning was spent sitting outside the band room, where I would test her on whatever chemistry quiz was upcoming. Even then her aspiration to be a a biologist was clear and firm, and she was driven to get every answer not only correct, but DOWN. Down pat. To this day whenever I call, I ask her the atomic weight of germanium (72.64, if you are curious).

 

She graduated a year before me, was accepted to the college of her choice, and from there easily attained all of her goals.

 

No small credit for this goes to the two women in her life: “Nanna” – her grandmother, who you can read about below; and her mother, a gifted chemical engineer with a long and illustrious career at BP, who was herself also inspired by Nanna.

 

When we talk about the energy behind the “challenge accepted” theme, this story really drives home a powerful set of lessons:

 

  1. The lesson that the challenges accepted by others have paved the way for us. They are a very real and tangible gift.
  2. The reminder that our willingness to face challenges today has the potential to impact far more than we realize: more than our day; more than our yearly bonus; more even than our career.

 

In simply getting up, facing the day, and proclaiming (whether in a bold roar to the heavens or a determined whisper to ourselves) “Challenge Accepted” we have the opportunity to light the way for generations to come.

 

************************

Nanna's Ring

It is a simple steel band. No engravings, nothing remarkable. It has always been on her right hand pinkie finger since she got it.

 

It was May in the summer of 1988. I had graduated with my bachelor degree in biology and was getting ready to start the Master's program at the University of Dayton. Nanna needed to get to Ada, Ohio and I needed to drop some stuff down at Dayton. We made a girl's weekend trip around Ohio.

 

We talked about traveling to college and how in her day it was all back roads; the interstate system that I was driving had not come about. I was speeding (young and in a hurry to go do things) and Nanna said, "Go faster." I made it from Eastside Cleveland to Dayton in record time. Dropped off the stuff with my friends and back on the road!

 

Made it to Ohio Northern in time to grab dinner in the dorm hall cafeteria, wander around a little bit until Nanna's knees had enough of that, then we settled into the dorm room for the night. I got the top bunk, she took the bottom. We talked and giggled like freshmen girls spending their first night at college.

 

The next morning we got dressed up, grabbed breakfast, then made our way to the lecture hall. The room was packed with kids my age and professors. The ceremony began. It was the honor society for engineers and the soon-to-graduate engineers were being honored. One by one, the new engineers were called to the front. Last of all the head of the society called out, "Jane Cedarquist!" Nanna smiled and, with a little more spring in her achy knees, went to the front of the hall.

 

"About 50 years ago, this young lady graduated with a degree in engineering. She was the first lady to do so from our college so we honor her today - an honor overdue." She got a standing ovation and a number of the young engineers that stood with her gave her hugs and shook her hands. After the quiet returned, the engineering students gave their pledge and received their rings of steel and placed them on their right hand pinkie fingers.

 

Jane Cedarquist went out into the world as an engineer and managed to survive trials and tribulations of being a woman in a man's world. Eventually she meet Dick Harris and they married and had two kids. She stayed home because that's how things worked. Eventually her kids grew up and had kids of their own. She never got back to engineering, though some of the landscape projects and quilts she made had the obvious stamp of an engineer's handywork. She traveled around the world and marvelled at the wonders, both man-made and God-given. She loved jewlery and always had her earrings, necklace, bracelets, and rings on her person. All of them had meaning and value. Some pieces would come and go, but after that day in 1988 she was never seen without that ring of steel.

 

It is a simple steel band. No engravings, nothing remarkable. It has always been on her right hand pinkie finger since she got it - until now.

 

(Leon's footnote: Jane Cedarquist Harris passed away, and passed her ring to my friend. She wears it - on a chain, since she understands the gravity of the pledge her Nanna made - carrying the legacy forward both professionally and symbolically).

stencil.linkedin-post (1).jpg

In my previous posts, I shared my tips on being an Accidental DBA - what things you should focus on first and how to prioritize your tasks.  Today at 1PM CDT, Thomas LaRock, HeadGeek and Kevin Sparenberg, Product Manager, will be talking about what Accidental DBAs should know about all the stuff that goes on inside the Black Box of a database.  I'm going to share with you some of the other things that Accidental DBAs need to think about inside the tables and columns of a database.

 

I'm sure you're thinking "But Karen, why should I care about database design if my job is keeping databases up and running?"  Accidental DBAs need to worry about database design because bad design has significant impacts on database performance, data quality, and availability. Even though an operational DBA didn't build it, they get the 3 AM alert for it.

 

Tricks

People use tricks for all kinds of reasons: they don't fully understand the relational model or databases, they haven't been properly trained, they don't know a feature already exists, or they think they are smarter than the people who build database engines. All but the last one are easily fixed.  Tricky things are support nightmares, especially at 3 AM, because all your normal troubleshoot techniques are going to fail.  They impact the ability to integrate with other databases, and they are often so fragile no one wants to touch the design or the code that made all these tricks work. In my experience, my 3 AM brain doesn't want to see any tricks.

 

Tricky.png

 

Tricky Things

Over my career I've been amazed by the variety and volume of tricky things I've seen done in database designs.  Here I'm going to list just 3 examples, but if you've seen others, I'd love to hear about them in the comments. Some days I think we need to create a Ted Codd Award for the worst database design tricks.  But that's another post...

 

Building a Database Engine Inside Your Database

 

You've seen these wonders…a graph database build in a single table.  A key-value pair (or entity attribute value) database in a couple of tables. Or my favourite, a relational database engine within a relational database engine.  Now doing these sorts of things for specific reasons might be a good idea.  But embracing these designs as your whole database design is a real problem.  More about that below.

 

Wrong Data Types

 

One of the goals of physical database design is to allocate just the right amount of space for data. Too little and you lose data (or customers), too much and performance suffers.  But some designers take this too far and reach for the smallest one possible, like INTEGER for a ZIPCode.  Ignoring that some postal codes have letters, this is a bad idea because ZIPCodes have leading zeros.  When you store 01234 as an INTEGER, you are storing 1234.  That means you need to do text manipulation to find data via postal code and you need to "fix" the data to display it.

 

Making Your Application Do the Hard Parts

It's common to see solutions architected to do all the data integrity and consistency checks in the application code instead of in the database.  Referential integrity (foreign key constraints), check constraints, and other database features are ignored and instead hundreds of thousands of lines of code are used to ensure these data quality features. This inevitably leads to data quality problems.  However, the worst thing is that these often lead to performance issues, too, and most developers have no idea why.

Why Do We Care?

 

While most of the sample tricks above are the responsibility of the database designer, the Accidental DBA should care because:

 

  • DBAs are on-call, not the designers
  • If there are Accidental DBAs, it's likely there are Accidental Database Designers
  • While recovery is job number one, all the other jobs involve actually getting the right data to business users
  • Making bad data move around faster isn't actually helping the business
  • Making bad data move around slower never helps the business
  • Keeping your bosses out of jail is still in your job description, even if they didn't write it down

 

But the most important reason why production DBAs should care about this is that relational database engines are optimized to work a specific way - with relational database structures.  When you build that fancy Key-Value structure for all your data, the database optimizer is clueless how to handle all the different types of data. All your query tuning tricks won't help, because all the queries will be the same.  All your data values will have to be indexed in the same index, for the most part.  Your table sizes will be enormous and full table scans will be very common.  This means you, as the DBA, will be getting a lot of 3 AM calls. I hope you are ready.

 

With applications trying to do data integrity checks, they are going to miss some. A database engine is optimized to do integrity checks quickly and completely. Your developers may not.  This means the data is going to be mangled, with end users losing confidence in the systems. The system may even harm customers or lead to conflicting financial results.  Downstream systems won't be able to accept bad data.  You will be getting a lot of 3 AM phone calls as integration fails.

 

Incorrect data types will lead to running out of space for bigger values, slower performance as text manipulation must happen to process the data, and less confidence in data quality.  You will be getting a lot of 3 AM and 3 PM phone calls from self-serve end users.

 

In other words, doing tricky things with your database is tricky. And often makes things much worse than you anticipate.

 

In Thwack Camp today, sqlrockstar Thomas and Kevin will be covering the mechanics of databases and how to think about troubleshooting all those 3 AM alerts.  While you are attending, I'd like you to also think about how design issues might have contributed to that phone call.  Database design and database configurations are both important.  A great DBA, accidental or not, understands how all these choices impact performance and data integrity.

 

Some tricks are proper use of unique design needs. But when I see many of them, or over use of tricks, I know that there will be lots and lots of alerts happening in some poor DBA's future.  You should take steps to ensure a good design lets you get more sleep.  Let the database engine do what it is meant to do.

TODAY IS THWACKCAMP! Have you registered yet? I will be in Austin this week for the event and doing some live cut-ins as well. Come join over 5,000 IT professionals for two days of quality content and prizes!

 

As exciting as THWACKcamp might be, I didn't let it distract me from putting together this week's Actuator. Enjoy!

 

Ransomware: The race you don’t want to lose

Another post about ransomware which means it's time for me to take more backups of all my data. You should do the same.

 

How to Stream Every NFL Game Live, Without Cable

I had a friend once that wanted to watch a football match while he was in Germany and he used an Azure VM from a US East datacenter in order to access the stream. Funny how those of us in tech know how to get around silly rules blocking content in other countries, like Netflix in Canada, but ti still seems to be a big secret for most.

 

Why lawyers will love the iPhone 7 and new Apple Watch

Everything you need to know about the recent Apple event last week. Even my kids are tuned in to how Apple likes to find ways to get people to spend $159 on something like an AirPod that will easily get lost or damaged, forcing you to spend more money.

 

Delta: Data Center Outage Cost Us $150M

Glad we have someone trying to put a price tag on this but the question that remains is: How much would it have cost Delta to architect these systems in a way that the power failure didn't need to trigger a reboot? If the answer is "not as much", then Delta needs to get to work, because these upgrades will be incremental and take time.

 

Samsung Galaxy Note 7: FAA warns plane passengers not to use the phone

Since we are talking about airlines, let's talk about how the next time you fly your phone (or the phone of the passenger next to you) may explode. Can't the FAA and TSA find a way to prevent these phones from being allowed on board?

 

Discipline: The Key to Going From Scripter to Developer

Wonderful write up describing the transition we all have as sys admins. We go from scripting to application development as our careers progress. In my case, I was spending more time managing my scripts and homegrown monitoring/tracking system than I was being a DBA. That's when I started buying tools instead of building them.

 

What if Star Trek’s crew members worked in an IT department?

Because Star Trek turned 50 years old last week, I felt the need to share at least one post celebrating the series that has influenced so many people for so many years.

 

Presented without comment:

DBA.jpg

Many of my blog posts and live talks focus on the changing nature of storage. Traditional storage architectures are giving way to dispersed arrays and even software-defined storage. And traditional storage arrays are giving way to things that don't really look like storage arrays at all. But what does this mean for the storage administrator? Is this job going the way of the dodo?

 

Storage isn't easy. Sure, it seems like it's just a matter of keeping the disks spinning while applications do the real work, but storage is much more than that. It's all about performance, availability, and advanced data movement features. And the job of the storage administrator is much more than just keeping the lights on until the next array needs to be installed! In fact, it is mastery of the application integration points, from VAAI to TimeFinder, that truly define what it means to be a storage admin.

 

As storage arrays devolve, evolve, and merge with servers, many of the traditional management tasks do disappear. But so much more is left to do! The demise of the traditional storage array isn't the end of a career in storage; it's a moment of liberation!

 

Storage administrators have always wanted to move their focus from hardware operations to higher-level data management tasks. Now is their big chance. VSAN may have no SAN at all, but the data is still there. Cloud storage moves data off-premises but the data is just as important. In many ways, these new technologies make it even more important for a company to have someone looking after their data.

 

Now is the time for that storage administrator to stake out a seat at the application planning meetings and begin talking about issues of data mobility, locality, and availability. These are the traditional topics of the storage industry, but they were too-often submerged in the daily grind of storage performance and basic functionality. Integrated storage systems finally promise to eliminate the tedium of mapping and connecting storage systems, and anyone who's fought with iSCSI or Fibre Channel welcomes that burden being lifted!

 

The career of a storage administrator is not going away. In fact, it's becoming much more important in this software-defined world. Rise up and become data managers!

 

I am Stephen Foskett and I love storage. You can find more writing like this at blog.fosketts.net, connect with me as @SFoskett on Twitter, and check out my Tech Field Day events.

Network Management doesn’t have to be overly complex, but a clear understanding of what needs to be accomplished is important. In a previous blog series I had talked about the need for a tools team to help in this process, a cross functional team may be critical in defining these criteria.

 

  1. Determine What is Important—What is most important to your organization is likely different than that of your peers at other organizations, albeit somewhat similar in certain regards. Monitoring everything isn’t realistic and may not even be valuable if nothing is done with the data that is being collected. Zero in on the key metrics that define success and determine how to best monitor those.
  2. Break it Down into Manageable Pieces—Once you’ve determined what is important to the business, break that down into more manageable portions. For example if blazing fast website performance is needed for an eCommerce site, consider dividing this into network, server, services, and application monitoring components.
  3. Maintain an Open System—There is nothing worse than being locked into a solution that is inflexible. Leveraging APIs that can tie disparate systems together is critical in today’s IT environments. Strive for a single source of truth for each of your components and exchange that information via vendor integrations or APIs to make the system better as a whole.
  4. Invest in Understanding the Reporting—Make the tools work for you, a dashboard is simply not enough. Most of the enterprise tools out there today offer robust reporting capabilities, however these often go unimplemented.
  5. Review, Revise, Repeat—Monitoring is rarely a “set and forget” item, it should be in a constant state of improvement, integration, and evaluation to enable better visibility into the environment and the ability to deliver on key business values.

Learn from the past, live in the present and prepare for the future.

 

While this may sound like it belongs hanging on a high school guidance counselor’s wall, they are, nonetheless, words to live by, especially in federal IT. And they apply perhaps to no other infrastructure element better than the network. After all, the network has long been a foundational building block of IT, and its importance will only continue to grow in the future.

 

It’s valuable to take a step back and examine the evolution of the network. Doing so helps us take an inventory of lessons learned—or the lessons we should have learned; determine what today’s essentials of monitoring and managing networks are; and finally, turn an eye to the future to begin preparing now for what’s on the horizon.

 

Learn from the Past

 

Before the luxuries of Wi-Fi and the proliferation of virtualization, the network used to be defined by a mostly wired, physical entity controlled by routers and switches. Business connections were established and backhauled through the data center. Each network device was a piece of agency-owned hardware, and applications operated on well-defined ports and protocols.

 

With this yesteryear in mind, consider the following lessons we all (should) have learned that still apply today:

 

It Has to Work

 

If your network doesn’t actually work, then all the fancy hardware is for naught. Anything that impacts the ability of your network to work should be suspect.

 

The Shortest Distance Between Two Points is Still a Straight Line

 

Your job as a network engineer is still fundamentally to create the conditions where the distance between the provider of information, usually a server, and the consumer of that information, usually a PC, is as near to a straight line as possible. If you get caught up in quality of service maps, and disaster recovery and continuity of operations plans, you’ve lost your way.

 

Understand the Wizard

 

Wizards are a fantastic convenience and come in all forms, but if you don’t know what the wizard is making convenient, you are heading for trouble.

 

What is Not Explicitly Permitted is Forbidden

 

This policy will actually create work for you on an ongoing basis. But there is honestly no other way to run your network. If you are espousing that this policy will get you in trouble, then the truth is you’re going to get into trouble anyway. Do your part to make your agency network more secure, knowing that the bad guys are out there, or the next huge security breach might be on you.

 

Live in the Present

 

Now let’s fast forward and consider the network of present day.

 

Wireless is becoming ubiquitous, and the number of devices wirelessly connecting to the network is exploding. It doesn’t end there, though—networks are growing, some devices are virtualized, agency connections are T1 or similar services, and there is an increased use of cloud services. Additionally, tablets and smartphones are becoming prevalent and creating bandwidth capacity and security issues; application visibility based on port and protocol is largely impossible due to tunneling, and VoIP is common.

 

The complexity of today’s networking environment underscores that while lessons of the past are still important, a new set of network monitoring and management essentials is necessary to meet the challenges of today’s network administration head on. These new essentials include:

 

Network Mapping

 

When you consider the complexity of today’s networks and network traffic, network mapping and the subsequent understanding of management and monitoring needs has never been more essential than it is today.

 

Wireless Management

 

The growth of wireless networks presents new problems, such as ensuring adequate signal strength and that the proliferation of devices and their physical mobility doesn’t get out of hand. What’s needed are tools such as wireless heat maps, user device tracking and tracking and managing device IP addresses.

 

Application Firewalls

 

Application firewalls can untangle device conversations, get IP address management under control and help prepare for IPv6. They can also classify and segment device traffic; implement effective quality of service to ensure that critical business traffic has headroom; and of course, monitor flow.

 

Capacity Planning

 

You need to integrate capacity for forecasting tools, configuration management and web-based reporting to be able to predict scale and demand requirements.

 

Application Performance Insight

 

The whole point of having a network is to run the applications stakeholders need to do their jobs. Face it, applications are king. Technologies such as deep packet inspection, or packet-level analysis, can help you ensure the network is not the source of application performance problems.

 

Prepare for the Future

 

Now that we’ve covered the evolution of the network from past to present—and identified lessons we can learn from the network of yesterday and what the new essentials of monitoring and managing today’s network are—we can prepare for the future. So, stay tuned for part two in this series to explore what the future holds for the evolution of the network.

 

Find the full article on Federal Technology Insider.

Last week VMWorld took place in Las Vegas, and I was fortunate enough to attend for the second straight year. I love the energy at VMWorld, it is unlike any other conference I attend. The technology, as well as the attendees, appear to be on the cutting edge of enterprise technology. The discussions we have in and around the exhibit hall are worth the price of admission alone. On top of that I am lucky enough to be able to rub elbows with top industry experts and have discussions about the future tech landscape.

 

During VMWorld there was a hashtag on Twitter, #VMworld3word where attendees would use three words to describe VMWorld. That got me thinking about how I might try to describe VMWorld in three takeaways instead of just words. These three items were common themes in the discussions where I took part, or witnessed, last week.

 

You can find lots of articles on  the internet that summarize all the major announcements at VMWorld. That's not what this blog post is for. No, this blog post is my effort to help you understand what I witnessed as common threads even in regards to the major announcements.

 

Storage is King, Maybe

Make no mistake about this, everywhere you looked you found someone talking about storage, storage issues, and storage solutions. Flash is the answer for everything, apparently, even if storage isn't your issue. At one point I swear a storage vendor promised me that their all-flash hyper-converged array would cure my polio. The amount of money being invested in storage vendors may be trending downward, but judging by the exhibit hall floor last week at VMWorld the amount of money being spent on storage product development and marketing remains high.

 

One aspect that these storage vendors seem to either be forgetting, or just not talking about, is the Hybrid IT story. It would seem that the Cloud is a bit of a threat to these vendors, as they are finding it harder to sell their wares to enterprise customers and instead must start focusing on building partnerships with Microsoft and Amazon if their products are to remain relevant. Unfortunately, those Cloud giants rely on commodity hardware, not specialty hardware, which means to me that the storage gravy is just about over. Let's face it, Microsoft isn't about to order a million hyper-converged arrays anytime soon.

 

The last point I want to make about storage is that their seems to be a mindset that storage is the main bottleneck. Many vendors seem to forget that the network plays an important role in getting data to and from their storage devices. The network seems to be an area where storage vendors just put their hands up and say "that's not us". This is especially true when we talk Hybrid IT as well.

 

Correlated Monitoring is Lacking

Whenever I had the chance to talk monitoring to vendors and attendees it was clear to me that correlation of metrics and events is something that is lacking in the industry. I believe this to be true for two reasons. First, everyone makes dashboards that show metrics, usually related by resource (disk, CPU, memory, and sometimes basic network stats), and everyone admits that such dashboards are not very good at telling you a root cause. Second, the look on the faces of data professionals when I show them the main virtualization screen for DPA. Once they see that stacked view that allows them to see issues at the storage, host, guest, or database engine layer, the initial reaction was "take my money". It is as if no one on the market is presenting such data in a correlated way.

 

That's because vendors have spent years building tools that report metrics but do not report meaning. The latest trend now is machine learning and predictive analytics in order to give insight but the reality is you don't need a lot of fancy algorithms behind the scenes in order to do 80% of your work. What you need is for someone to provide you a group of metrics, across your infrastructure, that show the relationships between entities. In other words, is there is an issue with a LUN, can you quickly see what datastores, hosts, VMs, databases, and applications might be affected? For many vendors the answer is "no".

 

Accidental Cloud DBAs

Given my background and role I naturally gravitated towards conversations with DBAs last week. Storage and correlated monitoring were two of the topics we talked about. The third topic centered around how traditional DBAs today have little to no insight (or knowledge) of how networks work. But with Hybrid IT the reality for most, the data professionals I spoke with last week acknowledged that they needed to know more about networks, network topology, and how to troubleshoot network performance quickly.

 

Think about this for a moment. When you company starts using cloud resources, and someone calls your desk saying "the app is slow", are you able to quickly look to determine if the issue is related to the network? I tend to do my performance troubleshooting in buckets. It goes like this: either something is in this bucket, or in that bucket. If the app is slow then I want to quickly determine if it is a network issue or not. If it is network, then work with the team(s) that can fix the issue. If it is not network, then I know it is likely something I need to fix as the DBA, and I get to work.

 

But I certainly don't want to lose hours of my life trying to fix an issue with the application that doesn't exist. The Cloud DBA will also serve as an accidental network administrator as more companies adopt a Hybrid IT strategy.

 

There you have it, my three takeaways from a fabulous week in Las Vegas. Oh, and here's my #VMWorld3word for you: Fall out boy.

 

FullSizeRender.jpg

In the last couple of years the business is constantly asking IT departments how the public cloud can provide services that are faster, cheaper and more flexible than the in house solutions. I’m not going to argue with you if this is right or not, it is what I hear at most of my customers and in a couple of cases the answer seams to be automation. The next-gen data centers that leverage a software-defined foundation, use high levels of automation to control and coordinate the environment, enabling service delivery that will meet business requirements today and tomorrow.

 

For me the software-defined data center (SDDC) provides an infrastructure foundation that is highly automated for delivering IT resources at the moment they are needed. The strength of  SDDC is the idea of abstracting the hardware and enabling its functionality in software. Due to the power of hardware these days, it’s possible to use a generic platform with specialized software that enables the core functionality, whether for a network switch or a storage controller. Network, for example, were once specialized hardware appliances; today, they more and more are virtualized within the virtual with specialized software. Virtualization has revolutionized computing and allowed flexibility and speed of deployment. In the IT infrastructures of these days, virtualization enables both portability of entire virtual servers to off-premises data centers for disaster recovery and local virtual server replication for high availability. What used to require specialized hardware and complex cluster configuration can now be handled through a simple check box.

 

By applying the principles behind virtualization of compute to other areas such as storage, networking, firewalls and security, we can use its benefits throughout the data center. And it’s not just virtual servers: entire network configurations can be transported to distant public or private data centers to provide full architecture replication. Storage can be provisioned automatically in seconds and perfectly matched to the application that needs it. Firewalls are now part of the individual workload architecture, not of the entire data center, and this granularity against threats inside and out, yielding unprecedented security. But what does it all mean? Is the SDDC some high-tech fad or invention? If you ask me: absolutely not. The SDDC is the inevitable result of the evolution of the data center over the last decade.

 

I know there is a lot of marketing fluff around the datacenter, and Software Defined is one of them, but for me the SDDC is for a lot of companies the perfect fit for this time. What the future will bring, who knows where we stand in 10 years! The only thing we know is that a lot of companies are struggling with the IT infrastructure and need help in bringing the environment to the next level. SDDC is a big step forward for most (if not all) of us, and call it what you like but I’ll stick to SDDC

It's time to go to camp! You're probably thinking to yourself, "But Tom, summer's over. Camp is through. I have to go back to my boring day job."

That's not true at all! You've still got one more chance to go to camp with a group of geeks that you'll fit in with just fine. Solarwinds THWACKCamp is next week, and it's virtual! You can grab some S'mores, sit around the warm glow of your monitor, and commiserate with great speakers like Amy Lewis, Patrick Hubbard, and Stephen Foskett!

This is a chance for you to learn more about hot topics in IT. Not just the typical discussions about how SDN is going to take your job or how the cloud is the most awesome thing ever. No, these are really in-depth discussions about topics that matter to you. Like about how you shouldn't hate your monitoring system. Or how you can get started with the fundamentals of network security. There's even a flash storage panel! These are the topics that matter to the people in the trenches fighting to keep the network alive and running each day. And even if you find yourself wearing the DBA hat now and then, THWACKCamp has you covered there too.

The state of IT in 2016 is in flux more than any other time in the history of computing. Software is eating the world and proving that expensive custom hardware doesn't rule the kingdom any longer. The cloud is forcing infrastructure teams to look at their budgets and make hard choices. The cloud is also forcing application developers to change the way they write apps and never assume that something isn't going to be running all over the world at all hours of the day and night. The push toward automation and orchestration of the data center means that IT pros need to be making smarter decisions about the way things are planned so they aren't spending hours and hours fixing failures in production.

What does this state of flux mean for you? It means you need to get a leg up on everything you need to know to meet these challenges head on. That could mean sleepless nights combing through documentation for little gain. Or it could mean flying across the country to hear some "expert" drone on in an uncomfortable convention center about what their company vision is to get you to buy more things.

Instead, why don't you spend $0 and sit in the comfort of your office chair or couch and participate in THWACKCamp? You can learn about what you need to know without the need to travel or expend a fortune for no gain. Not enough? How about getting more than you bargained for? Because every THWACKCamp session has a drawing for a free fully licensed Solarwinds tool! You can't beat a deal like that!

Stop worrying about the future of IT and do something to be a part of it! Sign up for THWACKCamp today. It costs nothing and provides more than you could have bargained for. Good discussions, important topics, and a chance to get some free tools. You have nothing to lose, so sign up now!

I’ve been in a customer facing role for the last seven years. My first role as a Pre-Sales Architect came after years of running an architecture internal to a large insurance company.

 

There were many adjustments I needed to make in order to be successful in this new role. Some came easily to me, most notably the empathy which is required to support an internal IT role. I’d been there for years, so my abilities in this respect simply came naturally to me. There were others that didn’t come quite so smoothly. In this, I’d have to say the most difficult for me to achieve had to probably be the desire to influence the buyer that my solution was the most ideal for their needs. In some cases, surely, I did have the ideal solution, but in others, a bit of shoehorning needed to take place for this to be accomplished. I did have some philosophical problems pushing a somewhat less than ideal solution to my customers.

 

Taking care of the needs of my customers had always been the first priority toward which I focused my efforts. Arriving at the best solution, regardless of vendor, is and was paramount.

 

At what point, though, does the specific need the customer reflects, outweigh the benefits?

 

In some ways, a simple cost-benefit-analysis is all it takes. But that may be putting too simple a formula to the complexity of a decision like this. The requirements of a customer who’s not appropriate for your customer base, demands too much time, attention, or effort, with not enough pay-off to the company for whom you work could expedite the simple formula. But that’s easily too obvious an answer.

 

We could go on and on about a customer who’s unwilling or unable to pay their bills. This is again, a clear decision. In these cases, to state, “We’ll help, but we must change the pay dynamic” is appropriate. How about making things Net 10, or payable upon service? We must be considerate of their unique set of circumstances. However, this is a business, and to cut off a customer, as they’ve stopped being mutually beneficial, is quite possibly a little too self-serving. More must be entered into that equation. For example, do they want to work with you exclusively? Is this extended effort a one-time research project, or is it every single time?

 

My personal biggest frustration is when a new customer has placed us in the position of building an architecture for their needs, refining and refining, then takes the configuration and uses it for guidance on retrieving a competitive bid elsewhere. While we may have some slight advantage due to having registered the bid initially, we are a fully functional service provider. Our capacities extend far beyond the mere quoting of hardware. Our expertise simply in designing a quality architecture should, by all rights, give us some reasonable leg up. If a customer is buying just based on pricing, there are plenty of reputable options. Let’s all try to interact with integrity, please?

 

I think that what’s clear here is that these decisions must be made cautiously, on a case by case basis. What you don’t want to do, specifically, is throw away a good relationship. What you must do is be aware that a beneficial relationship can be forever lost.

I'm back from Las Vegas and VMworld. I loved the energy at the event last week. It was great talking with vendors and attendees about technology and events. There were a handful of announcements, of course, but for me the real excitement is when you can sit down and talk to others that are using technology in interesting ways to solve interesting problems.

 

Anyway, here are the items I found most amusing from around the Internet. Enjoy!

 

VMware Reassembles Cloud Stack as New VMware Cloud Foundation

The key point in this article has to do with how the Cloud Foundation is built upon existing cloud providers such as Azure, AWS, and Google. Going forward, that's the model I see everyone adopting, the leveraging of existing cloud and not trying to build your own.

 

Nashville Hotel Suffered POS Breach For Three Years

Three. Years. That's not just a breach, that's gross negligence. Then again, considering how data breaches are so prevalent these days, maybe it's just average negligence.

 

AI, Machine Learning, and The Hitchhiker’s Guide

Nice summary of machine learning versus artifical intelligence, with a touch of Douglas Adams for good measure.

 

What Facebook Data Center Team Learned from Shutting Down Entire Facilities

Back in the day, when I was involved in DR testing, we just did failovers. We didn't power down an entire data center! Hat tip to Facebook for going through this pain, and for helping to make the world a better place, one cat video at a time.

 

The Self-Driving Car Race

I was thinking about self-driving cars while in Las Vegas last week, and a report that Singapore is deploying them as taxis in a restricted zone of the city. I believe that self-driving cars are going to be here sooner than we might think, and the tech behind them will be used elsewhere.

 

Autonomous Tractor Concept Takes The Farmers Out Of Farming

For example, self-driving tractors. There's no reason why we can't automate these as well as cars.

 

NASA to develop rules for flying small drones

And then we have drones. There is no reason they can't be autonomous as well. And once they get big enough to carry hundreds of pounds of cargo? Flying cars, dear reader.

 

When I arrived in Las Vegas, this was my assigned taxi stand number, which made me think "maybe I shouldn't gamble this week":

 

IMG_4343.JPG

The advantage is firmly in the hands of the attackers right now. The number of easy to use tools available and the speed that new vulnerabilities are incorporated into these tools greatly outpaces the speed that most organizations can stay on top of the threats. No matter how many precautions you have taken, a breach, or incident will occur. You should operate under the assumed breach mentality. What are you going to do now?

 

Data centers are particularly juicy targets for attackers because there are so many different systems consolidated in a single place. Fortunately, the physical security of data centers is usually strong. Unfortunately, when you evaluate the digital security of data centers, we are far behind.

 

One lesson we can take from physical security principles is response: if someone were to physically attempt a breach, the plan would be clear. What’s your response after detecting a cyber-incident?

 

The Technical Response

 

For the technical response, one of the biggest questions is: do you shut down the attacker or monitor their activity? There are pros and cons for both approaches, but your organization needs to have a clear plan before the incident.

 

Let’s say you notice a large amount of traffic exiting your data center from a server that running an unauthorized FTP service. If you disable the service immediately, will you be able to determine the full extent that you are compromised? The attacker may still have access, and this will also cause them to go underground. If your policy is to monitor the attacker, how long do you do that and how can you wall off the attacker from gaining access to other systems?

 

Federal incident notification guidelines have been established by DHS/US-CERT, and there use is mandated by FISMA. US-CERT will work with agency IT personnel to analyze threats, exchange critical information with trusted partners, and engage cyber defense resources, as appropriate. Agencies also need to follow their departmental policies.

 

Breaches bring IT front and center to agency executives and have an immediate and often long lasting impact to agency operations. When a security breach occurs, how you respond can make all the difference. If you have a well-structured incident response plan, you can mitigate much of the damage of an attack.

 

A comprehensive incident response plan needs to address the different types of incidents an agency could encounter. Roles and responsibilities of the response team need to be assigned and communicated, and back-ups need to be identified. Other important parts of your plan include establishing a communication decision tree, as well as incident response procedures. And don’t forget regular testing and updating. Quarterly exercises can make sure staff know how to respond, find flaws in the plan, and lead to updating it accordingly.

 

Investment in prevention is necessary, but insufficient. If you don’t have a well-defined incident response plan in addition to those prevention solutions, then you aren’t doing enough to secure your data centers and critical facilities.

 

Find the full article on our partner DLT’s blog, TechnicallySpeaking.

Filter Blog

By date: By tag: