Skip navigation
1 2 3 4 5 Previous Next

Geek Speak

2,179 posts

By Joe Kim, SolarWinds Chief Technology Officer


With container adoption on the rise, I wanted to share a blog written in 2016 by my SolarWinds colleague, Kong Yang.


While the initial inroads are primarily still in the education phase, container technology has started making its way into federal IT networks, and the appeal is clear. Container-based technology provides value specifically in the areas of efficiency, optimization, and security, particularly as networks grow. This combination is uniquely suited to meet government IT needs.


Before an agency dips its toes into container technology, it’s vital for federal IT pros to gain an understanding of exactly what containers are, and what benefits they can bring.


What are Containers?


Container technology is far less complex than it sounds. Containers wrap a piece of software in a complete file system that contains everything the software needs to run, including code, run time, system tools, and libraries. Containers guarantee that the software will always run the same, regardless of the compute environment.


Let’s say you’re building an application that handles online transactions. The user experience consists of logging in, clicking on an item to add it to the cart, walking through the checkout process, and finally submitting to complete the transaction. With containers, you can isolate these services into loosely coupled services, aka microservices, across multiple containers. The advantage of doing it this way is that if the microservices fail, they will not take down the application.


In fact, a failure of a container or a system running containers will result in those services spinning up on other systems to get the work done. With non-container technology, there’s a good chance a tiered application is running on one or multiple systems to take care of that entire transaction. A failure that occurs on a tier or in a system will result in a degraded application or potential downtime as that tier restarts or fails over.


With container technology, however, each piece is separated out into its own tiny package. The login, for example, may be one container. Adding something to your cart may be another container, and so on. It’s like a distributed assembly line. Each container is responsible for its own small, unique task, which it does expertly, as opposed to one large monolithic application tier that’s responsible for many, often vastly different tasks, and carries much overhead.


How Can I Get Started Using Containers?


As with any new technology, the first thing to do is become familiar with that technology by learning about it. Because containers are typically open source, there is a wealth of publically available information and source materials that can be used for education and replication. and are great places to start.


The next step is to ramp up on skill sets. IT teams should dedicate some resources and time and start building experience around containers and microservices. Spend time testing to understand where these services might be implemented throughout the agency to increase efficiency. Again, Docker® provides installable platforms, such as Docker for Mac®, Linux®, and Windows® that you can leverage to level up your container experience.


Once there is a baseline understanding of containers, how they work, and how they can be used, apply that to your own environment and start mapping out a strategy for implementation.


Find the full article on Federal Technology Insider.

There’s no question that trends in IT change on a dime and have done so for as long as technology has been around. The hallmark of a truly talented IT professional is the ability to adapt to those ever-present changes and remain relevant, regardless of the direction that the winds of hype are pushing us this week. It’s challenging and daunting at times, but adaptation is just part of the gig in IT engineering.


Where are we headed?


Cloud (Public) - Organizations are adopting public cloud services in greater numbers than ever. Whether it be Platform, Software, or Infrastructure as a Service, the operational requirements within enterprises are being reduced by relying on third parties to run critical components of the infrastructure. To realize cost savings in this model, operational (aka employee) and capital (aka equipment) costs must be reduced for on-premises services.


Cloud (Private) - Due to the popularity of public cloud options, and the normalization of the dynamic/flexible infrastructure that they provide, organizations are demanding that their on-premises infrastructure operate in a similar fashion. Or in the case of hybrid cloud, operate in a coordinated fashion with public cloud resources. This means automation and orchestration are playing much larger roles in enterprise architectures. This also means that the traditional organizational structures of highly segmented skill specialties (systems, database, networking, etc.) are being consumed by engineers who have experience in multiple disciplines.


Commoditization - When I reference commoditization here, it isn’t about the ubiquity and standardization of hardware platforms. Instead, I’m talking about the way that enterprise C-level leadership is looking at technology within the organization. Fewer organizations are investing in true engineering/architecture resources, and instead are bringing those services in either via utilization of cloud infrastructure, or bringing this skill set on through consultation. The days of working your way from a help desk position up to a network architecture position within one organization are slowly fading away.


So what does all of this mean for you?

It’s time to skill up. Focusing on one specialty and mastering only that isn’t going to be as viable a career path as it once was. Breadth of knowledge across disciplines is going to help you stand out because organizations are starting to look for people who can help them manage their cloud initiatives. Take some time to learn how the large public cloud providers like AWS, Azure, and Google Compute operate and how to integrate organizations into them. Spend some time learning how hyperconverged platforms work and integrate into legacy infrastructures. Finally, learn how to script in an interpreted (non-compiled) programming language. Don’t take that as advice to change career paths and become a programmer.  That line of thinking is a bit overhyped in my opinion. However, you should be able to do simple automation tasks on your own, and modify other people’s code to do what you need. All of these skills are going to be highly sought after as enterprises move into more cloud-centric infrastructures.


Don’t forget a specialty. While a broad level of knowledge is going to be prerequisite as we go forward, I still believe having a specialty in one or two specifics areas will help from a career standpoint. We still need experts, we just need those experts to know more than just their one little area of the infrastructure. Pick something you are good at and enjoy, and then learn it as deeply as you possibly can, all while keeping up with the infrastructure that touches/uses your specialty. Sounds easy, right?


Consider what your role will look like in 5-10 years. This speaks to the commoditization component of the trends listed above. If your aspiration is to work your way into an engineering or architecture-style role, the enterprise may not be the best place to do that as we move forward. My prediction is that we are going to see many of those types of roles move to cloud infrastructure companies, web scale organizations, resellers/consultants, and the technology vendors themselves. It’s going to get harder to find organizations that want to custom-design their infrastructure to match and enhance their business objectives, instead opting to keep administrative-level technicians on staff and leave the really fun work to outside entities. Keep this in mind when plotting your career trajectory.


Do nothing. This is bad advice, and not at all what I would recommend, but it is an equally viable path. Organizations don’t turn on a dime (even though our tech likes to), so you probably have 5 to 10 years of coasting ahead. You might be able to eek out 15 if you can find an organization that is really change averse and stubbornly attached to their own hardware. It won’t last forever, though, and if you aren’t retiring before the end of that coasting period, you’re likely going to find yourself in a very bad spot.


Final thoughts


I believe the general trend of enterprises viewing technology as a commodity, rather than a potential competitive advantage, is foolish and shortsighted. Technology has the ability to streamline, augment, and enhance the business processes that directly face a business’ customers. That being said, ignoring business trends is a good way to find yourself behind the curve, and recognizing reality doesn’t necessarily indicate that you agree with the direction. Be cognizant of the way that businesses are employing technology and craft a personal growth strategy that allows you to remain relevant, regardless of what those future decisions may be. Cloud skills are king in the new technology economy, so don’t be left without them. Focusing on automation and orchestration will help you stay relevant in the future, as well. Whatever it is that you choose to do, continue learning and challenging yourself and you should do just fine.

Patrick Hubbard

THWACKcamp 2017

Posted by Patrick Hubbard Administrator Jun 9, 2017

When SolarWinds hosted its first virtual event, THWACKcamp in 2012, about 250 very active THWACK® community members attended, along with technology managers from a few large customers. There were a handful of sessions, with topics concentrated largely around network monitoring best practices, with a nod to IT systems management. THWACKcamp returns this October 18-19, and will mark the sixth year in what has grown to become a live, multi-track event for thousands of skilled IT professionals. It now spans expert advice in everything from networking, to automation, to hybrid IT, to cloud-native APM, DevOps, security, and even MSP operations. And, again this year, IT professionals will be at THWACKcamp’s core, collaborating (and occasionally commiserating), but learning and sharing ideas that make IT more reliable, innovative, and perhaps even fun.


Moon Landing


Being voluntold that you’re supporting a physical tradeshow booth can be nerve-wracking. First, the whole endeavor is, at its heart, a marketing thing. You must specify and configure demo gear that must somehow be squeezed into impossibly designed sets without overheating. You also become a Cord Master, asked to improvise never-before-seen cabling and connectivity, like HDMI to ½” pipe-thread. On top of this, add Layer 8 configurations, live code that attendees can actually see and touch that’s also interesting. Finally, throw the whole mess into crates months before the event, aware that forgetting even something small might mean five days of blank screens. Event tech is not IT’s comfort zone. I know I certainly prefer to have the safety of a hardware lab and dev team nearby.


While THWACKcamp has the advantage of being a virtual event, (more than a few admins have said that attending in shorts and a T-shirt working from home is the way to go), it is nonetheless a live event. And this year, as more technologies and topics are included than ever before, the Q&A and open chat conversations will be wider-ranging and more technical than ever. It’s not limited to what we can fit into a few crates. It’s an opportunity to interact with IT of all types, including very small businesses that rely on Managed Service Providers, midsized businesses managing the complexity of hybrid IT on a budget, to the largest enterprises with hundreds of IT professionals. It’s an open congress of some of the sharpest admins in IT, just as eager to attend and engage as the presenters are to share and learn something new.


Over-provisioned Geek Prize Closet


IT professionals attend technical conferences to learn, talk, and network, but they also certainly enjoy swag. Awesome geek giveaways return in 2017, along with THWACK community status points and bragging rights for those attending live. And for 2017, THWACKcamp attendees may earn up to 20,000 THWACK points for participating in activities, mini-missions, and, of course, attending sessions.

So, whether you’ve never missed a session of THWACKcamp, or you’ve never even been to a technical learning event, be sure to check out the registration page when it goes live in August. Maybe even set a reminder to register, because you can’t attend, chat with others, win prizes, or earn THWACK points if you don’t register. We look forward to seeing you live at THWACKcamp 2017, October 18 and 19!


Automating the Cloud

Posted by scuff Jun 8, 2017

Let’s stick our heads in the cloud for a moment. With your very first test account to play with a SaaS product or an Infrastructure as a Service environment, it’s natural to set up users and servers manually. That’s how we learn. That’s not sustainable on an ongoing basis for a production environment unless you want to screenshot every box you ticked and you know that the next tech will follow that documentation to the letter.


Decisions, decisions
Server builds and user account creation are two SysAdmin processes that are perfect for automating, even when they’re in the cloud. Your biggest challenge will be deciding what tool to use. Do you have a single vendor approach, so a native tool from that vendor will suffice? Are you splitting your risk between AWS and Azure, and looking for one tool that supports both environments? Are you running a hybrid model where there’s still a requirement for internal user accounts that you want to integrate with cloud SaaS products?


The single vendor approach
I’m going to pick on Azure and AWS because they are the two I’m most familiar with and I also have a word count to (roughly) stick to. If you’re a Rackspace or Google Cloud fan, or prefer some other IaaS flavor, add your thoughts in the comments.


Azure: It will be no surprise that Azure’s own automation service is based on PowerShell. PowerShell scripts and workflows (known as runbooks) to be exact. Learn more about Azure Automation here:


AWS: AWS Cloud Formation uses JSON or YAML text files. You can choose from a library of templates or you the designer to create your own.


The multi-vendor approach
I’ve briefly mentioned before the powerhouses of Chef and Ansible. Both have tools that integrate with both Azure and AWS.


Chef and Azure:


Chef and AWS:


Ansible and Azure:


Ansible and AWS:


DevOps also caught my eye, but it integrates with AWS, Digital Ocean, and Linode:


Usage and Billing
The "pay as you use" subscription model for SaaS products can lead to some large, unexpected bills. If the business loads a ton of new content (data) or places a significant amount of new traffic on one particular cloud server, you won’t see it until you get the monthly invoice. There are a few vendors jumping on board to help solve this problem.


Cloud Ctrl shows usage trends, compares spending between business units and allows you to set usage thresholds and alerts. It is compatible with Azure, AWS, Google Cloud, Soft Layer, and Office 365.


Startup Meta SaaS has just come out of stealth mode after a seed investment of around $1.5 million. Their product helps you analyze your spend and usage of SaaS products, including alerting on renewal dates. It will also tell you when accounts are being left dormant, which is handy if people have left your organization and their SaaS accounts haven’t been canceled. Meta SaaS currently supports 224 SaaS vendors and is adding new integrations at a rate of 20 per week.


Over to you!

I've offered just a taste of what you can automate in the cloud. We haven’t covered the automation of account provisioning when you run a hybrid environment (with tools like Azure AD Connect in the Microsoft world), but see my previous comment regarding word count.

Would a move to the cloud make you more open to investigating automation tools? Are they a necessity in the cloud world, or just another thing that will sit on your to-do list? Do you find it easy or hard to wrap your head around things like JSON scripts, to move to a world of cloud infrastructure as code?  Let me know what you think.

Root Cause.png


I remember the largest outage of my career. Late in the evening on a Friday night, I received a call from my incident center saying that the entire development side of my VMware environment was down and that there seemed to be a potential for a rolling outage including, quite possibly, my production environment.


What followed was a weekend of finger pointing and root cause analysis between my team, the virtual data center group, and the storage group. Our org had hired IBM as the first line of defense on these Sev-1 calls. IBM included EMC and VMware in the problem resolution process as issues went higher up the call chain, and still the finger pointing continued. By 7 am on Monday, we’d gotten the environment back up and running for our user community, and we’d been able to isolate the root cause and ensure that this issue would never come again. Others, certainly, but this one was not to recur.


Have you experienced similar circumstances like this at work? I imagine that most of you have.


So, what do you do? What may seem obvious to one may not be obvious to others. Of course, you can troubleshoot the way I do. Occam’s Razor or Parsimony are my courses of action. Try to apply logic, and force yourself to choose the easiest and least painful solutions first. Once you’ve exhausted those, you move on to the more illogical, and less obvious.


Early in my career, I was asked what I’d do as my first troubleshooting maneuver for a Windows workstation having difficulty connecting to the network. My response was to save the work that was open on the machine locally, then reboot. If that didn’t solve the connectivity issue, I’d check the cabling on the desktop, then the cross-connect before even looking at driver issues.


Simple parsimony, aka economy in the use of means to an end, is often the ideal approach.


Today’s data centers have complex architectures. Often, they’ve grown up over long periods of time, with many hands in the architectural mix. As a result, the logic as to why things have been done the way that they have has been lost. As a result, the troubleshooting toward application or infrastructural issues can be just as complex.


Understanding recent changes, patching, etc., can be an excellent way to focus your efforts. For example, patching Windows servers has been known to break applications. A firewall rule implementation can certainly break the ways in which application stacks can interact. Again, these are important things to know when you approach troubleshooting issues.


But, what do you do if there is no guidance on these changes? There are a great number of monitoring software applications out there that can track key changes in the environment and can point the troubleshooter toward potential issues. I am an advocate for the integration of change management software into help desk software and would like to add to that some feed toward this operations element with some SIEM collection element. The issue here has to do with the number of these components already in place at an organization, and with that in mind, would the company desire changing these tools in favor of an all-in-one type solution, or try to cobble pieces together. Of course, it is hard to discover, due to the nature of enterprise architectural choices, a single overall component that incorporates all of the choices made throughout the history of an organization.


Again, this is a caveat emptor situation. Do the research and find out a solution that best solves your issues, determines an appropriate course of action, and helps to provide the closest to an overall solution to the problem at hand.


The Actuator - June 7th

Posted by sqlrockstar Employee Jun 7, 2017

Data security and privacy links take center stage this week. I didn't intend for that to happen, it just did. I'm guessing we are going to see an uptick in incidents being reported, which is different than saying there is an uptick in incidents as a whole. I believe people are more cognizant of data security and privacy matters and as a result we are seeing increased reporting.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


Ransomware: Best Practices for Prevention and Response

A nice summary for you to convert into a checklist in an effort to minimize your risk from being a victim of ransomware.


Fireball Malware Infects 20% of Corporate Networks Worldwide
Interesting note here: Adware can spread just as malware would, but it isn’t considered illegal. And the result of not treating adware as a virus are things like Fireball.


The seven deadly sins of statistical misinterpretation, and how to avoid them
Because the future for data professionals is data analytics, and I want you to know about these simple mistakes that are all too common.


Building a Slack bot for channel topic detection using word embeddings
And I thought I was impressed when Outlook tells me that I forgot an attachment to an email, this looks like a real value-add.


OneLogin: Breach Exposed Ability to Decrypt Data
This is why we can’t have nice things. It’s time to move away from the use of passwords.


International data privacy laws create inconsistent rules
It’s almost as if the lawyers are passing contradictory laws to make certain they have billable hours for the next ten years.


The next time you are frustrated with some piece of code I want you to stop and think about how lucky you are that you didn't need to ever lookup the 10th character of a VIN number many times a day:


SAP® recently held their annual Americas’ SAP Users’ Group (ASUG) Sapphire Now celebration in Orlando, which attracted more than 35,000 executives, subject matter experts, sales and public relations personnel, as well as a whole bunch of SAP customers. They all converged on the Orlando Convention Center for four days to celebrate, collaborate, network, and innovate. Yours truly was a speaker for the “Using the Right SAP Support Services and Tools at the Right Time” session.


The Monday afternoon event, “A Call to Lead,” kicked off the conference with special guests former First Lady Michelle Obama and former President George W. Bush leading a discussion about diversity and equality in the workplace. (George Bush is hilarious, and, like the former first lady, a wonderful and charismatic speaker.) Tuesday morning’s keynote was delivered by SAP CEO Bill McDermott, who was joined on stage by Dell® Technologies founder and CEO Michael Dell. Bert and John Jacobs, brothers and co-founders of the Life is Good® clothing line, spoke in the afternoon, ending their presentation by throwing frisbees into the crowd. Wednesday morning, Hasso Plattner co-founder and chair of SAP’s supervisory committee presented, followed by appearances by Derek Jeter and Kobe Bryant on Thursday morning. That night, the British band Muse wrapped up the conference with a special performance.


When Hasso spoke, nearly everyone in the vast conference center stopped to listen. Hasso shared his thoughts on the future of IT, technology, and business, where it is all going, and how the driving forces behind these progressions are being shaped. While SAP ERP software runs on well-recognized technology, the conference did not focus solely on technology. SAPPHIRE targeted opportunities provided by the latest technological trends that drive businesses forward. SAP, vendors, and customers heard from people in human resources, finance, operations, supply chain, IT, and more. The vendor space was immense, and crackling with energy. A wide range of vendors, representing SAP HANA, cloud, integration services, managed, pretty much what you are pitched at every conference and many that you THWACKsters recognize: Microsoft®, VMware®, Dell, Cisco®, AWS®, Google® Cloud, and many more.


So why am I blogging about this? Like most THWACK® followers and contributors, I work in IT. I care about bits and bytes and blinking lights. Apparently, there are 22-25
fellow THWACKsters who have SAP running in their environment. And while the technology is critical to the IT professional, the experienced pros have learned that it is equally important to understand the company’s goals and how their work is aligned with them. Oddly enough, I sat in on several customer presentations, and SolarWinds® was featured on more than one slide deck (spelled incorrectly – Solarwinds --  every time! Grr!). SAP and SolarWinds share a common trait. They thrive on inspiring their customers to innovate and lead. SAP’s user group, ASUG, like THWACK, is a force to be reckoned with.


So will we see SolarWinds at Sapphire next year? Or maybe SAP’s more tech-y conference, TechEd? Here’s hoping!






Hey, guys! This week I’d like to share a very recent experience. I was troubleshooting, and the information I was receiving was great, but it was the context that saved the day! What I want to share is similar to the content in my previous post, Root Cause, When You're Neither the Root nor the Cause, but different enough that I thought I'd pass it along.


This tale of woe begins as they all do, with a relatively obscure description of the problem and little foundational evidence. In this particular case it was, “The internet wasn't working on the wireless, but once we rebooted, it worked fine.” How many of us have had to deal with that kind of problem before? Obviously, all answers lead to, “Just reboot and it’ll be fine." While that’s all fine and dandy, it is not acceptable, especially at the enterprise level, because it offers no real solution. Therefore, the digging began.


The first step was to figure out if I could reproduce the problem.


I had heard that it happened with some arbitrary mobile device, so I set up shop with my MacBook, an iPad, my iPhone and my Surface tablet. Once I was all connected, I started streaming content, particularly the live YouTube stream of The Earth From Space. It had mild audio and continuous video streaming that could not buffer much or for long.


The strangest thing happened in this initial wave of troubleshooting. I was able to REPRODUCE THE PROBLEM! That frankly was pretty awesome. I mean, who could ask for more than the ability to reproduce a problem! Though the symptoms were some of the stranger parts, if you want to play along at home, maybe you can try to solve this as I go. Feel free to chime in with something like, “Ha ha! You didn’t know that?" It's okay. I’m all for a resolution.


The weirdest part of this resolution was that for devices connecting on lower wireless bands, 802.11A, 802.11N, things were working like a champ, or seemingly working like a champ. They didn’t skip a beat and were working perfectly fine. I was able to reproduce it best with the MacBook connected at 802.11AC with the highest speeds available. But seemingly, when it would transfer from one APS channel to another AP on another channel, poof, I would lose internet access for five minutes. Later, it was proven to be EXACTLY five minutes (hint).


At the time though, like any problem in need of troubleshooting, there were other issues I needed to resolve because they could have been symptoms of this problem. Support even noted that these symptoms relate to a particular problem that was all fine and dandy when adjusted in the direction I preferred.  Alas, they didn’t solve my overwhelming problem of, “Sometimes, I lose the internet for EXACTLY five minutes.” Strange, right?


So, I tuned up channel overlap, modified how frequent devices will roam to a new access point and find their new neighbor, cleaned up how much interference there was in the area, and got it working like a dream. I could walk through zones transferring from AP to AP over and over again, and life seemed like it was going great. But then, poof, it happened again. The problem would resurface, with its signature registering an EXACT five-minute timeout.


This is one of those situations where others might say, “Hey, did you check the logs?” That's the strange part. This problem was not in the logs. This problem transcended mere logs.


It wasn’t until I was having a conversation one day and said, “It’s the weirdest thing. The connection with a full wireless signal, with minimal to no interference and nothing erroneous showing in the logs would just die, for exactly five minutes.” My friend chimed in, “I experienced something similar once at an industrial yard. The problem would surface when transferring from one closet-stack to another closet-stack, and the tables for Mac Refresh were set to five minutes. You could shorten the Mac Refresh timeout, or simply tunnel these particular connections back to the controller."


That prompted an A-ha moment (not the band) and I realized, "OMG! That is exactly it." And it made sense. In the earlier phases of troubleshooting, I had noted that this was a condition of the problem occurring, but I had not put all of my stock in that because I had other things to resolve that seemed out of place. It’s not like I didn’t lean on first instincts, but it’s like when there’s a leak in a flooded basement. You see the flooding and tackle that because it’s a huge issue. THEN you start cleaning up the leak because the leak is easily a hidden signal within the noise.


In the end, not only did I take care of the major flooding damage, but I also took care of the leaks. It felt like a good day!


What makes this story particularly helpful is that not all answers are to be found within an organization and their tribal knowledge. Sometimes you need to run ideas past others, engineers within the same industry, and even people outside the industry. I can’t tell you the number of times I've talked through some arbitrary PBX problem with family members. Just talking about it out loud and explaining why I did certain things caused the resolution to suddenly jump to the surface.


What about you guys? Do you have any stories of woe, sacrifice, or success that made you reach deep within yourself to find an answer? Have you had the experience of answers bubbling to the surface while talking with others? Maybe you have other issues to share, or cat photos to share. That would be cool, too.

I look forward to reading your stories!

In this post, part of a miniseries on coding for non-coders, I thought it might be interesting to look at a real-world example of breaking a task down for automation. I won't be digging hard into the actual code but instead looking at how the task could be approached and turned into a sequence of events that will take a sad task and transform it into a happy one.


The Task - Deploying a New VLAN


Deploying a new VLAN is simple enough, but in my environment it means connecting to around 20 fabric switches to build the VLAN. I suppose one solution would be to use an Ethernet fabric that had its own unified control plane, but ripping out my Cisco FabricPath™ switches would take a while, so let's just put that aside for the moment.


When a new VLAN is deployed, it almost always also requires that a layer 3 (IP) gateway with HSRP is created on the routers and that VLAN needs to be trunked from the fabric edge to the routers. If I can automate this process, for every VLAN I deploy, I can avoid logging in to 22 devices by hand, and I can also hopefully complete the task significantly faster.


Putting this together, I now have a list of three main steps I need to accomplish:


  1. Create the VLAN on every FabricPath switch
  2. Trunk the VLAN from the edge switches to the router
  3. Create the L3 interface on the routers, and configure HSRP


Don't Reinvent the Wheel


Much in the same way that one uses modules when coding to avoid rewriting something that has been created already, I believe that the same logic applies to automation. For example, I run Cisco Data Center Network Manager (DCNM) to manage my Ethernet fabric. DCNM has the capability to deploy changes (it calls them Templates) to the fabric on demand. The implementation of this feature involves DCNM creating an SSH session to the device and configuring it just like a real user would. I could, of course, implement the same functionality for myself in my language of choice, but why would I? Cisco has spent time making the deployment process as bulletproof as possible; DCNM recognizes error messages and can deal with them. DCNM also has the logic built in to configure all the switches in parallel, and in the event of an error on one switch, to either roll back that switch alone or all switches in the change. I don't want to have to figure all that out for myself when DCNM already does it.


For the moment, therefore, I will use DCNM to deploy the VLAN configurations to my 20 switches. Ultimately it might be better if I had full control and no dependency on a third-party product, but in terms of achieving the goal rapidly, this works for me. To assist with trunking VLANs toward the routers, in my environment the edge switches facing the routers have a unique name structure, so I was also able to tweak the DCNM template so that if it detects that it is configuring one of those switches, it also adds the VLANs to the trunked list on the relevant router uplinks. Again, that's one less task I'll have to do in my code.


Similarly, to configure the routers (IOS XR-based), I could write a Python script based on the Paramiko SSH library, or use the Pexpect library to launch ssh and control the program's actions based on what it sees in the session. Alternatively, I could use NetMiko which already understands how to connect to an IOS XR router and interact with it. The latter choice seems like it's preferable, if for no other reason than to speed up development.


Creating the VLAN


DCNM has a REST API through which I can trigger a template deployment. All I need is a VLAN number and an optional description, and I can feed that information to DCNM and let it run. First, though, I need the list of devices on which to apply the configuration template. This information can be retrieved using another REST API call. I can then process the list, apply the VLAN/Description to each item and submit the configuration "job." After submitting the request, assuming success, DCNM will return the JobID that was created. That's handy because it will be necessary to keep checking the status of that JobID afterward to see if it succeeded. So here are the steps so far:


  • Get VLAN ID and VLAN Description from user
  • Retrieve list of devices to which the template should be applied
  • Request a configuration job
  • Request job status until it has some kind of resolution (Success, Failed, etc)


Sound good? Wait; the script needs to login as well. In the DCNM REST API that means authenticating to a particular URL, receiving a token (a string of characters), then using that token as a cookie in all future requests within that session. Also, as a good citizen, the script should logout after completing its requests too, so the list now reads:

  • Get VLAN ID and VLAN Description from user
  • Authenticate to DCNM and extract session token
  • Retrieve list of devices to which the template should be applied
  • Request a configuration job
  • Request job status until it has some kind of resolution (Success, Failed, etc)
  • Log out of DCNM


That should work for the VLAN creation but I'm also missing a crucial step which is to sanitize and validate the inputs provided to the script. I need to ensure, for example, that:


  • VLAN ID is in the range 1-4094, but for legacy Cisco purposes perhaps, does not include 1002-1005
  • VLAN Description must be 63 characters or less, and the rules I want to apply will only allow [a-z], [A-Z], [0-9], dash [-] and underscore [_]; no spaces and odd characters


Maybe the final list looks like this then:


  • Get VLAN ID and VLAN Description from user
  • Confirm that VLANID and VLAN Description are valid
  • Authenticate to DCNM and extract session token
  • Retrieve list of devices to which the template should be applied
  • Request a configuration job
  • Request job status until it has some kind of resolution (Success, Failed, etc)
  • Log out of DCNM


Configuring IOS XR


In this example, I'll use Python+NetMiko to do the hard work for me. My inputs are going to be:


  • IPv4 Subnet and prefix length
  • IPv6 Subnet and prefix length
  • L3 Interface Description


As before, I will sanity check the data provided to ensure that the IPs are valid. I have found that IOS XR's configuration for HSRP, while totally logical and elegantly hierarchical, is a bit of a mouthful to type out, so to speak, and as such it is great to have a script take the basic information like a subnet, and apply some standard rules to it (e.g. the 2nd IP is the HSRP gateway, e.g. .1 on a /24 subnet), the next address up (e.g. .2) would be on the A router, and .3 would be on the B router. For my HSRP group number, I use the VLAN ID.  The subinterface number where I'll be configuring layer 3 will match the VLAN ID also, and with that information I can also configure the HSRP BFD peer between the routers too. By applying some simple standardized templating of the configuration, I can take a bare minimum of information from the user and create configurations which would take much longer to create manually and quite often (based on my own experience) would have mistakes in it.


The process then might look like this:


  • Get IPv4 subnet, IPv6 subnet, VLAN ID and L3 interface description from user
  • Confirm that IPv4 subnet, IPv6 subnet, VLANID and interface description are valid
  • Generate templated configuration for the A and B routers
  • Create session to A router and authenticate
  • Take a snapshot of the configuration
  • Apply changes (check for errors)
  • Assuming success, logout
  • Rinse and repeat for B router


Breaking Up is Easy


Note that the sequences of actions above have been created without requiring any coding. Implementation can come next, in the preferred language, but if we don't have an idea of where we're going, especially as a new coder, it's likely that the project will go wrong very quickly.


For implementation, I now have a list of tasks which I can attack, to some degree, separately from one another; each one is a kind of milestone. Looking at the DCNM process again:


  • Get VLAN ID and VLAN Description from user


Perhaps this data comes from a web page but for the purposes of my script, I will assume that these values are provided as arguments to the script. For reference, an argument is anything that comes after the name of the script when you type it on the command line, e.g. in the command, John the program would see one argument, with a value of John.


  • Confirm that VLANID and VLAN Description are valid


This sounds like a perfect opportunity to write a function/subroutine which can take a VLAN ID as its own argument, and will return a boolean (true/false) value indicating whether or not the VLAN ID is valid. Similarly, a function could be written for the description, either to enforce the allowed characters by removing anything that doesn't match, or by simply validating whether what's provided meets the criteria or not. These may be useful in other scripts later too, so writing a simple function now may save time later on.


  • Authenticate to DCNM and extract session token
  • Retrieve list of devices to which the template should be applied
  • Request a configuration job
  • Request job status until it has some kind of resolution (Success, Failed, etc)
  • Log out of DCNM


These five actions are all really the same kind of thing. For each one, some data will be sent to a REST API, and something will be returned to the script by the REST API. The process of submitting to the REST API only requires a few pieces of information:


  • What kind of HTML request is it? GET / POST / etc?
  • What is the URL?
  • What data needs to be sent, if any, to the URL?
  • How to process the data returned. (What format is it in?)


It should be possible to write some functions to handle GET and POST requests so that it's not necessary to repeat the HTTP request code every time it's needed. The idea is not to repeat code multiple times if it can be more simply put in a single function and called from many places. This also means that fixing a bug in that code only requires it to be fixed in one place.


For the IOS XR configuration, each step can be processed in a similar fashion, creating what are hopefully more manageable chunks of code to create and test.


Achieving Coding Goals


I really do believe that sometimes coders want to jump right into the coding itself before taking the time to think through how the code might actually work, and what the needs will be. In the example above, I've run through taking a single large task (Create a VLAN on 20 devices and configure two attached routers with an L3 interface and HSRP) which might seem rather daunting at first, and breaking it down into smaller functional pieces so that a) it's clearer how the code will work, and in what order; and b) each small piece of code is now a more achievable task. I'd be interested to know if you as a reader feel that the task lists, while daunting in terms of length, perhaps, seemed more accomplishable from a coding perspective than just the project headline. To me, at least, they absolutely are.


I said I wouldn't dig into the actual code, and I'll keep that promise. Before I end, though, here's a thought to consider: when is it right to code a solution, and when is it not? I'll be taking a look at that in the next, and final, article in this miniseries.

By Joe Kim, SolarWinds Chief Technology Officer


Because of the Internet of Things (IoT) we're seeing an explosion of devices, from smartphones and tablets to connected planes and Humvee® vehicles. So many, in fact, that IT administrators are left wondering how to manage the deluge, particularly when it comes to ensuring that their networks and data remain secure.


The challenge is significantly more formidable than the one posed by bring-your-own-device issues when administrators only had to worry about a few mobile operating systems. This pales in comparison to the potentially thousands of IoT-related operating systems that are part of an increasingly complex ecosystem that includes devices, cloud providers, data, and more.


How does one manage such a monumental task? Here are five recommendations that should help.


1. Turn to automation


Getting a grasp on the IoT and its impact on defense networks is not a job that can be done manually, which makes automation so important. The goal is to create self-healing networks that can automatically and immediately remediate themselves if a problem arises. A self-healing, automated network can detect threats, keep data from being compromised, and reduce response and downtime.


2. Get a handle on information and events


DoD administrators should complement their automation solutions with security information and event management processes. They are monitoring solutions designed to alert administrators to suspicious activity and security and operational events that may compromise the networks. Administrators can refer to these tools to monitor real-time data and provide insight into forensic data that can be critical to identifying the cause of network issues.


3. Monitor devices and access points


Device monitoring is also extremely important. Network administrators will want to make sure that the only devices that are hitting their networks are those deemed secure. Administrators will want to be able to track and monitor all connected devices by MAC and IP address, as well as access points. They should set up user and device watch lists to help them detect rogue users and devices in order to maintain control over who and what is using their networks.


4. Get everyone on board


Everyone in the agency must commit to complying with privacy policies and security regulations. All devices must be in compliance with high-grade security standards, particularly personal devices that are used outside of the agency. The bottom line is that it’s everyone’s responsibility to ensure that DoD information stays within its network.


5. Buckle up


Understand that while IoT is getting a lot of hype, we’re only at the beginning of that cycle. Analyst firm Gartner® once predicted that there would be 13 billion connected devices by 2020, but some are beginning to wonder if that’s actually a conservative number. Certainly, the military will continue to do its part to drive IoT adoption and push that number even higher.


In other words, when it comes to connected devices, this is only the beginning of the long road ahead. DOD administrators must prepare today for whatever tomorrow might bring.


Find the full article on Defense Systems.


Firewall Logs - Part Two

Posted by Dez Employee Jun 1, 2017

In Part One of this series, I dove into the issue of security and compliance. In case you don't remember, I'm reviewing this wonderful webcast series

to stress the importance of the information presented in each. This week, I'm focusing on the firewall logs webcast.


I chose the Firewall Logs webcast for this week because it is a known and very useful way to prevent attacks. Now, my takeaway from this session is that SIEMs are fantastic ways to normalize your logs from a firewall and also your infrastructure. You guys don't need me to preach on that, I know. However, I feel like when you use health performance and network configuration management tools, you really have a better solution all the way around.


Everyone (I think) knows that I'm not one to tell you to buy or purchase just SolarWinds products! So please do NOT take this that way. I will preach about having some type of SIEM, network performance monitor (NPM), patch manager (PaM), and a solid network configuration change management (NCM) within your environment. Let me give you some information to go along with this webcast on how I would personally tie these together. 


  1. Knowing the health of your infrastructure allows you to see anomalies. When this session was discussing the mean time to detection I couldn't help but think about a performance monitor. You have to know what normal is and have a clear baseline before an attack.
  2. Think about the ACLs along with your VLANs and allowed traffic on your network devices. NCM allows you to use a real-time change notification to help you track if any outside changes are being made and shows you what was changed.  Also, using this with the approval system allows you to verify outside access and stop it in its tracks as they are not approved network config changes. This is a huge win for security.  When you also add in the compliance reports and scheduled email send-outs you are able to verify your ACLs and access based on patterns you customize to your company's needs. This is vital for documentation and also if you have any type of a change request ticketing to validate.
  3. We all know we need to be more compliant and patch our stuff! Not only to be aware of vulnerabilities but also to protect our vested interests in our environment.


Okay, so the stage is laid out and I hope you see why you need more than just a great SIEM like LEM to back, plan, and implement any type of security policies you may need. This webcast brings up great points to think about on how to secure and think about those firewalls. IMHO, if you have LEM, Jamie's demo should help you guys strengthen your installation.  Also, the way he presents this helps you to strengthen or validate any SIEM you may have in place currently.


I hope you guys are enjoying this series as much as I am. I think we should all at least listen to security ideas to help us strengthen our knowledge and skill sets. Trust me, I'm no expert or I would abolish these attacks, lol! What I am is a passionate security IT person who wants to engage different IT silos to have a simple conversation about security.


Thanks for your valuable time! Let me know what you think by posting a comment below, and remember to follow me @Dez_Sayz!


Data is a commodity

Posted by sqlrockstar Employee Jun 1, 2017



Data is a commodity.


Don’t believe me? Let’s see how the Oxford dictionary defines “commodity.”


“A thing that is useful or has a useful quality.”


No good researcher would stop at just one source. Just for fun, let’s check out this definition from Merriam-Webster:


“Something useful or valued.”


Or, this one from


“An article of trade or commerce, especially a product as distinguished from a service.”


There’s a lot of data on the definition of the word “commodity.” And that’s the point, really. Data itself is a commodity, something to be bought and sold.


And data, like commodities, comes in various forms.


For example, data can be structured or unstructured. Structured data is data that we associate with being stored in a database, either relational or non-relational. Unstructured data is data that has no pre-defined data model, or is not organized in any pre-defined way. Examples of unstructured data include things like images, audio files, instant messages, and even this word document I am writing now.


Data can be relational or non-relational. Relational data is structured in such a way that data entities have relationships, often in the form of primary and foreign keys. This is the nature of traditional relational database management systems such as Microsoft SQL Server. Non-relational data is more akin to distinct entities that have no relationships to any other entity. The key-value pairs found in many NoSQL database platforms are examples of non-relational data.


And while data can come in a variety of forms, not all data is equal. If there is one thing I want you to remember from this article it is this: data lasts longer than code. Treat it right.


To do that, we now have Azure CosmosDB.


Introduced at Microsoft Build™, CosmosDB is an attempt to make data the primary focus for everything you do, no matter where you are. (Microsoft has even tagged CosmosDB as “planet-scale,” which makes me think they need to go back and think about what “cosmos” means to most people. But I digress.)


I want you to understand the effort Microsoft is taking to the NewSQL space here. CosmosDB is a database platform as a service that can store any data that you want: key-value pair, graph, document, relational, non-relational, structured, unstructured…you get the idea.


CosmosDB is a platform as a service, meaning the admin tasks that most DBAs would be doing (backups, tuning, etc.) are done for you. Microsoft will guarantee performance, transactional consistency, high availability, and recovery.


In short, CosmosDB makes storing your data easier than ever before. Data is a commodity and Microsoft wants as big a market share as possible.


I can’t predict the future and tell you CosmosDB is going to be the killer app for cloud database platforms. But I can understand why it was built.


It was built for the data. It was built for all the data.


The Actuator - May 31st

Posted by sqlrockstar Employee May 31, 2017

Home from Techorama in Belgium and back in the saddle for a short week before I head to Austin on Monday morning. I do enjoy visiting Europe, and Belgium in particular. Life just seems to move at a slower pace there.


As always, here are some links from the Intertubz that I hope will hold your interest. Enjoy!


The big asks of British Airways

Last year I wrote about a similar outage with Delta, so here's some equal time for a similar failure with BA. Who knew that managing IT infrastructure could be so hard?


Facebook Building Own Fiber Network to Link Data Centers

I'm kinda shocked they don't already have this in place. But more shocking is the chart that shows internal traffic growth, mostly a result of Facebook having to replicate more pictures and videos of cats.


Who Are the Shadow Brokers?

Interesting thought exercise from Bruce Schneier about this group and what might be coming next.


Web Developer Security Checklist

Every systems admin needs a similar checklist to this one.


All the things we have to do that we don't really need to do: The social cost of junk science

A nice and quick reminder about the hidden costs of junk science. Or, the hidden costs of good science.


The Calculus of Service Availability

So the next time someone tells you they need 99.9% uptime for a system, you can explain to them what that really means.


How Your Data is Stored, or, The Laws of the Imaginary Greeks

This is a bit long, set aside some time. But you'll learn all about the problems (and solutions) for distributed computing.


One thing I love about Belgium is how they make shopping for the essentials easy:


By Joe Kim, SolarWinds Chief Technology Officer


Federal IT professionals must consider the sheer volume and variety of devices connected to their networks, from fitness wearables to laptops, tablets, and smartphones. The Internet of Things (IoT) and the cloud also significantly impact bandwidth and present security concerns, spurred by incidents such as the Office of Personnel Management breach of 2014.


Despite this chaotic and ever-changing IT environment, for the Defense Department, network and data center consolidation is well underway, layering additional concerns on top of an already complex backdrop. Since 2011, the DoD has closed more than 500 data centers. That’s well below the goal the agency initially set forth, and it issued a directive last year to step up the pace; and subsequently, the Data Center Optimization Initiative was introduced to further speed efforts.


To be successful, federal IT professionals need a system that accounts for all of the data that soon will stream through their networks. They also need to get a handle on all the devices employees use and will use to access and share that data, all while ensuring network security.


Meeting the Challenges of Tomorrow Today


Network monitoring has become absolutely essential, but some solutions are simply not capable of dealing with the reality of today’s networks.


Increasingly, federal IT managers house some applications on-premises while others use hosted solutions, creating a hybrid IT environment that can be difficult to manage. Administrators will continue to go this route as they attempt to fulfill the DoD's ultimate goal: greater efficiency. Hybrid IT creates monitoring challenges, as it makes it difficult for administrators to “see” everything that is going on with the applications.


Going Beyond the Basics


This complexity will require network administrators to go beyond initial monitoring strategies and begin implementing processes that provide visibility into the entire network infrastructure, whether it’s on-premises or hosted. Hop-by-hop analysis lets administrators effectively map critical pathways and gain invaluable insight into the devices and applications using the network. It provides a complete view of all network activity, which will become increasingly important as consolidation accelerates.


At the very least, every IT organization should employ monitoring best practices to proactively plan for consolidation and ensuing growth, including:


  1. Adding dedicated monitoring experts who can provide holistic views of agencies’ current infrastructure and calculate future needs.
  2. Helping to ensure that teams understand the nuances of monitoring hardware, networks, applications, virtualization, and configurations and that they have access to a comprehensive suite of monitoring tools.
  3. Equipping teams with tools that address scalability needs. This will be exceptionally important as consolidation begins to truly take flight and data needs rapidly expand.


Looking Reality in the Eye


DoD network consolidation is a slow, yet major undertaking, and a necessity to help ensure efficiency. It comes with extreme challenges, particularly a much greater degree of network complexity. Effectively wrangling this complexity requires network administrators to go beyond simple monitoring and embrace a more comprehensive monitoring strategy that will better prepare them for their future.


Find the full article on Signal.


Two weeks ago, I had the privilege of attending and speaking at ByNet Expo in Tel Aviv, Israel.  As i mentioned in my preview article, I had hoped to use this event to talk about cloud, hybrid IT, and SolarWinds' approach to these trends, to meet with customers in the region, and to enjoy the food, culture, and weather.


I'm happy to report that the trip was a resounding success on all three fronts.


First, a bit of background:


Founded in 1975, ByNet ( is the region's largest systems integrator, offering professional services and solutions for networking, software, cloud, and more.


I was invited by SolarWinds' leading partner in Israel, ProLogic ( who, honestly, are a great bunch of folks who not only know their stuff when it comes to SolarWinds, but they also are amazing hosts and fantastic people to just hang out with.


Now you might be wondering what kind of show ByNet (sometimes pronounced "bee-naht" by the locals) Expo is. Is it a local user-group style gathering? A tech meet-up? A local business owners luncheon?


To answer that, let me first run some of the numbers:

  • Overall attendees: 4,500
  • Visitors to the SolarWinds/Prologic booth: ~1,000
  • Visitors to my talk (~150, which was SRO for the space I was in)


The booth was staffed by Gilad, Lior, and Yosef, who make up part of the ProLogic team. On the Solarwinds side, I was joined by Adriane Burke out of our Cork office. That was enough to attract some very interesting visitors, including the Israeli Ministry of Foreign Affairs, Orbotec, Soreq, the Israeli Prime Minister's Office, Hebrew University, Mcafee, and three different branches of the IDF.


We also got to chat with some of our existing customers in the region, like Motorola, 3M, the Bank of Israel, and Bank Hapoalim.


Sadly missing from our visitor list, despite my repeated invitations on Twitter, was Gal Gadot.


But words will only take you so far. Here are some pictures to help give you a sense of how this show measures up:













But those are just some raw facts and figures, along with a few flashy photos. What was the show really like? What did I learn and see and do?


First, I continue to be struck by the way language and culture informs and enriches my interactions with customers and those curious about technology. Whether I'm in the booth at a non-U.S. show such as CiscoLive Europe or ByNet Expo, or when I'm meeting with IT pros from other parts of the globe, the use of language, the expectations of where one should pause when describing a concept or asking for clarification, the graciousness with which we excuse a particular word use or phrasing - these are all the hallmarks of both an amazing and ultimately informative exchange. And also of individuals who value the power of language.


And every time I have the privilege to experience this, I am simply blown away by its power. I wonder how much we lose, here in the states, by our generally mono-linguistic mindset.


Second, whatever language they speak, SolarWinds users are the same across the globe. Which is to say they are inquisitive, informed, and inspiring in the way they push the boundaries of the solution. So many conversations I had were peppered with questions like, "Why can't you...?" and "When will you be able to...?"


I love the fact that our community pushes us to do more, be better, and reach higher.


With that said, I landed on Friday morning after a 14-hour flight, dropped my bags at the hotel and - what else - set off to do a quick bit of pre-Shabbat shopping. After that, with just an hour or two before I - and most of the country - went offline, I unpacked and got settled in.


Twenty-four hours later, after a Shabbat spent walking a chunkble chuck of the city, I headed out for a late night snack. Shawarma, of course.


Sunday morning I was joined by my co-worker from Cork, Adrian Burke. ProLogic's Gilad Baron spent the day showing us Jerusalem's Old City, introducing us to some of the best food the city has to offer, and generally keeping us out of trouble.


And just like that, the weekend was over and it was time to get to work. On Monday we visited a few key customers to hear their tales of #MonitoringGlory and answer questions. Tuesday was the ByNet Expo show, where the crowd and the venue rivaled anything Adrian and I have seen in our travels.


On my last day, Wednesday, I got to sit down in the ProLogic offices with a dozen implementation specialists to talk some Solarwinds nitty-gritty: topics like the product roadmaps, use cases, and trends they are seeing out in the field.


After a bit of last-minute shopping and eating that night, I packed and readied myself to return home Thursday morning.


Random Musings

  • On Friday afternoon, about an hour before sundown, there is a siren that sounds across the country, telling everyone that Shabbat is approaching. Of course nobody is OBLIGATED to stop working, but it is striking to me how powerful  a country-wide signal to rest can be. This is a cultural value that we do not see in America.
  • It is difficult to take a 67-year-old Israeli taxi driver seriously when he screams into his radio at people who obviously do not understand him. Though challenging, I managed to hide my giggles.
  • Traveling east is hard. Going west, on the other hand, is easy.
  • You never "catch up" on sleep.
  • Learning another language makes you much more sensitive to the importance of pauses in helping other people understand you.
  • Everything in Jerusalem is uphill. Both ways.
  • On a related note: there are very few fat people in Jerusalem.
  • Except for tourists.
  • Orthodox men clearly have their sweat glands removed. Either that or they install personal air conditioners inside their coats. That's right. I said coats. In May. When it's 95 degrees in the sun.







Filter Blog

By date:
By tag: