Monitoring Central Blogs

cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Monitoring Central Blogs

Level 14

Hello fellow data geeks! My name is Joshua Biggley and I am an Enterprise Monitoring Engineer for a Fortune 15 company. I’m also fortunate enough to be a remote worker on part of an amazing team. One of my favourite career achievements was to be named Canada’s only SolarWinds THWACK Community MVP in 2014.

I joined the THWACK Community in 2008, shortly after moving to beautiful Prince Edward Island on the East Coast of Canada. I’ve attended THWACKcamp for at least one session since its inception 7 years ago, but have been a regularly attendee for the past 4 years.  Humble brag moment -- I had the opportunity to join Leon Adato (@adatole) and Kate Asaff (@kasaff) for THWACKcamp 2016 in presenting the session Troubleshooting with SolarWinds - The Case of the Elusive Root Cause. Leon has been a friend and (short-lived) colleague since 2014 and Kate has quite literally saved my bacon in one of my biggest challenges as a Monitoring Engineer. Sharing the THWACKcamp stage with these two superheros was beyond awesome!  Last year, I was humbled to have my team and I win the Carmen Sandiego Award at THWACKcamp 2017. Our team is entirely remote engineers and having our work recognized for both the high-performance technical and inter-team collaboration we embrace was a highlight of my year.  Will 2018 be able to top it?

I think these two sessions will give 2017 a run for its money, even if I don’t win another THWACK award!

Day 2

Oct 18 @ 10AM CT

What Does It Take to Become a Practice Leader?

Too many organizations view monitoring, alerting, and event management as a necessary evil. It is often relegated to the “All other duties as assigned by your supervisor” category. As organizations mature, finding monitoring engineers becomes a challenge. It’s not just about someone who knows how to use the SolarWinds products you own (you are using SolarWinds products, aren’t you?) but finding someone who can explain why monitoring, alerting, and event management are so important. They need to explain to their peers, their management, and the business why monitoring needs to be a practice not an afterthought. They need to be a data geek. They need to be a storyteller.

Patrick Hubbard, Phoummala Schmitt, and Theresa MIller bring decades of experience and, more important, are recognized leaders in the industry. Discovering how they went from junior analyst to practice leaders will help me understand explain to others how to make that journey. As a practice leader in my full-time job as well as freelance work, being able to help others understand that they can be leaders is crucial to the health of monitoring as a practice. My colleagues and I have worked very hard to elevate monitoring to the respect it deserves. In 2019, we will be starting an internal Community of Excellence that focusing on monitoring, alerting, and event management plus my very favourite new focus -- observability!

Day 1

Oct 17 @ 12PM CT

Observability: Just A Fancy Word for Monitoring? A Journey From What to Why

Observability and high-cardinality data are sultry words to any data geek. Observability was introduced in the 1960s as part paper written by Rudolf E Kálmán entitled “On the General Theory of Control Systems”. If the status of a system can be known simply by examining the outputs of that system, the system is considered observable. In recent years, the idea of observability has been embraced by systems engineers as applications have moved from bare-metal to virtualized to containerized to serverless. Instead of monitoring the things that allow your system to do what it does, we’re now measuring how the system does what it does without much concern for why.

Of all of the sessions as THWACKcamp 2018, this is the one I would want every engineer, every application developer, every CTO --- OK, pretty much everyone who is involved in building, supporting, and managing any critical application anywhere -- to watch. Application Performance Management is coming to every organization. If you deliver any services through an application, APM provides the insight and observability is the methodology for measuring those insights.

Do I sound a little passionate about observability?  What?!? Only a little?!? Observability is my new passion. I recently wrote a white paper that defined an APM strategy and the foundation was observability. This idea of observability is probably the most important shift in our industry in 20 years. Unnecessary hyperbole? Maybe, but I think there are seminal moments in every industry and this focus on observability is going to be one of them. I’m Canadian, would I steer you wrong?

Read more
4 3 468

Dashboards are important. Your NOC is an essential avenue for collecting and relaying information about your network, and combined with a finely crafted set of alerts there’s nothing that can get past you. Not only are dashboards effective, but they just look so stinkin’ awesome when done properly. In this post I’m going to focus on my ‘Dashboard Philosophy,’ which is all about efficiency, information, and design. A dashboard should display the most data possible in the space that you have, it should include pertinent information that summarizes your environment, and it should look good doing it. Let’s talk about what the SolarWinds® Orion® Platform brings to the table to help make our dashboards the best they can be.

  1. NOC Views

Using the NOC view feature is a must. These space-saving views allow you to combine multiple sub-views that can be set on a rotation. Creating one is easy: simply add a new summary view, edit it, then enable left navigation and the NOC view feature. Here you can enter an interval for how often the NOC view rotates between individual sub-views. If you aren’t using NOC views, you’re wasting valuable space on your dashboards! Enter NOC mode, full-screen your browser window, and bask in the glory of a massive canvas to display all your fancy metrics and charts. Rob Boss would be proud.

     2. Network Atlas

Admit it, you both love and hate Network Atlas. It’s an incredibly useful tool that requires a bit of extra patience, but the results can be amazing once you get the hang of it. As Henry David Thoreau probably once said, “SolarWinds Network Atlas is but a canvas for your imagination…” or something like that. Check out this amazing example from THWACK® user spzander​:

pastedImage_17.png

Hungry for more? Here is some of my favorite THWACK content for tuning your Network Atlas skills and getting the creative juices flowing:

10 Hidden Gems in Orion Network Atlas

Using Custom Properties to send messages to your NOC using Network Atlas

The “Show us your Network Atlas Maps” thread

     3. PerfStack

With the release of NPM 12.1 came a game-changing new feature… PerfStack. This new charting tool allows you to quickly and easily create attractive charts that contain the data you need while optimizing page space. PerfStack is what makes you, the monitoring professional, shine when an application owner is looking for a way to view monitoring data for their systems. Check out the original release notes for PerfStack here. Since its first iteration, the SolarWinds team has been putting a lot of work into this tool. With PerfStack 2.0, they have added support for many major Orion modules including VMAN, SAM, VNQM, NCM, and DPA, along with a pile of new features such as fast polling, syslog/trap support, quick links, and full screen mode (which makes a great dashboard). As of this post, the next iteration of PerfStack is available in the latest NPM 12.3 Release Candidate and includes… drumroll please… A PERFSTACK WIDGET FOR YOUR DASHBOARDS!

pastedImage_18.png

Here we have a node detail view… WITH PERFSTACK! You can do the same thing with any view type in Orion, including Summary Views (which means dashboards). For dashboard nerds such as myself, this is truly a good day. Sign up for the NPM RC program for more details and awesome sneak peeks at what SolarWinds is doing to improve tools like PerfStack.

     4. AppStack

This is really one of the most efficient ways to display a mass amount of information in such a small space. AppStack is a one-size-fits-all tool that will satisfy your devs, their managers, and your director. An efficient dashboard should have MAXIMUM information in MINIMUM space, and AppStack is the answer. Whether you only have SAM or you’re running multiple products on the Orion platform the AppStack widget gives you a flexible, filterable, and fun-tastic (I couldn’t think of another word that started with ‘f’) resource to add to your dashboards and NOC views. There’s not much more to say. It’s the perfect widget for my Dashboard Philosophy.

pastedImage_19.png

     5. SWQL and Other Advanced Methods

Are you a dev nerd? Do you like to yell at code until it bends to your will? Are you ready to bring your SolarWinds deployment to an unreasonably awesome level? With a little bit of fidgeting and some help from THWACK, you can create your own charts, tables, dashboards, maps, and much more. Check out this post from THWACK MVP CourtesyIT, which has a master list of all the amazing ideas and customizations that have been posted in the community. Be sure to check out the section from THWACK MVP wluther:  he’s got some great content specifically tailored to dashboards. One thing to always keep in mind when using more advanced methods… SolarWinds support may not be able to assist you with the bending of spacetime. Fidget at your own risk!

In my opinion, one of the most powerful tools for creating custom resources is SWQL, the SolarWinds Query Language. With it, data is your slave. THWACK MVP mesverrum makes it easy in this post, where he provides an awesome example of how to create your own custom SWQL tables.

     Results

Let’s put all this together and create a shiny new dashboard that follows the idea of efficiency, information, and design. We need something that doesn’t waste space, contains useful data, and looks awesome. Something like this:

pastedImage_20.png

First thing’s first… we’re using the NOC view, indicated by the black bar at the top with the circles in the upper-right corner that represent the various sub-views in rotation. We have a map from Network Atlas (upper left), a PerfStack project added as a widget (lower left), AppStack (lower right), and a custom SWQL table that displays outage information (check out mesverrum​'s post about it here).

And there we have it! Five useful tools that you can use to make your dashboards amazing. Be sure to post your creations in the community. Here are some threads for NOC views and Network Atlas maps. Now go forth and dashboard!

Read more
36 25 7,181
Product Manager
Product Manager

Keeping a network up and running is a full-time job, sometimes a full-time job for several team members! But it doesn’t have to feel like a fire drill every day. Managing a network shouldn’t be entirely reactive. There are steps you can take and processes you can put in place to help reduce some of the top causes of network outages and minimize any downtime.

1. The Problem: Human Element

The dreaded “fat finger.” You’ve heard the stories. You may have done it yourself, or been the one working frantically late into the night or over a weekend to try to recover from someone else’s mistake. If you’re really unlucky (like some poor employee at Amazon® last spring), the repercussions can be massive. No one needs that kind of stress.


The Protection:
First, make sure only the appropriate people have access to make changes. Have an approval system built in. And, since even the best of us can make mistakes, ensure you have a system that allows you to roll back changes just in case.

2. The Problem: Security Breaches

Network security is becoming more and more critical every day. People trying to break the system get better, and privacy needs for users gets higher. There are many critical elements to trying to keep your network secure, and it’s important not to miss any. It doesn’t do any good to deadbolt your door when your window is wide open.

The Protection:

Protect your devices from unauthorized changes. Monitor configurations so you can be alerted to any changes, see exactly what was changed, and know what login ID was used to make the change. Also, you should be regularly auditing your device configurations for vulnerabilities. Whether you have custom policies defined for your organization or need to comply with HIPAA, DISA STIG, SOX, or other industry standards, continuously monitoring your devices to help ensure your network stays compliant is one way to help.

3. The Problem: Lack of Routine Maintenance

Over time, networks can become messy and disorganized if there aren’t standards in place, increasing both the risk of errors and the time needed to resolve them.

The Protection:

Network standardization simplifies and focuses your infrastructure, allowing you to become more disciplined with routines and expectations. Naming conventions, standard MOTD banners, and interface names are just a few things you can do to help troubleshoot and keep a balance within your team and devices, allowing for better management and less human error.

4. The Problem: Hardware Failures

It’s not if hardware will fail, but when. Are you ready to make a speedy recovery? When a device unexpectedly goes down, it can have a big impact, depending on which device it is and what redundancies you have in place.

The Protection:

Ensure that you can quickly recover devices or bring a replacement online by having device configurations automatically backed up so you can quickly bring new devices online.

5. The Problem: Firmware Issues / Faults in the Devices

When you support hundreds of devices, required firmware updates can be tedious, and executing commands over and over increases the risk of error.

The Protection:

With network automation, you can easily manage rapid change across complex networks. Bulk deploy configurations to ensure accuracy and speed up deployment times.

Increase your uptime and reduce the challenges of keeping your network running smoothly so you can focus on other projects. With SolarWinds® Network Configuration Manager, you can bulk deploy configuration changes or firmware updates, manage approvals, revert to previous configurations, audit for compliance, and run remediation scripts. Take action today to reduce these five causes of network outages.

Read more
1 0 593
Product Manager
Product Manager

We just can't have anything nice, now can we?  Oh, well. We knew there would be new vulnerabilities and ransomware attacks in 2018. However, this time hardware is the culprit, and patching is not going to be a cure-all for the situation. Consider yourself warned: expect more slowdowns in 2018.

Stop and think about this for a second: as the days progress, we are literally learning how much this new vulnerability impacts us. Anyone who says they have the full solution is not being honest with you or themselves. What I would like to do is help you to see how you can use the tools you likely already have to make you more aware of past, present, and future vulnerabilities and threats. That said, let's move on to the importance of using SolarWinds tools to do just that.

SolarWinds® Patch Manager will allow you to update your Windows® machines to their Microsoft® patches. If you are currently using this product, you should already be scheduling and looking for these. I discovered that there can be some issues with third-party Windows antivirus or you might get the BSOD. Read more here, because the awesome chart helps clarify these issues and how to prevent them from happening to you.

Further, Patch Manager will allow you to schedule and report on your Windows devices regarding updates. The reporting is key to showcase your compliance and, in this case, start your baseline. Plus, just because you update your devices does not mean you are 100% in the clear. Updating your third-party packages is an added bonus with Patch Manager, a fact that is often overlooked though desperately needed.     

SolarWinds® Server & Application Monitoring (SAM) will help you validate your business, yourself, and your vendor support for any degradation that patching may have on your applications. This is something you will want to have in place as soon as possible. It allows you to see any anomalies that may present themselves to your applications after the patching is applied. And because SAM is multi-vendor, you’ll be able to address even broad-scale hardware issues. The avid SAM users among you will likely know even more tricks for using the software, and I encourage you to share your knowledge in the comments to help us all be more aware in terms of application-centric monitoring.

SolarWinds® Network Configuration Manager (NCM) comes helps when there are firmware upgrades\updates that need to be applied to impacted network devices. It also helps you to roll these out. There is a compliance reporting function built into NCM that will assist with audits automatically. Remember, this incident is ongoing, which makes NCM’s ability to import very helpful. In fact, you can plug into firmware vulnerability warnings provided by the National Institute of Standards and Technology (NIST). This puts you even further ahead of future vulnerabilities.

SolarWinds® Network Performance Monitor (NPM) is all about the baseline. If you have ever been to one of our SWUGs, you have heard me preach endlessly about baselines and their extreme importance. However, I understand that sometimes you need black and white in front of you to truly understand this. The mindset I’m currently following regarding this vulnerability looks something like this:

  1. Patched and we have our checkbox
  2. Monitoring our application performances
  3. Ready for updates to needed network devices
  4. Monitoring the common vulnerabilities database
  5. Waiting for any anomaly that may present its ugly face (my favorite)

We can now show that we have implemented the patching to put a Band Aid® on the issues that could present themselves. However, as I’ve already mentioned, this is not a full fix. A hardware option would be the best solution, but is obviously not available to billions of devices at this time. YOU ARE THE THE FIRST RESPONDER!

Using NPM in combination with the other tools that I have outlined allows you to verify the patching and the results. Also, if there are ticks or drops or spikes that do NOT match your current baseline, you can share that solid reporting and documentation with your vendor to work out the possible issue, which makes you part of the solution. Is there anything better than working at the edge of technological advancements to create countermeasures to vulnerabilities? NO. The answer is a solid NO.

If you don’t already have it in place, set up threshold alerting and monitoring on critical devices that are housing your applications. That helps ensure that you are alerted to anything out of the ordinary, allowing you to get things back on track. It also shows your team and other departments that you are fully invested in the integrity of application uptime and performance. Also, if you have DevOps, you really need the documentation and baselines to prove that perhaps the performance issue is not the in-house application, but an actual patching issue. That, right there, can save a lot of unneeded cycles through rabbit holes.

Please let me know if you have additional ways to protect and help through these beginning stages of 2018 vulnerabilities. The ideas we share could literally help the many of you who act as a one-person army fighting your way to the top!

Thank you all for your eyes,

~Dez~

In case you’d like more information on any of the products mentioned above, check these out:

SolarWinds® Patch Manager

SolarWinds® Server & Application Monitor

SolarWinds® Network Performance Monitor

SolarWinds® Network Configuration Manager

Other resources:

https://www.pcworld.com/article/3245606/security/intel-x86-cpu-kernel-bug-faq-how-it-affects-pc-mac....

https://www.nytimes.com/2018/01/03/business/computer-flaws.html

Check out our Security and Compliance LinkedIn® Showcase Page for ideas on how to socialize this content: https://www.linkedin.com/showcase/solarwinds-security-and-compliance/

Follow our Federal LinkedIn page to stay current on federal events and announcements: https://www.linkedin.com/showcase/4799311/

Read more
3 7 2,703
Level 14

Jogging is my exercise. I use it to tune out noise, focus on a problem at hand, avoid interruptions, and stay healthy. Recently, I was cruising at a comfortable nine-minute pace when four elite runners passed me, and it felt like I was standing still. It got me thinking about the relationship between health and performance. I came to the conclusion that they are related, but more like distant cousins than siblings.

I can provide you data that indicates health status: blood pressure, resting heart rate, BMI, body fat percentage, current illnesses, etc. Given all that, tell me: can I run a four-minute mile? That question can’t be answered solely with the data I provided. That’s because I’m now talking about performance versus health.

We can also look at health metrics with databases: CPU utilization, I/O stats, memory pressure, etc. However, those also can’t answer the question of how your databases and queries are performing. I’d argue that both health AND performance monitoring and analysis are important. They can impact each other but answer different questions.

“What gets measured gets done.” I love this saying and believe that to be true. The tricky part is making sure we’re measuring the right thing to ensure we’re driving the behavior we want.

Health is a very mature topic and pretty much all database monitoring solutions offer visibility into it. Performance is another story. I love this definition of performance from Craig Mullins as it relates to databases: “the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.”

Interestingly, I believe this definition would be widely accepted, yet approaches to achieving this with monitoring tools varies widely. While I agree with this definition, I’d add “in the shortest possible time” to the end of it. If you agree that you need to consider a time component in regards to database performance, now we’re talking about wait-time analysis. Here’s a white paper that goes into much more detail on this approach and why it is the correct way to think about database performance.

We can only get to the right answer regarding root cause if we’re collecting (measuring) the right data in the first place. Below is a chart with some thoughts on data collection requirements. Adapt as needed, but I hope it provides a workable framework.

pastedImage_0.png

Remember: don’t stop with asking “What can we do?” Take it to the next level and instead ask, “What should we do?”

Read more
0 0 187
Level 8

Imagine this scenario: You are running a Kiwi® server either on-premises or in the cloud, and need to push at least a portion of that log data to Papertrail. This would be especially helpful in situations where Kiwi is already in place, and you need to allow a developer, support contact, etc. external access to limited log data without providing access to the Kiwi server itself. Once these logs are pushed to your Papertrail account, you can grant users access to specific Papertrail log data. These Papertrail logs can be viewed from anywhere, while Kiwi servers are often locked down within a secured network. The best part is that you can maintain a complete local copy of your logs while pushing interesting log data to Papertrail for use with advanced search and alerting features.

From your Kiwi Syslog® Service Manager select File -> Setup.

In the setup page, you have a rule named Default that displays all log entries sent to Kiwi and logs them to a file.

Send everything to Papertrail! If you wish to forward ALL logs seen by Kiwi to Papertrail, add the Send to Papertrail action to your Default rule, or any rule with no filters configured.

However, if you want to send only certain messages to Papertrail, you’ll need to add a new rule with a filter to capture just the specific messages you want.

We'll be adding 1 New Rule with 2 Filters and 2 Actions.

pastedImage_0.png

FILTERS

Filters allow several methods of matching log data. Positive matches result in the actions for that rule being performed on those log lines. Hostname, IP, Message Text, and Priority are the most commonly used filters.

Add the new rule by right-clicking Rules and selecting Add rule.

pastedImage_1.png

Under the new rule, right click Filters and Add Filter.

pastedImage_2.png

In the Field section, choose Priority.

pastedImage_3.png

Click on the Priority headings to highlight all the columns.

pastedImage_4.png

Click the green check mark at the bottom, to select the highlighted fields.

pastedImage_5.png

Next, create a new filter to match the text in log lines using the Message Text field, and Simple filter type. Here I used "test" because it will match on all of the Kiwi default test log lines. You can use any text strings in this filter to match log entries you wish to send to Papertrail.

pastedImage_6.png

ACTIONS!

Now configure the actions to take place on log lines matching our filters. Start by adding them to a Kiwi display so we can see what's matching the rule right here in Kiwi.

Under the new rule, right-click Actions and Add action.

pastedImage_18.png

Select the Display action at the top of the menu. Set a Display number that corresponds to the display dropdown in the main Kiwi window. You should use a unique display that isn't used by other Kiwi rules. Display 00 shows ALL logs seen by Kiwi by default, so I’ve used Display 01 instead. This will only show everything sent to Papertrail.

pastedImage_19.png

Now add an action to send the matching logs to Papertrail.

Under the new rule, right-click Actions and Add action to add another action.

pastedImage_20.png

Select the Log to Papertrail.com (cloud) action to send logs to a Papertrail account. Replace the hostname and port with your own log destination found here: https://papertrailapp.com/account/destinations

pastedImage_21.png

After hitting Apply to save the configuration, use the File –> Send test message to localhost menu item to generate a log line that will be pushed to your Papertrail account and shown on the Kiwi display you set. In your Papertrail account, you’ll see your Kiwi server show up by IP or hostname, but you can rename it as I’ve done here. (Remember: The test log line shown has to match your filters.)

pastedImage_22.png

pastedImage_23.png

pastedImage_24.png

Troubleshooting

Not seeing log lines in Papertrail? Does the Kiwi server have outbound network connectivity that allows a connection to Papertrail? In ~90% of cases, this is caused by host-based firewalls or other network devices blocking connectivity to Papertrail.

The PowerShell® below will test basic UDP connectivity to Papertrail from a Windows® host. Replace the Papertrail Hostname/Port with your actual log destination settings found here. Copy and paste all lines at once into PowerShell. (Run PowerShell as Administrator if you have trouble.)

WINDOWS - PowerShell

$udp = New-Object Net.Sockets.UdpClient logs6.papertrailapp.com, 12345

$payload = [Text.Encoding]::UTF8.GetBytes("PowerShell to Papertrail - UDP Syslog Test")

$udp.Send($payload, $payload.Length)

You can use this similar script to replicate a log transfer to Kiwi. Run this from the same host the Kiwi server is on.

$udp = New-Object Net.Sockets.UdpClient 127.0.0.1, 514

$payload = [Text.Encoding]::UTF8.GetBytes("udp papertrail test")

$udp.Send($payload, $payload.Length)

Read more
2 0 728
Level 10

There are many decisions in life where we come across the question of build or buy. A new house. A business. An application. Monitoring software. Regardless of the object of discussion, answering certain questions can act as gates for helping us along the path to the right solution—for us as individuals or as companies.

Take database monitoring software for example.

  • What are the requirements? What are the “must haves” vs. “nice to haves”?
  • Does a monitoring solution exist that at least meets minimum requirements?
  • Is there a competitive advantage to be realized by building?
  • What is the Total Cost of Ownership (TCO) of the build vs. buy options?

Requirements

Regarding requirements, be sure to get input from all of the stakeholders who will be involved or will benefit from the monitoring solution. This will include DBAs, developers, management, and application owners at a minimum, but may also extend to other IT disciplines (system admins, storage admins, VM admins, etc.). Be as thorough and detailed as possible when defining requirements. Be realistic on the “must haves” and try not to throw in the kitchen sink. Take into account best practices when defining requirements. There is a lot of good content out there on how to define requirements, so no need to be exhaustive here.

Does It Already Exist?

After requirements have been defined, the journey can begin. A quick Google® search will likely yield a number of alternatives for consideration. Some things to look for are:

  • Features that match requirements
  • Testimonials
  • Trusted reviews
  • Customer reviews
  • Satisfaction with the product and support
  • Specific strengths and weaknesses of the monitoring solution

In addition to the above research, evaluations are likely in order. All of the key boxes might get checked, but taking the candidate solutions for a test drive can go a long way in determining a good fit. This phase will likely involve talking with sales representatives from the vendors. This is a great opportunity to get additional information you may have missed during your research, or answers to questions that come up during evaluations. Restricting candidates for evaluation to three can save time and effort.

Competitive Advantage

Determining if there will be a competitive advantage by building monitoring software is a bit nebulous. Database monitoring tools in general are fairly mature for the major RDBMS vendors. Only databases without a significant install base are likely to not have any commercial off-the-shelf (COTS) monitoring solutions available. Still, if you don’t find a solution that meets your minimum criteria, you may be looking at a build scenario. In fact, many great startups resulted from needs that aren’t met by COTS products.

TCO

Purchasing perpetual COTS licenses is usually straightforward, in my experience. Costs to consider are the list price, the initial purchase/negotiated price (usually some percentage of list), and then maintenance (normally somewhere around 20% of list price due annually).

The TCO for building a solution is a bit more ambiguous. Costs to consider include development of the product, potential integration with other technologies, security, administration, opportunity cost (associated with not monitoring) while developing, and maintenance of the product when new versions/drivers are released on the target RDBMSs. Software development being what it is, a contingency factor of at least 20% should be built into the cost for rework, bug fixes, course adjustments, etc. And hey, if you do succeed in traveling the happy path during development, that’s cost reduction that can flow directly to the bottom line.

Conclusion

The decision to build vs. buy is a key one. It’s tempting to go down a path because it’s something that can be done. However, answering the questions listed here should help those making the decision to identify what should be done.

Read more
0 0 279
Level 10

It seems DevOps is the new cool thing in IT. Sometimes it feels like DevOps is an amorphous thing that only cloud people can play with. For many of us who come from the client-server era, it can be intimidating.

We know DevOps can be defined in many ways. It can be thought of as a mindset, a methodology, or a set of tools. In this post, I offer a definition of DevOps by breaking the concept down into seven fundamental principles.

Implementing DevOps is very complex, requires new tools, new skills, and new processes. It’s often only possible for development and operation teams who are working together on cloud architecture software. I am excited about these seven principles because they can be applied in any IT organization.

Embracing these seven principles might enable your team to grow more agile, more responsive to business needs, and better able to meet expectations. The combination of these principles represents the mindset that companies are trying to hire for, and the mindset that is required to make the best use of cloud technologies, too.

These are the seven principles that define DevOps that you can integrate into your IT operations team:

  1. 1. Application and End-user Focus – Everyone on the team is focused on how their end-users and applications are impacted. The infrastructure is only there to make the application work.
  2. 2. Collaboration – Because the focus is on the end-user, silos do not work. If the app is down, everyone has failed. There are no virtualization problems or isolated storage issues. There is only one team: the one responsible for the app to work. This requires transparency, visibility, a consistent set of tools, and teamwork that supports applications across the entire technology stack.
  3. 3. Performance Orientation – Performance is a requirement and a core skill across the team. Performance is measured, all the time, everywhere. Bottlenecks and contentions are well understood. Performance is an SLA. It’s critical to the end-user experience. Everyone understands the direct relationship between performance, resource utilization, and cost.
  4. 4. Speed – Taking agile one step further, shorter, iterative processes allow teams to move faster, innovate, and serve the business more effectively.
  5. 5. Service orientation – No monolithic apps. Everything is a service, from application components to infrastructure. Everything is flexible and ready to scale or change.
  6. 6. Automation - To move faster, code, deployments, tests, monitoring, alerts; everything is automated. That includes IT services. Embrace self-service for your users and focus on what matters.
  7. 7. Monitor everything – Visibility is critical for speed and collaboration. Monitoring is a requirement and a discipline. Everything is tested, the impact of every change is known.

For more details, I invite you to read the full presentation:

The 7 Principles of DevOps and Cloud Applications

[slideshare id=56581871&doc=the7disciplinesofdevopsandcloudaplications2-151231191647]

Read more
3 0 784
Level 10

The term “cloud” has stopped being useful as a description of a specific technology and business model. Everything, it seems, has some element of cloudiness. The definition of cloud versus on-premises has blurred.

It’s only been eight years since Gartner® defined the five attributes of cloud computing: scalable and elastic, service-based, shared, metered by use, uses internet technologies. Shortly after that, Forrester® defined the cloud as standard IT delivered via internet technologies on a pay per use, self-service model.

What we call on-premises is most often virtualized, dynamically provisioned infrastructure on a co-location hosting facility, programmable by software. Clouds now offer long-term, bare-metal, prepaid-for-the-year infrastructure, and private/dedicated infrastructure.

Organizations have learned that there is a place for cloud-hosted resources and a place for on-premises resources. The best analogy I have is this: there is a time when you want to buy a car and there is a time when you want to rent a car (or get a taxi). As Lew Moorman, president of Rackspace® told me many years ago, “the cloud is for everyone, but not for everything.”

It’s undeniable that every IT department is adopting the cloud, but it is also becoming increasingly clear that on-premises IT is not going away. Most companies will end up with a combination of the two. But how?

For the near future, there are mainly three broad ways to consume cloud by IT departments:

  • SaaS – From SalesForce® to Netsuite® and Office 365®
  • “Lift and Shift” – Where the architecture stays the same, and you only migrate the workloads to be hosted on a cloud
  • “Cloud first,” which takes full advantage of cloud architecture and services. This model is only viable for net new projects and for those where it makes sense to invest in writing or re-writing apps from the ground up.

The reason I bring this up is because when it comes to monitoring, application architecture is more important than where things are hosted.

A standard three-tier architecture application like SharePoint® on AWS® needs to be monitored essentially the same way it is monitored on-premises, or in a co-location environment. Conversely, a cloud-architected application (service-oriented, dynamically provisioned, horizontally scaled, etc.) will require a different monitoring approach, whether it is hosted on a public or a private cloud.

The key point is that cloud is quickly becoming irrelevant as a term. No one says they have an electronic calculator or a digital computer anymore.

We need to start using more specific terms that are more meaningful and useful, such as cloud services, cloud architecture, or cloud hosting – not just cloud.

Read more
0 2 605
Product Manager
Product Manager

If your organization is based in the EU, or provide goods or services to the EU, you’ve probably heard a lot about the General Data Protection Regulation (GDPR) compliance lately. In this post, I’d like to educate the THWACK® community on some of the GDPR requirements and how SolarWinds products such as Log & Event Manager (LEM) can assist with GDPR compliance.

Why the need for GDPR?

In December 2015, the EU announced that the GDPR was being implemented in place of the Data Protection Directive (DPD), the current EU data laws. The DPD was first established over 20 years ago, but it has not kept up with the seismic changes in information technology and is no longer sufficient for today’s technologies and threats. The shortcomings of the DPD have become apparent and the EU saw the need to replace it.

The shift from directive to regulation

A defining change which comes with the launch of GDPR is a shift from a directive to a regulation. DPD was a directive, meaning a set of rules issued to member states, but each country can interpret and implement the rules differently. GDPR is a regulation, which requires countries to implement the regulation without any scope for varying interpretations. It removes any ambiguities on organizations’ data protection responsibilities. GDPR paves the way for data privacy as a fundamental right for EU citizens. The implementation deadline for the regulation is May 25, 2018, so organizations are certainly against the clock to implement the necessary policies, procedures, and systems to ensure they are compliant.

What exactly is personal data?

GDPR defines a very broad spectrum of personal data. Personal data is no longer limited to information such as name, email, address, phone number, etc. GDPR also classifies online identifiers such as IP addresses, web cookies, and unique device identifiers such as personal data. Even pseudonymous data is included. This is personal data which has been technically modified in some way, such as hashed or encrypted. Worth noting that the rules are slightly relaxed for data that is pseudonymized, which provides an incentive for organizations to encrypt or hash their data. GDPR defines personal data as “data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade-union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health, or data concerning a natural person’s sex life or sexual orientation.” (GDPR Article 9, page 124)

My organization is not based in the EU—why should I care about GDPR?

Although it is an EU regulation, it is not limited to the EU. GDPR will affect organizations on a global scale. The regulation will apply to any organization that offers goods or services to EU citizens. If a company based outside the EU is storing, managing, or processing personal data belonging to EU citizens, they will need to ensure GDPR compliance (GDPR Article 3, page 110). According to a recent PwC study, a staggering 92% of US multinational companies have listed GDPR compliance as data-privacy priority. A significant percentage of those organization plan to spend $1 million or more on GDPR.

Data controllers vs. data processors

Controller – “The natural or legal person, public authority, agency, or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data.”

Processor – “A natural or legal person, public authority, agency, or other body which processes personal data on behalf of the controller.”  (Article 4, GDPR page 112)

Under the DPD, data processers had very little responsibilities to company, whereas GDPR places joint responsibility for both data controllers and data processors to comply with the regulation. As an example, if an organization (controller) outsources its payroll to an external payroll company (processor), even though the payroll company is managing and storing data on behalf of the controller, they are now required to comply with GDPR. This will impact controllers and processors alike. Controllers will have to conduct reviews to ensure their processors have a framework in place to comply with GDPR. Processors will have to ensure they are compliant. 

Data breach notification – GDPR Article 33 (page 53)

The Data Protection Directive didn’t require organizations to notify authorities of any data breaches. GDPR defines a personal data breach as the “accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to personal data transmitted, stored, or otherwise processed.” It’s worth remembering that personal data now includes IP addresses, web cookies, unique devices identifiers, and more. The GDPR also now requires organizations (or controllers, as they are known in GDPR) to report data breaches within 72 hours. If this deadline is not met, you will have to explain the reasons for the delay. If you are a data processor, you must report the breach to the controller. The controller then notifies the “supervisory authority.”  Data subjects must also be informed when a breach poses a high risk to their rights and freedoms. However, if the controller had implemented protection measures such as encryption on the data, then the data subject’s rights and freedoms are unlikely to be at risk.

Individual rights

GDPR provides EU citizens with increase personal data rights. Just some of these individual rights include Consent (Article 7) , Right to Erasure (Article 17), and Data Portability (Article 20).

Organizations will require consent when collecting personal data of EU citizens. The type of data and retention period will need to be stated in plain language that citizens can clearly understand. Data controllers will be required to prove that consent has been provided by the subject.

Individuals also have the right to erasure, meaning subjects can request controllers to delete all information about them, provided the controller has no reason to further process the data. There are exceptions if the data is used for legal obligations—for example, financial institutions are legally obliged to retain data for a certain period of time. If a data controller has shared personal data with third parties, the onus is on the controller to inform those third parties of the data subjects request to erase the data.

Data Portability allows data subjects to receive the personal data they provided to a data controller in a structured, “machine-readable” format. This portability facilitates data subjects’ ability to move, copy, or transmit data easily from one service provider to another.

What happens if we don’t comply?

If your organization is not compliant with GDPR, it can receive fines of up to €20 million or 4% of global annual turnover for the preceding financial year (whichever is greater). These penalties apply to both data controllers and processors. (Article 83, section 5)

How can SolarWinds help?

GDPR will likely require organizations to implement new policies, procedures, controls, and technologies—it may even require you to hire a Data Protection Officer, in certain cases. While no single technology can meet all the requirements of GDPR, SolarWinds can certainly assist with some of the requirements.

Article 32: Security of processing

This section of GDPR requires organizations to “implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk.” SolarWinds® Patch Manager can be used to identify and update missing patches and outdated third-party software on your Windows® servers and workstations. Patch Manager also enables you to inventory your Windows® machines and report on unauthorized software installations on your network.

Article 32 also requires “regular testing the effectiveness of technical measures for ensuring security of the processing.” SolarWinds LEM can be used to validate the controls you have put in place.

Please see here for more information: Article 32

Article 33 and 34: Notification of a personal data breach to the supervisory authority and communication of a personal data breach to the data subject

SolarWinds Risk Intelligence (RI) is a product that performs a scan to discover personally identifiable information across your systems and points out potential vulnerabilities that could lead to a data breach. RI can audit PII data to help ensure it is being stored, in accordance to the requirements of GDPR. The reports from RI can be helpful in providing evidence of due diligence when it comes to the storage and security of PII data.

As mentioned previously, if a personal data breach occurs, the controller must notify the supervisory authority within 72 hours. It is vital that breaches and threats are identified as quickly as possible.

LEM can assist with the detection of potential breaches thanks to features such as correlation rules and Threat Feed Intelligence. LEM’s File Integrity Monitoring and USB Defender® can monitor for any suspicious file activity and also the detect the use of unauthorized USB removable media. If an incident occurs, LEM’s nDepth feature can be leveraged to perform historical analysis. LEM also includes best practice reporting templates to assist with compliance reporting.

Please see here for more information: Article 33 and Article 34

“The implementation deadline for the regulation is May 25, 2018.”

The GDPR deadline is fast approaching. GDPR compliance will require significant effort from both data controllers and processors. There are several steps required to get started with GDPR, which include (but are not limited to) performing an analysis of what personal data your organization stores and where it’s stored, reviewing existing IT security policies and procedures, and ensuring you have the necessary technological and organizational procedures in place to detect, report, and investigate personal data breaches.

I am very interested in hearing opinions and how members of the THWACK community are preparing for GDPR. Please feel free to provide comments below.

To learn about SolarWinds portfolio of IT security software, please see here.

Read more
12 10 6,510
Level 10

SolarWinds THWACK® community has grown to become one of the largest and most active communities for IT professionals, expecting about two million unique visitors this year alone.

We see it as a great opportunity to have a conversation and to connect.

IT is changing all the time. That’s what makes it such an interesting industry. SolarWinds® solutions have been changing, too. In addition to our traditional product line, powered by the Orion® Platform, SolarWinds now offers a remote monitoring product line for MSPs, and a portfolio of cloud monitoring products for DevOps teams building cloud-first applications.

This makes it more important than ever that we have a space to connect with customers and with the IT industry. This is that space.

Monitoring Central complements our two other blog communities on THWACK: Geek Speak, where you can read opinions from industry thought leaders, and the Product Blog, where you find out about product updates and new releases.

Monitoring Central is a new space to talk about all things monitoring.

We invite you to participate, ask questions, voice your opinions, and actively participate in this blog. For example, write a comment below suggesting any topics you would like to hear about.

We look forward to the conversation.

Read more
18 16 2,354