Monitoring Central Blogs - Page 2

cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Monitoring Central Blogs - Page 2

Level 10

Look back into almost any online technology businesses 10, or even 5 years ago and you’d see a clear distinction between what the CTO and CMO did in their daily roles. The former would oversee the building of technology and products whilst the latter would drive the marketing that brought in the customers to use said technology. In short, the two together took care of two very different sides of the same coin.

Marketing departments traditionally measure their success against KPIs such as the number of conversions a campaign brought in versus the cost of running it. Developers measure their performance on how quickly and effectively they develop new technologies.

Today, companies are shifting focus towards a customer-centric approach, where customer experience and satisfaction is paramount. After all, how your customers feel about your products can make, or break a business.

Performance diagnostic tools can help you optimize a slow web page but won’t show you whether your visitors are satisfied.

So where do the classic stereotypes that engineers only care about performance and marketers only care about profit fit into the customer-centric business model? The answer is they don’t: in a business where each department works against the same metrics — increasing their customers’ experience — having separate KPIs is as redundant as a trap door in a canoe.

The only KPI that matters is “are my customers happy?”

Developers + Marketing KPIs = True

With technology being integral to any online business, marketers are now in a position where we can gather so much data and in such detail that we are on the front line when it comes to gauging the satisfaction and experience of our customers. We can see what path a visitor took on our website, how long they took to complete their journey and whether they achieved what they set out to do.

Armed with this, we stand in a position to influence the technologies developers build and use.

Support teams, no longer confined to troubleshooting customer problems have become Customer Success teams, and directly impact on how developers build products, armed with first-hand data from their customers.

So as the lines blur between departments, it shouldn’t come as a surprise that engineering teams should care about marketing metrics. After all, if a product is only as effective as the people who use it, engineers build better products and websites when they know how customers intend to use them.

Collaboration is King

“How could engineers possibly make good use of marketing KPIs?” you might ask. After all, the two are responsible for separate ends of your business but can benefit from the same data.

Take a vital page on your business’s website: it’s not the fastest page on the net but its load time is consistent and it achieves its purpose: to convert your visitors to customers. Suddenly your bounce rate has shot up from 5% to 70%.

Ask an engineer to troubleshoot the issue and they might tell you that the page isn’t efficient. It takes 2.7 seconds to load, which is 0.7 seconds over the universal benchmark and what’s more is that some of the file sizes on your site are huge.

Ask a marketer the same question and they might tell you that the content is sloppy, making the purpose of the page unclear. The colors are off-brand and what’s more is that an important CTA is missing.

Even though both have been looking at the same page, they’ve come to two very different results, but the bottom line is that your customer doesn’t care about what went wrong. What matters is that the issue is identified and solved, quickly.

Unified Metrics Mean Unified Monitoring

Having unified KPIs across the various teams internal to your organisation means that they should all draw their data from the same source: a single, unified monitoring tool.

For businesses where the customer comes first, a new breed of monitoring is evolving that offers organizations this unified view, centred on how your customer experiences your site: Digital Experience Monitoring, or seeing as everything we do is digital, how about we just call it Experience Monitoring?

With Digital Experience Monitoring, your marketers and your engineering teams can follow a customer’s journey through your site, see how the navigated through it and where and why interest became a sale or a lost opportunity.

Let’s go back to our previous example: both your marketer and your engineer will see that although your bounce rate skyrocketed, the page load time and size stayed consistent. What they might also see is that onboarding you implemented that coincides with your bounce rate spike is confusing to your customers meaning that they leave, frustrated and unwilling to convert.

Digital Experience Monitoring gives a holistic view of your website’s health and helps you answer questions like:

  • Where your visitors come from
  • When are they visiting your site
  • What they visit and the journey they take to get there
  • How your site’s performance impacts on your visitors

By giving your internal teams access to the same metrics, you foster greater transparency across your organization which leads to faster resolution of issues, a deeper knowledge of your visitors and better insights into what your customers love about your products.

Pingdom’s Digital Experience Monitoring, Visitor Insights, bridges the gap between site performance and customer satisfaction, meaning you can guess less and know more about how your visitors experience your site.

Read more
1 1 409
Level 9

What are some common problems that can be detected with the handy router logs on Heroku? We’ll explore them and show you how to address them easily and quickly with monitoring of Heroku from SolarWinds Papertrail.

One of the first cloud platforms, Heroku is a popular platform as a service (PaaS) that has been in development since June 2007. It allows developers and DevOps specialists to easily deploy, run, manage, and scale applications written in Ruby, Node.js, Java, Python, Clojure, Scala, Go, and PHP.

To learn more about Heroku, head to the Heroku Architecture documentation.

Intro to Heroku Logs

Logging in Heroku is modular, similar to gathering system performance metrics. Logs are time-stamped events that can come from any of the processes running in all application containers (Dynos), system components, or backing services. Log streams are aggregated and fed into the Logplex—a high-performance, real-time system for log delivery into a single channel.

Run-time activity, as well as dyno restarts and relocations, can be seen in the application logs. This will include logs generated from within application code deployed on Heroku, services like the web server or the database, and the app’s libraries. Scaling, load, and memory usage metrics, among other structural events, can be monitored with system logs. Syslogs collect messages about actions taken by the Heroku platform infrastructure on behalf of your app. These are two of the most recurrent types of logs available on Heroku.

To fetch logs from the command line, we can use the heroku logs command. More details on this command, such as output format, filtering, or ordering logs, can be found in the Logging article of Heroku Devcenter.

$ heroku logs 2019-09-16T15:13:46.677020+00:00 app[web.1]: Processing PostController#list (for 208.39.138.12 at 2010-09-16 15:13:46) [GET] 2018-09-16T15:13:46.677902+00:00 app[web.1]: Rendering post/list 2018-09-16T15:13:46.698234+00:00 app[web.1]: Completed in 74ms (View: 31, DB: 40) | 200 OK [http://myapp.heroku.com/] 2018-09-16T15:13:46.723498+00:00 heroku[router]: at=info method=GET path='/posts' host=myapp.herokuapp.com' fwd='204.204.204.204' dyno=web.1 connect=1ms service=18ms status=200 bytes=975   # © 2018 Salesforce.com. All rights reserved.

Heroku Router Logs

Router logs are a special case of logs that exist somewhere between the app logs and the system logs—and are not fully documented on the Heroku website at the time of writing. They carry information about HTTP routing within Heroku Common Runtime, which manages dynos isolated in a single multi-tenant network. Dynos in this network can only receive connections from the routing layer. These routes are the entry and exit points of all web apps or services running on Heroku dynos.

Tail router only logs with the heroku logs -tp router CLI command.

$ heroku logs -tp router 2018-08-09T06:24:04.621068+00:00 heroku[router]: at=info method=GET path='/db' host=quiet-caverns-75347.herokuapp.com request_id=661528e0-621c-4b3e-8eef-74ca7b6c1713 fwd='104.163.156.140' dyno=web.1 connect=0ms service=17ms status=301 bytes=462 protocol=https 2018-08-09T06:24:04.902528+00:00 heroku[router]: at=info method=GET path='/db/' host=quiet-caverns-75347.herokuapp.com request_id=298914ca-d274-499b-98ed-e5db229899a8 fwd='104.163.156.140' dyno=web.1 connect=1ms service=211ms status=200 bytes=3196 protocol=https 2018-08-09T06:24:05.002308+00:00 heroku[router]: at=info method=GET path='/stylesheets/main.css' host=quiet-caverns-75347.herokuapp.com request_id=43fac3bb-12ea-4dee-b0b0-2344b58f00cf fwd='104.163.156.140' dyno=web.1 connect=0ms service=3ms status=304 bytes=128 protocol=https 2018-08-09T08:37:32.444929+00:00 heroku[router]: at=info method=GET path='/' host=quiet-caverns-75347.herokuapp.com request_id=2bd88856-8448-46eb-a5a8-cb42d73f53e4 fwd='104.163.156.140' dyno=web.1 connect=0ms service=127ms status=200 bytes=7010 protocol=https   # Fig 1. Heroku router logs in the terminal

Heroku routing logs always start with a timestamp and the “heroku[router]” source/component string, and then a specially formatted message. This message begins with either “at=info”, “at=warning”, or “at=error” (log levels), and can contain up to 14 other detailed fields such as:

  • Heroku error “code” (Optional) – For all errors and warning, and some info messages; Heroku-specific error codes that complement the HTTP status codes.
  • Error “desc” (Optional) – Description of the error, paired to the codes above
  • HTTP request “method” e.g. GET or POST – May be related to some issues
  • HTTP request “path” – URL location for the request; useful for knowing where to check on the application code
  • HTTP request “host” – Host header value
  • The Heroku HTTP Request ID – Can be used to correlate router logs to application logs;
  • HTTP request “fwd” – X-Forwarded-For header value;
  • Which “dyno” serviced the request – Useful for troubleshooting specific containers
  • “Connect” time (ms) spent establishing a connection to the web server(s)
  • “Service” time (ms) spent proxying data between the client and the web server(s)
  • HTTP response code or “status” – Quite informative in case of issues;
  • Number of “bytes” transferred in total for this web request;

Common Problems Observed with Router Logs

Examples are manually color-coded in this article. Typical ways to address the issues shown above are also provided for context.

Common HTTP Status Codes

404 Not Found Error

Problem: Error accessing nonexistent paths (regardless of HTTP method):

2018-07-30T17:10:18.998146+00:00 heroku[router]: at=info method=POST path='/saycow' host=heroku-app-log.herokuapp.com request_id=e5634f81-ec54-4a30-9767-bc22365a2610 fwd='187.220.208.152' dyno=web.1 connect=0ms service=15ms status=404 bytes=32757 protocol=https 2018-07-27T22:09:14.229118+00:00 heroku[router]: at=info method=GET path='/irobots.txt' host=heroku-app-log.herokuapp.com request_id=7a32a28b-a304-4ae3-9b1b-60ff28ac5547 fwd='187.220.208.152' dyno=web.1 connect=0ms service=31ms status=404 bytes=32769 protocol=https

Solution: Implement or change those URL paths in the application or add the missing files.

500 Server Error

Problem: There’s a bug in the application:

2018-07-31T16:56:25.885628+00:00 heroku[router]: at=info method=GET path='/' host=heroku-app-log.herokuapp.com request_id=9fb92021-6c91-4b14-9175-873bead194d9 fwd='187.220.247.218' dyno=web.1 connect=0ms service=3ms status=500 bytes=169 protocol=https

Solution: The application logs have to be examined to determine the cause of the internal error in the application’s code. Note that HTTP Request IDs can be used to correlate router logs against the web dyno logs for that same request.

Common Heroku Error Codes

Other problems commonly detected by router logs can be explored in the Heroku Error Codes. Unlike HTTP codes, these error codes are not standard and only exist in the Heroku platform. They give more specific information on what may be producing HTTP errors.

H14 – No web dynos running

Problem: App has no web dynos setup:

2018-07-30T18:34:46.027673+00:00 heroku[router]: at=error code=H14 desc='No web processes running' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=b8aae23b-ff8b-40db-b2be-03464a59cf6a fwd='187.220.208.152' dyno= connect= service= status=503 bytes= protocol=https

Notice that the above case is an actual error message, which includes both Heroku error code H14 and a description. HTTP 503 means “service currently unavailable.”

Note that Heroku router error pages can be customized. These apply only to errors where the app doesn’t respond to a request e.g. 503.

Solution: Use the heroku ps:scale command to start the app’s web server(s).

H12 – Request timeout

Problem: There’s a request timeout (app takes more than 30 seconds to respond):

2018-08-18T07:11:15.487676+00:00 heroku[router]: at=error code=H12 desc='Request timeout' method=GET path='/sleep-30' host=quiet-caverns-75347.herokuapp.com request_id=1a301132-a876-42d4-b6c4-a71f4fe02d05 fwd='189.203.188.236' dyno=web.1 connect=1ms service=30001ms status=503 bytes=0 protocol=https

Error code H12 indicates the app took over 30 seconds to respond to the Heroku router.

Solution: Code that requires more than 30 seconds must run asynchronously (e.g., as a background job) in Heroku. For more info read Request Timeout in the Heroku DevCenter.

H18 – Server Request Interrupted

Problem: The Application encountered too many requests (server overload):

2018-07-31T18:52:54.071892+00:00 heroku[router]: sock=backend at=error code=H18 desc='Server Request Interrupted' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=3a38b360-b9e6-4df4-a764-ef7a2ea59420 fwd='187.220.247.218' dyno=web.1 connect=0ms service=3090ms status=503 bytes= protocol=https

Solution: This problem may indicate that the application needs to be scaled up, or the app performance improved.

H80 – Maintenance mode

Problem: Maintenance mode generates an info router log with error code H18:

2018-07-30T19:07:09.539996+00:00 heroku[router]: at=info code=H80 desc='Maintenance mode' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=1b126dca-1192-4e98-a70f-78317f0d6ad0 fwd='187.220.208.152' dyno= connect= service= status=503 bytes= protocol=https

Solution: Disable maintenance mode with heroku maintenance:off

Papertrail

Papertrail™ is a cloud log management service designed to aggregate Heroku app logs, text log files, and syslogs, among many others, in one place. It helps you to monitor, tail, and search logs via a web browser, command-line, or an API. The Papertrail software analyzes log messages to detect trends, and allows you to react instantly with automated alerts.

The Event Viewer is a live aggregated log tail with auto-scroll, pause, search, and other unique features. Everything in log messages is searchable, and new logs still stream in real time in the event viewer when searched (or otherwise filtered). Note that Papertrail reformats the timestamp and source in its Event Viewer to make it easier to read.

Viewer Live Pause
Fig 2. The Papertrail Event Viewer.

Provisioning Papertrail on your Heroku apps is extremely easy: heroku addons:create papertrail from terminal. (See the Papertrail article in Heroku’s DevCenter for more info.) Once setup, the add-on can be open from the Heroku app’s dashboard (Resources section) or with heroku addons:open papertrail in terminal.

Troubleshooting Routing Problems Using Papertrail

A great way to examine Heroku router logs is by using the Papertrail solution. It’s easy to isolate them in order to filter out all the noise from multiple log sources: simply click on the “heroku/router” program name in any log message, which will automatically search for “program:heroku/router” in the Event Viewer:

Heroku router viewer
Fig 3. Tail of Heroku router logs in Papertrail, 500 app error selected. © 2018 SolarWinds. All rights reserved.

Monitor HTTP 404s

How do you know that your users are finding your content, and that it’s up to date? 404 Not Found errors are what a client receives when the URL’s path is not found. Examples would be a misspelled file name or a missing app route. We want to make sure these types of errors remain uncommon, because otherwise, users are either walking to dead ends or seeing irrelevant content in the app!

With Papertrail, setting up an alert to monitor the amount of 404s returned by your app is easy and convenient. One way to do it is to search for “status=404” in the Event Viewer, and then click on the Save Search button. This will bring up the Save Search popup, along with the Save & Setup Alert option:

Save a search
Fig 4. Save a log search and set up an alert with a single action © 2018 SolarWinds. All rights reserved.

The following screen will give us the alert delivery options, such as email, Slack message, push notifications, or even publish all matching events as a custom metric for application performance management tools such as AppOptics™.

Troubleshoot 500 errors quickly

500 error on Heroku
Fig 5. HTTP 500 Internal Server Error from herokuapp.com. © 2018 Google LLC. All rights reserved.

Let’s say an HTTP 500 error is happening on your app after it’s deployed. A great feature of Papertrail is to make the request_id in log messages clickable. Simply click on it or copy it and search it in the Event Viewer to find all the app logs that are causing the internal problem, along with the detailed error message from your application’s code.

Conclusion

Heroku router logs are the glue between web traffic and (sometimes intangible) errors in your application code. It makes sense to give them special focus when monitoring a wide range of issues because they often indicate customer-facing problems that we want to avoid or address ASAP. Add the Papertrail addon to Heroku to get more powerful ways to monitor router logs.

Sign up for a 30-day free trial of Papertrail and start aggregating logs from all your Heroku apps and other sources. You may learn more about the Papertrail advanced features in its Heroku Dev Center article.

Read more
0 0 363
Product Manager
Product Manager

Page load time is inversely related to page views and conversion rates. While probably not a controversial statement, as the causality is intuitive, there is empirical data from industry leaders such as Amazon, Google, and Bing to back it in High Scalability and O’Reilly’s Radar, for example.

As web technology has become much more complex over the last decade, the issue of performance has remained a challenge as it relates to user experience. Fast forward to 2018, and UX is identified as a key requirement for business success by CIOs and CDOs.

In today’s growing ecosystem of competing web services, the undeniable reality remains that performance impacts business and it can represent a major competitive (dis)advantage. Whether your application relies on AWS, Azure, Heroku, Salesforce, Cloud Foundry, or any other SaaS platform, consider these five tips for monitoring SaaS services.

1. Realize the Importance of Monitoring

In case we haven’t established that app performance is critical for business success, let’s look at research done in the online retail sector.

“E-commerce sites must adopt a zero-tolerance policy for any performance issues that will impact customer experience [in order to remain competitive]” according to Retail Systems Research. Their conclusion is that performance management must shift from being considered an IT issue to being a business matter.

We can take this concept into more specific terms, as stated in our article series on Building a SaaS Service for an Unknown Scale. “Treat scalability and reliability as product features; this is the only way we can build a world-class SaaS application for unknown scale.”

LG ProactiveMonitoringSaaS BlogImage A
Data from Measuring the Business Impact of IT Through Application Performance (2015).

End users have come to expect very fast, real-time-like interaction with most software, regardless of the system complexities behind the scenes. This means that commercial applications and SaaS services need to be built and integrated with performance in mind at all times. And so, knowing how to measure their performance from day one is paramount. Logs extend application performance monitoring (APM) by giving you deeper insights into the causes of performance problems as well as application errors that can cause user experience problems.

2. Incorporate a Monitoring Strategy Early On

In today’s world, planning for your SaaS service’s successful adoption to take time (and thus worrying about its performance and UX later) is like selling 100 tickets to a party but only beginning preparations on the day of the event. Needless to say, such a plan is prone to produce disappointed customers, and it can even destroy a brand. Fortunately, with SaaS monitoring solutions like SolarWinds® Loggly®, it’s not time-consuming or expensive to implement monitoring.

In fact, letting scalability become a bottleneck is the first of Six Critical SaaS Engineering Mistakes to Avoid we published some time ago. We recommend defining realistic adoption goals and scenarios in early project stages, and to map them into performance, stress, and capacity testing. To realize these tests, you’ll need to be able to monitor specific app traffic, errors, user engagement, and other metrics that tech and business teams need to define together.

A good place to start is with the Four Golden Signals described by Google’s Monitoring Distributed Systems book chapter: Latency, Traffic, Errors, and Saturation. Finally, and most importantly from the business perspective, your key metrics can be used as service level indicators (SLI), which are measures of the service level provided to customers.

Based on your SLIs and adoption goals, you’ll be able to establish service level objectives (SLOs) so your ops team can target specific availability levels (uptime and performance). And, as a SaaS service provider, you should plan to offer service level agreement (SLA). SLAs are contracts with your clients that specify what happens if you fail to meet non-functional requirements, and the terms are based on your SLOs, but can be negotiated with each client, of course. SLIs, SLOs, and SLAs are the basis for successful site reliability engineering (SRE).

LG ProactiveMonitoringSaaS BlogImage B
Apache Preconfigured Dashboards in Loggly can help you watch SLOs in a single click.

For a seamless understanding among tech and business leadership, key performance indicators (KPI) should be identified for various business stakeholders. KPIs should then be mapped to the performance metrics that compose each SLA (so they can be monitored). Defining a matrix of KPI vs. metrics vs. area of business impact as part of the business documentation is a good option. For example, a web conversion rate could map to page load time and number of outages, and impacts sales.

Finally, don’t forget to consider and plan for governance: roles and responsibilities around information (e.g., ownership, prioritization, and escalation rules). The RACI model can help you establish a clear matrix of which team is responsible, accountable, consulted, and informed when there are unplanned events emanating from or affecting business technology.

3. Have Application Logging as a Code Standard

Tech leadership should realize that the main function of logging begins after the initial development is complete. Good logging serves multiple purposes:

  1. Improving debugging during development iterations
  2. Providing visibility for tuning and optimizing complex processes
  3. Understanding and addressing failures of production systems
  4. Business intelligence

“The best SaaS companies are engineered to be data-driven, and there’s no better place to start than leveraging data in your logs.” (From the last of our SaaS Engineering Mistakes)

Best practices for logging is a topic that’s been widely written about. For example, see our article on best practices for creating logs. Here are a few guidelines from that and other sources:

  • Define logging goals and criteria to decide what to log. (Logging absolutely everything produces noise and is needlessly expensive.)
  • Log messages should contain data, context, and description. They need to be digestible (structured in a way that both humans and machines can read them).
  • Ensure that log messages are appropriate in severity using standard levels such as FATAL, ERROR, WARN, INFO, DEBUG, TRACE (See also Syslog facilities and levels).
  • Avoid side effects on the code execution. Particularly, don’t let logging halt your app by using non-blocking calls.
  • External systems: try logging all data that comes out from your application and gets in.
  • Use a standard log message format with clear key-value pairs and/or consider a known text standard format like JSON. (See figure 4 below.)
  • Support distributed logging: Centralize logs to a shareable, searchable platform such as Loggly.

Some of our sources include:

LG ProactiveMonitoringSaaS BlogImage C
Loggly automatically parses several log formats you can navigate with the Fields Explorer.

Every stage in the software development life cycle can be enriched by logs and other metrics. Implementation, integration, staging, and production deployment (especially rolling deploys) will particularly benefit from monitoring such metrics appropriately.

Logs constitute valuable data for your tech team, and invaluable data for your business. Now that you have rich information about the app that is generated in real-time, think about ways to put it in good use.

4. Automate Your Monitoring Configuration

Modern applications are deployed using infrastructure as code (IaC) techniques because they replace fragile server configuration with systems that can be easily torn down and restarted. If your team has made undocumented changes to servers and are too scared to shut them down, they are essentially “pet” servers.

If you manually deploy monitoring configuration on a per-server basis, then you have the potential to lose visibility when servers stop or when you add new ones. If you treat monitoring as something to be automatically deployed and configured, then you’ll get better coverage for less effort in the long run. This becomes even more important when testing new versions of your infrastructure or code, and when recovering from outages. Tools like Terraform, Ansible, Puppet, and CloudFormation can automate not just the deployment of your application but the monitoring of it as well.

Monitoring tools typically have system agents that can be installed on your infrastructure to begin streaming metrics into their service. In the case of applications built on SaaS platforms, there are convenient integrations that plug into well-known ecosystems. For example, Loggly streams and centralizes logs as metrics, and supports dozens of out-of-box systems, including the Amazon Cloudwatch and Heroku PaaS platforms.

5. Use Alerts on Your Key Metrics

Monitoring solutions like Loggly can alert you in changes in your SLIs over time, such as your error rate. It can help you visually identify the types of errors that occur and when they start. This will help identify root causes and fix problems faster, minimizing impact to user experience.

LG ProactiveMonitoringSaaS BlogImage D
Loggly Chart of application errors split by errorCode.

Custom alerts can be created from saved log searches, which act as key metrics of your application’s performance. Loggly even lets you integrate alerts to incident management systems like PagerDuty and OpsGenie.

LG ProactiveMonitoringSaaS BlogImage E
Adding an alert from a Syslog error log search in Loggly.

In conclusion, monitoring your SaaS service performance is very important because it significantly impacts your business’ bottom line. This monitoring has to be planned for, applied early on, and instrumented for all the stages in the SDLC.

Additionally, we explained how and why correct logging is one of the best sources for key metrics to measure your monitoring goals during development and production of your SaaS service. Proper logging on an easy-to-use platform such as Loggly will also help your business harness invaluable intel in real time. You can leverage these streams of information for tuning your app, improving your service, and to discover new revenue models.

Sign up for a free 14-day trial of SolarWinds Loggly to start doing logging right today, and move your SaaS business into the next level of performance control and business intelligence.

Read more
0 0 298
Level 9

Let’s dream for a while—imagine your databases. All of them are running fast and smooth. There’s no critical issue, no warnings. All requests are handled immediately and the response time is literally immeasurable. Sounds like a database nirvana, doesn’t it? Now, let’s face reality. You resolved all critical issues of the database, but people still report slowdowns. Everything looks good at a first glance, but your sixth sense tells you something bad is happening under the surface. You could start shooting in the dark and hope that you will hit the target, or you need more information about what’s going inside the database to make a single, surgically precise cut to solve the problem.

We’ve got good news for you. SolarWinds has a new tool called SQL Plan Warnings. For the first time, you can inspect the list of queries that have warnings without spending hours on manual and labor-intensive work. Oh, and we almost forgot to mention—this tool is available for you right now for free.

Why do we believe that the free SQL Plan Warnings tool can help you improve your databases? Well, SQL Server Optimizer often comes up with bad plans with warnings. That can cause increased resource consumption, increased wait time, and unnecessary end-user or customer angst. For these reasons, a database professional should look at it. But we don’t always have time or resources to do so.

SQL Plan Warnings free tool at a glance:

  • Gives you unique visibility into plan warnings that can be easily overlooked and can affect query performance
  • Sort all warnings by consumed CPU time, elapsed time, or execution
  • Filter results by warning type or by specific keywords
  • Investigate plan warnings, query text, or complete query plan in a single click
  • No installation is needed—just download the tool and run
  • Runson Microsoft Windows and MacOS X

And what can SQL Plan Warnings check for you?

  • Spill to TempDB – Sorts that are larger than estimated can spill to disk via TempDB. This can dramatically slow down queries. There are two similar warnings that fall into this category.
  • No join predicates – Query does not properly join tables/objects, which can cause Cartesian products and slow queries.
  • Implicit conversion – A column of data is being converted to another data type, which can cause a query to not use an index.
  • Missing indexes – SQL Server is telling us there is an index that may help performance.
  • Missing column statistics – If statistics are missing, it can lead to bad decisions by the optimizer.
  • Lookup warning – An index is being used, but it's not covering an index, and a visit back to the table is required to complete the query.

The free SQL Plan Warnings tool brings a fresh new feature to your database management capabilities and gives you another tool to improve query performance. Download it here today and be another step closer to our dream—everything in a database running fast and smooth with no critical issues and no warnings.

Read more
2 4 2,280
Product Manager
Product Manager

When development started on NGINX in 2002, the goal was to develop a web server which would be more performant than Apache had been up to that point. While NGINX may not offer all of the features available in Apache, its default configuration can handle approximately four times the number of requests per second while using significantly less memory.

While switching to a web server with better performance seems like a no-brainer, it’s important that you have a monitoring solution in place to ensure that your web server is performing optimally, and that users who are visiting the NGINX-hosted site receive the best possible experience. But how do we ensure that the experience is as performant as expected for all users?

Monitoring!

This article is meant to assist you in putting together a monitoring plan for your NGINX deployments. We’ll look at what metrics you should be monitoring, why they are important, and putting a monitoring plan in place using SolarWinds® AppOptics™.

Monitoring is a Priority

As engineers, we all understand and appreciate the value that monitoring provides. In the age of DevOps, however, when engineers are responsible for both the engineering and deployment of solutions into a production environment, monitoring is often relegated to the list of things we plan to do in the future. In order to be the best engineers we can be, monitoring should be the priority from day one.

Accurate and effective monitoring allows us to test the efficiency of our solutions, and help identify and troubleshoot inefficiencies and other potential problems. Once the solution has moved to requiring operational support, monitoring allows us to ensure that the application is running efficiently and alerting us when things go wrong. An effective monitoring plan should help to identify problems before they start, allowing engineers to resolve issues proactively, instead of being purely reactive.

Specific Metrics to Consider with NGINX

Before we can develop a monitoring plan, we need to know what metrics are available for monitoring, understand what they mean, and how we can use them. There are two distinct groups of metrics we should be concerned with—metrics related to the web server itself, and those related to the underlying infrastructure.

While a highly performant web server like NGINX may be able to handle more requests and traffic, it is vital that the machine hosting the web server has the necessary resources as well. Each metric represents a potential limit to the performance of your application. Ultimately, you want to ensure your web server and underlying infrastructure are able to operate efficiently without approaching those limits.

NGINX Web Server-specific Metrics

  • Current Connections
    Indicates the number of active and waiting client connections with the server. This may include actual users and automated tasks or bots.
  • Current Requests
    Each connection may be making one or more requests to the server. This number indicates the total count of requests coming in.
  • Connections Processed
    This shows the number of connections that have been accepted and handled by the server. Dropped connections can also be monitored.

Infrastructure-specific Metrics

  • CPU Usage
    An indication of the processing usage of the underlying machine. This should be measured as utilization across all cores, if using a multi-core machine.
  • Memory Usage
    Measurement of the memory currently in use on the machine.
  • Swap Usage
    Swap is what the host machine uses when it runs out of memory or if the memory region has been unused for a period of time. It is significantly slower, and is generally only used in an emergency. When an application begins using swap space, it’s usually an indicator that something is amiss.
  • Network Bandwidth
    Similar to traffic, this is a measurement of information flowing in and out of the machine. Again, load units are important to monitor here as well.
  • Disk Usage
    Even if the web server is not physically storing files on the host machine, space is required for logging, temporary files, and other supporting files.
  • Load
    Load is a performance metric which combines many of the other metrics into a simple number. A common rule of thumb is the load on the machine should be less than the number of processing cores.

Let’s look at how to configure monitoring on your instances with AppOptics, along with building a dashboard which will show each of those metrics.

Installing the AppOptics Agent on the Server

Before you start, you’ll need an account with AppOptics. If you don’t already have one, you can create a demo account, which will give you 14 days to try the service, free of charge.

The first thing to do to allow AppOptics to aggregate the metrics from the server is install the agent on all instances. To do this, you’ll need to reference your AppOptics API token when setting up the agent. Log in to your AppOptics account and navigate to the Infrastructure page.

Locate the Add Host button, and click on it. It should look similar to the image below.

Fig. 2. AppOptics Host Agent Installation

I used the Easy Install option when setting up the instances for this article. Ensure that Easy Install is selected, and select your Linux distribution. I used an Ubuntu image in the AWS Cloud, but this will work on almost any Linux server.

Note: Prior to installation of the agent, the bottom of the dialog below will not contain the success message.

Copy the command from the first box, and then SSH into the server and run the Easy Install script.

Fig. 3. Easy Install Script to Add AppOptics Agent to a Server

When the agent installs successfully, you should be presented with the following message on your terminal. The “Confirm successful installation” box on the AppOptics agent screen should look similar to the above, with a white on blue checkbox. You should also see “Agent connected.”

Fig. 4. Installing the AppOptics Agent on your NGINX Instance

Configuring the AppOptics Agent

With the agent installed, the next step is to configure NGINX to report metrics to the agent. Navigate back to the Infrastructure page, Integrations tab, and locate the NGINX plugin.

Note: Prior to enabling the integration, the “enabled” checkbox won’t be marked.

Fig. 5. NGINX Host Agent Plugin

Click on the plugin, and the following panel will appear. Follow the instructions in the panel, click Enable Plugin, and your metrics will start flowing from the server into AppOptics.

Fig. 6. NGINX Plugin Setup

When everything is configured, either click on the NGINX link in the panel’s Dashboard tab, or navigate to the Dashboards page directly, then select the NGINX link to view the default dashboard provided by AppOptics.

Working With the NGINX Dashboard

The default NGINX dashboard provided by AppOptics offers many metrics related to the performance of the web server that we discussed earlier and should look similar to the image below.

Fig. 8. Default AppOptics Dashboard

Now we need to add some additional metrics to get a full picture of the performance of our server. Unfortunately, you can’t make changes to the default dashboard, but it’s easy to create a copy and add metrics of your own. Start by clicking the Copy Dashboard button at the top of the screen to create a copy.

Create a name for your custom dashboard. For this example, I’m monitoring an application called Retwis, so I’m calling mine “NGNIX-Retwis.” It’s also helpful to select the “Open dashboard on completion” option, so you don’t have to go looking for the dashboard after it’s created.

Let’s do some customization. First, we want to ensure that we’re only monitoring the instances we need to. We do this by filtering the chart or dashboard. You can find out more about how to set and filter these in the documentation for Dynamic Tags.

With our sources filtered, we can add some additional metrics. Let’s look at CPU Usage, Memory Usage, and Load. Click on the Plus button located  at the bottom right of the dashboard. For CPU and Memory Usage, let’s add a Stacked chart. We’ll add one for each. Click on the Stacked icon.

Fig. 10. Create New Chart

In the Metrics search box, type “CPU” and hit enter. A selection of available metrics will appear below. I’m going to select system.cpu.utilization, but your selection may be different depending on the infrastructure you’re using. Select the checkbox next to the appropriate metric, then click Add Metrics to Chart. You can add multiple metrics to the chart by repeating the same process, but we’ll stick with one for now.

If you click on Chart Attributes, you can change the scale of the chart, adjust the Y-axis label, and even link it to another dashboard to show more detail for a specific metric. When you’re done, click on the green Save button, and you’ll be returned to your dashboard, with the new chart added. Repeat this for Memory Usage. I chose the “system.mem.used” metric.

For load, I’m going to use a Big Number Chart Type, and select the system.load.1_rel metric. When you’re done, your chart should look similar to what is shown below.

Fig. 11. Custom Dashboard to View NGINX Metrics

Pro tip: You can move charts around by hovering over a chart, clicking on the three dots that appear at the top of the chart, and dragging it around. Clicking on the menu icon on the top right of the chart will allow you to edit, delete, and choose other options related to the chart.

Beyond Monitoring

Once you have a monitoring plan in place and functioning, the next step is to determine baseline metrics for your application and set up alerts which will be triggered when significant deviations occur. Traffic is a useful baseline to determine and monitor. A significant reduction in traffic may indicate a problem that is preventing clients from accessing the service. A significant increase in traffic would indicate an increase in clients, and may require either an increase in the capacity of your environment (in the case of increased popularity), or, potentially, the deployment of defensive measures in response to a cyberattack.

Monitoring your NGINX server is critical as a customer-facing part of your infrastructure. You need to know immediately when there is a sudden change in traffic or connections that could impact the rest of your application or website. AppOptics provides an easy way to monitor your NGINX servers and it typically only takes a few minutes to get started. Learn more about AppOptics infrastructure monitoring and try it today with a free 14-day trial.

Read more
0 0 913
Product Manager
Product Manager

Kubernetes is a container orchestrator that provides a robust, dynamic environment for reliable applications. Maintaining a Kubernetes cluster requires proactive maintenance and monitoring to help prevent and diagnose issues that occur in clusters. While you can expect a typical Kubernetes cluster to be stable most of the time, like all software, issues can occur in production. Fortunately, Kubernetes insulates us against most of these issues with its ability to reschedule workloads, and just replacing nodes when issues occur. When cloud providers have availability zone outages, or are in constrained environments such as bare metal, being able to debug and successfully resolve problems in our nodes is still an important skill to have.

In this article, we will use SolarWinds® AppOptics tracing to diagnose some latency issues with applications running on Kubernetes. AppOptics is a next-generation application performance monitoring (APM) and infrastructure monitoring solution. We’ll use it’s trace latency on requests to our Kubernetes pods to identify problems in the network stack.

The Kubernetes Networking Stack

Networking in Kubernetes has several components and can be complex for beginners. To be successful in debugging Kubernetes clusters, we need to understand all of the parts.

Pods are the scheduling primitives in Kubernetes. Each pod is composed of multiple containers that can optionally expose ports. However, because pods may share the same host on the same ports, workloads must be scheduled in a way that ensures ports do not conflict with each other on a single machine. To solve this problem, Kubernetes uses a network overlay. In this model, pods get their own virtual IP addresses to allow different pods to listen to the same port on the same machine.

This diagram shows the relationship between pods and network overlays. Here we have two nodes, each running two pods, all connected to each other via a network overlay. The overlay assigned each of these pods an IP and can listen on the same port despite conflicts they (is the “they” referring to the pods or the overlay? If it’s the pods please replace “they” with “pods” and if it’s the overlay, “they” should be changed to “it” would have listening at the host level. Network traffic, shown by the arrow connecting pods B and C, is facilitated by the network overlay and pods do not have knowledge about the host’s networking stack.

Having pods on a virtualized network solves significant issues with providing dynamically scheduled networked workloads. However, these virtual IPs are randomly assigned. This presents a problem for any service or DNS record relying on these pod IPs. Services fixes this by providing a stable virtual IP frontend to these pods. These services maintain a list of backend pods and load balances across them. The kube-proxy component routes requests for these service IPs from anywhere in the cluster.

This diagram differs slightly from the last one. Although pods may still be running on node 1, we omitted them from this diagram for clarity. We defined a service A that is exposed on port 80 on our hosts. When a request is made, it is accepted by the kube-proxy component and forwarded onto pod A1 or A2, which then handles the request. Although the service is exposed to the host, it is also given its own service IP on a separate CIDR from the pod network and can be accessed from within the cluster as well on that IP.

The network overlay in Kubernetes is a pluggable component. Any provider that implements the Container Networking Interface APIs can be used as a network overlay, and these overlay providers can be chosen based on the features and performance required. In most environments, you will see overlay networks ranging from the cloud provider’s (such as Google Kubernetes Engine or Amazon Elastic Kubernetes) to operator-managed solutions such as flannel or Calico. Calico is a network policy engine that happens to include a network overlay. Alternatively, you can disable the built-in network overlay and use it to implement network policy on other overlays such as a cloud provider’s or flannel. This is used to enforce pod and service isolation, a requirement of most secure environments.

Troubleshooting Application Latency Issues

Now that we have a basic understanding of how networking works in Kubernetes, let’s look at an example scenario. We’ll focus on an example where a networking latency issue led to a network blockage. We’ll show you how to identify the cause of the problem and fix it.

To demonstrate this example, we’ll start by setting up a simple two-tier application representing a typical microservice stack. This gives us network traffic inside a Kubernetes cluster, so we can introduce issues with it that we can later debug and fix. It is made up of a web component and an API component that do not have any known bugs and correctly serve traffic.

These applications are written in the Go Programming Language and are using the AppOptics agent for Go. If you’re not familiar with Go, the “main” function is the entry point of our application and is at the bottom of our web tier’s file. It listens on the base path (“/”) and calls out to our API tier using the URL defined on line 13. The response from our API tier is written to an HTML template and displayed to the user. For brevity’s sake, error handling, middleware, and other good Go development practices are omitted from this snippet.

package main
import (
          "context"
         
"html/template"
         
"io/ioutil"
          "log"
         
"net/http"

          "github.com/appoptics/appoptics-apm-go/v1/ao"

)

const
url = "http://apitier.default.svc.cluster.local"

func
handler(w http.ResponseWriter, r *http.Request)
{
      const tpl = `
<html>
  <head>
  <meta charset="UTF-8">
    <title>My Application</title>
  </head>
  <body>
    <h1>{{.Body}}</h1>
  </body>
</html>  `

      t, w, r := ao.TraceFromHTTPRequestResponse("webtier", w, r)
      defer t.End()
      ctx := ao.NewContext(context.Background(), t)

      httpClient := &http.Client{}
      httpReq, _ := http.NewRequest("GET", url, nil)

      l := ao.BeginHTTPClientSpan(ctx, httpReq)
      resp, err := httpClient.Do(httpReq)
      defer resp.Body.Close()
      l.AddHTTPResponse(resp, err)
      l.End()

      body, _ := ioutil.ReadAll(resp.Body)
      template, _ := template.New("homepage").Parse(tpl)

      data := struct {
              Body string
     
}{
      Body: string(body),
      }

      template.Execute(w, data)
}

func
main()
{
      http.HandleFunc("/", ao.HTTPHandler(handler))
      http.ListenAndServe(":8800", nil)
}

Our API tier code is simple. Much like the web tier, it serves requests from the base path (“/”), but only returns a string of text. As part of this code, we propagate the context of any traces requested to this application with the name “apitier”. This sets our application up for end to end distributed tracing.

package main

import (
      "context"
     
"fmt"
     
"net/http"
     
"time"

     
"github.com/appoptics/appoptics-apm-go/v1/ao"
)

func query() {
      time.Sleep(2 * time.Millisecond)
}

func handler(w http.ResponseWriter, r *http.Request) {
      t, w, r := ao.TraceFromHTTPRequestResponse("apitier", w, r)
      defer t.End()

      ctx := ao.NewContext(context.Background(), t)
      parentSpan, _ := ao.BeginSpan(ctx, "api-handler")
      defer parentSpan.End()

      span := parentSpan.BeginSpan("fast-query")
      query()
      span.End()

      fmt.Fprintf(w, "Hello, from the API tier!")
}

func main() {
      http.HandleFunc("/", ao.HTTPHandler(handler))
      http.ListenAndServe(":8801", nil)
}

When deployed on Kubernetes and accessed from the command line, these services look like this:

Copyright: Kubernetes®

This application is being served a steady stream of traffic. Because the AppOptics APM agent is turned on and tracing is being used, we can see a breakdown of these requests and the time spent in each component, including distributed services. From the web tier component’s APM page, we can see the following graph:

This view is telling us the majority of our time is spent in our API tier, with a brief amount of time spent in the web tier serving this traffic. However, we have an extra “remote calls” section. This section represents untraced time between the API tier and web tier. For a Kubernetes cluster, this includes our kube-proxy, network overlay, or proxies that have not had tracing added to them. This makes up 1.65ms of our request for a normal request, which for this environment adds an insignificant overhead, so we can use this as our “healthy” benchmark for this cluster.

Now we will simulate a failure in the networking overlay layer. Using a tool satirically named Comcast, we can simulate adverse network conditions. This tool uses iptables and the traffic control (tc) utility, standard Linux utilities for managing network environments, under the hood. Our test cluster is using Calico as the network overlay and exposes a tunl0 interface. This is a custom, local tunnel Calico uses to bridge all network traffic to both implement the network overlay between machines and enforce policy. We only want to simulate a failure at the network overlay, so we use it as the device, and inject 500ms of latency with a maximum bandwidth of 50kbps and minor packet loss.

Our continuous traffic testing is still running. After a few minutes of new requests, our AppOptics APM graph looks very different:

While our application time and tracing-api-tier remained consistent, our remote calls time jumped significantly. We’re now spending 6-20 seconds of our request time just traversing the network stack. Thanks to tracing, it’s clear that this application is operating as expected and the problem is in another part of our stack. We also have the AppOptics Agent for Kubernetes and Integration of CloudWatch running on this cluster, so we can look at the host metrics to find more symptoms of the problem:

Our network graph suddenly starts reporting much more traffic, and then stops reporting entirely. This could be a symptom of our network stack handling a great deal of requests into our host on the standard interface (eth0), queueing at the Calico tunnel, and then overflowing and preventing any more network traffic from accessing the machine until existing requests time out. This aggregate view of all traffic moving inside of our host is deceptive since it’s counting every byte passing through internal as well as external interfaces, which explains our extra traffic.

We still have the problem where the agent stops reporting. Because the default pods use the network overlay, the agent reporting back to AppOptics suffers from the same problem our API tier is having. As part of recovering this application and helping prevent this issue from happening again, we would move the AppOptics agent off of the network overlay and use the host network.

Even with our host agent either delayed or not reporting at all, we still have the AppOptics CloudWatch metrics for this host turned on, and can get the AWS view of the networking stack on this machine:

In this graph we see that at the start of the event traffic becomes choppy, but is generally fixed between 50Kb/s out on normal operation all the way up to 250Kb/s. This could be our bandwidth limits and packet loss settings causing bursts of traffic out. In any case, there’s a massive discrepancy between the networking inside of our Kubernetes cluster and outside of it, which points us to problems with our overlay stack. From here, we would move the node out of service, let Kubernetes automatically schedule our workloads onto other hosts, and proceed with host-level network debugging, like looking at our iptables settings, checking flow logs, and the health of our overlay components.

Once we remove these rules to clear the network issue, and our traffic quickly returns to normal.

The latency drops to such a small value, and it’s no longer visible on the graph after 8:05:

Next Steps

Hopefully now you are much more familiar with how the networking stack works on Kubernetes and how to identify problems. A monitoring solution like AppOptics APM can help you monitor the availability of service and troubleshoot problems faster. A small amount of tracing in your application goes a long way in identifying components of your systems that are having latency issues.

Read more
1 0 537
Level 10

Version 1.1 of the venerable HTTP protocol powered the web for 18 years.

Since then, websites have emerged from static, text-driven documents to interactive, media-rich applications. The fact that the underlying protocol remained unchanged throughout this time just goes to show how versatile and capable it is. But as the web grew bigger, its limitations became more obvious.

We needed a replacement, and we needed it soon.

Enter HTTP/2. Published in early 2015, HTTP/2 optimizes website connections without changing the existing application semantics. This means you can take advantage of HTTP/2’s features such as improved performance, updated error handling, reduced latency, and lower overhead without changing your web applications.

Today nearly 84% of modern browsers and 27% of all websites support HTTP/2, and those numbers are gradually increasing.

How is HTTP/2 Different from HTTP/1.1?

HTTP/2’s biggest changes impact the way data is formatted and transported between clients and servers.

Binary Data Format

HTTP/2 encapsulates data using a binary protocol. With HTTP/1.1, messages are transmitted in plaintext. This makes requests and responses easy to format and even read using a packet analysis tool, but results in increased size due to unnecessary whitespace and inefficient compression.

The benefit of a binary protocol is it allows for more compact, more easily compressible, and less error-prone transmissions.

Persistent TCP Connections

In early versions of HTTP, a new TCP connection had to be created for each request and response. HTTP/1.1 introduced persistent connections, allowing multiple requests and responses over a single connection. The problem was that messages were exchanged sequentially, with web servers refusing to accept new requests until previous requests were fulfilled.

HTTP/2 simplifies this by allowing for multiple simultaneous downloads over a single TCP connection. After a connection is established, clients can send new requests while receiving responses to earlier requests. Not only does this reduce the latency in establishing new connections, but servers no longer need to maintain multiple connections to the same clients.

Multiplexing

Persistent TCP connections paved the way for multiplexed transfers. With HTTP/2, multiple resources can be transferred simultaneously. Clients no longer need to wait for earlier resources to finish downloading before the next one begins. Website developers used workarounds such as domain sharding to “trick” browsers into opening multiple connections to a single host; however, this led to browsers opening multiple TCP connections. HTTP/2 makes this entire practice obsolete.

Header Compression and Reuse

In HTTP/1.1, headers are incompressible and repeated for each request. As the number of requests grows, so does the volume of duplicate header information. HTTP/2 eliminates redundant headers and compresses the remaining headers to drastically decrease the amount of data repeated during a session.

Server Push

Instead of waiting for clients to request resources, servers can now push resources. This allows websites to preemptively send content to users, minimizing wait times.

Does My Site Already Support HTTP/2?

Several major web servers and content delivery networks (CDNs) support HTTP/2. The fastest way to check if your website supports HTTP/2 is to navigate to the website in your browser and open Developer Tools. In Firefox and Chrome, press Ctrl-Shift-I or the F12 key and click the Network tab. Reload the page to populate the table with a list of responses. Right-click the column names in the table and enable the “Protocol” header. This column will show HTTP/2.0 in Firefox or h2 in Chrome if HTTP/2 is supported, or HTTP/1.1 if it’s not.

What is HTTP/2, and Will It Really Make Your Site Faster?
The network tab after loading 8bitbuddhism.com©. The website fully supports HTTP/2 as shown in the Protocol column.

Alternatively, KeyCDN provides a web-based HTTP/2 test tool. Enter the URL of the website you want to test, and the tool will report back on whether it supports HTTP/2.

How Do I Enable HTTP/2 on Nginx?

As of version 1.9.5, Nginx fully supports HTTP/2 via the ngx_http_v2 module. This module comes included in the pre-built packages for Linux and Windows. When building Nginx from source, you will need to enable this module by adding –with-http_v2_module as a configuration parameter.

You can enable HTTP/2 for individual server blocks. To do so, add http2 to the listen directive. For example, a simple Nginx configuration would look like this:

# nginx.conf
server {
listen 443 ssl http2;
server_name mywebsite.com;

root /var/www/html/mywebsite;
}

Although HTTP/2 was originally intended to require SSL, you can use it without SSL enabled. To apply the changes, reload the Nginx service using:

$ sudo service nginx reload

or by invoking the Nginx CLI using:

$ sudo /usr/sbin/nginx -s reload

Benchmarking HTTP/2

To measure the speed difference between HTTP/2 and HTTP/1.1, we ran a performance test on a WordPress site with and without HTTP/2 enabled. The site was hosted on a Google Compute Engine instance with 1 virtual CPU and 1.7 GB of memory. We installed WordPress 4.9.6 in Ubuntu 16.04.4 using PHP 7.0.30, MySQL 5.7.22, and Nginx 1.10.3.

To perform the test, we created a recurring page speed check in SolarWinds®Pingdom® to contact the site every 30 minutes. After four measurements, we restarted the Nginx server with HTTP/2 enabled and repeated the process. We then dropped the first measurement for each test (to allow Nginx to warm up), averaged the results, and took a screenshot of the final test’s Timeline.

The metrics we measured were:
  • Page size: the total combined size of all downloaded resources.
  • Load time: the time until the page finished loading completely.

Results Using HTTP/1.1

What is HTTP/2, and Will It Really Make Your Site Faster?
Timeline using HTTP/1.1

Results Using HTTP/2

What is HTTP/2, and Will It Really Make Your Site Faster?
Timeline using HTTP/2

And the Winner Is…

With just a simple change to the server configuration, the website performs noticeably better over HTTP/2 than HTTP/1.1. The page load time dropped by over 13% thanks to fewer TCP connections, resulting in a lower time to first byte. As a result of only using two TCP connections instead of four, we also reduced the time spent performing TLS handshakes. There was also a minor drop in overall file size due to HTTP/2’s more efficient binary data format.

Conclusion

HTTP/2 is already proving to be a worthy successor to HTTP/1.1. A large number of projects have implemented it and, with the exception of Opera Mini and UC for Android, mainstream browsers already support it. Whether it can handle the next 18 years of web evolution has yet to be seen, but for now, it’s given the web a much-needed performance boost.

You can try this same test on your own website using the Pingdom page speed check. Running the page speed check will show you the size and load time of every element. With this data you can tune and optimize your website, and track changes over time.

Read more
1 2 525
Level 9

DevOps engineers wishing to troubleshoot Kubernetes applications can turn to log messages to pinpoint the cause of errors and their impact on the rest of the cluster. When troubleshooting a running application, engineers need real-time access to logs generated across multiple components.

Collecting live streaming log data lets engineers:

  • Review container and pod activity
  • Monitor the result of actions, such as creating or modifying a deployment
  • Understand the interactions between containers, pods, and Kubernetes
  • Monitor ingress resources and requests
  • Troubleshoot errors and watch for new or recurring problems

The challenge that engineers face is accessing comprehensive, live streams of Kubernetes log data. While some solutions exist today, these are limited in their ability to live tail logs or tail multiple logs. In this article, we’ll present an all-in-one solution for live tailing your Kubernetes logs, no matter the size or complexity of your cluster.

The Limitations of Current Logging Solutions

When interacting with Kubernetes logs, engineers frequently use two solutions: the Kubernetes command line interface (CLI), or the Elastic Stack.

The Kubernetes CLI (kubectl) is an interactive tool for managing Kubernetes clusters. The default logging tool is the command (kubectl logs) for retrieving logs from a specific pod or container. Running this command with the --follow flag streams logs from the specified resource, allowing you to live tail its logs from your terminal.

For example, let’s deploy a Nginx pod under the deployment name papertrail-demo. Using kubectl logs --follow [Pod name], we can view logs from the pod in real time:

$ kubectl logs --follow papertrail-demo-76bf4969df-9gs5w 10.1.1.1 - - [04/Jan/2019:22:42:11 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0" "-"

The main limitation of kubectl logs is that it only supports individual Pods. If we deployed two Nginx pod replicas instead of one, we would need to tail each pod separately. For large deployments, this could involve dozens or hundreds of separate kubectl logs instances.

The Elastic Stack (previously the ELK Stack) is a popular open-source log management solution. Although it can ingest and display log data using a web-based user interface, unfortunately, it doesn’t offer support for live tailing logs.

What is Papertrail, and How Does It Help?

SolarWinds® Papertrail is a cloud-hosted log management solution that lets you live tail your logs from a central location. Using Papertrail, you can view real-time log events from your entire Kubernetes cluster in a single browser window.

When a log event is sent from Kubernetes to Papertrail, Papertrail records the log’s contents along with its timestamp and origin pod. You can view these logs in a continuous stream in your browser using the Papertrail Event Viewer, as well as the Papertrail CLI client or Papertrail HTTP API. Papertrail shows all logs by default, but you can limit these to a specific pod, node, or deployment using a flexible search syntax.

For example, let’s increase the number of replicas in our Nginx deployment to three. If we used kubectl logs -f, we would need to run it three times: one for each pod. With Papertrail, we can open the Papertrail Event Viewer and create a search that filters the stream to logs originating from the papertrail-demo deployment. Not only does this show us output from each pod in the deployment, but also Kubernetes cluster activity related to each pod:


Filtering a live stream of Kubernetes logs using Papertrail.

Sending Logs from Kubernetes to Papertrail

The most effective way to send logs from Kubernetes to Papertrail is via a DaemonSet. DaemonSets run a single instance of a pod on each node in the cluster. The pod used in the DaemonSet automatically collects and forwards log events from other pods, Kubernetes, and the node itself to Papertrail.

Papertrail provides two DaemonSets:

  • The Fluentd DaemonSet uses Fluentd to collect logs from containers, pods, Kubernetes, and nodes. This is the preferred method for logging a cluster.
  • The Logspout DaemonSet uses logspout to monitor the Docker log stream. This option is limited to log output from containers, not Kubernetes or nodes.

We’ll demonstrate using the Fluentd DaemonSet. From a computer with kubectl installed, download fluentd-daemonset-papertrail.yaml and open it in a text editor. Change the values of FLUENT_PAPERTRAIL_HOST and FLUENT_PAPERTRAIL_PORT to match your Papertrail log destination. Optionally, you can name your instance by changing FLUENT_HOSTNAME. You can also change the Kubernetes namespace that the DaemonSet runs in by changing the namespace parameter. When you are done, deploy the DaemonSet by running:

$ kubectl create -f fluentd-daemonset-papertrail.yaml

In a few moments, logs will start to appear in Papertrail:


Live feed of Kubernetes logs in Papertrail.

Best Practices for Live Tailing Kubernetes Logs

To get the most out of your logs, make sure you’re following these best practices.

Log All Applications to STDOUT and STDERR

Kubernetes collects logs from Pods by monitoring their STDOUT and STDERR streams. If your application logs to another location, such as a file or remote service, Kubernetes won’t be able to detect it, and neither will your Papertrail DaemonSet. When deploying an application, make sure to route its logs to the standard output stream.

Use the Fluentd DaemonSet

The Logspout DaemonSet is limited to logging containers. The Fluentd DaemonSet, however, will log your containers, pods, and nodes. In addition to logging more resources, Fluentd also logs valuable information such as Pod names, Pod controller activity, and Pod scheduling activity.

Open Papertrail Next to Your Terminal

When you’re working on Kubernetes apps and want to debug problems with Pods, have a browser window with Papertrail open either beside or behind your terminal window. This way you can see the results of your actions after you execute them. This also saves you from having to tail manually in your terminal.

Group Logs to Make Them Easier to Find

Kubernetes pods (and containers in general) are ephemeral and often have randomly generated names. Unless you specify fixed names, it can be hard to keep track of which pods or containers to filter on. A solution is to use log groups, which let you group logs from a specific application or development team together. This helps you find the logs you need and hide everything else.

Save Searches in Papertrail

Papertrail lets you save your searches for creating custom Event Viewer sessions and alerts. You can reopen previously created live tail sessions, share your sessions with team members, or receive an instant notification when new log events arrive in the stream.

Conclusion

Kubernetes logs help DevOps teams identify deployment problems and improve the reliability of their application . Live tailing enables faster troubleshooting by helping developers collect, view, and analyze these logs in real time. To get started in SolarWinds Papertrail, sign up and start logging your Kubernetes cluster in a matter of minutes.

Read more
1 0 392
Level 10

Jenkins X (JX) is an exciting new Continuous Integration and Continuous Deployment (CI/CD) tool for Kubernetes users. It hides the complexities of operating Kubernetes by giving developers a simpler experience to build and deploy their code. You can think of it as creating a serverless-like environment in Kubernetes. As a developer, you don’t need to worry about all the details of setting up environments, creating a CI/CD pipeline, or connecting GitHub to your CI pipeline. All of this and much more is handled by JX. In this article, we’ll introduce you to JX, show you how to use it, and how to monitor your builds and production deployments.

What is Jenkins X?

JX was created by James Strachan (creator of Groovy, Apache Camel, and now JX) and was first announced in March 2018. It’s designed from the ground up to be a cloud-native, Kubernetes-only application that not only supports CI/CD, but also makes working with Kubernetes as simple as possible. With one command you can create a Kubernetes cluster, install all the tools you’ll need to manage your application, create build and deployment pipelines, and deploy your application to various environments.

Jenkins is described as an “extensible automation server” that can be configured, via plugins, to be a Continuous Integration Server, a Continuous Deployment hub, or a tool to automate just about any software task. JX provides a specific configuration of Jenkins, meaning you don’t need to know which plugins are required to stand up a CI/CD pipeline. It also deploys numerous applications to Kubernetes to support building your docker container, storing the container in a docker registry, and deploying it to Kubernetes.

Jenkins pipeline builds are driven by adding a Jenkinsfile to your project. JX automates this for you. JX can create new projects (and the required Jenkinsfile) for you or import your existing project and create a Jenkinsfile if you don’t already have one. In short, you don’t need to know anything about Jenkins or Kubernetes to get started with JX. JX will do it all for you.

Overview of How JX Works

JX is designed to take all of the guesswork or trial and error approach many teams have used to create a fully functional CI/CD pipeline in Kubernetes. To make a tailored developer experience, JX had to choose which Kubernetes technologies to use. In many ways, JX is like a Linux distribution, but for Kubernetes. JX had to decide, from the plethora of tools available, which ones to use to create a smooth and seamless developer experience in Kubernetes.

To make the transition to Kubernetes simpler, the command line tool jx can drive most of your interactions with Kubernetes. This means you don’t need to know how to use kubectl right away; instead you can slowly adopt kubectl as you become more comfortable in Kubernetes. If you are an experienced Kubernetes user, you’ll use jx for interacting with JX (CI/CD, build logs, and so on) and continue to use kubectl for other tasks.

When you create or import a project using the jx command line tool, JX will detect your project type and create the appropriate Jenkinsfile for you (if it doesn’t already exist), define the required Kubernetes resources for your project (like Helm charts), add your project to GitHub and create the necessary webhooks for your application, build your application in Jenkins, and if all tests pass, deploy your application to a staging environment. You now have a fully integrated Kubernetes application with a CI/CD pipeline ready to go.

Your interaction with JX is driven by a few jx commands to set up and env, create or import an application, and monitor the state of your build pipelines. The developer workflow is covered in the next section. Generally speaking, once set up, you don’t need to interact with JX that much; it works quietly in the background, providing you CI and CD functionality.

Install Jenkins X

To get started using JX, install the jx binary. For Mac OS, you can use brew:

brew tap jenkins-x/jx brew install jx

Note: When I first tried to create a cluster using JX, it installed kops for me. However, the first time jx tried to use kops, it failed because kops wasn’t on my path. To address this, install kops as well:

brew install kops

Create a Kubernetes Cluster

JX supports most major cloud environments: Google GKE, Azure AKS, Amazon EKS, minikube, and many others. JX has a great video on installing JX on GKE. Here, I’m going to show you how to install JX in Amazon without EKS. Creating a Kubernetes cluster from scratch is very easy:

jx create cluster aws

Since I wasn’t using JX for a production application, I ran into a few gotchas during my install:

  1. When prompted with, “No existing ingress controller found in the kube-system namespace, shall we install one?” say yes.
  2. Assuming you are only trying out JX, when prompted with, “Would you like to register a wildcard DNS ALIAS to point at this ELB address?” say no.
  3. When prompted with, “Would you like wait and resolve this address to an IP address and use it for the domain?” say yes.
  4. When prompted with, “If you don’t have a wildcard DNS setup then set up a new CNAME and point it at: XX.XX.XX.XX.nip.io. Then, use the DNS domain in the next input” accept the default.

The image below shows you the EC2 instances that JX created for your Kubernetes Cluster (master is an m3.medium instance and the nodes are t2.medium instances😞

LG IntroJenkinsX 1
AWS EC2 Instances. © 2018 Amazon Web Services, Inc. or its affiliates. All rights reserved.

When you are ready to remove the cluster you just created, you can use this command (JX currently does not provide a delete cluster command):

kops delete cluster

Here’s the full kops command to remove the cluster you just created (you’ll want to use the cluster name and S3 bucket for all kops commands):

kops delete cluster --name aws1.cluster.k8s.local \ --state=s3://kops-state-xxxxxx-ff41cdfa-ede6-11e8-xx6-acde480xxxx

To add Loggly integration to your Kubernetes cluster, you can follow the steps outlined here.

Create an Application

Now that JX up and running, you are ready to create an application. The quickest way to do this is with the JX quickstart. In addition to the quickstart applications that come with JX, you can also create your own.

To get started, run create quickstart, and pick the spring-boot-http-gradle quick start (see the screenshot below for more details):

jx create quickstart

LG IntroJenkinsX 2
Creating a Kubernetes cluster using jx create cluster © 2018 Jenkins Project

Note: During the install process, I did run into one issue. When prompted with, “Which organization do you want to use?” make sure you choose a GitHub Org and not your personal account. The first time I ran this, I tried my personal account (which has an org associated with it) and jx create quickstart failed. When I reran it, I chose my org ripcitysoftware and everything worked as expected.

Once your application has been created, it will automatically be deployed to the staging environment for you. One thing I really like about JX is how explicit everything is. There isn’t any confusion between temporary and permanent environments because the environment name is embedded into the application URL (http://spring-boot-http-gradle.jx-staging.xx.xx.xx.xx.nip.io/).

The Spring Boot quickstart application provides you with one rest endpoint:

LG IntroJenkinsX 3
Example Spring Boot HTTP © 2018 Google, Inc

Developer Workflow

JX has been designed to support a trunk-based development model promoted by DevOps leaders like Jez Humble and Gene Kim. JX is heavily influenced by the book Accelerate (you can find more here), and as such it provides an opinionated developer workflow approach. Trunk-based development means releases are built off of trunk (master in git). Research has shown that teams using trunk-based development are more productive than those using long-lived feature branches. Instead of long-lived feature branches, teams create branches that live only a few hours, making a few small changes.

Here’s a short overview of trunk-based development as supported by JX. To implement a code change or fix a bug, you create a branch in your project, write tests, and make code changes as needed. (These changes should only take a couple of hours to implement, which means your code change is small.) Push your branch to GitHub and open a Pull Request. Now JX will take over. The webhook installed by JX when it imported your project will trigger a CI build in Jenkins. If the CI build succeeds, Jenkins will notify GitHub the build was successful, and you can now merge your PR into master. Once the PR is merged, Jenkins will create a released version of your application (released from the trunk branch) and deploy it (CD) to your staging environment. When you are ready to promote your application from stage to production, you’ll use the jx promote command.

The development workflow is expected to be:

  1. In git, create a branch to work in. After you’ve made your code changes, commit them and then push your branch to your remote git repository.
  2. Open a Pull Request in your remote git repo. This will trigger a build in Jenkins. If the build is successful, JX will create a preview environment for your PR so you can review and test your changes. To trigger the promotion of your code from Development to Staging, merge your PR.
  3. By default, JX will automatically promote your code to Stage. To promote your code to Production, you’ll need to run this command manually: jx promote app-name --version x.y.z --env production

Monitoring Jenkins X

Monitoring the status of your builds gives you insight into how development is progressing. It will also help you keep track of how often you are deploying apps to various environments.

JX provides you multiple ways to track the status of a build. JX configures Jenkins to trigger a build when a PR is opened or updated. The first place to look for the status of your build is in GitHub itself. Here is a build in GitHub that resulted in a failure. You can clearly see the CI step has failed:

LG IntroJenkinsX 4
GitHub PR Review Web Page. © 2018 GitHub Inc. All rights reserved.

The next way to check on the status of your build is in Jenkins itself. You can navigate to Jenkins in your browser or, from GitHub, you can click the “Details” link to the right of “This commit cannot be built.” Here is the Jenkins UI. You will notice Jenkins isn’t very subtle when a build fails:

LG IntroJenkinsX 5
Jenkins Blue Ocean failed build web page. © 2018 Jenkins Project

A third way to track the status of your build is from the command line, using the jx get activity command:

LG IntroJenkinsX 6
iTerm – output from jx get activity command © 2018 Jenkins Project

If you want to see the low-level details of what Jenkins is logging, you’ll need to look at the container Jenkins is running in. Jenkins is running in Kubernetes like any other application. It’s deployed as a pod and can be found using the kubectl command:

$ kubectl get pods NAME                      READY     STATUS    RESTARTS   AGE jenkins-fc467c5f9-dlg2p   1/1       Running   0          2d

Now that you have the name of the Pod, you can access the log directly using this command:

$ kubectl logs -f jenkins-fc467c5f9-dlg2p

LG IntroJenkinsX 7
iTerm – output from kubectl logs command © 2018 Jenkins Project

Finally, if you’d like to get the build output log, the log that’s shown in the Jenkins UI, you can use the command below. This is the raw build log that Jenkins creates when it’s building your application. When you have a failed build, you can use this output to determine why the build failed. You’ll find your test failures here along with other errors like failures in pushing your artifacts to a registry. The output below is not logged to the container (and therefore not accessible by Loggly):

$ jx get build log ripcitysoftware/spring-boot-http-gradle/master view the log at: http://jenkins.jx.xx.xx.xxx.xxx.nip.io/job/ripcitysoftware/job/spring-boot-http-gradle/job/master/2/console tailing the log of ripcitysoftware/spring-boot-http-gradle/master #2 Push event to branch master Connecting to https://api.github.com using macInfinity/****** (API Token for accessing https://github.com Git service inside pipelines)

Monitoring in Loggly

One of the principles of a microservice architecture, as described by Sam Newman in Building Microservices, is being Highly Observable. Specifically, Sam suggests that you aggregate all your logs. A great tool for this is SolarWinds® Loggly. Loggly is designed to aggregate all of your logs into one central location. By centralizing your logs, you get a holistic view of your systems. Deployments can trigger a change in the application that can generate errors or lead to instability. When you’re troubleshooting a production issue, one of the first things you want to know is whether something changed. Being able to track the deployments in your logs will let you backtrack deployments that may have caused bugs.

To monitor deployments, we need to know what’s logged when a deployment succeeds or fails. This is the message Jenkins logs when a build has completed:

INFO: ripcitysoftware/spring-boot-http-gradle/master #6 completed: SUCCESS

From the above message, we get a few pieces of information: the name of the branch, which contains the Project name ripcitysoftware/spring-boot-http-gradle and the branch master, the build number #6, and finally the build status SUCCESS.

The metrics you should monitor are:

  • Build status – Whether a build was a success or failure
  • The project name – Which project is being built
  • The build number – Tracks PRs and releases

By tracking the build status, you can see how often builds are succeeding or failing. The project name and build number tell you how many PRs have been opened (look for “PR” in the project name) and how often a release is created (look for “master” in the name).

To track all of the above fields, create one Derived Field in Loggly called jxRelease. Each capture group (the text inside of the parentheses) defines a unique Derived Field in Loggly. Here is the regex you’ll need:

^INFO:(.*)\/.*(master|PR.*) #(.*\d) completed: ([A-Z]+$)$

Here’s the Jenkins build success log-message above as it appears in Loggly after we’ve created the Derived Field. You can see all the fields we are defining highlighted in yellow below the Rule editor:

LG IntroJenkinsX
Loggly – Derived Field editor web page.  © 2018 SolarWinds Worldwide, LLC. All rights reserved.

Please note that Derived Fields use past logs only in the designer tool. Loggly only adds new derived fields to new log messages. This means if you’ve got an hour of Jenkins output already sent to Loggly and you create the jxBuildXXX fields (as shown above), only new log messages will include this field.

In the image below, you can see all the Derived Fields that have been parsed in the last 30 minutes. For jxBuildBranchName, there has been one build to stage, and it was successful, as indicated by the value SUCCESS. We also see that nine (9) builds have been pushed to stage, as indicated by the jxBuildNumber field.

LG IntroJenkinsX 9
Loggly Search Results web page.  © 2018 SolarWinds Worldwide, LLC. All rights reserved.

Now that these fields are parsed out of the logs, we can filter on them using the Field Explorer. Above, you can see that we have filtered on the master branch. This shows us each time the master branch has changed. When we are troubleshooting a production bug, we can now see the exact time the code changed. If the bug started after a deployment, then the root cause could be the code change. This helps us narrow down the root cause of the problem faster.

We can also track when master branch builds fail and fire an alert to notify our team on Slack or email. Theoretically, this should never happen, assuming we are properly testing the code. However, there could have been an integration problem that we missed, or a failure in the infrastructure. Setting an alert will notify us of these problems so we can fix them quickly.

Conclusion

JX is an exciting addition to Jenkins and Kubernetes alike. JX fills a gap that has existed since the rise of Kubernetes: how to assemble the correct tools within Kubernetes to get a smooth and automated CI/CD experience. In addition, JX helps break down the barrier of entry into Kubernetes and Jenkins for CI/CD. JX itself gives you multiple tools and commands to navigate system logs and track build pipelines. Adding Loggly integration with your JX environment is very straightforward. You can easily track the status of your builds and monitor your apps progression from development to a preview environment, to a staging environment and finally to production. When there is a critical production issue that you are troubleshooting, you can look at the deployment time to see if changes in the code caused the issue.

Read more
0 0 377
Level 9

Are you an administrator who’s supporting a small environment, and haven’t yet had the time or budget to invest in a centralized IT monitoring toolNo doubt you are tired of coworkers showing up at your desk or calling about an outage you weren’t yet aware of. If an enterprise-class solution would be overkill, but you don’t have the budget to purchase a licensed solution, ipMonitor Free Edition might be able to bridge that gap. 

ipMonitor Free Edition is a fully functional version of our ipMonitor solution for smaller environments.  It’s a standalone, free tool that helps you stay on top of what is going on with your critical network devices, servers, and applications—so you know what’s up, what’s down, and what’s not performing as expected. 

ipMonitor Free Edition at a Glance

  • Clear visibility of IT network dev !ice, server, and application status
  • Customizable alerting with optional automatic remediation
  • Simple deployment with our startup wizard and alerting recommendations
  • Lightweight installation and maintenance

ipMonitor Free Edition is an excellent starting point to more robust, centralized monitoring. It is designed for network and systems administrators with small environments or critical components they need to focus on, and can support up to 50 monitors. Monitors watch a specific aspect of a device, service, or process. Example monitors include: Ping, CPU, memory or disk usage, bandwidth, and response time.

Interested in giving it a try?  Download ipMonitor Free Edition today.  If you have any questions, head over to the ipMonitor product forum and start a discussion. 

Are you an administrator who’s supporting a small environment, and haven’t yet had the time or budget to invest in a centralized IT monitoring tool[MJ1] ? No doubt you are tired of coworkers showing up at your desk or calling about an outage you weren’t yet aware of. If an enterprise-class solution would be overkill, but you don’t have the budget to purchase a licensed solution, ipMonitor® Free Edition [MJ2] [WK3] might be able to help you bridge the gap.


[MJ2]Link to free edition PDP

[WK3]https://www.solarwinds.com/free-tools/ipmonitor-free

Read more
2 0 355
Level 10

Calling network engineers, network architects, and network defenders alike. We are happy to announce the arrival of the all-new SolarWinds® Flow Tool Bundle.

With this free tool, you can quickly distribute, test, and configure your flow traffic. Showcasing some of SolarWinds signature flow traffic analysis capabilities, the Flow Tool Bundle offers three handy, easy-to-install network traffic analysis tools: SolarWinds NetFlow Replicator, SolarWinds NetFlow Generator, and SolarWinds NetFlow Configurator.

So, what exactly can you do with this new addition to the vast family of SolarWinds free tools?

Here’s the breakdown:

SolarWinds NetFlow Replicator

  • Configure devices to send flow data to a single destination, then replicate the flows to a general-purpose flow analysis platform or even to a security analysis platform
  • Split off production flow streams to test new versions of the flow collector
  • Run sampled flow streams to multiple destinations or only to the destinations you designate
  • Reduce traffic through costly or low-bandwidth WAN links to decrease the volume of network management traffic
  • Enable segmentation of the managed domain to separate destination analysis platforms

SolarWinds NetFlow Generator

  • Troubleshoot flow tools to confirm that locally generated simulated traffic is visible in the tool
  • Validate the behavior of load balancing architectures
  • Test firewall rules that span across a network or those that are implemented on a host to confirm that flow traffic can be received
  • Perform performance and capacity lab testing
  • Perform functional testing to confirm that flow volumes are accurately represented
  • Test trigger conditions for newly created alerts and reset the alert behavior
  • Test new NetFlow application definitions
  • Populate traffic for demo environments

SolarWinds NetFlow Configurator

  • Analyze network performance
  • Activate NetFlow and find bandwidth hogs
  • Bypass the CLI with an intuitive GUI
  • Set up collectors for NetFlow data
  • Specify collector listening ports
  • Monitor traffic data per interface

How do you plan on using your Flow Tool Bundle? Install it today and let us know how you have been leveraging these awesome new free tools!

For more information about the SolarWinds Flow Tool Bundle, have a look at this page. You can also access the Quick Reference Guide on THWACK.

Read more
6 1 3,174
Level 12

This time of year is always exciting. The seasons change (depending on where you live), commercial buying season ramps up, and shopping lines resemble those of an amusement park in summer. The year is coming to an end, and we are busy shopping, making holiday preparations, traveling, and coming together with family to eat, exchange gifts, and be merry.

I’d wager access rights management doesn’t have a top spot on your holiday list. That’s ok. The topic doesn’t exactly exude that cozy holiday feeling. On the contrary, it might make you slightly uncomfortable. 

Most IT environments consist of tens, hundreds, or even thousands of servers. Those servers have thousands to tens of thousands of folders, groups, and paths. How can you really know who has access to what? Is your data safe? You have, no doubt, installed security monitoring and protection solutions to help protect the data in those folders and files. You’ve done everything you can, right? Despite all those protections, you still have users with access—but you don’t know who. You don’t know what. In fact, if someone asked you who has access to what, you probably couldn’t answer. It’s a hard question to field unless you have a solution in place giving you the visibility you need. Of course, if an auditor does ask you to answer these questions, your holidays could be spent digging through folders and directories to compile information and provide answers.

SolarWinds® Access Rights Manager (ARM) helps solve these challenges and more:

  • ARM provides a detailed overview of your users’ access rights, allowing you to easily visualize and show where access to resources has been granted erroneously
  • ARM enables standardization and automation of access rights, so you can easily apply the appropriate rights to users through templates
  • ARM helps demonstrate compliance and prevents insider data leakage by helping you achieve the principle of least privilege and giving you full auditability of user access over time

Let’s dig into this further.

ARM gives a detailed overview of your users’ access rights

The Active Directory group concept is essential for every administrator. These groups grow organically, and after years of existence and use, they often build up to complex group nesting structures. ARM gives you back control over these group structures.

The ARM AD Graph visualizes group structure and depth. Structural problems with these groups become transparent through this visualization.

pastedImage_0.png

In addition to the visualization provided by the AD Graph, the ARM dashboard allows a detailed analysis of the group nesting structures and circular nested groups. This enables administrators to work on the weak spots in the AD group structure, establish a flat group structure, and meet Microsoft best practices for group management.

With ARM, the issues related to lack of identifiable structures—or giving permissions to too many or the wrong people/groups—belong to the past. Once the group structure has been optimized, ARM allows you to compare any recorded access rights period with your current structure, and shows changes along with documented reasoning.

ARM enables standardization and automation of access rights

Compliance regulations, such as FISMA, GDPR, SOX, PCI DSS, BSI, and others, require administrators to adopt a high level of responsibility to ensure data is protected. Insider data leakage can cost companies large monetary sums in addition to lost customer, vendor, and reseller trust if data gets into the wrong hands. But it’s not always the headline-making data leak issues that harm companies. Employees leaving a company and taking valuable data with them is almost guaranteed without a cohesive access rights strategy to manage, control, and audit user rights—for users throughout the whole company.

ARM standardizes access rights across users and gives administrators a comprehensive tool to define, manage, monitor, and audit user access to resources across Active Directory, Exchange, SharePoint, and all your file servers.

pastedImage_2.png

ARM empowers administrators to predefine certain roles within the company, efficiently grant or deny rights with one click, and display all higher-level permissions in an easy-to- monitor overview. These different roles can be assigned a data owner (e.g., for department heads) to distribute control for managing access to resources the data owner is responsible for. In addition, this establishes a mindset of distributed access rights control to help ensure users with accurate access rights knowledge are granting and/or denying access appropriately.

Data owners, team leads, and IT professionals can be granted access to change personal information about a user, create or delete user accounts, reset passwords, unlock user accounts, or change group memberships centrally from within ARM. This allows the duties and tasks around access rights management to be shared while following standards to ensure full auditability.

ARM helps demonstrate compliance and prevents insider data leakage

Threats can emerge from the outside as well as the inside. Insider abuse can be a leading cause of data leakage. Of course, it’s not always a malicious insider; in many cases, data leakage is caused by negligent users who have access to resources, and are either compromised or take actions that inadvertently lead to data leakage. ARM takes special care to audit all changes within the ARM Logbook. The Logbook report enables admins and auditors to report on events and persons as needed to support investigations or auditor questions.

ARM also includes automated reports designed to meet regulatory compliance initiatives, such as NIST, PCI DSS, HIPAA, and GDPR. The flexible reporting views allow you to ask questions to quickly generate a report, which can be exported in an audit-ready format.

As mentioned earlier, ARM allows access rights management to be delegated to assigned staff members—placing control of the access rights assignment with the data owners that know their data. Changes made by these data owners are also audited so nothing goes unmonitored. ARM is designed to make your job easier—it helps you answer the questions you need to answer.

ARM is our gift to you this holiday season. It aligns with the SolarWinds mission to make your job as an IT technology professional easy. With Access Rights Manager, we make security easier too; we call it security simplified. If you are thinking of what you can do for yourself this holiday season, consider SolarWinds Access Rights Manager. It could turn out to be the gift that keeps on giving.

Read more
1 0 712
Level 9

Have you adopted Azure cloud services into your IT infrastructure? And do you know how much you paid last month and for what? And what about forecasting? Are you able to forecast your Azure spending in the current month? If the answer is no, don’t worry, you are not the only one. Unfortunately, Azure billing is really complicated with more than 15,000 SKUs available, and each have their own rate. But SolarWinds is here to help you! We’re proud to introduce a brand new free tool in our portfolio!

Cost Calculator for Azure is a standalone free tool that can help you discover how much you are paying for your Azure cloud services. It is as easy as it could be – you put the credentials of all your Azure accounts into the tool, so it can do all the work for you, telling you how much you really pay and for what specifically. This tool is designed to help all budget holders and SysAdmins of any sized-business who are responsible for cloud resources in their companies.

Cost Calculator for Azure at a glance:

  • No installation
  • Support
  • Show cost of all assigned Azure accounts and their subscriptions plans. There is no need to have more instances and work with Excel spreadsheet to have an overall number.
  • Show spending in current month, last month, last quarter, or year? Still not enough? You can set up your own timeframe that fits you.
  • Find orphaned objects
  • Consolidate all spending and show the final expense in users‘ preferred currency.
  • Filter spending

As you can see, Cost Calculator for Azure is a lightweight and easy to use tool that can help make your IT professional life a little bit easier thanks to better forecasting of your Azure cloud spending. And the best thing comes at the end – Cost Calculator for Azure is available completely for FREE!

So, why don’t you give a try? Click the link below to download your Cost Calculator for Azure free tool by SolarWinds. No installation needed.

Cost Calculator for Azure – Download Free Tool

Read more
0 1 301
Level 17

pastedImage_0.png

Did you ever dream you had a Ferrari® parked in your garage? How about a Porsche®? Or perhaps a finely engineered Mercedes-Benz®?

When I was eight years old, my father briefly flirted with the idea of buying a Ferrari. He was 38. I don't believe additional explanation is needed. However, as the oldest child, it was my privilege to accompany Dad to the showroom. And there, right next to the 308 GTB was a Ferrari bike. No, not a motorcycle. A regular pedal-with-your-feet bicycle. And I knew at that moment that this car was my destin... I mean my Dad's destiny. And that bike leaning beside it was mine, Mine, MINE!

You may be asking yourself why Ferrari would bother making a bicycle?

The obvious answer is "marketing." With a cheeky smile, Ferrari can say "anyone can own a Ferrari." But there's more to it.

Before I dive into the OTHER reason why, I just want to point out that car-manufacturer-bicycles is not just a thing with Ferrari. The trend started in the late 1800s with European car maker Opel® and includes Peugeot, Ford®, Mercedes-Benz, BMW®, and Porsche.

So what's the deal?

Some companies, like Opel, started with bicycles (they ACTUALLY started with sewing machines) and built up their mechanical expertise in sync with the rise of automobile technology. But most decided to build bikes as a side project. I imagine that the underlying message went something like this:

"Our engineers are the best in the world. They understand the complex interplay of materials, aerodynamics, maneuverability, and pure power. They are experts at squeezing every possible erg of forward thrust out of the smallest turn of the wheel. While we are used to operating on a much larger scale, we want to showcase how that knowledge and expertise translates to much more modest modes of conveyance. Whether you need to travel across the state or around the corner, we can help you get there."

I was thinking about that Ferrari bicycle, and the reasons it was built, as I played with ipMonitor® the other day.

For some of you reading this, ipMonitor will be an old and trusted friend. It may even have been your first experience with SolarWinds® solutions.

Some quick background: ipMonitor became part of the SolarWinds family in 2007 and has remained a beloved part of our lineup. ipMonitor is nimble, lightweight, and robust. A standalone product that installs on any laptop, server, or VM, ipMonitor can help you collect thousands of data points from network devices, servers, or applications. It's simple to learn, installs in minutes, and even comes with its own API and JSON-based query engine. Users tell us it quite literally blows the doors off the competition, and even reminds them of our more well-known network monitoring software like Network Performance Monitor (NPM) and Server & Application Monitor (SAM) server monitoring software.

Which is exactly why I remembered that Ferrari bicycle. It also was nimble, lightweight, and robust—a standalone product that could be implemented on any sidewalk, playground, or dirt path. It installed in minutes with nothing more than a wrench and a screwdriver, and epitomized the phrase "intuitive user interface."

And, like comparisons of ipMonitor to NPM, my beloved Ferrari bike was amazing until it came time to add new features or scale.

Much like the Ferrari bicycle, ipMonitor was designed by engineers who understood the complex interplay of code, polling cycles, data queries, and visualizations. Developers who were used to squeezing every ounce of compute out of the smallest cycle of a CPU. While used to creating solutions on a much larger scale, ipMonitor let us showcase how that knowledge and expertise translated to much more modest system requirements.

ipMonitor is designed to perform best in its correct context. For smaller environments with modest needs, when more feature-rich monitoring tools aren’t viable, it can be a game-changer. That Ferrari bicycle was an amazing piece of engineering—until I needed to bring home four bags of groceries or get to the other side of town. Likewise, ipMonitor is an amazing piece of engineering, but, as I said, in its correct context.

When you need "bigger" capabilities, like network path monitoring; insight into complex devices like load balancers, Cisco Nexus®, or stacked switches; application monitors that run scripted actions in the language of your choice; monitoring for containers and cloud; and so on, that's where the line is drawn between ipMonitor and solutions like NPM and SAM. It's not that we've deliberately limited ipMonitor, any more than Ferrari "limited" their bicycle so that it didn't have cruise control or ABS breaking. Of course, this isn't an either-or proposition. No matter your monitoring needs, we've got a solution that fits your situation.

So, consider this your invitation to take ipMonitor for a spin. Even if you own our larger, luxury models, sometimes it's nice to get out and monitor with nothing but the feel of the SolarWinds in your hair.

Read more
2 13 2,618
Level 14

Hello fellow data geeks! My name is Joshua Biggley and I am an Enterprise Monitoring Engineer for a Fortune 15 company. I’m also fortunate enough to be a remote worker on part of an amazing team. One of my favourite career achievements was to be named Canada’s only SolarWinds THWACK Community MVP in 2014.

I joined the THWACK Community in 2008, shortly after moving to beautiful Prince Edward Island on the East Coast of Canada. I’ve attended THWACKcamp for at least one session since its inception 7 years ago, but have been a regularly attendee for the past 4 years.  Humble brag moment -- I had the opportunity to join Leon Adato (@adatole) and Kate Asaff (@kasaff) for THWACKcamp 2016 in presenting the session Troubleshooting with SolarWinds - The Case of the Elusive Root Cause. Leon has been a friend and (short-lived) colleague since 2014 and Kate has quite literally saved my bacon in one of my biggest challenges as a Monitoring Engineer. Sharing the THWACKcamp stage with these two superheros was beyond awesome!  Last year, I was humbled to have my team and I win the Carmen Sandiego Award at THWACKcamp 2017. Our team is entirely remote engineers and having our work recognized for both the high-performance technical and inter-team collaboration we embrace was a highlight of my year.  Will 2018 be able to top it?

I think these two sessions will give 2017 a run for its money, even if I don’t win another THWACK award!

Day 2

Oct 18 @ 10AM CT

What Does It Take to Become a Practice Leader?

Too many organizations view monitoring, alerting, and event management as a necessary evil. It is often relegated to the “All other duties as assigned by your supervisor” category. As organizations mature, finding monitoring engineers becomes a challenge. It’s not just about someone who knows how to use the SolarWinds products you own (you are using SolarWinds products, aren’t you?) but finding someone who can explain why monitoring, alerting, and event management are so important. They need to explain to their peers, their management, and the business why monitoring needs to be a practice not an afterthought. They need to be a data geek. They need to be a storyteller.

Patrick Hubbard, Phoummala Schmitt, and Theresa MIller bring decades of experience and, more important, are recognized leaders in the industry. Discovering how they went from junior analyst to practice leaders will help me understand explain to others how to make that journey. As a practice leader in my full-time job as well as freelance work, being able to help others understand that they can be leaders is crucial to the health of monitoring as a practice. My colleagues and I have worked very hard to elevate monitoring to the respect it deserves. In 2019, we will be starting an internal Community of Excellence that focusing on monitoring, alerting, and event management plus my very favourite new focus -- observability!

Day 1

Oct 17 @ 12PM CT

Observability: Just A Fancy Word for Monitoring? A Journey From What to Why

Observability and high-cardinality data are sultry words to any data geek. Observability was introduced in the 1960s as part paper written by Rudolf E Kálmán entitled “On the General Theory of Control Systems”. If the status of a system can be known simply by examining the outputs of that system, the system is considered observable. In recent years, the idea of observability has been embraced by systems engineers as applications have moved from bare-metal to virtualized to containerized to serverless. Instead of monitoring the things that allow your system to do what it does, we’re now measuring how the system does what it does without much concern for why.

Of all of the sessions as THWACKcamp 2018, this is the one I would want every engineer, every application developer, every CTO --- OK, pretty much everyone who is involved in building, supporting, and managing any critical application anywhere -- to watch. Application Performance Management is coming to every organization. If you deliver any services through an application, APM provides the insight and observability is the methodology for measuring those insights.

Do I sound a little passionate about observability?  What?!? Only a little?!? Observability is my new passion. I recently wrote a white paper that defined an APM strategy and the foundation was observability. This idea of observability is probably the most important shift in our industry in 20 years. Unnecessary hyperbole? Maybe, but I think there are seminal moments in every industry and this focus on observability is going to be one of them. I’m Canadian, would I steer you wrong?

Read more
4 3 479

Dashboards are important. Your NOC is an essential avenue for collecting and relaying information about your network, and combined with a finely crafted set of alerts there’s nothing that can get past you. Not only are dashboards effective, but they just look so stinkin’ awesome when done properly. In this post I’m going to focus on my ‘Dashboard Philosophy,’ which is all about efficiency, information, and design. A dashboard should display the most data possible in the space that you have, it should include pertinent information that summarizes your environment, and it should look good doing it. Let’s talk about what the SolarWinds® Orion® Platform brings to the table to help make our dashboards the best they can be.

  1. NOC Views

Using the NOC view feature is a must. These space-saving views allow you to combine multiple sub-views that can be set on a rotation. Creating one is easy: simply add a new summary view, edit it, then enable left navigation and the NOC view feature. Here you can enter an interval for how often the NOC view rotates between individual sub-views. If you aren’t using NOC views, you’re wasting valuable space on your dashboards! Enter NOC mode, full-screen your browser window, and bask in the glory of a massive canvas to display all your fancy metrics and charts. Rob Boss would be proud.

     2. Network Atlas

Admit it, you both love and hate Network Atlas. It’s an incredibly useful tool that requires a bit of extra patience, but the results can be amazing once you get the hang of it. As Henry David Thoreau probably once said, “SolarWinds Network Atlas is but a canvas for your imagination…” or something like that. Check out this amazing example from THWACK® user spzander​:

pastedImage_17.png

Hungry for more? Here is some of my favorite THWACK content for tuning your Network Atlas skills and getting the creative juices flowing:

10 Hidden Gems in Orion Network Atlas

Using Custom Properties to send messages to your NOC using Network Atlas

The “Show us your Network Atlas Maps” thread

     3. PerfStack

With the release of NPM 12.1 came a game-changing new feature… PerfStack. This new charting tool allows you to quickly and easily create attractive charts that contain the data you need while optimizing page space. PerfStack is what makes you, the monitoring professional, shine when an application owner is looking for a way to view monitoring data for their systems. Check out the original release notes for PerfStack here. Since its first iteration, the SolarWinds team has been putting a lot of work into this tool. With PerfStack 2.0, they have added support for many major Orion modules including VMAN, SAM, VNQM, NCM, and DPA, along with a pile of new features such as fast polling, syslog/trap support, quick links, and full screen mode (which makes a great dashboard). As of this post, the next iteration of PerfStack is available in the latest NPM 12.3 Release Candidate and includes… drumroll please… A PERFSTACK WIDGET FOR YOUR DASHBOARDS!

pastedImage_18.png

Here we have a node detail view… WITH PERFSTACK! You can do the same thing with any view type in Orion, including Summary Views (which means dashboards). For dashboard nerds such as myself, this is truly a good day. Sign up for the NPM RC program for more details and awesome sneak peeks at what SolarWinds is doing to improve tools like PerfStack.

     4. AppStack

This is really one of the most efficient ways to display a mass amount of information in such a small space. AppStack is a one-size-fits-all tool that will satisfy your devs, their managers, and your director. An efficient dashboard should have MAXIMUM information in MINIMUM space, and AppStack is the answer. Whether you only have SAM or you’re running multiple products on the Orion platform the AppStack widget gives you a flexible, filterable, and fun-tastic (I couldn’t think of another word that started with ‘f’) resource to add to your dashboards and NOC views. There’s not much more to say. It’s the perfect widget for my Dashboard Philosophy.

pastedImage_19.png

     5. SWQL and Other Advanced Methods

Are you a dev nerd? Do you like to yell at code until it bends to your will? Are you ready to bring your SolarWinds deployment to an unreasonably awesome level? With a little bit of fidgeting and some help from THWACK, you can create your own charts, tables, dashboards, maps, and much more. Check out this post from THWACK MVP CourtesyIT, which has a master list of all the amazing ideas and customizations that have been posted in the community. Be sure to check out the section from THWACK MVP wluther:  he’s got some great content specifically tailored to dashboards. One thing to always keep in mind when using more advanced methods… SolarWinds support may not be able to assist you with the bending of spacetime. Fidget at your own risk!

In my opinion, one of the most powerful tools for creating custom resources is SWQL, the SolarWinds Query Language. With it, data is your slave. THWACK MVP mesverrum makes it easy in this post, where he provides an awesome example of how to create your own custom SWQL tables.

     Results

Let’s put all this together and create a shiny new dashboard that follows the idea of efficiency, information, and design. We need something that doesn’t waste space, contains useful data, and looks awesome. Something like this:

pastedImage_20.png

First thing’s first… we’re using the NOC view, indicated by the black bar at the top with the circles in the upper-right corner that represent the various sub-views in rotation. We have a map from Network Atlas (upper left), a PerfStack project added as a widget (lower left), AppStack (lower right), and a custom SWQL table that displays outage information (check out mesverrum​'s post about it here).

And there we have it! Five useful tools that you can use to make your dashboards amazing. Be sure to post your creations in the community. Here are some threads for NOC views and Network Atlas maps. Now go forth and dashboard!

Read more
36 25 7,353
Level 9

Let’s be honest, all of us need an SSH client from time to time. And when that SSH client is needed, most of us just use the standard PuTTY tool without question. Does this mean that PuTTY is a really good tool? Sure, it is… but could it be even better? We believe so. And we decided to prove it.

After months of development, we’re happy to introduce you to SolarWinds® Solar-PuTTY, an enhanced version of the most popular SSH client on the internet. We like PuTTY for its reliability and speed. And when you have to change anything on the server remotely, it’s still a decent choice… until you need to manage saved sessions, or if you’d like to connect to more servers at one time, or if you want to use the same script 100 times.

In all these scenarios, PuTTY has its limits. And at SolarWinds, we don’t like limits. So, we went beyond them and pushed PuTTY to the next level.

So, what are the key benefits of using Solar-PuTTY as your SSH client?

  • A new, fresh, browser-like interface—it’s easy to navigate, and everything is available in just a few clicks
  • Manage multiple sessions from a single console with a tabbed interface—you don’t have to run countless instances of the tool when you need work on more machines at the same time. All sessions are available in a single console with info about the name and status
  • Save your sessions, credentials, or private keys for easy login—you can access any saved session from the homepage with a single click. Usernames, passwords, and private keys can be stored and linked to one or multiple sessions when it’s needed
  • Filter saved sessions based on IP address, hostname, login, or tags—start typing into a search bar, and Solar-PuTTY will apply the filter in real time.
  • Automate all scripts you’re using when a connection is established—is there a set of commands you need to use right after initializing a telnet connection? When you need just a minute to do it one time, it sounds like no big deal… but what about a situation where you need to do it 100 times per a day? Save the script once and let Solar-PuTTY do it automatically
  • Auto-reconnect to a timed-out session—what if something goes wrong? Solar-PuTTY gives you details about what happened and the option to reconnect to the server with a single click. You don’t have to set up everything from scratch
  • Last but not least, Solar-PuTTY is available for free

As you can see, Solar-PuTTY keeps all the strengths of the original open source tool and adds the most-demanded features to bring you the best possible experience with SSH clients on the market. It’s time to say goodbye to Excel® spreadsheets and start managing your remote sessions in a more professional way.

What are you waiting for? Click the link below to download your Solar-PuTTY free tool by SolarWinds. No installation needed.

Solar-PuTTY Software – Download Free SSH Client

Read more
2 1 402
Level 12

You’ve been asking and we’ve been listening.  We are excited to announce that the newest member of the SolarWinds product family, Log Manager for Orion, is now available for trial.  Built on the Orion Platform, Log Manager provides unified infrastructure performance and log data in a single console. No need to hop back and forth between your infrastructure and log monitoring tools.

Through platform integration with Network Performance Monitor, Server & Application Monitor, and other Orion based products, Log Manager closes the gap between performance and log data.  With Log Manager you get:

  • Log aggregation
  • Filtering by Log Type, Level, Node name, IP Address, and more
  • Keyword, IP address, and Event ID search
  • Interactive log charting
  • Color-coded event tagging

To learn more about Log Manager, visit the Log Manager Thwack Forum or to try for yourself in your environment, download a free trial.

Read more
0 3 648
Level 10

Let’s face it. Traceroute is not what it used to be.

Van Jacobson and Steve Deering created the original “Traceroute” in 1987. They discovered it by editing the IPv4 packet header’s TTL field, so that they could derive a path from the packets being taken from each network hop. Network professionals quickly realized how valuable this tool was in terms of solving daily network issues. However, in recent years, Traceroute has not scaled to adapt to modern technologies, and has lost most of its useful functionality.

We note the following issues: When probing the network, the ICMP and UDP packets are blocked. The paths that the tool indicates, often don’t exist. And, ridiculously enough, there is no history function available. Even Ping has that! The list of issues is so vast that we’ve actually been able to find scholarly journal articles on the subject.

What’s the good news? The good news is that SolarWinds fixed Traceroute, and is offering it for free!

SolarWinds® Traceroute NG is a standalone free tool that effectively offers path analysis visibility via a CLI. By all standards, it’s a new, improved, and fully functional version of the older Traceroute generation tool. Yielding results in mere seconds, it provides an accurate single path from source to destination, and notifies users when the path is changed.

This new and improved version of Traceroute delivers the following information:

  • Number of hops
  • IP addresses
  • Fully qualified domain names (FQDNs)
  • Packet loss measured as a percentage
  • Current latency and average latency (ms)
  • Continuous probing that yields an iteration number for the user
  • Probe type used (if TCP, it also shows the port probed)
  • Issues (change in path, inability to reach destination)

SolarWinds Traceroute NG is able to get through firewalls, supports IPv6 networks, and can create a txt logfile containing the path number, probing time from source to destination, number of hops, IP addresses, FQDN, packet loss percentage, and average latency. It’s also able to copy data from the screen via the clipboard (copy/paste functionality), switch the probe type between ICMP and TCP using the switch command, and enable logging using the logging command, all while you’re probing simultaneously.

To sum it all up, Traceroute NG by SolarWinds brings back the power of the old Traceroute with new functionalities and capabilities that are adapted to modern technologies, so that you may once again reign supreme over the paths of your network, and never be lost when probing your long journey across the vast world wide web.

We hope you will enjoy this powerful new free tool. Click on the link below to download your Traceroute NG free tool by SolarWinds.

Traceroute NG Software - Download Free Traceroute Tool | SolarWinds

To find out more about what you can do with SolarWinds Traceroute NG, be sure to have a look at this article: Troubleshoot your network with a new free tool – Traceroute NG


Read more
2 17 1,912
Level 9

Are IP requests for virtual machines overwhelming your current IP address management practices?  You are not alone. In a June 2016 survey of IP Address Manager customers[1], 46% of respondents stated that virtual machines were creating challenges for managing IP addresses for their company.

Independent author Brien Posey explores this topic in the whitepaper “Overcoming IP Address Management Challenges in VMware Environments.” A challenge with virtual environments is that their dynamic nature can quickly lead to depleted address pools if IP addresses are not quickly de-provisioned. Utilizing DHCP services is a less than ideal solution, as IPs can be tied up by lease expiration dates. Using manual processes for provisioning IP addresses is another option, but this can be slow, error-prone, and limit the dynamic scaling of virtual environments. DNS records obviously must also be updated in tandem.

A solution to overcoming these IP address management challenges is fully automating the process of provisioning IP addresses and updating DNS records. VMware developed vRealize® Automation (vRA) to automate tasks in virtual environments. However, as Brien discusses, vRA was not designed to be a comprehensive IP address management solution, thus the need for third-party solutions to fill this gap. SolarWinds® IP Address Manager (IPAM) helps overcome this limitation by providing a plug-in for VMware® vRealize Orchestrator (vRO). The plug-in provides actions and workflows critical for managing IP addresses and DNS records. These actions and workflows integrate with vRA and enable the creation of blueprints to automate the provisioning and de-provisioning of VMs.

To learn more about this topic, please read Brien Posey’s whitepaper, and attend the live webcast coming up February 21, where our very own IPAM Product Manager Connie Dowdle will take you through a demonstration of the plug-in and the latest and greatest that SolarWinds IPAM 4.6 has to offer.


IP Address Manager customer survey, June 2016, survey result


Read more
1 1 409
Level 9

Survey-word-cloud_new.png

Reliable, recoverable backups have always been fundamental to a well-run data center. But the technology we use to accomplish that goal keeps reinventing itself. The old systems never quite go completely away, even as newer options come onto the scene. Too often, this results in a complicated mix of tools and media that can be a real headache to manage.

At one point, tape was the only storage medium, and the ubiquitous Iron Mountain® trucks hauled loads of tapes from place to place on a regular schedule. While those trucks haven’t gone away, today, they’re supplemented with disk and cloud storage.

Do you remember the simple days, when physical servers were the only thing needing protection? Traditional backup products were designed for this world, but increasing adoption of server virtualization led to new market leaders, like Veeam, with a virtual-first approach. Then laptops and an array of mobile devices needed protection.

Then came the cloud and SaaS applications. Every vendor sought to update their offerings to cover new use cases, new devices, and new storage options. Complexity multiplied, and prices went up and up.

Where does that leave you today?

In November, we surveyed the THWACK® community on server backup, and learned a lot. We heard from more than 500 of you that backup is too complicated, too time consuming, and too expensive.

Here are the top backup-related pain points our survey respondents listed:

Survey-issues_new.png

We also learned, not surprisingly, that you’re using a diverse mix of products that represent every era of backup history. The largest section of the pie was “other”.

pie_new2.png

We believe there’s a better way. We decided to approach the problem with a few guiding principles:

  • Simplicity – One backup product for physical and virtual servers, for one price that includes software and storage. No add-ins or options, no hidden costs.
  • Ease of use – One web-based console to see all backup status at a glance, and drill down as needed.
  • Reliability – Easy to deploy, clean, efficient dashboard. Our customers tell us it “just works.”
  • Powerful technology under the hood – Innovative features working in the background to make backups and restores fast and efficient.

The result of this approach is SolarWinds® Backup, a cloud-first backup service designed for IT pros who are tired of spending hours every week managing their backups. While it’s a new offering from SolarWinds, the product has been in use for years among the MSP community, and is already trusted by thousands of organizations. Here’s what a few of them have to say:

pastedImage_5.png

- Justin Cremer, IT Professional, Libra IT

pastedImage_10.png

- John Treanor, IT Professional, Satellyte Technology

More customer comments and insights can be found on TechValidate®.

To learn more about SolarWinds Backup and begin your free trial, check out the Product Blog post. Find out how simple backups can be.

Read more
1 2 1,058
Level 9

Update – February 7, 2018:

Cisco® updated their vulnerability advisory on Monday, February 5, 2018 after identifying “additional attack vectors and features that are affected.” What does this mean? If you patched last week, you may need to patch again. Be sure to read the advisory notice carefully to find out if your environment is at risk.

-------------------------------------

(Originally posted Wednesday, January 31, 2018):

What is it?

Earlier this week, Cisco revealed that there is a security vulnerability in the Cisco® ASAs, exposing these firewalls to remote attackers. Of course, now we all know about it, as does anyone who may want to exploit this opening. The good news: Cisco has released a critical update to address the issue. The bad news? There is no other workaround, so affected devices must be updated to be secured, and now you’re in a race against anyone who may be trying to take advantage. It’s worth noting that some FirePower devices are affected also, so read the Cisco post in detail to help ensure that you know where your vulnerabilities may lie.

What can you do?

Fortunately, if you have SolarWinds® Network Performance Monitor (NPM), our own KMSigma has created a report so you can quickly see if you have vulnerable devices. (For a refresher on implementing user-created reports, see How to export and import reports in the Orion® web console.)

Once you’ve identified affected devices, you can use Network Configuration Manager (NCM) to easily schedule, patch, and monitor your ASA devices using the firmware upgrade process. Are you running multi-context ASAs? No problem. The firmware upgrade path supports both single- and multi-context upgrades.

In this industry, it doesn’t take long to realize that discovering vulnerabilities of this nature—and subsequently addressing them—is a standard part of the job description. Having the right tools available can make a notable difference in how long your network is exposed and how much effort is required to remediate issues.

Tell us:

Were your devices affected? Have you already updated, and if so, did you use NPM and NCM to do so? Use the comments to tell us how it went. Were you affected but don’t have NPM or NCM? Download free 30-day trials of Network Performance Monitor and Network Configuration Manager today and see how they can help.

Learn more about Network Insight for Cisco ASA:

Did you know that SolarWinds added a new Network Insight feature for Cisco ASA in the NPM 12.2 and NCM 7.7 releases? Learn about all the functionality included in Network Insight for Cisco ASA.

Read more
0 1 1,325
Product Manager
Product Manager

Keeping a network up and running is a full-time job, sometimes a full-time job for several team members! But it doesn’t have to feel like a fire drill every day. Managing a network shouldn’t be entirely reactive. There are steps you can take and processes you can put in place to help reduce some of the top causes of network outages and minimize any downtime.

1. The Problem: Human Element

The dreaded “fat finger.” You’ve heard the stories. You may have done it yourself, or been the one working frantically late into the night or over a weekend to try to recover from someone else’s mistake. If you’re really unlucky (like some poor employee at Amazon® last spring), the repercussions can be massive. No one needs that kind of stress.


The Protection:
First, make sure only the appropriate people have access to make changes. Have an approval system built in. And, since even the best of us can make mistakes, ensure you have a system that allows you to roll back changes just in case.

2. The Problem: Security Breaches

Network security is becoming more and more critical every day. People trying to break the system get better, and privacy needs for users gets higher. There are many critical elements to trying to keep your network secure, and it’s important not to miss any. It doesn’t do any good to deadbolt your door when your window is wide open.

The Protection:

Protect your devices from unauthorized changes. Monitor configurations so you can be alerted to any changes, see exactly what was changed, and know what login ID was used to make the change. Also, you should be regularly auditing your device configurations for vulnerabilities. Whether you have custom policies defined for your organization or need to comply with HIPAA, DISA STIG, SOX, or other industry standards, continuously monitoring your devices to help ensure your network stays compliant is one way to help.

3. The Problem: Lack of Routine Maintenance

Over time, networks can become messy and disorganized if there aren’t standards in place, increasing both the risk of errors and the time needed to resolve them.

The Protection:

Network standardization simplifies and focuses your infrastructure, allowing you to become more disciplined with routines and expectations. Naming conventions, standard MOTD banners, and interface names are just a few things you can do to help troubleshoot and keep a balance within your team and devices, allowing for better management and less human error.

4. The Problem: Hardware Failures

It’s not if hardware will fail, but when. Are you ready to make a speedy recovery? When a device unexpectedly goes down, it can have a big impact, depending on which device it is and what redundancies you have in place.

The Protection:

Ensure that you can quickly recover devices or bring a replacement online by having device configurations automatically backed up so you can quickly bring new devices online.

5. The Problem: Firmware Issues / Faults in the Devices

When you support hundreds of devices, required firmware updates can be tedious, and executing commands over and over increases the risk of error.

The Protection:

With network automation, you can easily manage rapid change across complex networks. Bulk deploy configurations to ensure accuracy and speed up deployment times.

Increase your uptime and reduce the challenges of keeping your network running smoothly so you can focus on other projects. With SolarWinds® Network Configuration Manager, you can bulk deploy configuration changes or firmware updates, manage approvals, revert to previous configurations, audit for compliance, and run remediation scripts. Take action today to reduce these five causes of network outages.

Read more
1 0 599
Product Manager
Product Manager

We just can't have anything nice, now can we?  Oh, well. We knew there would be new vulnerabilities and ransomware attacks in 2018. However, this time hardware is the culprit, and patching is not going to be a cure-all for the situation. Consider yourself warned: expect more slowdowns in 2018.

Stop and think about this for a second: as the days progress, we are literally learning how much this new vulnerability impacts us. Anyone who says they have the full solution is not being honest with you or themselves. What I would like to do is help you to see how you can use the tools you likely already have to make you more aware of past, present, and future vulnerabilities and threats. That said, let's move on to the importance of using SolarWinds tools to do just that.

SolarWinds® Patch Manager will allow you to update your Windows® machines to their Microsoft® patches. If you are currently using this product, you should already be scheduling and looking for these. I discovered that there can be some issues with third-party Windows antivirus or you might get the BSOD. Read more here, because the awesome chart helps clarify these issues and how to prevent them from happening to you.

Further, Patch Manager will allow you to schedule and report on your Windows devices regarding updates. The reporting is key to showcase your compliance and, in this case, start your baseline. Plus, just because you update your devices does not mean you are 100% in the clear. Updating your third-party packages is an added bonus with Patch Manager, a fact that is often overlooked though desperately needed.     

SolarWinds® Server & Application Monitoring (SAM) will help you validate your business, yourself, and your vendor support for any degradation that patching may have on your applications. This is something you will want to have in place as soon as possible. It allows you to see any anomalies that may present themselves to your applications after the patching is applied. And because SAM is multi-vendor, you’ll be able to address even broad-scale hardware issues. The avid SAM users among you will likely know even more tricks for using the software, and I encourage you to share your knowledge in the comments to help us all be more aware in terms of application-centric monitoring.

SolarWinds® Network Configuration Manager (NCM) comes helps when there are firmware upgrades\updates that need to be applied to impacted network devices. It also helps you to roll these out. There is a compliance reporting function built into NCM that will assist with audits automatically. Remember, this incident is ongoing, which makes NCM’s ability to import very helpful. In fact, you can plug into firmware vulnerability warnings provided by the National Institute of Standards and Technology (NIST). This puts you even further ahead of future vulnerabilities.

SolarWinds® Network Performance Monitor (NPM) is all about the baseline. If you have ever been to one of our SWUGs, you have heard me preach endlessly about baselines and their extreme importance. However, I understand that sometimes you need black and white in front of you to truly understand this. The mindset I’m currently following regarding this vulnerability looks something like this:

  1. Patched and we have our checkbox
  2. Monitoring our application performances
  3. Ready for updates to needed network devices
  4. Monitoring the common vulnerabilities database
  5. Waiting for any anomaly that may present its ugly face (my favorite)

We can now show that we have implemented the patching to put a Band Aid® on the issues that could present themselves. However, as I’ve already mentioned, this is not a full fix. A hardware option would be the best solution, but is obviously not available to billions of devices at this time. YOU ARE THE THE FIRST RESPONDER!

Using NPM in combination with the other tools that I have outlined allows you to verify the patching and the results. Also, if there are ticks or drops or spikes that do NOT match your current baseline, you can share that solid reporting and documentation with your vendor to work out the possible issue, which makes you part of the solution. Is there anything better than working at the edge of technological advancements to create countermeasures to vulnerabilities? NO. The answer is a solid NO.

If you don’t already have it in place, set up threshold alerting and monitoring on critical devices that are housing your applications. That helps ensure that you are alerted to anything out of the ordinary, allowing you to get things back on track. It also shows your team and other departments that you are fully invested in the integrity of application uptime and performance. Also, if you have DevOps, you really need the documentation and baselines to prove that perhaps the performance issue is not the in-house application, but an actual patching issue. That, right there, can save a lot of unneeded cycles through rabbit holes.

Please let me know if you have additional ways to protect and help through these beginning stages of 2018 vulnerabilities. The ideas we share could literally help the many of you who act as a one-person army fighting your way to the top!

Thank you all for your eyes,

~Dez~

In case you’d like more information on any of the products mentioned above, check these out:

SolarWinds® Patch Manager

SolarWinds® Server & Application Monitor

SolarWinds® Network Performance Monitor

SolarWinds® Network Configuration Manager

Other resources:

https://www.pcworld.com/article/3245606/security/intel-x86-cpu-kernel-bug-faq-how-it-affects-pc-mac....

https://www.nytimes.com/2018/01/03/business/computer-flaws.html

Check out our Security and Compliance LinkedIn® Showcase Page for ideas on how to socialize this content: https://www.linkedin.com/showcase/solarwinds-security-and-compliance/

Follow our Federal LinkedIn page to stay current on federal events and announcements: https://www.linkedin.com/showcase/4799311/

Read more
3 7 2,725
Level 14

Looking back through previous content, I came across this post by Jerry Eshbaugh.

SQL Server Two Ways - SAM AppInsight for SQL and Database Performance Analyzer

I read through it again and realized it still resonates in a big way. I’d like to add this foreword and bring it up to speed given some recent changes. SolarWinds® Database Performance Analyzer (DPA) wait-time statistics and resource metrics were recently added to the Performance Analysis view (lovingly known as PerfStack) in the Orion® Platform. I believe this addition gives IT professionals the end-to-end visibility they want. I know we all tend to exist in silos, but that doesn’t mean we don’t want greater upstream and downstream performance metrics.

Now you can easily see if your database performance is impacting application response time, and if storage latency is causing longer I/O related database activities. Also, you can view existing dependencies and what relates to what. These customizable dashboards are way cool!

If you haven’t had a chance to check it out, you have a couple of ways to do so:

  • If you own just DPA (without any Orion products), you can now download a standalone DPA Integration Module (DPAIM) from your customer portal as part of your existing license. That’s right! It’s free. You will be limited to DPA data only, as there are no other modules running to collect application, server, storage, and network data, etc.
  • If you already have another Orion product and are on the latest release, DPAIM may be installed (it comes with Server and Application Monitor for example), or you can install the DPAIM module from your customer portal on your Orion Platform.
  • If you aren’t ready to commit to a download, you can check out oriondemo.solarwinds.com and try out the Performance Analysis view. This might be a good start to play around with, but remember, it is demo data. Things may not line up exactly. Some of the data might be invented. The best way to get the most out of the PerfStack dashboard would be to look at your own data with it, which is infinitely more interesting!

Let us know what you think about it!

Read more
0 1 320
Level 14

Jogging is my exercise. I use it to tune out noise, focus on a problem at hand, avoid interruptions, and stay healthy. Recently, I was cruising at a comfortable nine-minute pace when four elite runners passed me, and it felt like I was standing still. It got me thinking about the relationship between health and performance. I came to the conclusion that they are related, but more like distant cousins than siblings.

I can provide you data that indicates health status: blood pressure, resting heart rate, BMI, body fat percentage, current illnesses, etc. Given all that, tell me: can I run a four-minute mile? That question can’t be answered solely with the data I provided. That’s because I’m now talking about performance versus health.

We can also look at health metrics with databases: CPU utilization, I/O stats, memory pressure, etc. However, those also can’t answer the question of how your databases and queries are performing. I’d argue that both health AND performance monitoring and analysis are important. They can impact each other but answer different questions.

“What gets measured gets done.” I love this saying and believe that to be true. The tricky part is making sure we’re measuring the right thing to ensure we’re driving the behavior we want.

Health is a very mature topic and pretty much all database monitoring solutions offer visibility into it. Performance is another story. I love this definition of performance from Craig Mullins as it relates to databases: “the optimization of resource use to increase throughput and minimize contention, enabling the largest possible workload to be processed.”

Interestingly, I believe this definition would be widely accepted, yet approaches to achieving this with monitoring tools varies widely. While I agree with this definition, I’d add “in the shortest possible time” to the end of it. If you agree that you need to consider a time component in regards to database performance, now we’re talking about wait-time analysis. Here’s a white paper that goes into much more detail on this approach and why it is the correct way to think about database performance.

We can only get to the right answer regarding root cause if we’re collecting (measuring) the right data in the first place. Below is a chart with some thoughts on data collection requirements. Adapt as needed, but I hope it provides a workable framework.

pastedImage_0.png

Remember: don’t stop with asking “What can we do?” Take it to the next level and instead ask, “What should we do?”

Read more
0 0 192
Level 11

Do you know how to protect your organization's sensitive data from today’s cyberthreats? One way is to arm the enterprise with a security information and event management (SIEM) tool. SIEM solutions provide a meaningful contribution to defense-in-depth strategies with their ability to detect, defend against, and conduct post-mortem analysis on cyberattacks and general IT security anomalies. Over the years, they have become a contributing force in meeting, maintaining, and proving a business’ alignment with regulatory compliance frameworks such as HIPAA, PCI DSS, SOX, and more. Let's take a look at how SIEM software works and why it's a must have for your business.

What is SIEM?

Predecessors of SIEM solutions, security information management (SIM), and security event management (SEM) began merging into one security system over a decade ago. When you run a SIEM tool, all your relevant security data can come from multiple locations, but you can look at all that data from one dashboard. Being able to access data across numerous locations and evaluate it in one location makes it easier to spot unusual patterns and trends, and react and respond quickly to any possible threats.

The SIEM software collects information from event logs spanning all your devices, including anti-virus, spam filters, servers, firewalls, and more. It then uses key attributes (IPs, users, event types, memory, processes, ports) that can indicate security incidents or issues to alert and respond quickly—and in many cases, automatically.

How Does SIEM Help With Security?

The event management portion of a SIEM solution stores and interprets logs in a central location and allows analysis in near real-time, which means IT security personnel can take defensive actions much more rapidly. The information management component provides trend analysis, as well as automated and centralized reporting for compliance by collecting data into a central repository. As a whole, a SIEM tool provides quicker identification and better analysis and recovery of security events by combining these two functions. Another advantage is that compliance managers can confirm they are fulfilling their enterprise's legal compliance requirements with a SIEM tool.

Advantages of a SIEM Tool

There are many advantages to using a SIEM tool, other than only needing one tool to monitor cybersecurity. SIEM systems can be used for different purposes, so the benefits will vary from one organization to another, but every organization that uses a SIEM tool will experience these main benefits:

  1. Streamlined compliance reporting. SIEM solutions leverage the log data from various devices across an organization or enterprise.

  1. Better detect incidents that otherwise might be missed. SIEM products enable centralized analysis and reporting for an organization's security events. The IT security analysis may detect attacks that were not found through other means, and some SIEM products have the capabilities to attempt to stop attacks they detect—assuming they are still in progress.

  1. Improve their efficiency in handling activities. You can save time and resources with a SIEM tool because you can respond to security incidents more quickly and efficiently. IT professionals can quickly identify an attacker’s route, learn who has been affected, and implement automated mechanisms to stop the attack in its tracks.

What to Look for in a SIEM Tool

What features should you be looking for when shopping for a SIEM tool? Here are just a few of the important questions to consider when evaluating SIEM solutions:

  1. Does the SIEM provide enough native support for all relevant log sources?

  1. How well can the SIEM tool enhance current logging abilities?

  1. Can the SIEM software effectively use threat intelligence to your advantage?

  1. What features does the SIEM product offer to help carry out data analysis?

  1. Are the SIEM's automated response capabilities timely, secure, and effective?

Stay Protected with SolarWinds Log & Event Manager

There are numerous SIEM tools to choose from, but SolarWinds® Log & Event Manager (LEM) offers valuable features that can help you improve both your security and compliance, with relative ease and with limited impact on IT budgets.

These are just a few of the features LEM provides:

  1. Detect suspicious activity. Eliminate threats faster by instantaneously detecting suspicious activity and sending automated responses.

  1. Mitigate security threats. Conduct investigations of any security events and apply forensics for mitigation and compliance.

  1. Achieve auditable compliance. Demonstrate compliance with audit-proven reporting for HIPAA, PCI DSS, SOX, and more.

  1. Maintain continuous security. Your efforts to protect your business against cyberthreats should extend to the choices of software you employ to do so. LEM is deployed as a hardened virtual appliance with data encryption in transit and at rest, SSO/smart card integration, and more.

Purchase SolarWinds Log & Event Manager Software

Visit us online today to learn more about Log & Event Manager and get a free 30-day trial of the software. Learn more about the key features we offer in LEM, and watch our informative video explaining how it works. Get answers to frequently asked questions and hear from some of our very satisfied customers. This SIEM tool is clearly an industry favorite. Click here to see how it can help your enterprise or organization stay safe and secure from cyberthreats with the SolarWinds Log & Event Manager software.

Read more
0 0 1,117
Level 11

In today's landscape of security breaches and cyberattacks, it seems like no company or network is completely immune to cybercrime. In fact, you don’t have to search very hard in the news to read about another cyberattack that has happened to a big corporation. Thankfully, developers are constantly looking out for these threats and building important security patches and updates protect the data. Let's look at some of the major vulnerabilities and attacks that have happened in 2017.

Microsoft Security Bulletin MS17-010 (March 14, 2017)

Although this wasn't exactly a hack, it serves as a great reminder of how scary security vulnerabilities in Microsoft® Windows® software can be. The bulletin detailed several cyber security threats, but the most severe vulnerability was the potential for an attacker to execute code on the target server. This vulnerability was so huge that Microsoft called the security patches “critical for all supported releases of Microsoft Windows.”

Imagine the impact this could have had if the cyber threat was not discovered and a security patch was not created.

The biggest impact of this bulletin was that it showed how many zero-day level flaws were present in Microsoft products that made users vulnerable to cyberattacks. Essentially, the combination of the delayed rollout of crucial security patches and enterprises’ often slow adoption of patches made all Microsoft users vulnerable to the WannaCry and NotPetya ransomware attacks.

WannaCry Ransomware Attack (May 12, 2017)

The WannaCry Ransomware attack was one of the most significant cyberattacks in 2017. Seventy-five thousand organizations from 99 countries reported being attacked. How did it happen?

A vulnerability called EternalBlue was responsible for spreading the WannaCry attack. This vulnerability was actually addressed in Microsoft’s security patches released in March. Unfortunately, many users had not yet installed these critical patches.

Impact of WannaCry

As the name implies, many Microsoft users probably did want to cry after being hit by this cyberattack. It created a moment where global internet security reached a state of emergency. WannaCry affected the U.K., Spain, Russia, Ukraine, Taiwan, and even some Chinese and U.S. entities. In many cases, companies were forced to pay $300+ to regain access to their files/system. However, there was another even more severe impact, as sixteen National Health Service organizations were locked out of their systems. Many doctors were unable to pull up patient files and emergency rooms were forced to divert people seeking urgent care.

Petrwrap/Petwrap/NotPetya Ransomware Attack (June 27, 2017)

This attack was even worse than the WannaCry attack. NotPetya did not act like other ransomware malware. Instead, it rebooted victims’ computers and encrypted their hard drive’s master file table, which rendered the master boot record inoperable. Those who were infected lost full access to their system. Additionally, the cyberattack seized information about the file names, size, and location on the physical disk. NotPetya spread because it used the EternalBlue vulnerability, just like WannaCry.

Impact of NotPetya

NotPetya reportedly infected 300,000 systems and servers throughout the world, including some in Russia, Denmark, France, the U.K., the U.S., and Ukraine. Ukraine was hit the hardest. Within just a few hours of the infection starting, the country’s government, top energy companies, private and state banks, the main airport, and metro system all reported hits on their systems.

How to Protect Your Business From Cyberattacks

The evidence is clear. Hackers are always on the prowl and cyberattacks will happen. The key is to be ready for them so you can prevent an attack from being successful. You must take every step possible to protect your company and your private information. There are several important things you can do, including making sure you always install security patches and updates. For example, if infected organizations had installed the update patches in March, they would have been protected from the WannaCry attack. Therefore, this simple step could be the difference in whether or not a cybercriminal is able to successfully hack into your data.

Think Prevention, Not Cure

While installing every patch developers make might seem like a hassle, the fact is these patches play a significant role in your cybersecurity efforts. There is great wisdom in the saying of “an ounce of prevention is worth a pound of cure” when you’re dealing with cybersecurity. It’s so much easier to take the necessary steps to prevent a cyberhack than it is to overcome all the problems after a breach occurs. Regularly installing security patches is a must, especially since you might not be aware of the possible threats that could be coming.

Let SolarWinds Patch Manager Do the Work for You

Although constantly installing these updates and patches can be a pain, and it can feel like you get a new patch almost every other day, patches are a necessary evil. Thanks to the SolarWinds® Patch Manager software, you can now leave this tedious chore to someone else. This intuitive patch management software allows you to quickly address software vulnerabilities in your system. SolarWinds Patch Manager offers several key features, including:

  1. Simplified patch management. Automate the patching and reporting process and save time by simplifying patch management on servers and workstations.
  2. Extend the capabilities of WSUS patch management. Decrease service interruptions and lower your security risks by helping ensure patches are applied and controlling what gets patched and when.
  3. Extend the use of Microsoft System Center Configuration Manager. Protect your servers, desktops, laptops, and Virtual Machines (VMs) with the most current patches for third-party apps.
  4. Demonstrate Patch Compliance. Stay up to date on all vulnerabilities and create summary reports to show patching status.

Additionally, SolarWinds Patch Manager offers a Patch Status Dashboard. The dashboard tracks who got patched and what still needs to be patched. You will be able to see the most recent available patches, the top patches you are still missing, and the overall general health of your cyber environment. Patch Manager also allows you to build your own packages for many other types of files, including .EXE, .MSI, or .MSL.

Download SolarWinds Patch Manager now to identify the vulnerabilities in your system and help protect your business.

Read more
0 0 275
Level 9

Were you affected by an internet connectivity outage earlier this week? This outage affected users across the U.S., and originated from Level 3, an ISP recently acquired by CenturyLink®. Because Level 3 also provides infrastructure to other internet providers, some Comcast®, Spectrum®, Verizon®, and AT&T® users experienced outages as well.

          Level3.PNG
                (Source: Twitter)

A configuration error? That’s what I thought when I first read this. There are many crazy ways connectivity issues can occur, from rats chewing through cables to your standard PEBKAC error causing a user to holler, “the internet is down!” But configuration errors? This is an easy one to address.

Perhaps even more concerning than a massive telecommunications company losing connectivity due to a config error is the amount of time to recover. After the issue was corrected, Level 3 issued a statement to several publications (including TechCrunch, Slate, Mashable, and The Verge), saying:

"On Monday, November 6th, our network experienced a service disruption affecting some customers with IP-based services. The disruption was caused by a configuration error. We know how important these services are to our customers. Our technicians were able to restore service within approximately 90 minutes."

90 minutes to recover from an issue that is affecting potentially millions* of people in the middle of the workday is about 89 minutes too long. (*Total number of customers affected hasn’t been released, but it included customers of Comcast, Spectrum, Verizon, and AT&T across the U.S., among others.)

          ComcastOutage.png

               (Source: DownDetector.com via CNN)

Are YOU ready to ensure that something like this doesn’t happen to you? With SolarWinds® Network Configuration Manager (NCM), you can rest easy knowing that you are prepared. Even if a config error does occur, you can quickly rollback to a known-good config that you have saved, thanks to NCM’s automatic backups. If you need to make updates across devices, you can easily push bulk changes. And no need to worry about someone else messing with your configs—you can control who can make changes, and what kind, directly from the NCM console.

While we can’t help you with rats chewing your cables, we CAN help with your config management. Download a free trial of Network Configuration Manager today.

What are some of the craziest causes of connectivity issues that you’ve encountered?

Read more
1 0 376
Level 8

Imagine this scenario: You are running a Kiwi® server either on-premises or in the cloud, and need to push at least a portion of that log data to Papertrail. This would be especially helpful in situations where Kiwi is already in place, and you need to allow a developer, support contact, etc. external access to limited log data without providing access to the Kiwi server itself. Once these logs are pushed to your Papertrail account, you can grant users access to specific Papertrail log data. These Papertrail logs can be viewed from anywhere, while Kiwi servers are often locked down within a secured network. The best part is that you can maintain a complete local copy of your logs while pushing interesting log data to Papertrail for use with advanced search and alerting features.

From your Kiwi Syslog® Service Manager select File -> Setup.

In the setup page, you have a rule named Default that displays all log entries sent to Kiwi and logs them to a file.

Send everything to Papertrail! If you wish to forward ALL logs seen by Kiwi to Papertrail, add the Send to Papertrail action to your Default rule, or any rule with no filters configured.

However, if you want to send only certain messages to Papertrail, you’ll need to add a new rule with a filter to capture just the specific messages you want.

We'll be adding 1 New Rule with 2 Filters and 2 Actions.

pastedImage_0.png

FILTERS

Filters allow several methods of matching log data. Positive matches result in the actions for that rule being performed on those log lines. Hostname, IP, Message Text, and Priority are the most commonly used filters.

Add the new rule by right-clicking Rules and selecting Add rule.

pastedImage_1.png

Under the new rule, right click Filters and Add Filter.

pastedImage_2.png

In the Field section, choose Priority.

pastedImage_3.png

Click on the Priority headings to highlight all the columns.

pastedImage_4.png

Click the green check mark at the bottom, to select the highlighted fields.

pastedImage_5.png

Next, create a new filter to match the text in log lines using the Message Text field, and Simple filter type. Here I used "test" because it will match on all of the Kiwi default test log lines. You can use any text strings in this filter to match log entries you wish to send to Papertrail.

pastedImage_6.png

ACTIONS!

Now configure the actions to take place on log lines matching our filters. Start by adding them to a Kiwi display so we can see what's matching the rule right here in Kiwi.

Under the new rule, right-click Actions and Add action.

pastedImage_18.png

Select the Display action at the top of the menu. Set a Display number that corresponds to the display dropdown in the main Kiwi window. You should use a unique display that isn't used by other Kiwi rules. Display 00 shows ALL logs seen by Kiwi by default, so I’ve used Display 01 instead. This will only show everything sent to Papertrail.

pastedImage_19.png

Now add an action to send the matching logs to Papertrail.

Under the new rule, right-click Actions and Add action to add another action.

pastedImage_20.png

Select the Log to Papertrail.com (cloud) action to send logs to a Papertrail account. Replace the hostname and port with your own log destination found here: https://papertrailapp.com/account/destinations

pastedImage_21.png

After hitting Apply to save the configuration, use the File –> Send test message to localhost menu item to generate a log line that will be pushed to your Papertrail account and shown on the Kiwi display you set. In your Papertrail account, you’ll see your Kiwi server show up by IP or hostname, but you can rename it as I’ve done here. (Remember: The test log line shown has to match your filters.)

pastedImage_22.png

pastedImage_23.png

pastedImage_24.png

Troubleshooting

Not seeing log lines in Papertrail? Does the Kiwi server have outbound network connectivity that allows a connection to Papertrail? In ~90% of cases, this is caused by host-based firewalls or other network devices blocking connectivity to Papertrail.

The PowerShell® below will test basic UDP connectivity to Papertrail from a Windows® host. Replace the Papertrail Hostname/Port with your actual log destination settings found here. Copy and paste all lines at once into PowerShell. (Run PowerShell as Administrator if you have trouble.)

WINDOWS - PowerShell

$udp = New-Object Net.Sockets.UdpClient logs6.papertrailapp.com, 12345

$payload = [Text.Encoding]::UTF8.GetBytes("PowerShell to Papertrail - UDP Syslog Test")

$udp.Send($payload, $payload.Length)

You can use this similar script to replicate a log transfer to Kiwi. Run this from the same host the Kiwi server is on.

$udp = New-Object Net.Sockets.UdpClient 127.0.0.1, 514

$payload = [Text.Encoding]::UTF8.GetBytes("udp papertrail test")

$udp.Send($payload, $payload.Length)

Read more
2 0 737