Look back into almost any online technology businesses 10, or even 5 years ago and you’d see a clear distinction between what the CTO and CMO did in their daily roles. The former would oversee the building of technology and products whilst the latter would drive the marketing that brought in the customers to use said technology. In short, the two together took care of two very different sides of the same coin.

 

Marketing departments traditionally measure their success against KPIs such as the number of conversions a campaign brought in versus the cost of running it. Developers measure their performance on how quickly and effectively they develop new technologies.

 

Today, companies are shifting focus towards a customer-centric approach, where customer experience and satisfaction is paramount. After all, how your customers feel about your products can make, or break a business.

Performance diagnostic tools can help you optimize a slow web page but won’t show you whether your visitors are satisfied.

So where do the classic stereotypes that engineers only care about performance and marketers only care about profit fit into the customer-centric business model? The answer is they don’t: in a business where each department works against the same metrics — increasing their customers’ experience — having separate KPIs is as redundant as a trap door in a canoe.

 

The only KPI that matters is “are my customers happy?”

 

Developers + Marketing KPIs = True

With technology being integral to any online business, marketers are now in a position where we can gather so much data and in such detail that we are on the front line when it comes to gauging the satisfaction and experience of our customers. We can see what path a visitor took on our website, how long they took to complete their journey and whether they achieved what they set out to do.

 

Armed with this, we stand in a position to influence the technologies developers build and use.

 

Support teams, no longer confined to troubleshooting customer problems have become Customer Success teams, and directly impact on how developers build products, armed with first-hand data from their customers.

 

So as the lines blur between departments, it shouldn’t come as a surprise that engineering teams should care about marketing metrics. After all, if a product is only as effective as the people who use it, engineers build better products and websites when they know how customers intend to use them.

 

Collaboration is King

“How could engineers possibly make good use of marketing KPIs?” you might ask. After all, the two are responsible for separate ends of your business but can benefit from the same data.

 

Take a vital page on your business’s website: it’s not the fastest page on the net but its load time is consistent and it achieves its purpose: to convert your visitors to customers. Suddenly your bounce rate has shot up from 5% to 70%.

Ask an engineer to troubleshoot the issue and they might tell you that the page isn’t efficient. It takes 2.7 seconds to load, which is 0.7 seconds over the universal benchmark and what’s more is that some of the file sizes on your site are huge.

 

Ask a marketer the same question and they might tell you that the content is sloppy, making the purpose of the page unclear. The colors are off-brand and what’s more is that an important CTA is missing.

 

Even though both have been looking at the same page, they’ve come to two very different results, but the bottom line is that your customer doesn’t care about what went wrong. What matters is that the issue is identified and solved, quickly.

 

Unified Metrics Mean Unified Monitoring

Having unified KPIs across the various teams internal to your organisation means that they should all draw their data from the same source: a single, unified monitoring tool.

 

For businesses where the customer comes first, a new breed of monitoring is evolving that offers organizations this unified view, centred on how your customer experiences your site: Digital Experience Monitoring, or seeing as everything we do is digital, how about we just call it Experience Monitoring?

With Digital Experience Monitoring, your marketers and your engineering teams can follow a customer’s journey through your site, see how the navigated through it and where and why interest became a sale or a lost opportunity.

 

Let’s go back to our previous example: both your marketer and your engineer will see that although your bounce rate skyrocketed, the page load time and size stayed consistent. What they might also see is that onboarding you implemented that coincides with your bounce rate spike is confusing to your customers meaning that they leave, frustrated and unwilling to convert.

 

Digital Experience Monitoring gives a holistic view of your website’s health and helps you answer questions like:

  • Where your visitors come from
  • When are they visiting your site
  • What they visit and the journey they take to get there
  • How your site’s performance impacts on your visitors

By giving your internal teams access to the same metrics, you foster greater transparency across your organization which leads to faster resolution of issues, a deeper knowledge of your visitors and better insights into what your customers love about your products.

 

Pingdom’s Digital Experience Monitoring, Visitor Insights, bridges the gap between site performance and customer satisfaction, meaning you can guess less and know more about how your visitors experience your site.

What are some common problems that can be detected with the handy router logs on Heroku? We’ll explore them and show you how to address them easily and quickly with monitoring of Heroku from SolarWinds Papertrail.

 

One of the first cloud platforms, Heroku is a popular platform as a service (PaaS) that has been in development since June 2007. It allows developers and DevOps specialists to easily deploy, run, manage, and scale applications written in Ruby, Node.js, Java, Python, Clojure, Scala, Go, and PHP.

 

To learn more about Heroku, head to the Heroku Architecture documentation.

 

Intro to Heroku Logs

Logging in Heroku is modular, similar to gathering system performance metrics. Logs are time-stamped events that can come from any of the processes running in all application containers (Dynos), system components, or backing services. Log streams are aggregated and fed into the Logplex—a high-performance, real-time system for log delivery into a single channel.

 

Run-time activity, as well as dyno restarts and relocations, can be seen in the application logs. This will include logs generated from within application code deployed on Heroku, services like the web server or the database, and the app’s libraries. Scaling, load, and memory usage metrics, among other structural events, can be monitored with system logs. Syslogs collect messages about actions taken by the Heroku platform infrastructure on behalf of your app. These are two of the most recurrent types of logs available on Heroku.

 

To fetch logs from the command line, we can use the heroku logs command. More details on this command, such as output format, filtering, or ordering logs, can be found in the Logging article of Heroku Devcenter.

$ heroku logs 2019-09-16T15:13:46.677020+00:00 app[web.1]: Processing PostController#list (for 208.39.138.12 at 2010-09-16 15:13:46) [GET] 2018-09-16T15:13:46.677902+00:00 app[web.1]: Rendering post/list 2018-09-16T15:13:46.698234+00:00 app[web.1]: Completed in 74ms (View: 31, DB: 40) | 200 OK [http://myapp.heroku.com/] 2018-09-16T15:13:46.723498+00:00 heroku[router]: at=info method=GET path='/posts' host=myapp.herokuapp.com' fwd='204.204.204.204' dyno=web.1 connect=1ms service=18ms status=200 bytes=975   # © 2018 Salesforce.com. All rights reserved.

Heroku Router Logs

Router logs are a special case of logs that exist somewhere between the app logs and the system logs—and are not fully documented on the Heroku website at the time of writing. They carry information about HTTP routing within Heroku Common Runtime, which manages dynos isolated in a single multi-tenant network. Dynos in this network can only receive connections from the routing layer. These routes are the entry and exit points of all web apps or services running on Heroku dynos.

 

Tail router only logs with the heroku logs -tp router CLI command.

$ heroku logs -tp router 2018-08-09T06:24:04.621068+00:00 heroku[router]: at=info method=GET path='/db' host=quiet-caverns-75347.herokuapp.com request_id=661528e0-621c-4b3e-8eef-74ca7b6c1713 fwd='104.163.156.140' dyno=web.1 connect=0ms service=17ms status=301 bytes=462 protocol=https 2018-08-09T06:24:04.902528+00:00 heroku[router]: at=info method=GET path='/db/' host=quiet-caverns-75347.herokuapp.com request_id=298914ca-d274-499b-98ed-e5db229899a8 fwd='104.163.156.140' dyno=web.1 connect=1ms service=211ms status=200 bytes=3196 protocol=https 2018-08-09T06:24:05.002308+00:00 heroku[router]: at=info method=GET path='/stylesheets/main.css' host=quiet-caverns-75347.herokuapp.com request_id=43fac3bb-12ea-4dee-b0b0-2344b58f00cf fwd='104.163.156.140' dyno=web.1 connect=0ms service=3ms status=304 bytes=128 protocol=https 2018-08-09T08:37:32.444929+00:00 heroku[router]: at=info method=GET path='/' host=quiet-caverns-75347.herokuapp.com request_id=2bd88856-8448-46eb-a5a8-cb42d73f53e4 fwd='104.163.156.140' dyno=web.1 connect=0ms service=127ms status=200 bytes=7010 protocol=https   # Fig 1. Heroku router logs in the terminal

Heroku routing logs always start with a timestamp and the “heroku[router]” source/component string, and then a specially formatted message. This message begins with either “at=info”, “at=warning”, or “at=error” (log levels), and can contain up to 14 other detailed fields such as:

  • Heroku error “code” (Optional) – For all errors and warning, and some info messages; Heroku-specific error codes that complement the HTTP status codes.
  • Error “desc” (Optional) – Description of the error, paired to the codes above
  • HTTP request “method” e.g. GET or POST – May be related to some issues
  • HTTP request “path” – URL location for the request; useful for knowing where to check on the application code
  • HTTP request “host” – Host header value
  • The Heroku HTTP Request ID – Can be used to correlate router logs to application logs;
  • HTTP request “fwd” – X-Forwarded-For header value;
  • Which “dyno” serviced the request – Useful for troubleshooting specific containers
  • “Connect” time (ms) spent establishing a connection to the web server(s)
  • “Service” time (ms) spent proxying data between the client and the web server(s)
  • HTTP response code or “status” – Quite informative in case of issues;
  • Number of “bytes” transferred in total for this web request;

 

Common Problems Observed with Router Logs

Examples are manually color-coded in this article. Typical ways to address the issues shown above are also provided for context.

 

Common HTTP Status Codes

404 Not Found Error

Problem: Error accessing nonexistent paths (regardless of HTTP method):

2018-07-30T17:10:18.998146+00:00 heroku[router]: at=info method=POST path='/saycow' host=heroku-app-log.herokuapp.com request_id=e5634f81-ec54-4a30-9767-bc22365a2610 fwd='187.220.208.152' dyno=web.1 connect=0ms service=15ms status=404 bytes=32757 protocol=https 2018-07-27T22:09:14.229118+00:00 heroku[router]: at=info method=GET path='/irobots.txt' host=heroku-app-log.herokuapp.com request_id=7a32a28b-a304-4ae3-9b1b-60ff28ac5547 fwd='187.220.208.152' dyno=web.1 connect=0ms service=31ms status=404 bytes=32769 protocol=https

Solution: Implement or change those URL paths in the application or add the missing files.

500 Server Error

Problem: There’s a bug in the application:

2018-07-31T16:56:25.885628+00:00 heroku[router]: at=info method=GET path='/' host=heroku-app-log.herokuapp.com request_id=9fb92021-6c91-4b14-9175-873bead194d9 fwd='187.220.247.218' dyno=web.1 connect=0ms service=3ms status=500 bytes=169 protocol=https

Solution: The application logs have to be examined to determine the cause of the internal error in the application’s code. Note that HTTP Request IDs can be used to correlate router logs against the web dyno logs for that same request.

Common Heroku Error Codes

 

Other problems commonly detected by router logs can be explored in the Heroku Error Codes. Unlike HTTP codes, these error codes are not standard and only exist in the Heroku platform. They give more specific information on what may be producing HTTP errors.

H14 – No web dynos running

Problem: App has no web dynos setup:

2018-07-30T18:34:46.027673+00:00 heroku[router]: at=error code=H14 desc='No web processes running' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=b8aae23b-ff8b-40db-b2be-03464a59cf6a fwd='187.220.208.152' dyno= connect= service= status=503 bytes= protocol=https

Notice that the above case is an actual error message, which includes both Heroku error code H14 and a description. HTTP 503 means “service currently unavailable.”

Note that Heroku router error pages can be customized. These apply only to errors where the app doesn’t respond to a request e.g. 503.

Solution: Use the heroku ps:scale command to start the app’s web server(s).

 

H12 – Request timeout

Problem: There’s a request timeout (app takes more than 30 seconds to respond):

2018-08-18T07:11:15.487676+00:00 heroku[router]: at=error code=H12 desc='Request timeout' method=GET path='/sleep-30' host=quiet-caverns-75347.herokuapp.com request_id=1a301132-a876-42d4-b6c4-a71f4fe02d05 fwd='189.203.188.236' dyno=web.1 connect=1ms service=30001ms status=503 bytes=0 protocol=https

Error code H12 indicates the app took over 30 seconds to respond to the Heroku router.

Solution: Code that requires more than 30 seconds must run asynchronously (e.g., as a background job) in Heroku. For more info read Request Timeout in the Heroku DevCenter.

H18 – Server Request Interrupted

Problem: The Application encountered too many requests (server overload):

2018-07-31T18:52:54.071892+00:00 heroku[router]: sock=backend at=error code=H18 desc='Server Request Interrupted' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=3a38b360-b9e6-4df4-a764-ef7a2ea59420 fwd='187.220.247.218' dyno=web.1 connect=0ms service=3090ms status=503 bytes= protocol=https

Solution: This problem may indicate that the application needs to be scaled up, or the app performance improved.

H80 – Maintenance mode

Problem: Maintenance mode generates an info router log with error code H18:

2018-07-30T19:07:09.539996+00:00 heroku[router]: at=info code=H80 desc='Maintenance mode' method=GET path='/' host=heroku-app-log.herokuapp.com request_id=1b126dca-1192-4e98-a70f-78317f0d6ad0 fwd='187.220.208.152' dyno= connect= service= status=503 bytes= protocol=https

Solution: Disable maintenance mode with heroku maintenance:off

 

Papertrail

Papertrail™ is a cloud log management service designed to aggregate Heroku app logs, text log files, and syslogs, among many others, in one place. It helps you to monitor, tail, and search logs via a web browser, command-line, or an API. The Papertrail software analyzes log messages to detect trends, and allows you to react instantly with automated alerts.

 

The Event Viewer is a live aggregated log tail with auto-scroll, pause, search, and other unique features. Everything in log messages is searchable, and new logs still stream in real time in the event viewer when searched (or otherwise filtered). Note that Papertrail reformats the timestamp and source in its Event Viewer to make it easier to read.

Viewer Live Pause
Fig 2. The Papertrail Event Viewer. © 2018 Solarwinds. All rights reserved.

Provisioning Papertrail on your Heroku apps is extremely easy: heroku addons:create papertrail from terminal. (See the Papertrail article in Heroku’s DevCenter for more info.) Once setup, the add-on can be open from the Heroku app’s dashboard (Resources section) or with heroku addons:open papertrail in terminal.

 

Troubleshooting Routing Problems Using Papertrail

A great way to examine Heroku router logs is by using the Papertrail solution. It’s easy to isolate them in order to filter out all the noise from multiple log sources: simply click on the “heroku/router” program name in any log message, which will automatically search for “program:heroku/router” in the Event Viewer:

Heroku router viewer
Fig 3. Tail of Heroku router logs in Papertrail, 500 app error selected. © 2018 SolarWinds. All rights reserved.

 

Monitor HTTP 404s

How do you know that your users are finding your content, and that it’s up to date? 404 Not Found errors are what a client receives when the URL’s path is not found. Examples would be a misspelled file name or a missing app route. We want to make sure these types of errors remain uncommon, because otherwise, users are either walking to dead ends or seeing irrelevant content in the app!

 

With Papertrail, setting up an alert to monitor the amount of 404s returned by your app is easy and convenient. One way to do it is to search for “status=404” in the Event Viewer, and then click on the Save Search button. This will bring up the Save Search popup, along with the Save & Setup Alert option:

Save a search
Fig 4. Save a log search and set up an alert with a single action © 2018 SolarWinds. All rights reserved.

 

The following screen will give us the alert delivery options, such as email, Slack message, push notifications, or even publish all matching events as a custom metric for application performance management tools such as AppOptics™.

Troubleshoot 500 errors quickly

500 error on Heroku
Fig 5. HTTP 500 Internal Server Error from herokuapp.com. © 2018 Google LLC. All rights reserved.

 

Let’s say an HTTP 500 error is happening on your app after it’s deployed. A great feature of Papertrail is to make the request_id in log messages clickable. Simply click on it or copy it and search it in the Event Viewer to find all the app logs that are causing the internal problem, along with the detailed error message from your application’s code.

 

Conclusion

Heroku router logs are the glue between web traffic and (sometimes intangible) errors in your application code. It makes sense to give them special focus when monitoring a wide range of issues because they often indicate customer-facing problems that we want to avoid or address ASAP. Add the Papertrail addon to Heroku to get more powerful ways to monitor router logs.

 

Sign up for a 30-day free trial of Papertrail and start aggregating logs from all your Heroku apps and other sources. You may learn more about the Papertrail advanced features in its Heroku Dev Center article.

Page load time is inversely related to page views and conversion rates. While probably not a controversial statement, as the causality is intuitive, there is empirical data from industry leaders such as Amazon, Google, and Bing to back it in High Scalability and O’Reilly’s Radar, for example.

 

As web technology has become much more complex over the last decade, the issue of performance has remained a challenge as it relates to user experience. Fast forward to 2018, and UX is identified as a key requirement for business success by CIOs and CDOs.

 

In today’s growing ecosystem of competing web services, the undeniable reality remains that performance impacts business and it can represent a major competitive (dis)advantage. Whether your application relies on AWS, Azure, Heroku, Salesforce, Cloud Foundry, or any other SaaS platform, consider these five tips for monitoring SaaS services.

 

1. Realize the Importance of Monitoring

In case we haven’t established that app performance is critical for business success, let’s look at research done in the online retail sector.

 

“E-commerce sites must adopt a zero-tolerance policy for any performance issues that will impact customer experience [in order to remain competitive]” according to Retail Systems Research. Their conclusion is that performance management must shift from being considered an IT issue to being a business matter.

 

We can take this concept into more specific terms, as stated in our article series on Building a SaaS Service for an Unknown Scale. “Treat scalability and reliability as product features; this is the only way we can build a world-class SaaS application for unknown scale.”

LG ProactiveMonitoringSaaS BlogImage A
Data from Measuring the Business Impact of IT Through Application Performance (2015).

 

End users have come to expect very fast, real-time-like interaction with most software, regardless of the system complexities behind the scenes. This means that commercial applications and SaaS services need to be built and integrated with performance in mind at all times. And so, knowing how to measure their performance from day one is paramount. Logs extend application performance monitoring (APM) by giving you deeper insights into the causes of performance problems as well as application errors that can cause user experience problems.

 

2. Incorporate a Monitoring Strategy Early On

In today’s world, planning for your SaaS service’s successful adoption to take time (and thus worrying about its performance and UX later) is like selling 100 tickets to a party but only beginning preparations on the day of the event. Needless to say, such a plan is prone to produce disappointed customers, and it can even destroy a brand. Fortunately, with SaaS monitoring solutions like SolarWinds® Loggly®, it’s not time-consuming or expensive to implement monitoring.

 

In fact, letting scalability become a bottleneck is the first of Six Critical SaaS Engineering Mistakes to Avoid we published some time ago. We recommend defining realistic adoption goals and scenarios in early project stages, and to map them into performance, stress, and capacity testing. To realize these tests, you’ll need to be able to monitor specific app traffic, errors, user engagement, and other metrics that tech and business teams need to define together.

 

A good place to start is with the Four Golden Signals described by Google’s Monitoring Distributed Systems book chapter: Latency, Traffic, Errors, and Saturation. Finally, and most importantly from the business perspective, your key metrics can be used as service level indicators (SLI), which are measures of the service level provided to customers.

 

Based on your SLIs and adoption goals, you’ll be able to establish service level objectives (SLOs) so your ops team can target specific availability levels (uptime and performance). And, as a SaaS service provider, you should plan to offer service level agreement (SLA). SLAs are contracts with your clients that specify what happens if you fail to meet non-functional requirements, and the terms are based on your SLOs, but can be negotiated with each client, of course. SLIs, SLOs, and SLAs are the basis for successful site reliability engineering (SRE).

LG ProactiveMonitoringSaaS BlogImage B
Apache Preconfigured Dashboards in Loggly can help you watch SLOs in a single click.

 

For a seamless understanding among tech and business leadership, key performance indicators (KPI) should be identified for various business stakeholders. KPIs should then be mapped to the performance metrics that compose each SLA (so they can be monitored). Defining a matrix of KPI vs. metrics vs. area of business impact as part of the business documentation is a good option. For example, a web conversion rate could map to page load time and number of outages, and impacts sales.

 

Finally, don’t forget to consider and plan for governance: roles and responsibilities around information (e.g., ownership, prioritization, and escalation rules). The RACI model can help you establish a clear matrix of which team is responsible, accountable, consulted, and informed when there are unplanned events emanating from or affecting business technology.

 

3. Have Application Logging as a Code Standard

Tech leadership should realize that the main function of logging begins after the initial development is complete. Good logging serves multiple purposes:

  1. Improving debugging during development iterations
  2. Providing visibility for tuning and optimizing complex processes
  3. Understanding and addressing failures of production systems
  4. Business intelligence

“The best SaaS companies are engineered to be data-driven, and there’s no better place to start than leveraging data in your logs.” (From the last of our SaaS Engineering Mistakes)

 

Best practices for logging is a topic that’s been widely written about. For example, see our article on best practices for creating logs. Here are a few guidelines from that and other sources:

  • Define logging goals and criteria to decide what to log. (Logging absolutely everything produces noise and is needlessly expensive.)
  • Log messages should contain data, context, and description. They need to be digestible (structured in a way that both humans and machines can read them).
  • Ensure that log messages are appropriate in severity using standard levels such as FATAL, ERROR, WARN, INFO, DEBUG, TRACE (See also Syslog facilities and levels).
  • Avoid side effects on the code execution. Particularly, don’t let logging halt your app by using non-blocking calls.
  • External systems: try logging all data that comes out from your application and gets in.
  • Use a standard log message format with clear key-value pairs and/or consider a known text standard format like JSON. (See figure 4 below.)
  • Support distributed logging: Centralize logs to a shareable, searchable platform such as Loggly.

Some of our sources include:

LG ProactiveMonitoringSaaS BlogImage C
Loggly automatically parses several log formats you can navigate with the Fields Explorer.

 

Every stage in the software development life cycle can be enriched by logs and other metrics. Implementation, integration, staging, and production deployment (especially rolling deploys) will particularly benefit from monitoring such metrics appropriately.

 

Logs constitute valuable data for your tech team, and invaluable data for your business. Now that you have rich information about the app that is generated in real-time, think about ways to put it in good use.

 

4. Automate Your Monitoring Configuration

Modern applications are deployed using infrastructure as code (IaC) techniques because they replace fragile server configuration with systems that can be easily torn down and restarted. If your team has made undocumented changes to servers and are too scared to shut them down, they are essentially “pet” servers.

 

If you manually deploy monitoring configuration on a per-server basis, then you have the potential to lose visibility when servers stop or when you add new ones. If you treat monitoring as something to be automatically deployed and configured, then you’ll get better coverage for less effort in the long run. This becomes even more important when testing new versions of your infrastructure or code, and when recovering from outages. Tools like Terraform, Ansible, Puppet, and CloudFormation can automate not just the deployment of your application but the monitoring of it as well.

Monitoring tools typically have system agents that can be installed on your infrastructure to begin streaming metrics into their service. In the case of applications built on SaaS platforms, there are convenient integrations that plug into well-known ecosystems. For example, Loggly streams and centralizes logs as metrics, and supports dozens of out-of-box systems, including the Amazon Cloudwatch and Heroku PaaS platforms.

 

5. Use Alerts on Your Key Metrics

Monitoring solutions like Loggly can alert you in changes in your SLIs over time, such as your error rate. It can help you visually identify the types of errors that occur and when they start. This will help identify root causes and fix problems faster, minimizing impact to user experience.

LG ProactiveMonitoringSaaS BlogImage D
Loggly Chart of application errors split by errorCode.

 

Custom alerts can be created from saved log searches, which act as key metrics of your application’s performance. Loggly even lets you integrate alerts to incident management systems like PagerDuty and OpsGenie.

LG ProactiveMonitoringSaaS BlogImage E
Adding an alert from a Syslog error log search in Loggly.

 

In conclusion, monitoring your SaaS service performance is very important because it significantly impacts your business’ bottom line. This monitoring has to be planned for, applied early on, and instrumented for all the stages in the SDLC.

 

Additionally, we explained how and why correct logging is one of the best sources for key metrics to measure your monitoring goals during development and production of your SaaS service. Proper logging on an easy-to-use platform such as Loggly will also help your business harness invaluable intel in real time. You can leverage these streams of information for tuning your app, improving your service, and to discover new revenue models.

 

Sign up for a free 14-day trial of SolarWinds Loggly to start doing logging right today, and move your SaaS business into the next level of performance control and business intelligence.

Let’s dream for a while—imagine your databases. All of them are running fast and smooth. There’s no critical issue, no warnings. All requests are handled immediately and the response time is literally immeasurable. Sounds like a database nirvana, doesn’t it? Now, let’s face reality. You resolved all critical issues of the database, but people still report slowdowns. Everything looks good at a first glance, but your sixth sense tells you something bad is happening under the surface. You could start shooting in the dark and hope that you will hit the target, or you need more information about what’s going inside the database to make a single, surgically precise cut to solve the problem.

 

We’ve got good news for you. SolarWinds has a new tool called SQL Plan Warnings. For the first time, you can inspect the list of queries that have warnings without spending hours on manual and labor-intensive work. Oh, and we almost forgot to mention—this tool is available for you right now for free.

Why do we believe that the free SQL Plan Warnings tool can help you improve your databases? Well, SQL Server Optimizer often comes up with bad plans with warnings. That can cause increased resource consumption, increased wait time, and unnecessary end-user or customer angst. For these reasons, a database professional should look at it. But we don’t always have time or resources to do so.

 

SQL Plan Warnings free tool at a glance:

  • Gives you unique visibility into plan warnings that can be easily overlooked and can affect query performance
  • Sort all warnings by consumed CPU time, elapsed time, or execution
  • Filter results by warning type or by specific keywords
  • Investigate plan warnings, query text, or complete query plan in a single click
  • No installation is needed—just download the tool and run
  • Runson Microsoft Windows and MacOS X

 

And what can SQL Plan Warnings check for you?

  • Spill to TempDB – Sorts that are larger than estimated can spill to disk via TempDB. This can dramatically slow down queries. There are two similar warnings that fall into this category.
  • No join predicates – Query does not properly join tables/objects, which can cause Cartesian products and slow queries.
  • Implicit conversion – A column of data is being converted to another data type, which can cause a query to not use an index.
  • Missing indexes – SQL Server is telling us there is an index that may help performance.
  • Missing column statistics – If statistics are missing, it can lead to bad decisions by the optimizer.
  • Lookup warning – An index is being used, but it's not covering an index, and a visit back to the table is required to complete the query.

 

The free SQL Plan Warnings tool brings a fresh new feature to your database management capabilities and gives you another tool to improve query performance. Download it here today and be another step closer to our dream—everything in a database running fast and smooth with no critical issues and no warnings.

When development started on NGINX in 2002, the goal was to develop a web server which would be more performant than Apache had been up to that point. While NGINX may not offer all of the features available in Apache, its default configuration can handle approximately four times the number of requests per second while using significantly less memory.

 

While switching to a web server with better performance seems like a no-brainer, it’s important that you have a monitoring solution in place to ensure that your web server is performing optimally, and that users who are visiting the NGINX-hosted site receive the best possible experience. But how do we ensure that the experience is as performant as expected for all users?

 

Monitoring!

 

This article is meant to assist you in putting together a monitoring plan for your NGINX deployments. We’ll look at what metrics you should be monitoring, why they are important, and putting a monitoring plan in place using SolarWinds® AppOptics™.

 

Monitoring is a Priority

 

As engineers, we all understand and appreciate the value that monitoring provides. In the age of DevOps, however, when engineers are responsible for both the engineering and deployment of solutions into a production environment, monitoring is often relegated to the list of things we plan to do in the future. In order to be the best engineers we can be, monitoring should be the priority from day one.

 

Accurate and effective monitoring allows us to test the efficiency of our solutions, and help identify and troubleshoot inefficiencies and other potential problems. Once the solution has moved to requiring operational support, monitoring allows us to ensure that the application is running efficiently and alerting us when things go wrong. An effective monitoring plan should help to identify problems before they start, allowing engineers to resolve issues proactively, instead of being purely reactive.

 

Specific Metrics to Consider with NGINX

 

Before we can develop a monitoring plan, we need to know what metrics are available for monitoring, understand what they mean, and how we can use them. There are two distinct groups of metrics we should be concerned with—metrics related to the web server itself, and those related to the underlying infrastructure.

 

While a highly performant web server like NGINX may be able to handle more requests and traffic, it is vital that the machine hosting the web server has the necessary resources as well. Each metric represents a potential limit to the performance of your application. Ultimately, you want to ensure your web server and underlying infrastructure are able to operate efficiently without approaching those limits.

 

NGINX Web Server-specific Metrics

 

  • Current Connections
    Indicates the number of active and waiting client connections with the server. This may include actual users and automated tasks or bots.
  • Current Requests
    Each connection may be making one or more requests to the server. This number indicates the total count of requests coming in.
  • Connections Processed
    This shows the number of connections that have been accepted and handled by the server. Dropped connections can also be monitored.

 

Infrastructure-specific Metrics

  • CPU Usage
    An indication of the processing usage of the underlying machine. This should be measured as utilization across all cores, if using a multi-core machine.
  • Memory Usage
    Measurement of the memory currently in use on the machine.
  • Swap Usage
    Swap is what the host machine uses when it runs out of memory or if the memory region has been unused for a period of time. It is significantly slower, and is generally only used in an emergency. When an application begins using swap space, it’s usually an indicator that something is amiss.
  • Network Bandwidth
    Similar to traffic, this is a measurement of information flowing in and out of the machine. Again, load units are important to monitor here as well.
  • Disk Usage
    Even if the web server is not physically storing files on the host machine, space is required for logging, temporary files, and other supporting files.
  • Load
    Load is a performance metric which combines many of the other metrics into a simple number. A common rule of thumb is the load on the machine should be less than the number of processing cores.

 

Let’s look at how to configure monitoring on your instances with AppOptics, along with building a dashboard which will show each of those metrics.

 

Installing the AppOptics Agent on the Server

 

Before you start, you’ll need an account with AppOptics. If you don’t already have one, you can create a demo account, which will give you 14 days to try the service, free of charge.

 

The first thing to do to allow AppOptics to aggregate the metrics from the server is install the agent on all instances. To do this, you’ll need to reference your AppOptics API token when setting up the agent. Log in to your AppOptics account and navigate to the Infrastructure page.

 

Locate the Add Host button, and click on it. It should look similar to the image below.

 

Fig. 2. AppOptics Host Agent Installation

 

I used the Easy Install option when setting up the instances for this article. Ensure that Easy Install is selected, and select your Linux distribution. I used an Ubuntu image in the AWS Cloud, but this will work on almost any Linux server.

 

Note: Prior to installation of the agent, the bottom of the dialog below will not contain the success message.

 

Copy the command from the first box, and then SSH into the server and run the Easy Install script.

 

Fig. 3. Easy Install Script to Add AppOptics Agent to a Server

 

When the agent installs successfully, you should be presented with the following message on your terminal. The “Confirm successful installation” box on the AppOptics agent screen should look similar to the above, with a white on blue checkbox. You should also see “Agent connected.”

 

Fig. 4. Installing the AppOptics Agent on your NGINX Instance

 

Configuring the AppOptics Agent

 

With the agent installed, the next step is to configure NGINX to report metrics to the agent. Navigate back to the Infrastructure page, Integrations tab, and locate the NGINX plugin.

 

Note: Prior to enabling the integration, the “enabled” checkbox won’t be marked.

 

Fig. 5. NGINX Host Agent Plugin

 

Click on the plugin, and the following panel will appear. Follow the instructions in the panel, click Enable Plugin, and your metrics will start flowing from the server into AppOptics.

 

Fig. 6. NGINX Plugin Setup

 

When everything is configured, either click on the NGINX link in the panel’s Dashboard tab, or navigate to the Dashboards page directly, then select the NGINX link to view the default dashboard provided by AppOptics.

 

Working With the NGINX Dashboard

 

The default NGINX dashboard provided by AppOptics offers many metrics related to the performance of the web server that we discussed earlier and should look similar to the image below.

 

Fig. 8. Default AppOptics Dashboard

 

Now we need to add some additional metrics to get a full picture of the performance of our server. Unfortunately, you can’t make changes to the default dashboard, but it’s easy to create a copy and add metrics of your own. Start by clicking the Copy Dashboard button at the top of the screen to create a copy.

 

Create a name for your custom dashboard. For this example, I’m monitoring an application called Retwis, so I’m calling mine “NGNIX-Retwis.” It’s also helpful to select the “Open dashboard on completion” option, so you don’t have to go looking for the dashboard after it’s created.

 

Let’s do some customization. First, we want to ensure that we’re only monitoring the instances we need to. We do this by filtering the chart or dashboard. You can find out more about how to set and filter these in the documentation for Dynamic Tags.

 

With our sources filtered, we can add some additional metrics. Let’s look at CPU Usage, Memory Usage, and Load. Click on the Plus button located  at the bottom right of the dashboard. For CPU and Memory Usage, let’s add a Stacked chart. We’ll add one for each. Click on the Stacked icon.

 

Fig. 10. Create New Chart

 

In the Metrics search box, type “CPU” and hit enter. A selection of available metrics will appear below. I’m going to select system.cpu.utilization, but your selection may be different depending on the infrastructure you’re using. Select the checkbox next to the appropriate metric, then click Add Metrics to Chart. You can add multiple metrics to the chart by repeating the same process, but we’ll stick with one for now.

 

If you click on Chart Attributes, you can change the scale of the chart, adjust the Y-axis label, and even link it to another dashboard to show more detail for a specific metric. When you’re done, click on the green Save button, and you’ll be returned to your dashboard, with the new chart added. Repeat this for Memory Usage. I chose the “system.mem.used” metric.

 

For load, I’m going to use a Big Number Chart Type, and select the system.load.1_rel metric. When you’re done, your chart should look similar to what is shown below.

 

Fig. 11. Custom Dashboard to View NGINX Metrics

 

Pro tip: You can move charts around by hovering over a chart, clicking on the three dots that appear at the top of the chart, and dragging it around. Clicking on the menu icon on the top right of the chart will allow you to edit, delete, and choose other options related to the chart.

 

Beyond Monitoring

 

Once you have a monitoring plan in place and functioning, the next step is to determine baseline metrics for your application and set up alerts which will be triggered when significant deviations occur. Traffic is a useful baseline to determine and monitor. A significant reduction in traffic may indicate a problem that is preventing clients from accessing the service. A significant increase in traffic would indicate an increase in clients, and may require either an increase in the capacity of your environment (in the case of increased popularity), or, potentially, the deployment of defensive measures in response to a cyberattack.

 

Monitoring your NGINX server is critical as a customer-facing part of your infrastructure. You need to know immediately when there is a sudden change in traffic or connections that could impact the rest of your application or website. AppOptics provides an easy way to monitor your NGINX servers and it typically only takes a few minutes to get started. Learn more about AppOptics infrastructure monitoring and try it today with a free 14-day trial.

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.