ikongf

Papertrail for Python Logs

Posted by ikongf Employee Apr 27, 2019

When you’re troubleshooting a problem or tracking down a bug in Python, the first place to look for clues related to server issues is in the application log files.

 

Python includes a robust logging module in the standard library, which provides a flexible framework for emitting log messages. This module is widely used by various Python libraries and is an important reference point for most programmers when it comes to logging.

 

The Python logging module provides a way for applications to configure different log handlers and provides a standard way to route log messages to these handlers. As the Python.org documentation notes, there are four basic classes defined by the Python logging module: Loggers, Handlers, Filters, and Formatters. We’ll provide more details on these below.

 

Getting Started with Python Logs

There are a number of important steps to take when setting up your logs. First, you need to ensure logging is enabled in the applications you use. You also need to categorize your logs by name so they are easy to maintain and search. Naming the logs makes it easier to search through large log files, and to use filters to find the information you need.

 

To send log messages in Python, request a logger object. It should have a unique name to help filter and prioritize how your Python application handles various messages. We are also adding a StreamHandler to print the log on our console output. Here’s a simple example:

import logginglogging.basicConfig(handlers=[logging.StreamHandler()])log = logging.getLogger('test')log.error('Hello, world')

This outputs:

ERROR:test:Hello, world

This message consists of three fields. The first, ERROR, is the log level. The second, test, is the logger name. The third field, “Hello, world”, is the free-form log message.

 

Most problems in production are caused by unexpected or unhandled issues. In Python, such problems generate tracebacks where the interpreter tries to include all important information it could gather. This can sometimes make the traceback a bit hard to read, though. Let’s look at an example traceback. We’ll call a function that isn’t defined and examine the error message.

def test():    nofunction()test()

Which outputs:

Traceback (most recent call last):   File '<stdin>', line 1, in <module>   File '<stdin>', line 2, in test NameError: global name 'nofunction' is not defined

This shows the common parts of a Python traceback. The error message is usually at the end of the traceback. It says “nofunction is not defined,” which is what we expected. The traceback also includes the lines of all stack frames that were touched when this error occurred. Here we can see that it occurred in the test function on line two. Stdin means standard input and refers to the console where we typed this function. If we were using a Python source file, we’d see the file name here instead.

 

Configuring Logging

You should configure the logging module to direct messages to go where you want them. For most applications, you will want to add a Formatter and a Handler to the root logger. Formatters let you specify the fields and timestamps in your logs. Handlers let you define where they are sent. To set these up, Python provides a nifty factory function called basicConfig.

import logginglogging.basicConfig(format='%(asctime)s %(message)s',                  handlers=[logging.StreamHandler()])logging.debug('Hello World!')

By default, Python will output uncaught exceptions to your system’s standard error stream. Alternatively, you could add a handler to the excepthook to send any exceptions through the logging module. This gives you the flexibility to provide custom formatters and handlers. For example, here we log our exceptions to a log file using the FileHandler:

import loggingimport syslogger = logging.getLogger('test')fileHandler = logging.FileHandler('errors.log')logger.addHandler(fileHandler)def my_handler(type, value, tb):  logger.exception('Uncaught exception: {0}'.format(str(value)))# Install exception handlersys.excepthook = my_handler# Throw an errornofunction()

Which results in the following log output:

$ cat errors.log Uncaught exception: name 'nofunction' is not defined None

In addition, you can filter logs by configuring the log level. One way to set the log level is through an environment variable, which gives you the ability to customize the log level in the development or production environment. Here’s how you can use the LOGLEVEL environment variable:

$ export LOGLEVEL='ERROR' $ python >>> import logging >>> logging.basicConfig(handlers=[logging.StreamHandler()]) >>> logging.debug('Hello World!') #prints nothing >>> logging.error('Hello World!') ERROR:root:Hello World!

Logging from Modules

Modules intended for use by other programs should only emit log messages. These modules should not configure how log messages are handled. A standard logging best practice is to let the Python application importing and using the modules handle the configuration.

Another standard best practice to follow is that each module should use a logger named like the module itself. This naming convention makes it easy for the application to distinctly route various modules and helps keep the log code in the module simple.

You need just two lines of code to set up logging using the named logger. Once you do this in Python, the “ name” contains the full name of the current module, and will work in any module. Here’s an example:

import logginglog = logging.getLogger(__name__)def do_something():    log.debug('Doing something!')

Analyzing Your Logs with Papertrail

Python applications on a production server contain millions of lines of log entries. Command line tools like tail and grep are often useful during the development process. However, they may not scale well when analyzing millions of log events spread across multiple servers.

 

Centralized logging can make it easier and faster for developers to manage a large volume of logs. By consolidating log files onto one integrated platform, you can eliminate the need to search for related data that is split across multiple apps, directories, and servers. Also, a log management tool can alert you to critical issues, helping you more quickly identify the root cause of unexpected errors, as well as bugs that may have been missed earlier in the development cycle.

 

For production-scale logging, a log management tool such as Solarwinds® Papertrail™ can help you better manage your data. Papertrail is a cloud-based platform designed to handle logs from any Python application, including Django servers.

 

The Papertrail solution provides a central repository for event logs. It helps you consolidate all of your Python logs using syslog, along with other application and database logs, giving you easy access all in one location. It offers a convenient search interface to find relevant logs. It can also stream logs to your browser in real time, offering a “live tail” experience. Check out the tour of the Papertrail solution’s features.

2018-09-07-viewer

Papertrail is designed to help minimize downtime. You can receive alerts via email, or send them to Slack, Librato, PagerDuty, or any custom HTTP webhooks of your choice. Alerts are also accessible from a web page that enables customized filtering. For example, you can filter by name or tag.

2018-09-07-edit-alert

 

Configuring Papertrail in Your Application

There are many ways to send logs to Papertrail depending on the needs of your application. You can send logs through journald, log files, Django, Heroku, and more. We will review the syslog handler below.

 

Python can send log messages directly to Papertrail with the Python SysLogHandler. Just set the endpoint to the log destination shown in your Papertrail settings. You can optionally format the timestamp or set the log level as shown below.

import loggingimport socketfrom logging.handlers import SysLogHandlersyslog = SysLogHandler(address=('logsN.papertrailapp.com', XXXXX))format = '%(asctime)s YOUR_APP: %(message)s'formatter = logging.Formatter(format, datefmt='%b %d %H:%M:%S')syslog.setFormatter(formatter)logger = logging.getLogger()logger.addHandler(syslog)logger.setLevel(logging.INFO)def my_handler(type, value, tb):  logger.exception('Uncaught exception: {0}'.format(str(value)))# Install exception handlersys.excepthook = my_handlerlogger.info('This is a message')nofunction() #log an uncaught exception

Conclusion

Python offers a well-thought-out framework for logging that makes it simple to enable and manage your log files. Getting started is easy, and a number of tools baked into Python automate the logging process and help ensure ease of use.

Papertrail adds even more functionality and tools for diagnostics and analysis, enabling you to manage your Python logs on a centralized cloud server. Quick to setup and easy to use, Papertrail consolidates your logs on a safe and accessible platform. It simplifies your ability to search log files, analyze them, and then act on them in real time—so that you can focus on debugging and optimizing your applications.

Learn more about how Papertrail can help optimize your development ecosystem.

Have you ever wondered what happens when you type an address into your browser? The first step is the translation of a domain name (such as pingdom.com) to an IP address. Resolving domain names is done through a series of systems and protocols that make up the Domain Name System (DNS). Here we’ll break down what DNS is, and how it powers the underlying infrastructure of the internet.

 

What is DNS?

Traffic across the internet is routed by an identifier called an IP address. You may have seen IP addresses before. IPv4 addresses are a series of four numbers under 256, separated by periods (for example: 123.45.67.89).

 

IP addresses are at the core of communicating between devices on the internet, but they are hard to memorize and can change often, even for the same service. To get around these problems, we give names to IP addresses. For example, when you type https://www.pingdom.com into your web browser, it translates that name into an IP address, which your computer then uses to access a server that ultimately responds with the contents of the page that your browser displays. If a new server is put into place with a new IP address, that name can simply be updated to point to the new address.

 

These records are stored in the name server for a given name, or “zone,” in DNS parlance. These zones can include many different records and record types for the base name and subdomains in that zone.

 

The internet is decentralized, designed to withstand failure, and not rely on a single source of truth. DNS is built for this environment using recursion, which enables DNS servers to talk to each other to find the answer for a request. Each server is more authoritative than the last, until it reaches one of 13 “root” servers that are globally maintained as the definitive source for other DNS servers.

 

Anatomy of a DNS Request

When you type in “pingdom.com” to your browser and hit enter, your browser doesn’t directly ask the web servers for that page. First, a multi-step interaction with DNS servers must happen to translate pingdom.com into an IP address that is useable for establishing a connection and routing traffic. Here’s what that interaction looks like:

  1. Recursive DNS server requests abc.com from a DNS root server. The root server replies with the .com TLD name server IP address.
  2. Recursive DNS server requests abc.com from the .com TLD name server. The TLD name server replies with the authoritative name server for abc.com.
  3. Recursive DNS server requests abc.com from the abc.com nameserver. The nameserver replies with the IP address A record for abc.com. This IP address is returned to the client.
  4. Client requests abc.com using the web server’s IP address that was just resolved.

 

In subsequent requests, the recursive name server will have the IP address for pingdom.com.

This IP address is cached for a period of time determined by the pingdom.com nameserver. This value is called the time-to-live (TTL) for that domain record. A high TTL for a domain record means that local DNS resolvers will cache responses for longer and give quicker responses. However, making changes to DNS records can take longer due to the need to wait for all cached records to expire. Conversely, domain records with low TTLs can change much more quickly, but DNS resolvers will need to refresh their records more often.

 

Not Just for the Web

The DNS protocol is for anything that requires a decentralized name, not just the web. To differentiate between various types of servers registered with a nameserver, we use record types. For example, email servers are part of DNS. If a domain name has an MX record, it is signaling that the address associated with that record is an email server.

 

Some of the more common record types you will see are:

  • A Record – used to point names directly at IPv4 addresses. This is used by web browsers.
  • AAAA Record – used to point names directly at IPV6 addresses. This is used by web browsers when a device has an IPv6 network.
  • CNAME Record – also known as the Canonical Name record and is used to point web domains at other DNS names. This is common when using platforms as a service such as Heroku or cloud load balancers that provide an external domain name rather than an IP address.
  • MX Record – as mentioned before, MX records are used to point a domain to mail servers.
  • TXT Record – arbitrary information attached to a domain name. This can be used to attach validation or other information about a domain name as part of the DNS system. Each domain or subdomain can have one record per type, with the exception of TXT records.

 

DNS Security and Privacy

There are many parts to resolving a DNS request, and these parts are subject to security and privacy issues. First, how do we verify that the IP address we requested is actually the one on file with the domain’s root nameserver? Attacks exist that can disrupt the DNS chain, providing false information back to the client or triggering denial of service attacks upon sites. Untrusted network environments are vulnerable to man-in-the-middle attacks that can hijack DNS requests and provide back false results.

 

There is ongoing work to enhance the security of DNS with the Domain Name System Security Extensions (DNSSEC). This is a combination of new records, public-key cryptography, and establishing a chain of trust with DNS providers to ensure domain records have not been tampered with. Some DNS providers today offer the ability to enable DNSSEC, and its adoption is growing as DNS-based attacks become more prevalent.

 

DNS requests are also typically unencrypted, which allows attackers and observers to pry into the contents of a DNS request. This information is valuable, and your ISP or recursive zone provider may be providing this information to third parties or using it to track your activity. Furthermore, it may or may not contain personally identifiable information like your IP address, which can be correlated with other tracking information that third parties may be holding.

 

There are a few ways to help protect your privacy with DNS and prevent this sort of tracking:

 

1. Use a Trusted Recursive Resolver

Using a trusted recursive resolver is the first step to ensuring the privacy of your DNS requests. For example, the Cloudflare DNS service https://1.1.1.1is a fast, privacy-centric DNS resolver. Cloudflare doesn’t log IP addresses or track requests that you make against it at any time.

 

2. Use DNS over HTTPS (DoH)

DoH is another way of enhancing your privacy and security when interacting with DNS resolvers. Even when using a trusted recursive resolver, man-in-the-middle attacks can alter the returned contents back to the requesting client. DNSSEC offers a way to fix this, but adoption is still early, and relies on DNS providers to enable this feature.

 

DoH secures this at the client to DNS resolver level, enabling secure communication between the client and the resolver. The Cloudflare DNS service offers DNS over HTTPS, further enhancing the security model that their recursive resolver provides. Keep in mind that the domain you’re browsing is still available to ISPs thanks to Server Name Indication, but the actual contents, path, and other parts of the request are encrypted.

 

Even without DNSSEC, you can still have a more private internet experience. Firefox recently switched over to using the Cloudflare DNS resolver for all requests by default. At this time, DoH isn’t enabled by default unless you are using the nightly build.

 

Monitoring DNS Problems

DNS is an important part of your site’s availability because a problem can cause a complete outage. DNS has been known to cause outages due to BGP attacks, TLD outages, and other unexpected issues. It’s important your uptime or health check script includes DNS lookups.

Using SolarWinds® Pingdom®, we can monitor for DNS problems using the uptime monitoring tool. Here we will change the DNS record for a domain and show you how the Pingdom tool responds. Once you have an uptime check added in Pingdom, click the “Reports” section, and “Uptime” under that section, then go to your domain of interest. Under the “Test Result Log” tab for an individual domain’s uptime report, hover over the failing entry to see why a check failed.

This tells us that for our domain, we have a “Non-recoverable failure in name resolution.” This lets us know to check our DNS records. After we fix the problem, our next check succeeds:

Pingdom gives us a second set of eyes to make sure our site is still up as expected.

 

Curious to learn more about DNS? Check out our post on how to test your DNS-configuration. You can also learn more about Pingdom uptime monitoring.

The world may run on coffee, but it’s the alarm clock that gets us out of bed. It operates on a simple threshold. You set the time that’s important to you and receive an alert when that variable is true.

 

Like your alarm clock, today’s tooling for web service alerting often operates on simple thresholds, but unlike with your clock, there is a wide variety of metrics and it’s not as clear which should trigger an alert. Until we have something better than thresholds, engineers have to carefully weigh which metrics are actionable, how they are being measured, and what thresholds correspond to real-world problems.

 

Measure the Thing You Care About

 

In practice, this arguably simple process of reasoning about what you are monitoring, and how you are monitoring it, is rarely undertaken. More often, our metric choices and threshold values are guided by our preexisting tools. Hence, if our tools cannot measure latency, we do not alert on latency. This practice of letting our tools guide our telemetry content is an anti-pattern which results in unreliable problem detection and alerting.

 

Effective alerting requires metrics that are reliably actionable. You have to start by reasoning about the application and/or infrastructure you want to monitor. Only then can you choose and implement collection tools that get you the metrics you’re actually interested in, like queue size, DB roundtrip times, and inter-service latency.

 

One Reliable Signal

 

Effective alerting requires a singular, reliable telemetry signal, to which every collector can contribute. Developing and ensuring a reliable signal can be difficult, but the orders of magnitude are simpler than building out multiple disparate monitoring systems and trying to make them agree with each other – in the way many shops, for example, alert from one system like Nagios, and troubleshoot from another like Ganglia.

 

It’s arguably impossible to make multiple, fallible systems agree with each other in every case. They may usually agree, but every false positive or false negative undermines the credibility of both systems. Further, multiple systems rarely improve because it’s usually impossible to know which system was at fault when they disagree. Did the alerting system send a bogus alert or is there a problem with the data in the visualization system? If false positives arise from a single telemetry system, you simply iterate and improve that system.

 

Alert Recipient == Alert Creator

 

Crafting effective alerts involves knowing how your systems work. Each alert should trigger in the mind of its recipient an actionable cognitive model that describes how the production environment is being threatened. How does the individual piece of infrastructure that fired this alert affect the application? Why is this alert a problem?

 

Only engineers who understand the systems and applications we care about have the requisite knowledge to craft alerts that describe actionable threats to those systems and applications. Therefore effective alerting requires that the recipients of alerts be able to craft those alerts.

 

Push Notifications as Last Resort

 

Emergencies force context switches. They interrupt workflow and destroy productivity. Many alerts are necessary, but very few of them should be considered emergencies. At AppOptics, the preponderance of our alerts are delivered to group chat. We find that this is a timely notification medium which doesn’t interrupt productivity. Further, group chat allows everyone to react to the alert together, in a group context, rather than individually from an email-box or pager. This helps us avoid redundant troubleshooting effort, and keeps everyone synchronized on problem resolution.

 

Effective alerting requires an escalation system that can communicate problems in a way that is not interrupt-driven. There are myriad examples in other industries like healthcare and security systems, where, when every alert is interrupt-driven, human beings quickly begin to ignore the alerts. Push notifications should be a last resort.

 

Alerting is Hard

 

Effective alerting is a deceptively hard problem, which represents one of the biggest challenges facing modern operations engineers. A careful balance needs to be struck between the needs of the systems and the needs of the humans tending those systems.

For an infrastructure to be healthy, there must be good monitoring. The team should have a monitoring infrastructure that speeds up and facilitates the verification of problems, following the line of prevention, maintenance, and correction. SolarWinds® AppOptics™ was created with the purpose of helping monitoring teams control infrastructure, including Linux monitoring.

 

Monitoring overview

It is critical that a technology team prepare for any situation that occurs in their environment. The purpose of monitoring is to be aware of changes in the environment so that problems can be solved with immediate action. Good monitoring history and proper perception can allow you to suggest environmental improvements according to the charts. If you have a server that displays memory usage for a certain amount of time, you can purchase more memory, or investigate the cause of the abnormal behavior before the environment becomes unavailable.

 

Monitoring indexes can be used for various purposes, such as application availability for a given number of users, tool deployment tracking, operating system update behavior, purchase requests, and exchanges or hardware upgrades. Each point of use depends on your deployment purpose.

 

Linux servers historically have operating systems that are difficult to monitor because most of the tools in the market serve other platforms. In addition, a portion of IT professionals cannot make monitoring work properly on these servers, so when a disaster occurs, it is difficult to identify what happened.

 

Constant monitoring of servers and services used in production is critical for company environments. Server failures in virtualization, backup, firewalls, and proxies can directly affect availability and quality of service.

 

The Linux operating system offers a basic monitoring system for more experienced administrators, but when it comes to monitoring, real-time reports are needed for immediate action. You cannot count on an experienced system administrator being available to access the servers, or that they can perform all existing monitoring capabilities.

 

In the current job market, it is important to remember that Linux specialists are rare, and their availability is limited. There are cases where an expert administrator can only act on a server when the problem has been long-standing. Training for teams to become Linux experts can be

expensive and time-consuming, with potentially low returns.

 

Metrics used for monitoring

  1. CPU – It is crucial to monitor CPU, as it can reach a high utilization rate and temperature. It can have multiple cores, but an application can be directed to only one of these cores, pointing to a dangerous hardware behavior.

  2. Load – This specifies whether the CPU is being used, how much is being executed, and how long it has been running.

  3. Disk Capacity and IO – Disk capacity is especially important when it comes to image servers, files, and VMs, as it can directly affect system shutdown, corrupt the operating system, or cause extreme IO slowness. Along with disk monitoring, it’s possible to plan for an eventual change or addition of a disk, and to verify the behavior of a disk that demonstrates signs of hardware failure.

  4. Network – When it comes to DNS, DHCP, firewall, file server, and proxy, it is extremely important to monitor network performance as input and output of data packets. With network performance logs, you can measure the utilization of the card, and create a plan to suit the application according to the use of the network.

  5. Memory – Memory monitoring in other components determines the immediate stop of a system due to memory overflow or misdirection for a single application.

  6. Swap – This is virtual memory created by the system and allocated to disk to be used when necessary. Its high utilization can indicate that the amount of memory for the server is insufficient.

With this information from Linux systems, you can have good monitoring and a team that can act immediately on downtime that can paralyze critical systems.

 

 

Monitoring with AppOptics

AppOptics is a real-time web monitoring tool that enables you to set up a real-time monitoring environment, create alerts by e-mail, and focus on threshold and monitoring history. You can also create monitoring levels with profiles of equipment to be monitored, and have simple monitoring viewers that can trigger a specialist or open a call for immediate action when needed.

 

This tool can also be an ally of an ITIL/COBIT team, which can use the reports to justify scheduled and unscheduled stops, and clarify systems that historically have problems. It can also be used to justify the purchase of new equipment, software upgrades, or the migration of a system that no longer meets the needs of a company.

 

AppOptics can be installed in major Linux distributions such as Red Hat, CentOS, Ubuntu, Debian, Fedora, and Amazon Linux. Its deployment is easy, fast, and practical.

 

 

Installing the AppOptics Agent on the Server

Before you start, you’ll need an account with AppOptics. If you don’t already have one, you can create a demo account which will give you 14 days to try the service, free of charge. Sign up here.

 

First, to allow AppOptics to aggregate the metrics from the server, you will need to install the agent on all instances. To do this, you’ll need to reference your AppOptics API token when setting up the agent. Log in to your AppOptics account and navigate to the Infrastructure page.

 

Locate the Add Host button, and click on it. It should look similar to the image below.

Fig. 2. AppOptics Host Agent Installation

 

You can follow a step-by-step guide on the Integration page, where there are Easy Install and Advanced options for users. I used an Ubuntu image in the AWS Cloud, but this will work on almost any Linux server.

 

Note: Prior to installation of the agent, the bottom of the dialog below will not contain the success message.

 

Copy the command from the first box, and then SSH into the server and run the Easy Install script.

 

Fig. 3. Easy Install Script to Add AppOptics Agent to a Server

 

When the agent installs successfully, you should be presented with the following message on your terminal. The “Confirm successful installation” box on the AppOptics agent screen should look similar to the above, with a white on blue checkbox. You should also see “Agent connected.”

 

Fig. 4. Installing the AppOptics Agent on your Linux Instance

 

After installation, you can start configuring the dashboard for monitoring on the server. Click on the hostname link in the Infrastructure page, or navigate to the Dashboards page directly, and then select the Host Agent link to view the default dashboard provided by AppOptics.

 

Working with the Host Agent Dashboard

The default Host Agent Dashboard provided by AppOptics offers many of the metrics discussed earlier, related to the performance of the instance itself, and should look similar to the image below.

 

Fig. 6. Default Host Agent Dashboard

 

One common pattern is to create dashboards for each location you want to monitor. Let’s use “Datacenter01” for our example. Head to Dashboards and click the Create a New Dashboard button.

 

You can choose the type of monitoring display (Line, Stacked, and Big Number). Then you can choose what you want to monitor as CPU Percent, Swap, or Load. In addition, within the dashboard, you can select how long you want to monitor a group of equipment or set it to be monitored indefinitely.

 

Fig. 8. Custom Dashboard to View Linux System Metrics

 

Metrics – You can select existing metrics to create new composite metrics according to what you want to be monitored in the operating system.

 

Alerts – Alerts are created for the operating system, including time settings for issuing a new alert and the conditions for issuing alerts.

 

Integrations – You can add host agent plug-ins for support for application monitoring.

 

 

Conclusion

Monitoring your Linux servers is critical as they represent the basis of your infrastructure. You need to know immediately when there is a sudden change in CPU or memory usage that could affect the performance of your applications. AppOptics has a range of ready-made tools, customizable monitoring panels, and reports that are critical for investigating chronic infrastructure problems. Learn more about AppOptics infrastructure monitoring and try it today with a free 14-day trial.

How do PHP logging frameworks fare when pushed to their limits? This analysis can help us decide which option is best for our PHP applications. Performance, speed, and reliability are important for logging frameworks because we want the best performance out of our application and to minimize loss of data.

Our goals for the fastest PHP framework benchmark tests are to measure the time different frameworks require to process a large number of log messages, considering various logging handlers, as well as which logging frameworks are more reliable at their limits (dropping none or less messages).

The frameworks we tried are:

  • native PHP logging (error_log and syslog built-in functions)
  • KLogger
  • Apache Log4php
  • Monolog

All of these frameworks use synchronous or “blocking” calls, as PHP functions typically do. The web server execution waits until the function/method call is finished in order to continue. As for the handlers: error_log, KLogger, Log4php, and Monolog can write log messages to text file, while error_log/syslog, Log4php, and Monolog are able to send the messages to the local system logger. Finally, only Log4php and Monolog allow remote Syslog connections.

NOTE: The term syslog can refer to various things. In this article, this includes the PHP function of the same name, the system logger daemon (e.g. syslogd), or a remote syslog server (i.e. rsyslog).

Application and Handlers

For this framework benchmark, we built a PHP CodeIgniter 3 web app with a controller for each logging mechanism. Controller methods echo the microtime difference before and after logging, which is useful for manual tests. Each controller method call has a loop that writes 10,000 INFO log messages in the case of file handlers (except error_log which can only produce E_ERROR), or 100,000 INFO messages to syslog. This helps us stress the logging system while not over-burdening the web server request handler.

NOTE: You may see the full app source code at https://github.com/jorgeorpinel/php-logging-benchmark

 

For the local handlers, first we tested writing to local files and kept track of the number of written logs in each test. We then tested the local system logger handler (which uses the /dev/log UNIX socket by default) and counted the number of logs syslogd wrote to /var/log/syslog.

As for the “remote” syslog server, we set up rsyslog on the system and configured it to accept both TCP and UDP logs, writing them to /var/log/messages. We recorded the number of logs there to determine whether any of them were dropped.

Benchmarking PHP Logging Frameworks A
Fig. 1 System Architecture – Each arrow represents a benchmark test.

Methodology

We ran the application locally on Ubuntu with Apache (and mod-php). First, each Controller/method was “warmed up” by requesting that URL with curl, which ensures the PHP source is already precompiled when we run the actual framework benchmark tests. Then we used ApacheBench to stress test the local web app with 100 or 10 serial requests (file or syslog, respectively). For example:

ab -ln 100 localhost:8080/Native/error_log

ab -ln 10 localhost:8080/Monolog/syslog_udp

The total number of log calls in each test was 1,000,000 (each method). We gathered performance statistics from the tool’s report for each Controller/method (refer to figure 1).

Please note in normal operations the actual drop rates should be much smaller, if any.

Hardware and OS

We ran both the sample app and the tests on AWS EC2 micro instance. It’s set up as a 64-bit Ubuntu 16.04 Linux box with an Intel(R) Xeon(R) CPU @ 2.40GHz processors and 1GiB of memory, and an 8 GB storage SSD.

Native tests

The “native” controller uses a couple of PHP built-in error handling functions. It has two methods: one that calls error_log, which is configured in php.ini to write to a file, and one that calls syslog to reach the system logger. Both functions are used with their default parameters.

error_log to file

By definition, no log messages can be lost by this method as long as the web server doesn’t fail. Its performance when writing to file will depend on the underlying file system and storage speed. Our test results:

error_log (native PHP file logger)
Requests per sec23.55 [#/sec] (mean)
Time per request42.459 [ms] (mean)
↑ Divide by 10,000 logs written per request.
NOTE: error_log can also be used to send messages to system log, among other message types.

syslog

Using error_log when error_log = syslog in php.ini, or simply using the syslog function, we can reach the system logger. This is similar to using the logger command in Linux.

syslog (native PHP system logger)
Requests per sec0.25 [#/sec] (mean)
Time per request4032.164 [ms] (mean)  ← ÷ 100,000 logs sent per request

This is typically the fastest logger, and syslogd is as robust as the web server or more, so no messages should be dropped (none were in our tests). Another advantage of the system logger is that it can be configured to write to a file and to forward logs via network.

KLogger test

KLogger is a “simple logging class for PHP” with its first stable release in 2104. It’s only able to write logs to file. Its simplicity helps its performance, however. KLogger is PSR-3 compliant: It implements the LoggerInterface.

K2Logger (simple PHP logging class)
Requests per sec14.11 [#/sec] (mean)
Time per request70.848 [ms] (mean)  ← Divide by 10,000 = 0.0070848 ms / msg
NOTE: This GitHub fork of KLogger allows local syslog usage as well. We did not try it.

Log4php tests

Log4php, first released in 2010, is one in the suite of loggers that Apache provides for several popular programming languages. Logging to file, it turns out to be a speedy contender, at least on Apache. Running the application on Apache probably helps the performance of Log4php. In local tests using PHP’s built-in server (php -S command), it was actually the slowest contender!

Log4php (Apache PHP file logger)
Requests per sec18.70 [#/sec] (mean) * 10k = 187k msg per sec
Time per request53.470 [ms] (mean) / 10k = .0053 ms / msg

As for sending to syslog, it was actually our least performant option, but not by far:

Log4php to syslog
Local syslog socketSyslog over TCP/IPSyslog over UDP/IP
0.08 ms per logAround 24 ms per log0.07 ms per log
0% dropped0% dropped0.15% dropped

Some of the advantages Log4php has, which may offset its lack of performance, are Java-like XML configuration files (same as other Apache loggers, such as the popular log4j), six logging destinations, and three message formats.

NOTE: Remote syslog over TCP however, doesn’t seem to be well supported at this time. We had to use the general-purpose LoggerAppenderSocket, which was really slow, so we only ran 100,000.

Monolog tests

Monolog, like KLogger, is a PSR-3; and, like Log4php, a full logging framework that can send logs to files, sockets, email, databases, and various web services. It was first released in 2011.

Monolog features many integrations with popular PHP frameworks, making it a popular alternative. Monolog beat its competitor Log4php in our tests, but is still not the fastest PHP framework nor most reliable of options, although probably one of the easiest for web developers.

Monolog (full PHP logging framework)
Requests per sec4.93 [#/sec] (mean) x 10k
Time per request202.742 [ms] (mean) / 10k

Monolog over Syslog:

Monolog over syslog
UNIX socketTCPUDP
0.062 ms per log0.06 ms per log0.079 ms per log
Less than 0.01% dropped0.29% dropped0% dropped

Now let’s take a look at graphs that summarize and compare all the results above. These charts show the tradeoff between using faster native or basic logging methods, more limited and lower level in nature vs. relatively less performant but full-featured frameworks:

Local File Performance Comparison

Benchmarking PHP Logging Frameworks 2
Fig 2. Time per message written to file [ms/msg]

Local Syslog Performance and Drop Rates

Log handler or “appender” names vary from framework to framework. For native PHP, we just use the syslog function (Klogger doesn’t support this); in Log4php, it’s a class called LoggerAppenderSyslog; and it’s called SyslogHandler in Monolog.

Benchmarking PHP Logging Frameworks 3
Fig 3. Time per message sent to syslogd via socket [ms/msg]

Benchmarking PHP Logging Frameworks 4
Fig 4. Drop rates to syslogd via socket [%]

 

Remote Syslog Performance and Drop Rates

The appenders are LoggerAppenderSocket in Log4php, SocketHandler and SyslogUdpHandler for Monolog.

To measure the drop rates, we leveraged the $RepeatedMsgReduction config param of rsyslog, which collapses identical messages into a single one and a second message with the count of further repetitions. In the case of Log4php, since the default message includes a timestamp that varies in every single log, we forwarded the logs to SolarWinds® Loggly® (syslog setup in seconds) and used a filtered, interactive log monitoring dashboard to count the total logs received.

TCP

Benchmarking PHP Logging Frameworks 6
Fig 5. Time per message sent via TCP to rsyslog

Benchmarking PHP Logging Frameworks 5
Fig 6. Drop rates to rsyslog (TCP) [%]

UDP

Benchmarking PHP Logging Frameworks 8
Fig 7. Time per message sent on UDP to rsyslog
Benchmarking PHP Logging Frameworks 7
Fig 8. Drop rates to rsyslog (UDP)

Conclusion

Each logging framework is different, and while each could be best fit to specific projects, our recommendations are as follows. Nothing beats the performance of native syslog for system admins who know their way around syslogd or syslog-ng daemons, or to forward logs to a cloud service such as Loggly. If what’s needed is a simple, yet powerful way to log locally to files, KLogger offers PSR-3 compliance and is almost as fast as native error_log, although Log4php does seem to edge it out when the app is running on Apache. For a more complete framework, Monolog seems to be the more well-rounded option, particularly when considering remote logging via TCP/IP.

 

After deciding on a logging framework, your next big decision is choosing a log management solution. Loggly provides unified log analysis and monitoring for all your servers in a single place. You can configure your PHP servers to forward syslog to Loggly or simply use Monolog’s LogglyHandler, which is easy to set up in your app’s code. Try Loggly for free and take control over your PHP application logs.

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.