We’re no strangers to logging from Docker containers here at SolarWinds Loggly. In the past, we’ve demonstrated different techniques for logging individual Docker containers. But while logging a handful of containers is easy, what happens when you start deploying dozens, hundreds, or thousands of containers across different machines?
In this post, we’ll explore the best practices for logging applications deployed using Docker Swarm.
Intro to Docker Swarm
Docker Swarm is a container orchestration and clustering tool from the creators of Docker. It allows you to deploy container-based applications across a number of computers running Docker. Swarm uses the same command-line interface (CLI) as Docker, making it more accessible to users already familiar with Docker. And as the second most popular orchestration tool behind Kubernetes, Swarm has a rich ecosystem of third-party tools and integrations.
A swarm consists of manager nodes and worker nodes. Managers control how containers are deployed, and workers run the containers. In Swarm, you don’t interact directly with containers, but instead define services that define what the final deployment will look like. Swarm handles deploying, connecting, and maintaining these containers until they meet the service definition.
For example, imagine you want to deploy an Nginx web server. Normally, you would start an Nginx container on port 80 like so:
$ docker run --name nginx --detach --publish 80:80 nginx
With Swarm, you instead create a service that defines what image to use, how many replica containers to create, and how those containers should interact with both the host and each other. For example, let’s deploy an Nginx image with three containers (for load balancing) and expose it over port 80.
$ docker service create --name nginx --detach --publish 80:80 --replicas 3 nginx
When the deployment is done, you can access Nginx using the IP address of any node in the Swarm.
To learn more about Docker services, see the services documentation.
The Challenges of Monitoring and Debugging Docker Swarm
Besides the existing challenges in container logging, Swarm adds another layer of complexity: an orchestration layer. Orchestration simplifies deployments by taking care of implementation details such as where and how containers are created. But if you need to troubleshoot an issue with your application, how do you know where to look? Without comprehensive logs, pinpointing the exact container or service where an error occurred can become an operational nightmare.
On the container side, nothing much changes from a standard Docker environment. Your containers still send logs to
stderr, which the host Docker daemon accesses using its logging driver. But now your container logs include additional information, such as the service that the container belongs to, a unique container ID, and other attributes auto-generated by Swarm.
Consider the Nginx example. Imagine one of the containers stops due to a configuration issue. Without a monitoring or logging solution in place, the only way to know this happened is by connecting to a manager node using the Docker CLI and querying the status of the service. And while Swarm automatically groups log messages by service using the
docker service logs command, searching for a specific container’s messages can be time-consuming because it only works when logged in to that specific host.
How Docker Swarm Handles Logs
Like a normal Docker deployment, Swarm has two primary log destinations: the daemon log (events generated by the Docker service), and container logs (events generated by containers). Swarm doesn’t maintain separate logs, but appends its own data to existing logs (such as service names and replica numbers).
The difference is in how you access logs. Instead of showing logs on a per-container basis using
docker logs <container name>, Swarm shows logs on a per-service basis using
docker service logs <service name>. This aggregates and presents log data from all of the containers running in a single service. Swarm differentiates containers by adding an auto-generated container ID and instance ID to each entry.
For example, the following message was generated by the second container of the
nginx_nginx service, running on
# docker service logs nginx_nginx nginx_nginx.2.subwnbm15l3f@swarm-client1 | 10.255.0.2 - - [01/Jun/2018:22:21:11 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0" "-"
To learn more about the logs command, see the Docker documentation.
Options for Logging in Swarm
Since Swarm uses Docker’s existing logging infrastructure, most of the standard Docker logging techniques still apply. However, to centralize your logs, each node in the swarm will need to be configured to forward both daemon and container logs to the destination. You can use a variety of methods such as Logspout, the daemon logging driver, or a dedicated logger attached to each container.
Best Practices to Improve Logging
To log your swarm services effectively, there are a few steps you should take.
1. Log to STDOUT and STDERR in Your Apps
Docker automatically forwards all standard output from containers to the built-in logging driver. To take advantage of this, applications running in your Docker containers should write all log events to
STDERR. If you try to log from within your application, you risk losing crucial data about your deployment.
2. Log to Syslog Or JSON
Syslog and JSON are two of the most commonly supported logging formats, and Docker is no exception. Docker stores container logs as JSON files by default, but it includes a built-in driver for logging to Syslog endpoints. Both JSON and Syslog messages are easy to parse, contain critical information about each container, and are supported by most logging services. Many container-based loggers such as Logspout support both JSON and Syslog, and Loggly has complete support for parsing and indexing both formats.
3. Log to a Centralized Location
A major challenge in cluster logging is tracking down log files. Services could be running on any one of several different nodes, and having to manually access log files on each node can become unsustainable over time. Centralizing logs lets you access and manage your logs from a single location, reducing the amount of time and effort needed to troubleshoot problems.
One common solution for container logs is dedicated logging containers. As the name implies, dedicated logging containers are created specifically to gather and forward log messages to a destination such as a syslog server. Dedicated containers automatically collect messages from other containers running on the node, making setup as simple as running the container.
Why Loggly Works for Docker Swarm
Normally you would access your logs by connecting to a master node, running
docker service logs <service name>, and scrolling down to find the logs you’re looking for. Not only is this labor-intensive, but it’s slow because you can’t easily search, and it’s difficult to automate with alerts or create graphs. The more time you spend searching for logs, the longer problems go unresolved. This also means creating and maintaining your own log centralization infrastructure, which can become a significant project on its own.
Loggly is a log aggregation, centralization, and parsing service. It provides a central location for you to send and store logs from the nodes and containers in your swarm. Loggly automatically parses and indexes messages so you can search, filter, and chart logs in real-time. Regardless of how big your swarm is, your logs will be handled by Loggly.
Sending Swarm Logs to Loggly
The easiest way to send your container logs to Loggly is with Logspout. Logspout is a container that automatically routes all log output from other containers running on the same node. When deploying the container in global mode, Swarm automatically creates a Logspout container on each node in the swarm.
To route your logs to Loggly, provide your Loggly Customer Token and a custom tag, then specify a Loggly endpoint as the logging destination.
# docker service create --name logspout --mode global --detach --volume=/var/run/docker.sock:/var/run/docker.sock --volume=/etc/hostname:/etc/host_hostname:ro -e SYSLOG_STRUCTURED_DATA="<Loggly Customer Token>@41058 tag=\"<custom tag>\"" gliderlabs/logspout syslog+tcp://logs-01.loggly.com:514
You can also define a Logspout service using Compose.
docker-compose-logspout.yml version: "3" networks: logging: services: logspout: image: gliderlabs/logspout networks: - logging volumes: - /etc/hostname:/etc/host_hostname:ro - /var/run/docker.sock:/var/run/docker.sock environment: SYSLOG_STRUCTURED_DATA: "<Loggly Customer Token>@41058" tag: "<custom tag>" command: syslog+tcp://logs-01.loggly.com:514 deploy: mode: global
docker stack deploy to deploy the Compose file to your swarm.
<stack name> is the name that you want to give to the deployment.
# docker stack deploy --compose-file docker-compose-logspout.yml <stack name>
Configuring Dashboards and Alerts
Since Swarm automatically appends information about the host, service, and replica to each log message, we can create Dashboards and Alerts similar to those for a single-node Docker deployment. For example, Loggly automatically breaks down logs from the Nginx service into individual fields.
We can create Dashboards that show, for example, the number of errors generated on each node, as well as the container activity level on each node.
Alerts are useful for detecting changes in the status of a service. If you want to detect a sudden increase in errors, you can easily create a search that scans messages from a specific service for error-level logs.
You can select this search from the Alerts screen and specify a threshold. For example, this alert triggers if the Nginx service logs more than 10 errors over a 5-minute period.
While Swarm can add a layer of complexity over a typical Docker installation, logging it doesn’t have to be difficult. Tools like Logspout and Docker logging drivers have made it easier to collect and manage container logs no matter where those containers are running. And with Loggly, you can easily deploy a complete, cluster-wide logging solution across your entire environment.