Kubernetes is a container orchestrator that provides a robust, dynamic environment for reliable applications. Maintaining a Kubernetes cluster requires proactive maintenance and monitoring to help prevent and diagnose issues that occur in clusters. While you can expect a typical Kubernetes cluster to be stable most of the time, like all software, issues can occur in production. Fortunately, Kubernetes insulates us against most of these issues with its ability to reschedule workloads, and just replacing nodes when issues occur. When cloud providers have availability zone outages, or are in constrained environments such as bare metal, being able to debug and successfully resolve problems in our nodes is still an important skill to have.

In this article, we will use SolarWinds® AppOptics tracing to diagnose some latency issues with applications running on Kubernetes. AppOptics is a next-generation application performance monitoring (APM) and infrastructure monitoring solution. We’ll use it’s trace latency on requests to our Kubernetes pods to identify problems in the network stack.

The Kubernetes Networking Stack

Networking in Kubernetes has several components and can be complex for beginners. To be successful in debugging Kubernetes clusters, we need to understand all of the parts.

 

Pods are the scheduling primitives in Kubernetes. Each pod is composed of multiple containers that can optionally expose ports. However, because pods may share the same host on the same ports, workloads must be scheduled in a way that ensures ports do not conflict with each other on a single machine. To solve this problem, Kubernetes uses a network overlay. In this model, pods get their own virtual IP addresses to allow different pods to listen to the same port on the same machine.

 

This diagram shows the relationship between pods and network overlays. Here we have two nodes, each running two pods, all connected to each other via a network overlay. The overlay assigned each of these pods an IP and can listen on the same port despite conflicts they (is the “they” referring to the pods or the overlay? If it’s the pods please replace “they” with “pods” and if it’s the overlay, “they” should be changed to “it” would have listening at the host level. Network traffic, shown by the arrow connecting pods B and C, is facilitated by the network overlay and pods do not have knowledge about the host’s networking stack.

 

Having pods on a virtualized network solves significant issues with providing dynamically scheduled networked workloads. However, these virtual IPs are randomly assigned. This presents a problem for any service or DNS record relying on these pod IPs. Services fixes this by providing a stable virtual IP frontend to these pods. These services maintain a list of backend pods and load balances across them. The kube-proxy component routes requests for these service IPs from anywhere in the cluster.

 

 

This diagram differs slightly from the last one. Although pods may still be running on node 1, we omitted them from this diagram for clarity. We defined a service A that is exposed on port 80 on our hosts. When a request is made, it is accepted by the kube-proxy component and forwarded onto pod A1 or A2, which then handles the request. Although the service is exposed to the host, it is also given its own service IP on a separate CIDR from the pod network and can be accessed from within the cluster as well on that IP.

 

The network overlay in Kubernetes is a pluggable component. Any provider that implements the Container Networking Interface APIs can be used as a network overlay, and these overlay providers can be chosen based on the features and performance required. In most environments, you will see overlay networks ranging from the cloud provider’s (such as Google Kubernetes Engine or Amazon Elastic Kubernetes) to operator-managed solutions such as flannel or Calico. Calico is a network policy engine that happens to include a network overlay. Alternatively, you can disable the built-in network overlay and use it to implement network policy on other overlays such as a cloud provider’s or flannel. This is used to enforce pod and service isolation, a requirement of most secure environments.

Troubleshooting Application Latency Issues

Now that we have a basic understanding of how networking works in Kubernetes, let’s look at an example scenario. We’ll focus on an example where a networking latency issue led to a network blockage. We’ll show you how to identify the cause of the problem and fix it.

 

To demonstrate this example, we’ll start by setting up a simple two-tier application representing a typical microservice stack. This gives us network traffic inside a Kubernetes cluster, so we can introduce issues with it that we can later debug and fix. It is made up of a web component and an API component that do not have any known bugs and correctly serve traffic.

 

These applications are written in the Go Programming Language and are using the AppOptics agent for Go. If you’re not familiar with Go, the “main” function is the entry point of our application and is at the bottom of our web tier’s file. It listens on the base path (“/”) and calls out to our API tier using the URL defined on line 13. The response from our API tier is written to an HTML template and displayed to the user. For brevity’s sake, error handling, middleware, and other good Go development practices are omitted from this snippet.

 

package main
import (
          "context"
         
"html/template"
         
"io/ioutil"
          "log"
         
"net/http"

          "github.com/appoptics/appoptics-apm-go/v1/ao"

)

const
url = "http://apitier.default.svc.cluster.local"

func
handler(w http.ResponseWriter, r *http.Request)
{
      const tpl = `
<html>
  <head>
  <meta charset="UTF-8">
    <title>My Application</title>
  </head>
  <body>
    <h1>{{.Body}}</h1>
  </body>
</html>  `

      t, w, r := ao.TraceFromHTTPRequestResponse("webtier", w, r)
      defer t.End()
      ctx := ao.NewContext(context.Background(), t)

      httpClient := &http.Client{}
      httpReq, _ := http.NewRequest("GET", url, nil)

      l := ao.BeginHTTPClientSpan(ctx, httpReq)
      resp, err := httpClient.Do(httpReq)
      defer resp.Body.Close()
      l.AddHTTPResponse(resp, err)
      l.End()

      body, _ := ioutil.ReadAll(resp.Body)
      template, _ := template.New("homepage").Parse(tpl)

      data := struct {
              Body string
     
}{
      Body: string(body),
      }

      template.Execute(w, data)
}

func
main()
{
      http.HandleFunc("/", ao.HTTPHandler(handler))
      http.ListenAndServe(":8800", nil)
}

Our API tier code is simple. Much like the web tier, it serves requests from the base path (“/”), but only returns a string of text. As part of this code, we propagate the context of any traces requested to this application with the name “apitier”. This sets our application up for end to end distributed tracing.

package main

import (
      "context"
     
"fmt"
     
"net/http"
     
"time"

     
"github.com/appoptics/appoptics-apm-go/v1/ao"
)

func query() {
      time.Sleep(2 * time.Millisecond)
}

func handler(w http.ResponseWriter, r *http.Request) {
      t, w, r := ao.TraceFromHTTPRequestResponse("apitier", w, r)
      defer t.End()

      ctx := ao.NewContext(context.Background(), t)
      parentSpan, _ := ao.BeginSpan(ctx, "api-handler")
      defer parentSpan.End()

      span := parentSpan.BeginSpan("fast-query")
      query()
      span.End()

      fmt.Fprintf(w, "Hello, from the API tier!")
}

func main() {
      http.HandleFunc("/", ao.HTTPHandler(handler))
      http.ListenAndServe(":8801", nil)
}

When deployed on Kubernetes and accessed from the command line, these services look like this:

Copyright: Kubernetes®

This application is being served a steady stream of traffic. Because the AppOptics APM agent is turned on and tracing is being used, we can see a breakdown of these requests and the time spent in each component, including distributed services. From the web tier component’s APM page, we can see the following graph:

This view is telling us the majority of our time is spent in our API tier, with a brief amount of time spent in the web tier serving this traffic. However, we have an extra “remote calls” section. This section represents untraced time between the API tier and web tier. For a Kubernetes cluster, this includes our kube-proxy, network overlay, or proxies that have not had tracing added to them. This makes up 1.65ms of our request for a normal request, which for this environment adds an insignificant overhead, so we can use this as our “healthy” benchmark for this cluster.

Now we will simulate a failure in the networking overlay layer. Using a tool satirically named Comcast, we can simulate adverse network conditions. This tool uses iptables and the traffic control (tc) utility, standard Linux utilities for managing network environments, under the hood. Our test cluster is using Calico as the network overlay and exposes a tunl0 interface. This is a custom, local tunnel Calico uses to bridge all network traffic to both implement the network overlay between machines and enforce policy. We only want to simulate a failure at the network overlay, so we use it as the device, and inject 500ms of latency with a maximum bandwidth of 50kbps and minor packet loss.

Our continuous traffic testing is still running. After a few minutes of new requests, our AppOptics APM graph looks very different:

While our application time and tracing-api-tier remained consistent, our remote calls time jumped significantly. We’re now spending 6-20 seconds of our request time just traversing the network stack. Thanks to tracing, it’s clear that this application is operating as expected and the problem is in another part of our stack. We also have the AppOptics Agent for Kubernetes and Integration of CloudWatch running on this cluster, so we can look at the host metrics to find more symptoms of the problem:

Our network graph suddenly starts reporting much more traffic, and then stops reporting entirely. This could be a symptom of our network stack handling a great deal of requests into our host on the standard interface (eth0), queueing at the Calico tunnel, and then overflowing and preventing any more network traffic from accessing the machine until existing requests time out. This aggregate view of all traffic moving inside of our host is deceptive since it’s counting every byte passing through internal as well as external interfaces, which explains our extra traffic.

 

We still have the problem where the agent stops reporting. Because the default pods use the network overlay, the agent reporting back to AppOptics suffers from the same problem our API tier is having. As part of recovering this application and helping prevent this issue from happening again, we would move the AppOptics agent off of the network overlay and use the host network.

 

Even with our host agent either delayed or not reporting at all, we still have the AppOptics CloudWatch metrics for this host turned on, and can get the AWS view of the networking stack on this machine:

 

In this graph we see that at the start of the event traffic becomes choppy, but is generally fixed between 50Kb/s out on normal operation all the way up to 250Kb/s. This could be our bandwidth limits and packet loss settings causing bursts of traffic out. In any case, there’s a massive discrepancy between the networking inside of our Kubernetes cluster and outside of it, which points us to problems with our overlay stack. From here, we would move the node out of service, let Kubernetes automatically schedule our workloads onto other hosts, and proceed with host-level network debugging, like looking at our iptables settings, checking flow logs, and the health of our overlay components.

 

Once we remove these rules to clear the network issue, and our traffic quickly returns to normal.

 

The latency drops to such a small value, and it’s no longer visible on the graph after 8:05:

Next Steps

Hopefully now you are much more familiar with how the networking stack works on Kubernetes and how to identify problems. A monitoring solution like AppOptics APM can help you monitor the availability of service and troubleshoot problems faster. A small amount of tracing in your application goes a long way in identifying components of your systems that are having latency issues.

Version 1.1 of the venerable HTTP protocol powered the web for 18 years.

 

Since then, websites have emerged from static, text-driven documents to interactive, media-rich applications. The fact that the underlying protocol remained unchanged throughout this time just goes to show how versatile and capable it is. But as the web grew bigger, its limitations became more obvious.

 

We needed a replacement, and we needed it soon.

 

Enter HTTP/2. Published in early 2015, HTTP/2 optimizes website connections without changing the existing application semantics. This means you can take advantage of HTTP/2’s features such as improved performance, updated error handling, reduced latency, and lower overhead without changing your web applications.

 

Today nearly 84% of modern browsers and 27% of all websites support HTTP/2, and those numbers are gradually increasing.

 

How is HTTP/2 Different from HTTP/1.1?

HTTP/2’s biggest changes impact the way data is formatted and transported between clients and servers.

 

Binary Data Format

HTTP/2 encapsulates data using a binary protocol. With HTTP/1.1, messages are transmitted in plaintext. This makes requests and responses easy to format and even read using a packet analysis tool, but results in increased size due to unnecessary whitespace and inefficient compression.

 

The benefit of a binary protocol is it allows for more compact, more easily compressible, and less error-prone transmissions.

 

Persistent TCP Connections

In early versions of HTTP, a new TCP connection had to be created for each request and response. HTTP/1.1 introduced persistent connections, allowing multiple requests and responses over a single connection. The problem was that messages were exchanged sequentially, with web servers refusing to accept new requests until previous requests were fulfilled.

 

HTTP/2 simplifies this by allowing for multiple simultaneous downloads over a single TCP connection. After a connection is established, clients can send new requests while receiving responses to earlier requests. Not only does this reduce the latency in establishing new connections, but servers no longer need to maintain multiple connections to the same clients.

 

Multiplexing

Persistent TCP connections paved the way for multiplexed transfers. With HTTP/2, multiple resources can be transferred simultaneously. Clients no longer need to wait for earlier resources to finish downloading before the next one begins. Website developers used workarounds such as domain sharding to “trick” browsers into opening multiple connections to a single host; however, this led to browsers opening multiple TCP connections. HTTP/2 makes this entire practice obsolete.

 

Header Compression and Reuse

In HTTP/1.1, headers are incompressible and repeated for each request. As the number of requests grows, so does the volume of duplicate header information. HTTP/2 eliminates redundant headers and compresses the remaining headers to drastically decrease the amount of data repeated during a session.

 

Server Push

Instead of waiting for clients to request resources, servers can now push resources. This allows websites to preemptively send content to users, minimizing wait times.

 

Does My Site Already Support HTTP/2?

Several major web servers and content delivery networks (CDNs) support HTTP/2. The fastest way to check if your website supports HTTP/2 is to navigate to the website in your browser and open Developer Tools. In Firefox and Chrome, press Ctrl-Shift-I or the F12 key and click the Network tab. Reload the page to populate the table with a list of responses. Right-click the column names in the table and enable the “Protocol” header. This column will show HTTP/2.0 in Firefox or h2 in Chrome if HTTP/2 is supported, or HTTP/1.1 if it’s not.

 

What is HTTP/2, and Will It Really Make Your Site Faster?
The network tab after loading 8bitbuddhism.com©. The website fully supports HTTP/2 as shown in the Protocol column.

 

Alternatively, KeyCDN provides a web-based HTTP/2 test tool. Enter the URL of the website you want to test, and the tool will report back on whether it supports HTTP/2.

 

How Do I Enable HTTP/2 on Nginx?

As of version 1.9.5, Nginx fully supports HTTP/2 via the ngx_http_v2 module. This module comes included in the pre-built packages for Linux and Windows. When building Nginx from source, you will need to enable this module by adding –with-http_v2_module as a configuration parameter.

You can enable HTTP/2 for individual server blocks. To do so, add http2 to the listen directive. For example, a simple Nginx configuration would look like this:

# nginx.conf
server {
listen 443 ssl http2;
server_name mywebsite.com;

root /var/www/html/mywebsite;
}

Although HTTP/2 was originally intended to require SSL, you can use it without SSL enabled. To apply the changes, reload the Nginx service using:

$ sudo service nginx reload

or by invoking the Nginx CLI using:

$ sudo /usr/sbin/nginx -s reload

Benchmarking HTTP/2

To measure the speed difference between HTTP/2 and HTTP/1.1, we ran a performance test on a WordPress site with and without HTTP/2 enabled. The site was hosted on a Google Compute Engine instance with 1 virtual CPU and 1.7 GB of memory. We installed WordPress 4.9.6 in Ubuntu 16.04.4 using PHP 7.0.30, MySQL 5.7.22, and Nginx 1.10.3.

 

To perform the test, we created a recurring page speed check in SolarWinds®Pingdom® to contact the site every 30 minutes. After four measurements, we restarted the Nginx server with HTTP/2 enabled and repeated the process. We then dropped the first measurement for each test (to allow Nginx to warm up), averaged the results, and took a screenshot of the final test’s Timeline.

 

 

The metrics we measured were:
  • Page size: the total combined size of all downloaded resources.
  • Load time: the time until the page finished loading completely.

 

Results Using HTTP/1.1

 

What is HTTP/2, and Will It Really Make Your Site Faster?
Timeline using HTTP/1.1

 

Results Using HTTP/2

 

What is HTTP/2, and Will It Really Make Your Site Faster?
Timeline using HTTP/2

 

And the Winner Is…

With just a simple change to the server configuration, the website performs noticeably better over HTTP/2 than HTTP/1.1. The page load time dropped by over 13% thanks to fewer TCP connections, resulting in a lower time to first byte. As a result of only using two TCP connections instead of four, we also reduced the time spent performing TLS handshakes. There was also a minor drop in overall file size due to HTTP/2’s more efficient binary data format.

 

Conclusion

HTTP/2 is already proving to be a worthy successor to HTTP/1.1. A large number of projects have implemented it and, with the exception of Opera Mini and UC for Android, mainstream browsers already support it. Whether it can handle the next 18 years of web evolution has yet to be seen, but for now, it’s given the web a much-needed performance boost.

 

You can try this same test on your own website using the Pingdom page speed check. Running the page speed check will show you the size and load time of every element. With this data you can tune and optimize your website, and track changes over time.

DevOps engineers wishing to troubleshoot Kubernetes applications can turn to log messages to pinpoint the cause of errors and their impact on the rest of the cluster. When troubleshooting a running application, engineers need real-time access to logs generated across multiple components.

 

Collecting live streaming log data lets engineers:

  • Review container and pod activity
  • Monitor the result of actions, such as creating or modifying a deployment
  • Understand the interactions between containers, pods, and Kubernetes
  • Monitor ingress resources and requests
  • Troubleshoot errors and watch for new or recurring problems

 

The challenge that engineers face is accessing comprehensive, live streams of Kubernetes log data. While some solutions exist today, these are limited in their ability to live tail logs or tail multiple logs. In this article, we’ll present an all-in-one solution for live tailing your Kubernetes logs, no matter the size or complexity of your cluster.

 

The Limitations of Current Logging Solutions

When interacting with Kubernetes logs, engineers frequently use two solutions: the Kubernetes command line interface (CLI), or the Elastic Stack.

 

The Kubernetes CLI (kubectl) is an interactive tool for managing Kubernetes clusters. The default logging tool is the command (kubectl logs) for retrieving logs from a specific pod or container. Running this command with the --follow flag streams logs from the specified resource, allowing you to live tail its logs from your terminal.

 

For example, let’s deploy a Nginx pod under the deployment name papertrail-demo. Using kubectl logs --follow [Pod name], we can view logs from the pod in real time:

$ kubectl logs --follow papertrail-demo-76bf4969df-9gs5w 10.1.1.1 - - [04/Jan/2019:22:42:11 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0" "-"

The main limitation of kubectl logs is that it only supports individual Pods. If we deployed two Nginx pod replicas instead of one, we would need to tail each pod separately. For large deployments, this could involve dozens or hundreds of separate kubectl logs instances.

 

The Elastic Stack (previously the ELK Stack) is a popular open-source log management solution. Although it can ingest and display log data using a web-based user interface, unfortunately, it doesn’t offer support for live tailing logs.

 

What is Papertrail, and How Does It Help?

SolarWinds® Papertrail is a cloud-hosted log management solution that lets you live tail your logs from a central location. Using Papertrail, you can view real-time log events from your entire Kubernetes cluster in a single browser window.

 

When a log event is sent from Kubernetes to Papertrail, Papertrail records the log’s contents along with its timestamp and origin pod. You can view these logs in a continuous stream in your browser using the Papertrail Event Viewer, as well as the Papertrail CLI client or Papertrail HTTP API. Papertrail shows all logs by default, but you can limit these to a specific pod, node, or deployment using a flexible search syntax.

 

For example, let’s increase the number of replicas in our Nginx deployment to three. If we used kubectl logs -f, we would need to run it three times: one for each pod. With Papertrail, we can open the Papertrail Event Viewer and create a search that filters the stream to logs originating from the papertrail-demo deployment. Not only does this show us output from each pod in the deployment, but also Kubernetes cluster activity related to each pod:


Filtering a live stream of Kubernetes logs using Papertrail.

 

Sending Logs from Kubernetes to Papertrail

The most effective way to send logs from Kubernetes to Papertrail is via a DaemonSet. DaemonSets run a single instance of a pod on each node in the cluster. The pod used in the DaemonSet automatically collects and forwards log events from other pods, Kubernetes, and the node itself to Papertrail.

 

Papertrail provides two DaemonSets:

  • The Fluentd DaemonSet uses Fluentd to collect logs from containers, pods, Kubernetes, and nodes. This is the preferred method for logging a cluster.
  • The Logspout DaemonSet uses logspout to monitor the Docker log stream. This option is limited to log output from containers, not Kubernetes or nodes.

We’ll demonstrate using the Fluentd DaemonSet. From a computer with kubectl installed, download fluentd-daemonset-papertrail.yaml and open it in a text editor. Change the values of FLUENT_PAPERTRAIL_HOST and FLUENT_PAPERTRAIL_PORT to match your Papertrail log destination. Optionally, you can name your instance by changing FLUENT_HOSTNAME. You can also change the Kubernetes namespace that the DaemonSet runs in by changing the namespace parameter. When you are done, deploy the DaemonSet by running:

$ kubectl create -f fluentd-daemonset-papertrail.yaml

In a few moments, logs will start to appear in Papertrail:


Live feed of Kubernetes logs in Papertrail.

 

Best Practices for Live Tailing Kubernetes Logs

To get the most out of your logs, make sure you’re following these best practices.

 

Log All Applications to STDOUT and STDERR

Kubernetes collects logs from Pods by monitoring their STDOUT and STDERR streams. If your application logs to another location, such as a file or remote service, Kubernetes won’t be able to detect it, and neither will your Papertrail DaemonSet. When deploying an application, make sure to route its logs to the standard output stream.

 

Use the Fluentd DaemonSet

The Logspout DaemonSet is limited to logging containers. The Fluentd DaemonSet, however, will log your containers, pods, and nodes. In addition to logging more resources, Fluentd also logs valuable information such as Pod names, Pod controller activity, and Pod scheduling activity.

 

Open Papertrail Next to Your Terminal

When you’re working on Kubernetes apps and want to debug problems with Pods, have a browser window with Papertrail open either beside or behind your terminal window. This way you can see the results of your actions after you execute them. This also saves you from having to tail manually in your terminal.

 

Group Logs to Make Them Easier to Find

Kubernetes pods (and containers in general) are ephemeral and often have randomly generated names. Unless you specify fixed names, it can be hard to keep track of which pods or containers to filter on. A solution is to use log groups, which let you group logs from a specific application or development team together. This helps you find the logs you need and hide everything else.

 

Save Searches in Papertrail

Papertrail lets you save your searches for creating custom Event Viewer sessions and alerts. You can reopen previously created live tail sessions, share your sessions with team members, or receive an instant notification when new log events arrive in the stream.

 

Conclusion

Kubernetes logs help DevOps teams identify deployment problems and improve the reliability of their application . Live tailing enables faster troubleshooting by helping developers collect, view, and analyze these logs in real time. To get started in SolarWinds Papertrail, sign up and start logging your Kubernetes cluster in a matter of minutes.

Jenkins X (JX) is an exciting new Continuous Integration and Continuous Deployment (CI/CD) tool for Kubernetes users. It hides the complexities of operating Kubernetes by giving developers a simpler experience to build and deploy their code. You can think of it as creating a serverless-like environment in Kubernetes. As a developer, you don’t need to worry about all the details of setting up environments, creating a CI/CD pipeline, or connecting GitHub to your CI pipeline. All of this and much more is handled by JX. In this article, we’ll introduce you to JX, show you how to use it, and how to monitor your builds and production deployments.

 

What is Jenkins X?

JX was created by James Strachan (creator of Groovy, Apache Camel, and now JX) and was first announced in March 2018. It’s designed from the ground up to be a cloud-native, Kubernetes-only application that not only supports CI/CD, but also makes working with Kubernetes as simple as possible. With one command you can create a Kubernetes cluster, install all the tools you’ll need to manage your application, create build and deployment pipelines, and deploy your application to various environments.

Jenkins is described as an “extensible automation server” that can be configured, via plugins, to be a Continuous Integration Server, a Continuous Deployment hub, or a tool to automate just about any software task. JX provides a specific configuration of Jenkins, meaning you don’t need to know which plugins are required to stand up a CI/CD pipeline. It also deploys numerous applications to Kubernetes to support building your docker container, storing the container in a docker registry, and deploying it to Kubernetes.

Jenkins pipeline builds are driven by adding a Jenkinsfile to your project. JX automates this for you. JX can create new projects (and the required Jenkinsfile) for you or import your existing project and create a Jenkinsfile if you don’t already have one. In short, you don’t need to know anything about Jenkins or Kubernetes to get started with JX. JX will do it all for you.

 

Overview of How JX Works

JX is designed to take all of the guesswork or trial and error approach many teams have used to create a fully functional CI/CD pipeline in Kubernetes. To make a tailored developer experience, JX had to choose which Kubernetes technologies to use. In many ways, JX is like a Linux distribution, but for Kubernetes. JX had to decide, from the plethora of tools available, which ones to use to create a smooth and seamless developer experience in Kubernetes.

To make the transition to Kubernetes simpler, the command line tool jx can drive most of your interactions with Kubernetes. This means you don’t need to know how to use kubectl right away; instead you can slowly adopt kubectl as you become more comfortable in Kubernetes. If you are an experienced Kubernetes user, you’ll use jx for interacting with JX (CI/CD, build logs, and so on) and continue to use kubectl for other tasks.

When you create or import a project using the jx command line tool, JX will detect your project type and create the appropriate Jenkinsfile for you (if it doesn’t already exist), define the required Kubernetes resources for your project (like Helm charts), add your project to GitHub and create the necessary webhooks for your application, build your application in Jenkins, and if all tests pass, deploy your application to a staging environment. You now have a fully integrated Kubernetes application with a CI/CD pipeline ready to go.

Your interaction with JX is driven by a few jx commands to set up and env, create or import an application, and monitor the state of your build pipelines. The developer workflow is covered in the next section. Generally speaking, once set up, you don’t need to interact with JX that much; it works quietly in the background, providing you CI and CD functionality.

 

Install Jenkins X

To get started using JX, install the jx binary. For Mac OS, you can use brew:

brew tap jenkins-x/jx brew install jx

Note: When I first tried to create a cluster using JX, it installed kops for me. However, the first time jx tried to use kops, it failed because kops wasn’t on my path. To address this, install kops as well:

brew install kops

Create a Kubernetes Cluster

JX supports most major cloud environments: Google GKE, Azure AKS, Amazon EKS, minikube, and many others. JX has a great video on installing JX on GKE. Here, I’m going to show you how to install JX in Amazon without EKS. Creating a Kubernetes cluster from scratch is very easy:

jx create cluster aws

Since I wasn’t using JX for a production application, I ran into a few gotchas during my install:

  1. When prompted with, “No existing ingress controller found in the kube-system namespace, shall we install one?” say yes.
  2. Assuming you are only trying out JX, when prompted with, “Would you like to register a wildcard DNS ALIAS to point at this ELB address?” say no.
  3. When prompted with, “Would you like wait and resolve this address to an IP address and use it for the domain?” say yes.
  4. When prompted with, “If you don’t have a wildcard DNS setup then set up a new CNAME and point it at: XX.XX.XX.XX.nip.io. Then, use the DNS domain in the next input” accept the default.

The image below shows you the EC2 instances that JX created for your Kubernetes Cluster (master is an m3.medium instance and the nodes are t2.medium instances):

LG IntroJenkinsX 1
AWS EC2 Instances. © 2018 Amazon Web Services, Inc. or its affiliates. All rights reserved.

When you are ready to remove the cluster you just created, you can use this command (JX currently does not provide a delete cluster command):

kops delete cluster

Here’s the full kops command to remove the cluster you just created (you’ll want to use the cluster name and S3 bucket for all kops commands):

kops delete cluster --name aws1.cluster.k8s.local \ --state=s3://kops-state-xxxxxx-ff41cdfa-ede6-11e8-xx6-acde480xxxx

To add Loggly integration to your Kubernetes cluster, you can follow the steps outlined here.

 

Create an Application

Now that JX up and running, you are ready to create an application. The quickest way to do this is with the JX quickstart. In addition to the quickstart applications that come with JX, you can also create your own.

To get started, run create quickstart, and pick the spring-boot-http-gradle quick start (see the screenshot below for more details):

jx create quickstart

 

LG IntroJenkinsX 2
Creating a Kubernetes cluster using jx create cluster © 2018 Jenkins Project

Note: During the install process, I did run into one issue. When prompted with, “Which organization do you want to use?” make sure you choose a GitHub Org and not your personal account. The first time I ran this, I tried my personal account (which has an org associated with it) and jx create quickstart failed. When I reran it, I chose my org ripcitysoftware and everything worked as expected.

Once your application has been created, it will automatically be deployed to the staging environment for you. One thing I really like about JX is how explicit everything is. There isn’t any confusion between temporary and permanent environments because the environment name is embedded into the application URL (http://spring-boot-http-gradle.jx-staging.xx.xx.xx.xx.nip.io/).

The Spring Boot quickstart application provides you with one rest endpoint:

LG IntroJenkinsX 3
Example Spring Boot HTTP © 2018 Google, Inc

 

Developer Workflow

JX has been designed to support a trunk-based development model promoted by DevOps leaders like Jez Humble and Gene Kim. JX is heavily influenced by the book Accelerate (you can find more here), and as such it provides an opinionated developer workflow approach. Trunk-based development means releases are built off of trunk (master in git). Research has shown that teams using trunk-based development are more productive than those using long-lived feature branches. Instead of long-lived feature branches, teams create branches that live only a few hours, making a few small changes.

Here’s a short overview of trunk-based development as supported by JX. To implement a code change or fix a bug, you create a branch in your project, write tests, and make code changes as needed. (These changes should only take a couple of hours to implement, which means your code change is small.) Push your branch to GitHub and open a Pull Request. Now JX will take over. The webhook installed by JX when it imported your project will trigger a CI build in Jenkins. If the CI build succeeds, Jenkins will notify GitHub the build was successful, and you can now merge your PR into master. Once the PR is merged, Jenkins will create a released version of your application (released from the trunk branch) and deploy it (CD) to your staging environment. When you are ready to promote your application from stage to production, you’ll use the jx promote command.

The development workflow is expected to be:

  1. In git, create a branch to work in. After you’ve made your code changes, commit them and then push your branch to your remote git repository.
  2. Open a Pull Request in your remote git repo. This will trigger a build in Jenkins. If the build is successful, JX will create a preview environment for your PR so you can review and test your changes. To trigger the promotion of your code from Development to Staging, merge your PR.
  3. By default, JX will automatically promote your code to Stage. To promote your code to Production, you’ll need to run this command manually: jx promote app-name --version x.y.z --env production

Monitoring Jenkins X

Monitoring the status of your builds gives you insight into how development is progressing. It will also help you keep track of how often you are deploying apps to various environments.

JX provides you multiple ways to track the status of a build. JX configures Jenkins to trigger a build when a PR is opened or updated. The first place to look for the status of your build is in GitHub itself. Here is a build in GitHub that resulted in a failure. You can clearly see the CI step has failed:

LG IntroJenkinsX 4
GitHub PR Review Web Page. © 2018 GitHub Inc. All rights reserved.

The next way to check on the status of your build is in Jenkins itself. You can navigate to Jenkins in your browser or, from GitHub, you can click the “Details” link to the right of “This commit cannot be built.” Here is the Jenkins UI. You will notice Jenkins isn’t very subtle when a build fails:

LG IntroJenkinsX 5
Jenkins Blue Ocean failed build web page. © 2018 Jenkins Project

A third way to track the status of your build is from the command line, using the jx get activity command:

LG IntroJenkinsX 6
iTerm – output from jx get activity command © 2018 Jenkins Project

If you want to see the low-level details of what Jenkins is logging, you’ll need to look at the container Jenkins is running in. Jenkins is running in Kubernetes like any other application. It’s deployed as a pod and can be found using the kubectl command:

$ kubectl get pods NAME                      READY     STATUS    RESTARTS   AGE jenkins-fc467c5f9-dlg2p   1/1       Running   0          2d

Now that you have the name of the Pod, you can access the log directly using this command:

$ kubectl logs -f jenkins-fc467c5f9-dlg2p

 

LG IntroJenkinsX 7
iTerm – output from kubectl logs command © 2018 Jenkins Project

Finally, if you’d like to get the build output log, the log that’s shown in the Jenkins UI, you can use the command below. This is the raw build log that Jenkins creates when it’s building your application. When you have a failed build, you can use this output to determine why the build failed. You’ll find your test failures here along with other errors like failures in pushing your artifacts to a registry. The output below is not logged to the container (and therefore not accessible by Loggly):

$ jx get build log ripcitysoftware/spring-boot-http-gradle/master view the log at: http://jenkins.jx.xx.xx.xxx.xxx.nip.io/job/ripcitysoftware/job/spring-boot-http-gradle/job/master/2/console tailing the log of ripcitysoftware/spring-boot-http-gradle/master #2 Push event to branch master Connecting to https://api.github.com using macInfinity/****** (API Token for accessing https://github.com Git service inside pipelines)

Monitoring in Loggly

One of the principles of a microservice architecture, as described by Sam Newman in Building Microservices, is being Highly Observable. Specifically, Sam suggests that you aggregate all your logs. A great tool for this is SolarWinds® Loggly. Loggly is designed to aggregate all of your logs into one central location. By centralizing your logs, you get a holistic view of your systems. Deployments can trigger a change in the application that can generate errors or lead to instability. When you’re troubleshooting a production issue, one of the first things you want to know is whether something changed. Being able to track the deployments in your logs will let you backtrack deployments that may have caused bugs.

To monitor deployments, we need to know what’s logged when a deployment succeeds or fails. This is the message Jenkins logs when a build has completed:

INFO: ripcitysoftware/spring-boot-http-gradle/master #6 completed: SUCCESS

From the above message, we get a few pieces of information: the name of the branch, which contains the Project name ripcitysoftware/spring-boot-http-gradle and the branch master, the build number #6, and finally the build status SUCCESS.

The metrics you should monitor are:

  • Build status – Whether a build was a success or failure
  • The project name – Which project is being built
  • The build number – Tracks PRs and releases

By tracking the build status, you can see how often builds are succeeding or failing. The project name and build number tell you how many PRs have been opened (look for “PR” in the project name) and how often a release is created (look for “master” in the name).

To track all of the above fields, create one Derived Field in Loggly called jxRelease. Each capture group (the text inside of the parentheses) defines a unique Derived Field in Loggly. Here is the regex you’ll need:

^INFO:(.*)\/.*(master|PR.*) #(.*\d) completed: ([A-Z]+$)$

Here’s the Jenkins build success log-message above as it appears in Loggly after we’ve created the Derived Field. You can see all the fields we are defining highlighted in yellow below the Rule editor:

LG IntroJenkinsX
Loggly – Derived Field editor web page.  © 2018 SolarWinds Worldwide, LLC. All rights reserved.

Please note that Derived Fields use past logs only in the designer tool. Loggly only adds new derived fields to new log messages. This means if you’ve got an hour of Jenkins output already sent to Loggly and you create the jxBuildXXX fields (as shown above), only new log messages will include this field.

In the image below, you can see all the Derived Fields that have been parsed in the last 30 minutes. For jxBuildBranchName, there has been one build to stage, and it was successful, as indicated by the value SUCCESS. We also see that nine (9) builds have been pushed to stage, as indicated by the jxBuildNumber field.

 

LG IntroJenkinsX 9
Loggly Search Results web page.  © 2018 SolarWinds Worldwide, LLC. All rights reserved.

Now that these fields are parsed out of the logs, we can filter on them using the Field Explorer. Above, you can see that we have filtered on the master branch. This shows us each time the master branch has changed. When we are troubleshooting a production bug, we can now see the exact time the code changed. If the bug started after a deployment, then the root cause could be the code change. This helps us narrow down the root cause of the problem faster.

We can also track when master branch builds fail and fire an alert to notify our team on Slack or email. Theoretically, this should never happen, assuming we are properly testing the code. However, there could have been an integration problem that we missed, or a failure in the infrastructure. Setting an alert will notify us of these problems so we can fix them quickly.

 

Conclusion

JX is an exciting addition to Jenkins and Kubernetes alike. JX fills a gap that has existed since the rise of Kubernetes: how to assemble the correct tools within Kubernetes to get a smooth and automated CI/CD experience. In addition, JX helps break down the barrier of entry into Kubernetes and Jenkins for CI/CD. JX itself gives you multiple tools and commands to navigate system logs and track build pipelines. Adding Loggly integration with your JX environment is very straightforward. You can easily track the status of your builds and monitor your apps progression from development to a preview environment, to a staging environment and finally to production. When there is a critical production issue that you are troubleshooting, you can look at the deployment time to see if changes in the code caused the issue.

Are you an administrator who’s supporting a small environment, and haven’t yet had the time or budget to invest in a centralized IT monitoring toolNo doubt you are tired of coworkers showing up at your desk or calling about an outage you weren’t yet aware of. If an enterprise-class solution would be overkill, but you don’t have the budget to purchase a licensed solution, ipMonitor Free Edition might be able to bridge that gap. 

 

ipMonitor Free Edition is a fully functional version of our ipMonitor solution for smaller environments.  It’s a standalone, free tool that helps you stay on top of what is going on with your critical network devices, servers, and applications—so you know what’s up, what’s down, and what’s not performing as expected. 

 

ipMonitor Free Edition at a Glance

  • Clear visibility of IT network dev !ice, server, and application status
  • Customizable alerting with optional automatic remediation
  • Simple deployment with our startup wizard and alerting recommendations
  • Lightweight installation and maintenance

 

ipMonitor Free Edition is an excellent starting point to more robust, centralized monitoring. It is designed for network and systems administrators with small environments or critical components they need to focus on, and can support up to 50 monitors. Monitors watch a specific aspect of a device, service, or process. Example monitors include: Ping, CPU, memory or disk usage, bandwidth, and response time.

 

Interested in giving it a try?  Download ipMonitor Free Edition today.  If you have any questions, head over to the ipMonitor product forum and start a discussion. 

 

 

 

Are you an administrator who’s supporting a small environment, and haven’t yet had the time or budget to invest in a centralized IT monitoring tool[MJ1] ? No doubt you are tired of coworkers showing up at your desk or calling about an outage you weren’t yet aware of. If an enterprise-class solution would be overkill, but you don’t have the budget to purchase a licensed solution, ipMonitor® Free Edition [MJ2] [WK3] might be able to help you bridge the gap.


[MJ2]Link to free edition PDP

[WK3]https://www.solarwinds.com/free-tools/ipmonitor-free

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.