DevOps Terms for Non-Developers
DevOps is on the rise these days. Whether you attend banking conferences, such as FinTech, or browse automobile industry trend reports, it’s hard not to run into DevOps. According to a 2022 GigaOm report, 79% of organizations are expanding their DevOps practices.
While there may be many reasons that prompt varied organizations in different industries to deepen their DevOps practices, GigaOm found the top three drivers were improving the quality and reliability of each release, increasing team productivity, and faster release velocity. These fundamental principles of increasing quality, efficiency, and speed are attractive and have broad appeal across industries, geographies, and functional areas.
With the rise of DevOps adoption, there are many of us non-developers (or former developers) who can feel a little lost with some of the terms. For example, the first time I heard the term ‘chaos experiment’ in the morning standup, I thought it was a colorful euphemism for the testing process.
Below are a few of the DevOps terms I ran into over the last few years. If you run into other DevOps phrases that surprised you, please share them in the comments below, and I will add them to the list.
A program that performs a small, specific task in the background. It is used to automate repetitive tasks.
Agile Software Development
A lightweight framework that promotes iterative development and incremental delivery using self-organizing cross-functional teams.
Application Release Automation (ARA)
ARA involves using tools, scripts, or products to achieve the consistent and repeatable process of packaging and deploying an update, feature, or application from development to production.
An artifact is a document or any deliverable associated with a project that helps to describe the function, architecture, and design of the software being developed.
Behavior-Driven Development (BDD)
BDD is an evolution of test-driven development (TDD) that focuses on how the software should behave rather than how the behavior is implemented. BDD tests are written in natural language, allowing both technical and non-technical members of the team to describe the requirements of a software system collaboratively.
Black Box Testing
A type of functional testing that involves testing the internal structure, design, and coding of software. It is distinguished from white box testing in which the internal design of the software is not known to the tester and is, therefore, a “black box”.
Branching occurs when an object under review in source control is duplicated so other developers can work on it concurrently.
A type of agent that sends and receives messages about handling software builds.
Build Artifact Repository
Build artifact repositories are used to store, organize, and distribute artifacts (meaning binary files plus their metadata) in a single centralized location. This reduces the amount of time spent downloading dependencies from a public place and prevents inconsistencies by allowing development teams to find the right version of an artifact easily.
Build automation involves scripting and automating the process of compiling computer source code into binary code.
A deployment strategy aiming to reduce the impact of failures when bringing an application update to production. An updated version of the application is brought up alongside the existing one, and user traffic is gradually shifted over to the updated version. If any anomalies arise, traffic shifting is halted, and can easily be diverted back to the previous version, which is still running. This strategy ensures the update is stable and functional before being rolled out to all users.
This test determines the user capacity a computer, server, or application can support right before failing.
Chaos Engineering or Chaos Experiments
A disciplined approach for identifying potential failures before they evolve to create outages. Chaos engineering uses an approach of actively introducing failures in the system and ensuring the system can automatically alleviate or correct the failures without service disruption.
A set of connected computers that work together to enable load balancing, auto-scaling, and high availability.
The process of pushing code to a source code repository and logging the changes made.
Complex-Adaptive System (CAS)
A complex adaptive system refers to a system composed of a collection of similar, smaller pieces that are dynamically connected and can change to adapt to changes for the benefit of a macrostructure. The result of this distributed, interrelated, but independent collection of entities is a system where the understanding of the individual components does not translate into a perfect understanding of the system’s behavior. Examples of CAS include the brain, climate change, ant colonies, and DevOps teams.
Configuration as code
Configuration as code is an approach and practice where configuration items are treated as source code and with equal importance. Configuration items are defined as something used to change the behavior of an existing application, tool, or infrastructure component. Treating configuration “as code” ensures configuration items are quality tested and improved over time. The approach introduces guardrails, processes, tests, and automation and drives increased collaboration and responsibility for configuration items.
The natural evolution of virtualization, containerization treats each application as its own logically distinct server by virtualizing the operating system.
Resource isolation at the OS (rather than machine) level. Isolated elements vary by containerization strategy and often include file system, disk quota, CPU and memory, I/O rate, root privileges, and network access. Containers are less resource-intensive than machine-level virtualization and can be used to address most isolation requirement sets.
Containerization treats each application as its own logically distinct server by virtualizing the operating system. Containers are immutable, meaning that no matter where it is created, on what hardware or underlying operating system, the container will work the same.
A cloud service model that offers container-based virtualization with container engines, orchestration, and compute resources.
Containers as a Service is a managed service to run containerized applications where the orchestration is taken care of, and all you as a user need to be concerned about is your application as a container. CaaS is a specialized type of Platform as a Service (PaaS).
Continuous Delivery (CD)
An evolutionary outgrowth of continuous integration, continuous delivery is a set of processes and practices that automates the SDLC from build to testing, thereby enabling a rapid feedback loop between a business and its users. Together with continuous integration, it forms the modern CI/CD delivery pipeline.
Continuous Deployment (CDE)
Continuous deployment enables a development team to integrate code segments into the production environment several times per day. It is a fully automated version of continuous delivery.
Continuous feedback is an essential component of the DevOps way of working. The development team works closely with customers and end-users to get their feedback on the product. The actual user feedback is a source for further tasks and prioritization. The user feedback is augmented with feedback obtained from monitoring the system behavior in production and with other immediate feedback during development time, such as testing and monitoring results from system testing. It is also normal to get feedback on processes by measuring different aspects of them.
Continuous improvement is an approach with the objective of creating a culture that allows anyone on the team to make or suggest improvements to a product or process at any given time.
Continuous Integration (CI)
Continuous integration is a software development practice where developers are required to integrate code into a shared repository multiple times per day to get rapid feedback. Together with continuous delivery, it forms the modern CI/CD delivery pipeline.
Continuous Quality Improvement (CQI)
CQI is a quality management philosophy organizations use to reduce waste, increase efficiency, and increase internal and external satisfaction.
Continuous testing aims to reduce waiting time for developers by testing early and often and automating as much as possible.
A cross-functional team is the industry term for a team with all the competency required to design, implement, test, deliver, operate, and monitor a service or product. For example, a cross-functional team could include an Application Team, a Feature Team, or a DevOps Team.
A development strategy in which a new version of the code, one that implements new features, is released to a subset of users, observing how they respond, and updating features accordingly. It is similar in concept to the Canary Release.
Deployment refers to all the processes involved in getting new software up and running correctly in the target environment. Deployment includes installation, configuration, running, and testing.
An automated multi-step process that takes software from version control to making it available to an organization’s users.
DevOps is a methodology that strives to improve collaboration and automate as much as possible, with the end goal of releasing software faster and more efficiently.
DevSecOps involves incorporating security into all stages of the software development workflow instead of saving it for the last stage. DevSecOps resolves the tension between DevOps teams that want to release software quickly, and security teams that prioritize security over all else.
DevXOps refers to incorporating DevOps approaches in specific focus areas. Examples include DevSecOps with a focus on security, DevTestOps with a focus on testing, and DevBizOps with a focus on business value creation. All these focus areas are applications of the DevOps approach. DevXOps is used to describe any such focus area.
Distributed tracing is technology aiming to ensure application observability, traceability, and monitoring. It enables tracing the long chain of requests between different services over the network and it is relied upon for root cause analysis of issues in microservice-based applications.
A loosely coupled software architecture framework for application design where the capture, communication, processing, and persistence of events are at the very core of the solution design. With this architectural design, each component of the system is designed to produce events and to react, consume, and detect other events.
Everything as Code
Everything as code treats the entire system as code, meaning everything from bare metal servers to deployment configurations is stored in a repository as code and can be recreated or rolled back to a past state with the click of a button.
A type of testing that emphasizes testers freely discovering the capabilities of the software rather than following fixed methodologies.
A testing process where human testers are given free rein to test areas that may potentially have issues automated testing couldn’t detect.
A design strategy characterized by a rapid turnaround, where an attempt fails, is reported on time, feedback is quickly returned, the changes are made, and a new attempt is made. This is a key tenet of agile.
A type of black-box testing where functions are tested by feeding them input and examining the output.
Functions as a Service
Functions as a Service (FaaS) provides services for operating and triggering independent functions. The services aim to empower developers to run self-contained functions triggered by external events. The benefits are scalability from zero to thousands. Applications built on FaaS are comprised of a set of functions that are run and triggered independently of each other but together are (or are part of) a complete system.
Immutable infrastructure is the practice of creating new resources with the latest version or configuration. Instead of modifying an existing server or application, you create a new one and shift traffic over to the new instance. The key benefit is avoiding potential unplanned service interruptions. This practice is very well suited for cloud-based environments in Infrastructure as Code or Configuration as Code approaches. One common configuration is where file systems are mounted read-only, both to avoid configuration drift and to reduce the impact of security breaches.
IaaS provides access to computing resources through a virtual server instance, which replicates the capabilities of an on-premises data center. It is elastic and scalable, which makes it practical for workloads that are temporary or unpredictable.
Infrastructure as code is an approach and practice where infrastructure items are treated as source code and with equal importance. Infrastructure items are defined as scripts, definitions, and templates used to create and set up infrastructure components like tools, virtual machines, networks, storage, etc. In DevOps, this is an important technique since it enables teams to take responsibility and ownership of infrastructure, remove the handover to an operations team, and evolve infrastructure according to the innovation pace of the application itself.
The testing of a component or module of code to ensure it integrates correctly with other components or modules of code.
The average amount of time needed for one feature request to make it through the entire development cycle from concept to delivery.
Lean aims to create more value for customers with fewer resources by identifying and eliminating waste.
An approach to software development where the developers create software through modeling, instead of traditional computer programming. A development platform and software that provides an environment to developers that includes graphical interfaces and configuration management. Low-code platforms also contain software development automation features. Thanks to its visual layout, set of components, and minimal hand-coding required, Low-code speeds up the process of application creation.
Machine Learning (ML)
Machine learning is a branch of artificial intelligence that gives computers the ability to learn without being explicitly programmed.
Mean Time Between Failures (MTBF)
MTBF is the average time between failures or breakdowns of a device. It is used to measure the reliability of a system or component.
Mean Time to Recovery (MTTR)
The average time it takes a system or component to recover from a failure and return to production status.
A software design architecture that breaks apart monolithic systems into loosely coupled services that can be developed, deployed, and maintained independently. Each microservice is a discrete process that provides a unique business capability.
A testing technique where test cases are automatically generated from models. A model is an abstraction of the real-world function and represents the expected behavior of the system.
Non-functional testing is performed to confirm whether the system’s behavior meets non-functional requirements like usability and performance. It handles everything not covered by functional testing.
NoOps (no operations) is the concept that automation can eliminate the need for a dedicated operations team.
Software that makes its source code freely available for people to use, share, and modify.
One-Stop Shop / Out-of-the-Box Tools
Tools that provide a set of functionalities that work immediately after installation with hardly any configuration or modification needs. When applied to the software delivery, a one-stop shop solution allows quick deployment pipeline setup.
An agile software development method involving two programmers working together at a single workstation. One programmer (the driver) writes code while the other (the observer) reviews it, with frequent role changes between the two.
Pipeline management refers to the activities that are needed to keep the DevOps pipeline running optimally. Pipeline management involves following the load, identifying bottlenecks, and correcting any shortcomings. It can also include further development of the pipeline based on the needs. There can be a separate team providing these services, and in that situation, it may be referred to as “Pipeline as a Service”.
PaaS expands on the IaaS model by providing not only infrastructure through the cloud but also middleware, development tools, business intelligence, database management systems, and more.
Policy as code
Policy as code is the idea of writing code in a high-level language to manage and automate policies. The policies could range from high-level access policies in databases to low-level resource management in cloud environments. By representing policies as code in text files, proven software development best practices can be adopted, such as version control, automated testing, and automated deployment.
Policy as code makes it possible to automatically apply and enforce the policy, enabling benefits such as having defined one single policy, which is then applied to many different systems, increasing the consistency across a wide set of services. This makes it easier to maintain and change the policy. To achieve this, you need a configuration language to describe the policy and a way of applying the policy to the target services. An example of policy as code is the Open Policy Agent Framework.
A private cloud serves the needs of a single organization. It’s often hosted on-prem and is optimized to fit a particular infrastructure use case.
The product owner has a leadership role in agile development and is responsible for managing the product backlog.
Production, also known as live, is an environment where the application or feature is accessible to users.
The final stage in a deployment pipeline is where the software will be used by the intended audience.
A public cloud is hosted by a cloud provider such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform. It provides on-demand cloud services that are pay-as-you-go.
A quality gate is a checkpoint between different stages in a software development process. A predefined set of criteria defines if the process can proceed to the next stage. Quality gates are designed to give fast feedback to stakeholders and secure quality. They also reduce waste in the organization since they stop the process if criteria are not met.
A radiator is a visualization of real-time status information from the delivery pipeline and development lifecycle. Radiators aggregate information from multiple sources and present it to stakeholders to reduce the need for users to go into multiple sources to understand the real-time status.
A type of software testing that verifies existing software still performs the same way after being changed in some way (such as with software enhancements, patches, or configuration changes).
Release management is the orchestration of software delivery activities and resources across multiple, interdependent releases and change initiatives.
An automatic or manual operation that restores a database or program to a previously defined state. Usually, this is performed in response to issues in the current version.
Scrum is an agile project management framework that breaks the available work into discrete units and then works on them during periods called sprints. At the end of each sprint, the deliverable is a potentially releasable increment to the product.
Self-service deployment refers to situations where deployment is not fully automated. A single manual command can take push code from staging to production.
Serverless is a paradigm where an application developer does not have to care about the underlying (virtual) servers – they are completely managed. Typically, via a consumption-based model, you only pay for the actual usage. No up-front commitment is needed. If the application is not used, you don’t pay for anything. Serverless is often used in conjunction with FaaS and SaaS services.
Service Level Indicators (SLI)
Service Level Indicators (SLI) is a specific metric of the application that directly corresponds to the satisfaction of a typical user of the application. SLIs are used to build targets (SLO) for the level of reliability one aims to achieve and is a core part of Site Reliability Engineering (SRE). Good examples for SLI are the percentage of all HTTP requests that result in an error page or the 99th percentile of the time it takes to respond to HTTP requests.
Service Level Objectives (SLO)
Service Level Objectives (SLO) are the goals for the level of reliability one wants to achieve over a certain time (often measured over the last 28 days). SLOs are used to adjust priorities for development teams when they need to focus more on reliability or if they are achieving the required reliability to develop new features.
Service virtualization is an approach where impediments caused by dependencies between software components are alleviated through well-defined interfaces and virtual implementations of those interfaces (stubs, virtual services). The virtual implementation typically returns some simple valid answer for the interface call, which thus allows for the development of other dependent components before the actual implementation is available. Also commonly referred to as mocking a service.
Process of pushing testing (code quality, performance & security) toward the early stages of the SDLC. By testing early and testing often, you can find and remediate bugs earlier and improve the quality of software.
Shift-left is the practice of testing earlier in the software delivery process. Rather than deferring thorny issues to an unknown later date, testers can improve quality by catching errors before they snowball or become critical.
Shift-right / Progressive Delivery
Process of monitoring, observing, and testing (resiliency, reliability, and security) of new releases “in production” to ensure correct behavior, performance, and availability.
Site Reliability Engineering (SRE)
Site Reliability Engineering (SRE) is the practice of running modern and complex applications in production by balancing new features and stability with agreed-upon metrics (SLO). SRE differs from traditional operations by being a part of the application team and sharing the responsibility. Usually, SRE team members have a background or knowledge of system internals and networking (shared infrastructure) components and focus on building reliability into the application.
SaaS hosts applications and makes them available to users over the internet.
Source control (or version control) is a system that records changes to a file or set of files over time so that previous versions can be accessed and recalled later. It is useful for rollbacks and disaster recovery, among other things.
Also called revision control or version control, this is a process for storing, tracking, and managing changes to code, documents, websites, and other pieces of information. This is usually achieved by generating branches off of the source.
A sprint is a defined work period, usually a month or less, in which a scrum team completes a discrete unit of work.
A near replica of a production environment used to test codes, builds, and updates to make sure everything works properly before deployment.
Used to test the newer version of your software before it’s moved to live production. Staging is meant to replicate as much of your live production environment as possible, giving you the best chance to catch any bugs before you release your software.
A stream-aligned team is aligned to a flow of work from (usually) a segment of the business domain.
Refers to the rework that must be done when speedy delivery and easy implementation are prioritized over a better, but usually slower or more expensive, approach.
Allows testers to reuse tests in a repeatable process, thereby eliminating time-consuming and repetitive tasks. Test automation is crucial for agile and DevOps environments.
An environment that testing teams use to execute test cases and fix bugs before a release.
Unit testing involves breaking down the code segments into small bitesize chunks or units of code or logic that can be quickly and easily tested.
User Acceptance Testing (UAT)
A type of software testing that verifies a given application works for the user. During this process, users test the software to ensure it behaves as expected in real-world scenarios.
Value Stream Management
A new category that maps optimizes, visualizes, and governs business value flow through heterogeneous enterprise software delivery pipelines.
Value Stream Mapping
A technique that relies on structured visualizations to get a holistic view of how work flows through the system to identify and reduce wasteful activities. Value stream mapping creates a visual representation of the activities in a value stream, organized into sequences of steps. Each step is analyzed for its characteristics like the value it generates, the time delay it introduces, and the effort it requires.
White Box Testing
A software testing method that involves testing the internal structure, design, and coding of a piece of software. It is distinguished from black box testing in that code is visible or transparent to the tester, and is, therefore, a “white box”.