Building Distributed Architectures: Effective CI/CD Pipelines
“Continuous integration and continuous delivery allow us to build, test, and deliver software with greater speed, reliability, and automation than ever before.” ― Jez Humble, Co-author of “Continuous Delivery
Today’s article explores the second goal in our series, “Building Distributed Architectures”. In this post, we’ll delve into the importance of CI/CD pipelines and explore the best practices for creating robust and efficient pipelines that ensure smooth and reliable software delivery, while minimizing human error and manual effort.
Before going further, this article is part of a series covering how to build an effective distributed architecture. If you haven’t already, I recommend reading the previous articles to get up to speed:
Building distributed systems requires a lot of moving parts that must work effectively with each other. Furthermore, there might be more than one distributed system that exists at any given moment in time, and based on the previous article in this series, our goal is to have at least three environments: development, staging and production. Each of these environments has its own purpose, and the development environment is the one that is most frequently updated. In order to ensure that the changes are propagated to the other environments, we need to have a reliable and efficient CI/CD pipeline. Also, we need to keep a good track of the versions of the components that are deployed in each environment, so that we can easily identify the changes that are deployed in each environment.
This article focuses on setting up an effective pipeline that meets these demands. Since my day-to-day work involves using Azure DevOps, I will be focusing on that platform, but the concepts are applicable to any CI/CD platform with minimal changes.
The CI/CD Pipeline
A CI/CD pipeline must be the single source of truth for any deployed system. This means that any change that must end up in a running environment must follow the same set of standard steps and manual changes must be avoided at any cost. On the short run this might seem like an overhead, but on the long run it pays off, as it ensures that the environments are always in sync and that the changes are propagated in a consistent manner. Also, it reduces the risk of human error, as the steps are always the same and the pipeline can be tested and validated. Using the CI/CD pipeline as the single source of truth also allows us to easily identify the changes that are deployed in each environment, as we can always check the pipeline history and see what was deployed and when.
I will outline a few key points that I consider important when designing a CI/CD pipeline:
- Reusability and modularity - basically applying the DRY (Don’t Repeat Yourself) principle.
- Versioning - identifying the version of each component that is deployed in each environment and the overall version of the product.
- Handling multiple releases, bug fixes and features.
- Reentrant pipelines.
Reusability and modularity
First of all, when designing a CI/CD pipeline, in order to make it efficient, we must address it as any normal piece of software. This means using templates, variables, and other features that allow us to reuse the same pipeline for multiple projects. This is especially important when we have multiple projects that are deployed in the same environment, as we can use the same pipeline for all of them, with minimal changes, or when having similar projects, the pipelines are also similar and can be reused.
In order to achieve this goal, a good practice is to use templates. Templates allow us to define a set of steps that can be reused in multiple pipelines. For example, we can have a template that installs the Azure CLI, or a template that installs the Terraform CLI. This allows us to reuse the same steps in multiple pipelines, without having to copy-paste the same steps in each pipeline. Also, if we need to update the steps, we can do it in one place and the changes will be propagated to all pipelines that use the template.
I would generally reccomend setting up a single project that contains all these templates. Looking at the structure of the Azure CI pipelines, the templates would be structured using the following folder structure:
/ci-templates
│
├── /deployments
│ ├── update-product-version.yaml
│ ├── tag-repositories.yaml
│ ├── publish-release-notes.yaml
│ └── ...
│
├── /jobs
│ ├── /node
│ │ ├── build.yaml
│ │ ├── publish-artifacts.yaml
│ │ └── ...
│ ├── /dotnet
│ │ ├── build.yaml
│ │ ├── publish-artifacts.yaml
│ │ └── ...
│ ├── /golang
│ │ ├── build.yaml
│ │ ├── publish-artifacts.yaml
│ │ └── ...
│ ├── ...
│ ├── build.yaml
│ ├── publish-artifacts.yaml
│ ├── semantic-release.yaml
│ └── ...
│
├── /steps
│ ├── /build
│ │ ├── /node
│ │ │ ├── build.yaml
│ │ │ ├── test.yaml
│ │ │ ├── doc.yaml
│ │ │ ├── prod.yaml
│ │ │ ├── publish-build-artifact.yaml
│ │ │ └── ...
│ │ ├── /dotnet
│ │ │ └── ...
│ │ ├── /golang
│ │ │ └── ..
│ │ ├── ...
│ │ └── create-fs-snapshot.yaml
│ ├── ...
│ └── ...
Deployments
The deployment templates address cross-cutting concerns such as:
- Updating the product version
- Tagging repositories with the actual versions for easy identification
- Creating release notes or publishing internal documentation
- Deploying brand-specific assets (logos, images, etc)
Jobs
The job templates address specific tasks that are common to multiple pipelines, such as:
- Building a project (ex. basic NodeJS build job, .NET build job, etc)
- Publishing artifacts (ex. publishing a Docker image, publishing a Helm chart, etc)
- Running the semantic-release tool to automatically version the project build and to generate release notes
If the jobs require specialization (such as build jobs), then a general job can be defined with parameters, and in turn this job will reference the more specialized job based on the parameters. For example, a general build job can be defined with the following Azure DevOps YAML syntax:
parameters:
build: true
test: true
doc: false
platform: node
jobs:
- template: ./$/build.yaml
parameters:
build: $
test: $
doc: $
platform: $
This build job will load the appropriate specialized job based on the platform
parameter. This allows us to reuse the same job template for multiple projects, without having to copy-paste the same steps in each pipeline.
A specialized build job for NodeJS projects can be defined as follows:
parameters:
build: true
test: true
doc: false
jobs:
# Perform the project's semantic release
- job: BuildAndTest
displayName: Build ($) and test ($) and document ($) node project
steps:
- checkout: self
persistCredentials: true
- $:
- template: ../../steps/build/create-fs-snapshot.yaml
- template: ../../steps/build/node/build.yaml
- $:
- template: ../../steps/build/node/test.yaml
- $:
- template: ../../steps/build/node/doc.yaml
- $:
- template: ../../steps/build/node/prod.yaml
- $:
- template: ../../steps/build/node/publish-build-artifact.yaml
Note: The specifics of each step will be detailed below in the Steps section
Steps
The step templates address specific tasks that are common to multiple jobs, such as:
- Installing NPM packages
- Running the build/test/doc command for a specific platform
- Collecting the build artifacts
- Publishing the build artifacts
- etc.
In our example above - for a NodeJS project we have the following steps:
- create-fs-snapshot.yaml - this creates an internal snapshot of the build folder before installing packages. This is useful in the later steps to identify exactly which files need to be published as artifacts.
- build.yaml - this installs the NPM packages and runs the build command
- test.yaml - this runs the test command
- doc.yaml - this runs the doc command, generating the documentation files from the codebase
- prod.yaml - This will install the production dependencies and remove the development dependencies from the build folder
- publish-build-artifact.yaml - This will create an archive containing all necessary files from the initial snapshot and the production dependencies, and will publish it as a build artifact. This artifact can be used later in the pipeline to publish the build to a Docker registry, or to publish it as a Helm chart.
Of course, this is just an example, the build steps will be different for each project and technology, but the general idea is to have the granular details in the steps, while the jobs and deployments act as recipes that combine the steps in a specific order.
Using the templates in each pipeline
Once the templates are defined in a separate project, they can be used in each pipeline. For example, a pipeline for a NodeJS project can be defined as follows:
# Set continuous build trigger on the master branch
trigger:
batch: true
branches:
include:
- master
- main
- release-*
- feat-*
- staging
- prod
exclude:
- refs/tags/*
resources:
repositories:
# Reference the ci-pipelines repository for the CI templates
- repository: ci-pipelines
type: git
name: my-devops-project/ci-pipelines
stages:
- stage: Build
jobs:
- template: ./ci-templates/jobs/build.yaml@dev-tools
parameters:
build: true # change this to false if the project doesn't need building
test: true # change this to false if the project doesn't need testing
doc: true # change this to false if the project doesn't produce doc artifacts
platform: node
- template: ./ci-templates/jobs/semantic-release.yaml@dev-tools
parameters:
build: true # change this to false if the project doesn't build any output; this must match the build parameter from the previous step as it will affect the semantic-release step
platform: node
- stage: Publish
dependsOn: Build
jobs:
- template: ./ci-templates/jobs/publish-container.yaml@dev-tools # this will run the steps to produce a docker container and publish it to a container registry
- template: ./ci-templates/jobs/publish-helm-chart.yaml@dev-tools # this will run the steps to produce a helm chart and publish it to a chart registry
Some notes on the above pipeline
First fo all, the project pipeline uses one or more Stages to separate the jobs. This is not always necessary, but it will produce a cleaner view in the Azure Devops UI, as each stage will be displayed as a separate section. Also, the stages can be used to define dependencies between jobs, so that a job will only start after the previous job has finished, so I would reccommend it as a good practice. There is, though, a downside to going through this approach - there is a small overhead in the pipeline execution time, as each stage will be executed in a separate agent. This is not a big issue, as the stages are usually independent, but it is something to keep in mind.
The pipeline does not effectively publish the helm chart in the Kubernetes environment, it only publishes it to a chart registry. The actual deployment of the chart is done in another pipeline, which will be outlined below. This decouples the build of a project with the lifetime of that project inside the running environments.
Generally, each project’s pipeline should produce artifacts that are published to a location from which they can be used later on. The pipeline above describes the templates and steps for a typical NodeJS microservice that will end up creating a Docker image and a Helm chart. The Docker image will be published to a Docker registry, while the Helm chart will be published to a Helm chart registry. These artifacts can be used later on in the deployment pipeline to deploy the project in the Kubernetes environment.
For NodeJS projects that produce only NPM packages, the above pipeline will not contain the Publish
Stage.
For other technologies such as dotnet, the pipeline will probably be nearly identical, with the only difference in the value of the platform
parameter.
Approaching the pipelines in this way ensures that the details of the steps are hidden from the project, yet centralized. If a change needs to be done in the steps, it can be done in one place and the changes will be propagated to all pipelines that use the templates. This is particularly useful when the solution grows and the number of projects becomes larger, as it doesn’t require changing each individual pipeline.
Versioning
A typical requirement is to be able to identify the version of the software. This is important for various reasons, such as:
- Better communication with the customers
- Easier identification of the changes that are deployed in each environment
- Marketing activities centered around new releases and updates
This may sound simple, but in reality when having a distributed environment, we must handle the individual versions of each component and the overall version of the product. Therefore, our CI/CD pipeline must be able to handle this requirement.
In order to manage the versioning aspect, as well as having a coherent and consistent behavior across the entire solution, there are a few decisions and steps that must be taken into account.
Versioning of each component
Having a clear and standardized versioning schema is essential for consistency. There are multiple approaches to this, but the most common one is the Semantic Versioning approach. This approach defines a version as a set of three numbers, separated by dots, as follows: MAJOR.MINOR.PATCH
. The version is incremented based on the following rules:
- MAJOR version when you make incompatible API changes,
- MINOR version when you add functionality in a backwards compatible manner, and
- PATCH version when you make backwards compatible bug fixes.
We employ semantic versioning in all of the projects that make up the solution. For this, we use the semantic-release tool, which is a tool that automatically determines the next version number based on the commit messages. This tool is integrated in the CI/CD pipeline and it is run after the build step. The tool will analyze the commit messages and will determine the next version number based on the changes that were made. For example, if the commit messages contain the BREAKING
keyword, then the MAJOR version will be incremented. If the commit messages contain the feat
keyword, then the MINOR version will be incremented. If the commit messages contain the fix
keyword, then the PATCH version will be incremented. If the commit messages contain the chore
keyword, then the version will not be incremented. This allows us to have a clear and consistent versioning schema across all projects.
The semantic-release
tool will also generate release notes based on the commit messages, which can be used to communicate the changes to the customers. Also, the tool will tag the repository with the version number, which allows us to easily identify the version of each component.
To set up the semantic release, a configuration must be defined in the root of the project, in a file named .releaserc
. This file contains the configuration for the tool, such as the plugins that are used, the commit message format, the release rules, etc. An example configuration file can be defined as follows:
success: []
fail: []
branches:
- 'master'
- 'main'
- 'staging'
- 'prod'
- 'release-+([0-9])?(.{+([0-9]),x}).x'
plugins:
- '@semantic-release/commit-analyzer'
- '@semantic-release/release-notes-generator'
- '@semantic-release/changelog'
- '@semantic-release/npm'
This configuration file triggers the semantic-release algorithm on the branches named main
, master
, staging
, prod
or on any branch that has a name matching the format release-[MAJOR].x
or release-[MAJOR].[MINOR].x
The steps taken during the semantic release are:
- Analyze the commit messages and determine the next version number (this also takes into account the repo tags and branch name)
- Generate the release notes
- Generate the changelog
- Publish the NPM package (if the project is an NPM package).
Notes
- We use the
semantic-release
project even in solutions which are not based on NodeJS. This means that apackage.json
file must exist even in these projects. - For projects that either are not based on NodeJS or that do not actually publish an NPM package, it is enough to place the
private: true
flag in thepackage.json
file, so that the NPM publish step is skipped. - For monorepos that contain sub-projects that are NPM packages, the above configuration is slightly different:
success: []
fail: []
branches:
- 'master'
- 'main'
- 'staging'
- 'prod'
- 'release-+([0-9])?(.{+([0-9]),x}).x'
plugins:
- '@semantic-release/commit-analyzer'
- '@semantic-release/release-notes-generator'
- '@semantic-release/changelog'
- - '@semantic-release/npm'
- pkgRoot: ./dist/packages/my-lib-1
- - '@semantic-release/npm'
- pkgRoot: ./dist/packages/my-lib-2
The configuration above is for an NX project that has the subprojects in the packages
folder and that publishes two NPM packages: my-lib-1
and my-lib-2
. The pkgRoot
parameter instructs the semantic-release
tool to publish the NPM package from the specified folder.
Branching and merging strategies
Although this is a whole topic in itself, I will briefly outline the branching and merging strategy that we use in our projects, as this affects also the CI/CD pipelines and the versioning.
Looking back at our standard environments - dev, staging and prod, our versioning strategy is linked to the environments. In a nutshell, we use the following branching strategy:
- The
main
ormaster
branch contains the code that will be deployed in thedev
environment. We use both names for historical reasons, but to be consistent it is better to settle on a fixed name e.g.main
and use it across all projects. - The
staging
andprod
branches contain stable releases that are deployed in the respective environments. Whenever a project is deemed stable enough, it will be merged onto thestaging
branch. After the QA process gives the green light, it will be merged into theprod
branch. Theprod
branch is the one that is deployed in the production environment. - Bug fixes are done by branching the corresponding
prod
orstaging
branch into arelease-[MAJOR].[MINOR]
branch. The MINOR version used as a basis for the branch should be the highest one for that specific[MAJOR]
. The semantic release configuration ensures that any new versions created on this branch can only affect the[PATCH]
version and that they will be higher than the initial[MAJOR].[MINOR].[PATCH]
version. This also ensures that changes on this branch cannot produce versions that already exist (semantic-release will report an error). Once the bug is fixed, the branch will be merged into themain
so that newer versions will contain the fix. - New features are developed on feature branches called
feature-[name]
. Changes pushed here will not create new versions.
The branching strategy can be extended with other rules, such as - commits to main
branch can be done directly, while merges into the staging
and prod
branches must be done through pull requests. This ensures that the code is reviewed before being merged into the stable branches.
From a CI/CD perspective:
- the project pipeline will cause an automatic CI build for the commit on any of these standard branches (
main
,master
,staging
,prod
,release-...
,feature-...
) - the master CI pipeline, about which I will talk below, is triggered continuously on the
main
branch, while thestaging
andprod
pipelines are triggered manually, when a new version is ready to be deployed. Therelease-...
andfeature-...
branches are not deployed in any environment, they are only used for development purposes, therefore the master CI pipeline is not triggered on these branches. Optionally, depending on the bug fix strategy, this rules can be extended to handle therelease-...
branches as well (treating them identical to theprod
branch).
There are multiple branching strategies that can be employed and the semantic-release tool can be configured to work with any of them. The above strategy is just an example, but it is important to have a clear strategy that is followed across all projects.
Versioning of the product
As mentioned before, even though each project in the whole solution has its own version, we also need to have a version for the entire solution. Also, we need control over when the version is incremented, as we don’t want to have a new version for each commit that is pushed to the main
branch. This is where the master CI pipeline comes into play.
A master CI pipeline describes the steps that are taken for the entire project. The master CI pipeline uses deployments, jobs and steps from the same common template project that is used by the project pipelines. The master CI pipeline is triggered continuously on the main
branch and manually on the staging
and prod
pipelines, and it is responsible for the following:
- Set product version stage. Generating a new version (this is not used until the end of the pipeline, but we need it from the beginning for the next steps). For control of the whole project version we don’t use semantic versioning here, instead the versions are controlled in a semi-automatic way. The product
MAJOR
andMINOR
version are set out in build variables, which must be manually adjusted, while thePATCH
version is a timestamp (or optionally, incremental) value. Typically releasing a new official version (whether minor or major) is an activity that must be coordinated with the other business departments (such as marketing, sales, etc.), therefore the decision of switching to a new version involves a manual step. - Deployment stage. This stage has individual jobs for each microservice or component from the solution that must be deployed to an environment. The environment, as well as it’s settings are parametrized so that by changing the environment name, the pipeline will target
dev
,staging
orprod
, respectively. - Update product version stage. The final step in the master CI pipeline is to update the product version. This is reflected in either setting up some global value or updating a ConfigMap or Secret in the Kubernetes environment. This is done at the end of the pipeline, so that if any of the previous steps fail, the version will not be updated.
The master CI pipeline is set out with continuous triggering for the main/master
branches of each project and manual triggering for the staging
and prod
branches. I consider this to be a good practice because even if the staging
or prod
merges are guarded by a PR, in the end, the whole deployment is managed as a whole - generating a single new version. On the opposite side, any push to the main
branch of a project will trigger the deployment to the dev
environment, which ensures that the dev
is always up-to-date with the latest changes. As an optional alternative, the staging
can be set up as a continuous deployment, but only in the case when the staging
branches are guarded by a PR.
Reentrant pipelines
An important aspect of a healthy CI/CD process is that the pipelines must be reentrant. This means that running the same pipeline multiple times should not cause issues and in the best case, if there are no changes between runs, there should be no end effect. Having the pipelines designed in such a way makes it easy to re-run them in case of failures, or to re-run them in case of changes in the environment. There are many situations in which a pipeline job can fail, such as:
- The pipeline agent is not available
- Timeout during the pipeline execution
- Temporary outage of the external services (such as Docker registry, Helm registry, etc)
- etc. In such cases, the pipeline should support the option of retrying the failed jobs or the whole pipeline altogether.
Other improvements
There are many improvements that can be done to the outlined pipelines, but they are out of the scope of this article. Just to name a few:
- Setting up pipelines that create a complete environment from scratch, run e2e tests and then drop the environment
- Integrating static and dynamic vulnerability scanning tools into the build/deployment process
- Integrating code quality tools into the build/deployment process
- Integrating performance testing tools into the build/deployment process
- Integrating observability tools into the build/deployment process
- etc.
Conclusions
This article delves into the crucial considerations when designing a robust CI/CD (Continuous Integration and Continuous Delivery) pipeline for your software projects. While it’s an extensive read, it’s essential to address these key aspects:
-
Single Source of Truth: The CI/CD pipeline serves as the authoritative source for all deployments. It ensures consistency and reliability across environments.
-
Software Development Principles: Treat the CI/CD pipeline like any other software project. Emphasize reusability and modularity, following the “Don’t Repeat Yourself” (DRY) principle.
-
Handling Diverse Changes: Design the pipeline to accommodate multiple types of changes, including releases, bug fixes, and new features.
-
Reentrancy: Create a pipeline that can be rerun without adverse effects, crucial for addressing failures or adapting to evolving requirements.
-
Versioning Control: Manage versioning at both the component and product levels. Consider adopting semantic versioning for components and ensure clear control over the product’s version.
-
Branching and Merging Strategies: Implement a well-defined branching and merging strategy across the project, aligning it with CI/CD practices.
-
Support for Multiple Environments: Craft the pipeline to seamlessly adapt to various deployment environments, such as development, staging, and production.
-
Scalability for Multiple Projects: Ensure that the pipeline can accommodate multiple projects within your ecosystem with minimal adjustments.
-
Technology Agnosticism: Design a pipeline flexible enough to cater to various technologies used in your organization’s projects.
-
Deployment Flexibility: Enable the pipeline to target different deployment destinations, supporting diverse deployment targets.
By incorporating these considerations into your CI/CD pipeline design, you can establish a robust, adaptable, and efficient pipeline that aligns with best practices in software delivery.
Having a well-designed CI/CD pipeline is essential for any distributed system. It ensures that the environments are always in sync and that the changes are propagated in a consistent manner. Also, it reduces the risk of human error, as the steps are always the same and the pipeline can be tested and validated. Using the CI/CD pipeline as the single source of truth also allows us to easily identify the changes that are deployed in each environment, as we can always check the pipeline history and see what was deployed and when. It is one of the most crucial aspects of a distributed system and it must be designed with care early in the stages of the project.
As with any other part of the project, the CI/CD pipeline must be designed with the future in mind, but it must also be flexible enough to allow changes and improvements as the project evolves. It is important to have a clear strategy and to follow it across all projects, but it is also important to be able to adapt to the changing requirements and to be able to improve the pipeline as the project evolves. Using templates and modularizing the pipeline is a good way to achieve this without having to rewrite the pipeline from scratch for each project, especially when the number of such projects grows over time.