This tutorial focuses on canary testing, its approaches, the process of running canary tests, challenges, and their solutions.
OVERVIEW
Canary testing is an approach to validate the newly added feature or version of the software application in a production (live) environment. Its main aim is to lower the risk and its impact on the users.
In the Software Development Life Cycle, various types of software testing approaches are executed by the DevOps team to ensure the reliability and quality of software applications. All organizations recognize the utmost importance of running tests on their software products before deploying them to end users.
Such testing is done because it demonstrates that the software meets the specified requirements of developers and end-users while also ensuring the absence of coding defects. The canary test complements the testing process and allows organizations to further validate the stability and quality of their software before a full-scale release.
DevOps teams use this approach to identify performance bottlenecks in software applications. This test observes how the software application behaves for a small chunk of end-users, similar to statistical sampling. Further, by analyzing this sample, DevOps teams can gain insights and estimates regarding the overall response.
Canary testing serves as a risk reduction and validation method for a new software application by introducing it to a limited number of real users. The team implements canary testing to test minor changes to the software application for a specific set of end-users. This group typically has a small percentage of the larger user base. By deploying this code to a sample group, the DevOps team can identify issues in the code.

It is a valuable and practical method as developers gradually deploy the code. It allows them to test new features and functionalities during production while minimizing the impact on the application's users. By limiting the exposure of the new feature, teams can validate the changes without significantly affecting the overall user experience.
In many instances, canary testing is often used interchangeably with canary release and canary deployment. However, when referring to canary testing specifically, it involves releasing code to evaluate and test new features or versions using real users within the live production environment.
The word "canary" is used in software development to describe a way of testing new software applications before releasing them to the public. The term comes from coal mining, where they used canary birds to keep miners safe. These birds were more sensitive to dangerous gases than people, so if the air became dangerous, the canaries would show signs of distress or even die, warning the miners to get out.
In terms of canary testing of a software application, a small group of users first tests out the new version. These users don't know they're like the canaries, helping to find any issues early on. If there are issues with the new code, developers fix it before letting more users try it. This way, they can make sure the software product works well for everyone and doesn't cause any major issues.
Note : Test your web apps across 3000+ desktop and mobile browsers. Try LambdaTest Now!
Assuming you already have a process in place to test your software upgrades, you most likely utilize various techniques from the DevOps realm, such as A/B testing and blue-green deployments.
Developers create automated tests for their software's new features and modifications. The changes are deployed to a testing environment where others can explore and interact with the new features. If everything goes smoothly, the new software update is rolled out to the production environment, allowing end users to benefit from the newly added feature.
However, given the nature of software, bugs tend to move into production. As humans, it is impossible to anticipate every potential edge case. Moreover, deadlines and budget constraints add to the pressure.
Canary testing is a technique that aims to limit the impact of these production bugs to a small subset of users. Traditionally, it involves having two identical production environments, although they need not be separate servers. For instance, two web applications could run on the same server.
Once you have a new release ready, you can deploy it to one of the environments. Then, you can direct a small portion of your users (around 5% is recommended) to this canary release. These users will experience the new features, while the other group will not encounter any changes.
You can then closely monitor this canary release and address any bugs. The objective is not to eliminate production bugs but to minimize their impact. If a bug exists in the new version, only 5% of your users are affected. While the bug still requires fixing, the pressure on you may be less than if all users were affected.
Testing helps to identify and address software application issues that affect user experience. Canary tests go a step further by introducing changes to the production environment to minimize or eliminate any negative impact on usability. Here are the advantages that make canary testing a valuable process:
The canary test address the following aspects to ensure the less buggy software applications:
If any issue arises, rolling back the changes and returning users to the original infrastructure is always possible.
Now that you know what the exact canary test does, it is important to understand when to execute the canary test. As we have learned in the previous section, the canary test is performed by the development team to check the functionality of the new version of the software applications.
Nonetheless, it remains crucial to test the code before deployment to prevent future issues thoroughly. To this, canary testing is performed to comprehensively understand the code's capabilities before updating the entire environment. Here are some other crucial scenarios where you have to perform canary tests:
Implementing canary tests is easy and helpful if you carefully follow the abovementioned steps. By doing this, you can successfully conduct canary tests and deployments, getting valuable insights to improve your software application and foster innovation.
Limiting the number of users affected by software changes makes identifying and addressing any software-related errors simpler. However, subtle distinctions exist between canary deployment and canary release that may cause confusion. Let's explore these differences in the following section.
Using a canary release is an effective method for gradually introducing incremental code changes associated with adding new features or developing a new software version. This approach involves releasing the code to real users in the production environment, allowing the development team to quickly assess whether the changes yield the intended or expected results.
Furthermore, canary deployment permits developers to migrate a small portion of users to the new functionality offered in a new release. Exposing only a subset of the overall user base to the new code minimizes potential issues related to the new software. Additionally, this approach facilitates an easier rollback of a faulty release, preventing it from impacting the entire user base.
To execute the canary test, two approaches are mainly implemented to achieve reliable outcomes. Here are those two approaches:
Blue-green deployment is one of the most used approaches to implement the canary test. In this, two identical environments are developed: "blue” and “green,” and the existing version of the software application are executed in a blue environment. In contrast, its new version is deployed in a green environment. However, there is a slight difference in the context of blue-green deployment.
Instead of having a separate environment to wait for switching to another environment once the deployment is done, a canary test using blue-green deployment involves initially switching over only a small subset of servers or nodes before proceeding with the rest.
This is how it is done:
Various configurations can be implemented for canary deployments. The easiest method involves setting up your environment behind a load balancer as usual but with a few spare nodes or servers (depending on your application's size) that are not in use. These spare nodes or servers are designated as the deployment targets for your CI/CD pipeline.
Once you build, deploy, and test these nodes, you reintroduce them to the load balancer for a limited duration and a restricted group of users. This enables you to ensure the success of the changes before repeating the process with the remaining nodes in your cluster.
Feature flags are a popular method of conducting canary tests that focuses on specific features. Instead of relying on releases, feature flags utilize code to allow development teams to activate or deactivate particular features for specific users. With the feature flag, you can limit the release to 1% of the users and monitor the key metrics, such as error rate and business metrics.

This helps to ensure that new features added to the software application do not have any negative impact. This approach is handy for business stakeholders who need to test new features before implementing them for everyone. However, while performing a canary test, if any issue is detected during the deployment method, you can easily disable the new features by turning the feature off.
Here are some common scenarios where feature flags are applied:
Prior to conducting canary tests, it is important to execute your automated tests. This step ensures that the code intended for release to your selected users is free from bugs and meets the initial quality standards. Typically, organizations already have established processes for testing software updates. Many utilize techniques such as A/B testing and leverage DevOps practices to automate the development, testing, and deployment of code modifications in the form of builds.
Once you have completed the automated testing phase for your new code and it has passed the necessary checks, you can push it to the production environment for user access. Following this, the canary test process can begin.
It is advisable to perform automated testing using a suitable test automation framework or tool. This will provide a significant level of code control while accurately documenting and presenting the test results. Selecting an automation tool that allows you to execute test cases on web and mobile applications is important, as canary tests encompass these devices and environments. For this, you can opt for cloud-based testing platforms to test on vast combinations of browsers, devices, and platforms.
LambdaTest is one of the most used AI-powered test orchestration and execution platforms that allow testing across large farms of 3000+ browser and OS combinations. With its real device cloud infrastructure, you can perform both manual and automated tests. Deploy and scale faster with its cross browser testing capabilities.
With LambdaTest's cloud-based grid, you can efficiently execute tests using frameworks such as Selenium, Cypress, Playwright, and more. Check documentation to get started with automation testing on LambdaTest.
You can also look at the tutorial below that gives you a basic understanding of performing automation testing on LambdaTest.
Catch up on the latest tutorials around Selenium automation testing, Playwright testing, CI/CD, and more. Subscribe to the LambdaTest YouTube Channel.
When the canary test has to be performed, you will need libraries and frameworks to streamline the test process and provide useful features. Here are some testing frameworks that you can use:
Utilizing specialized tools is indispensable to oversee canary releases and guarantee observability throughout testing effectively. Presented below are a few commonly employed tools for monitoring and observability:
Bear in mind that the selection of canary testing frameworks and monitoring tools depends on the specific requirements of your project, the technology stack being utilized, and the desired level of complexity in your canary deployment and monitoring processes.
The process of canary testing operates in a structured manner, similar to other software testing methods. The steps involved are as follows:
Step 1: The development team carefully selects a group of users who will serve as testers. This group represents a small subset of the overall user base, yet it is large enough to yield meaningful statistical analysis. Importantly, these users are unaware that they are participating in the testing process.
Step 2: A dedicated testing environment is established, running alongside the existing live environment. The system load balancer is configured to direct user requests from the designated canary testers to the new environment.
Step 3: The canary test begins as developers route test user requests to the new environment. Throughout this period, the developers closely monitor the testers to ensure that the new version operates as expected.
Step 4: If the new version meets the predetermined deployment criteria, the new software feature or version can be released to all users. However, if the new version has numerous bugs, diminishes application performance, or introduces any other issues for users, the testers are redirected back to the original software version.
Step 5: The development team addresses the identified bugs and subsequently releases the software to a broader audience.
By following these steps, canary tests foster thorough evaluation and validation of software changes before deployment to a large user base.
Note : Test your web and mobile apps in real-user conditions. Try LambdaTest Now!
The process of the canary test involves three main phases, which are very simple to be executed. Below are the main phases of the canary test:
This phase can be the longest and most challenging of all. During the initial step of canary testing, it is crucial to engage in proper planning. In this phase of the canary test, a small group of users will receive the updated code before a full release, called a canary deployment. Several factors need to be considered when planning a canary test, including:
Once you have finalized these decisions, you and your team can start working on establishing the canary infrastructure. This involves the following steps:
Followed to this, you need to generate a canary node using load balancing. You will replicate your production environment, creating a similar infrastructure to the currently active software environment. One of the clones will serve as the original or baseline, which you can rely on if the new code fails. If necessary, you can roll back to this baseline clone. The number of clones you create depends on the number of features you intend to test, with a minimum requirement of two.
After finishing the planning phase, the development team proceeds with the actual deployment of the canary test by directing the updated code to the chosen test group. The team will prepare deployment manifests and configuration files, build artifacts, and create testing scripts.
The team will then establish a canary node by balancing the load and duplicating the existing production environment. At least two production environments are needed for canary testing, with one serving as the original application without any code modifications (baseline). Also, the team will evaluate the new version by collecting data for the designated metrics determined in the previous stage.
The aim is to assess the latest version's performance consistency and system health. It is crucial to examine metrics such as latency, memory usage, error count, and volume is crucial. Detailed logs will be provided to identify any bottlenecks.
In this phase of canary testing, the canary code is routed to the selected number of users, resulting in traffic in both baseline and test nodes. With this, it becomes easy to comparatively evaluate the application's performance and check whether the test version aligns with the evaluation criteria.
If any issue is identified, information will be shared with the team for early fixing. Without any issues, you can deploy the version to the entire baseline or conduct another test with a different subset of users.
Here you get three options that you can choose from:
Every approach has its own challenges, and canary releases are no exception. However, rather than considering them as "disadvantages," feature management solutions can effectively address these challenges.
Fortunately, feature flags can come to the rescue once again. By incorporating feature flags into the new version of the app, you can enable the feature for a small group of users while keeping it disabled for others. Thus, feature flags allow you to conduct canary deployments even within a single production instance of your application.
Again, the utilization of feature flags can alleviate these difficulties. By leveraging a robust feature flag management platform, you can easily enable one or more features for specific groups of users. As things progress according to plan, you can gradually increase the percentage of users who experience the new version of your software until it reaches all users.
Throughout this comprehensive guide, we have delved into the fundamental concepts and optimal methodologies associated with canary testing. By gradually introducing alterations to a small subset of users or systems, canary testing empowers teams to carefully monitor the impact of these changes in a controlled environment before deploying features to the broader audience.
One of the key advantages of canary tests is their capacity to mitigate the risks linked to software updates or feature releases. Beyond the technical aspects, successful implementation of canary tests necessitates meticulous planning, transparent communication, and collaboration among development, operations, and other pertinent teams.
As organizations strive for continuous delivery and rapid innovation, canary testing remains an indispensable approach in their arsenal, ensuring that software updates are rolled out seamlessly and reliably.
Author's Profile

Nazneen Ahmad
Nazneen Ahmad is an experienced technical writer with over five years of experience in the software development and testing field. As a freelancer, she has worked on various projects to create technical documentation, user manuals, training materials, and other SEO-optimized content in various domains, including IT, healthcare, finance, and education. You can also follow her on Twitter.
Reviewer's Profile

Salman Khan
Salman works as a Digital Marketing Manager at LambdaTest. With over four years in the software testing domain, he brings a wealth of experience to his role of reviewing blogs, learning hubs, product updates, and documentation write-ups. Holding a Master's degree (M.Tech) in Computer Science, Salman's expertise extends to various areas including web development, software testing (including automation testing and mobile app testing), CSS, and more.
Get 100 minutes of automation test minutes FREE!!
