Flaky Selenium tests are the most complex and challenging issues for QA teams. These tests can unpredictably fail or pass without the backing of a consistent behavior. These inconsistent results can cause delays and confusion, hampering test reliability.
Therefore, finding flaky Selenium tests and addressing them is very important for maintaining reliable test suites and producing accurate results to validate software functionality.
This article discusses in detail about flaky selenium tests and how to handle them.
What are Flaky Tests?
A flaky test is one that might pass or fail for the same configuration. Their existence confuses developers because in this case, test failure does not always signal the existence of bugs in the code. Since the primary purpose of software tests is the detection of bugs, this non-determinism defeats the purpose.
Flaky tests are particularly disruptive to tests with a broad scope, such as functional tests and acceptance tests.
It’s easy to think of frustrating roadblocks and nothing else. But then, it can actually be helpful in revealing gaps in the testing pipeline, both on the technical and human aspects.
This article will explore what those gaps are, and why it pays to pay attention to flaky selenium tests when they crop up.
The Technical Aspect of Flaky Selenium Tests
In Selenium WebDriver tests, flakiness comes from a couple of tests:
1. Lack of synchronization: The architecture of web apps contain multiple layers, The way these layers interact influences web app performance, including network speed, HTTP handling, source rendering, and computer processing resources. Because of this, some operations may have varied timings when the website is put through different end-to-end tests. In one instance, a button may not show up on the page quickly enough, or dialog box might not shift fast enough for the automated test to progress accurately.
This can be solved by including Selenium wait commands that synchronize test steps with the software logic. If some actions need a bit more time to execute, Selenium wait commands are perfect for halting test execution until the action is complete or a certain web element is found.
However, remember that if certain areas of the web app consistently need waits, especially longer waits, it could suggest poor performance. An example of this would be this: one set of automated end-to-end tests related to the same feature fails every time. Dig deeper; it is possible that some bad coding practices may be involved. Flaky tests would indirectly pick up on this problem.
2. Accidental load testing: As automated test suites grow, the number of lines in test code increases. Additionally, the number of tests a piece of software is put through also increases. Consequently, test suites are reorganized to be executed at the same time, usually through parallel testing with Selenium to reduce test runtime. However, a side effect of this can also impose large loads on the software, resulting in an unintentional load test.
Certain tests may run perfectly fine when executed in a series but may display flakiness when running simultaneously. If the failures are seemingly random, perform some debugging. In an example, it was found that during parallel testing, all tests would try to log in with the same admin user when tests started.
This means that multiple simultaneous logins were occurring through the same user. Chances are, a web app will not be prepared for this kind of load. But in this case, the issue was made visible by flakiness.
Finding Flaky Selenium Tests by Detecting Common Errors
A flaky test is difficult to detect because it doesn’t exactly depend upon the state of the application; however, it produces inconsistent results.
Here’s how you can find flaky Selenium Tests by detecting common errors:
- If you have a script that relies on external APIs, there is a high chance of errors due to unavailability or latency issues. One must carefully check the test logs for API errors.
- Static waits (Thread.sleep() or similar) doesn’t synchronize well with the application’s state.
- Identify tests failing due to NoSuchElementException or StaleElementReferenceException, this is due to the use of locators that depend upon UI layout (eg: absolute XPath).
- If a test fails only in parallel execution, it is also due to flaky tests. Tests interfere with each other during parallel execution.
Read More: Parallel Testing with Selenium
How to Avoid Writing Flaky Tests in Selenium
It is essential to know how to reduce flakiness in your Selenium tests. A flaky test is always dangerous and is not reliable. Here’s the step by step process to avoid writing flaky tests in Selenium:
Steps to Avoid Writing Flaky Selenium Tests:
- Stabilize Locators
- Avoid fixed delays
- Add Explicit Waits
- Conduct tests in isolation
- Avoid harcoding data
- Mock external services inside your test script
- Synchronize with Application State
- Handle Browser Variations
- Refactor Tests
- Monitor and Debug
- Documentation
- Stabilize Locators: When writing tests, firstly, you should avoid writing the absolute XPaths as they are likely to break with a slight change of UI. You may use Use IDs or unique attributes (e.g., data-test or aria-label).
- Avoid fixed delays: One must avoid using fixed delays and start using dynamic waits for elements to become visible or interactable. This reduces the chance of test failures due to the unavailability of elements at the moment and thus, waits for the element to be present.
- Add Explicit Waits: Stop using hardcoded sleeps, as they can slow down tests. Use Selenium’s explicit waits to manage dynamic content or delays in loading. Also make it a point to wait for specific conditions. For instance, waiting for elements to be visisble, clickable or present in DOM.
- Conduct tests in isolation: In your project’s database, if several elements share common data, one must ensure that tests are conducted in isolation so that the test cases unintentionally modify the test’s shared database.
- Avoid harcoding data: Improve flexibility and scalability by using data-driven testing techniques to load input data from external sources like JSON, CSV etc. For tests that need unique values, you can get inputs dynamically. However, see to it that they remain predictable.
- Mock external services inside your test script: You must mock external services inside your test script to simulate API responses. Relying on external systems and APIs doesn’t guarantee consistency due to unforeseen circumstances and may cause latency issues.
- Synchronize with Application State: Use dynamic waits and validate the readiness of APIs or dynamic content and then proceed with your tests.
- Handle Browser Variations: Run cross-browser tests with tools like BrowserStack to make sure that the app runs seamlessly across multiple browsers, devices and screen resolutions.
- Refactor Tests: Remove or update tests non-existant functionalities, simplify overly complex scripts and analyze failures to find out root causes.
- Monitor and Debug: Leverage screenshots, logs and CI/CD integration to examine your tests closely , track your test performance and fix inconsistencies and recurring failures.
- Documentation: Analyze and document patterns that causes flaky tests along with their resolution strategies.
Read More: What Is API Automation Testing?
How to Handle Flaky Tests in Selenium
Before handling flaky tests in Selenium you must first ensure their presence. Ensure that your tests are flaky and are displaying inconsistent results, that is they sometimes pass and sometimes fail without changing the codebase. Here are a few ways to handle flaky tests in Selenium.
Handle Flaky Tests in Selenium:
- Identify the root cause
- Stabilize Environment
- Use the right ‘waits’
- Solve Parallel Execution Challenges
- Prioritize Stable Tests
- Automate cleanup tasks
- Avoid hardcoded timeouts
- Identify the root cause: Execute your tests multiple times and use a debugger to analyze the reason behind inconsistencies. Some of the reasons could be bad locators, incorrect data, and bad assertions.
- Stabilize Environment: Don’t underestimate aspects like network strength, latency, browser crashes, cache, etc., as they are important parts of stabilizing the test environment. You can also leverage cloud platforms to stabilize the tests, as they offer better customization and faster test execution.
- Use the right ‘waits’: Using the correct ‘waits’ is also important as it helps in synchronization. Since it can’t be predicted when the web element will be present on the page after loading, using static wait times will lead to test failure. Therefore, one must rely upon state-based synchronizations.
- Solving Parallel Execution Challenges: Other challenges arise in the Parallel execution, which can be solved by continuous monitoring, isolating elements, avoiding the use of shared databases, and using appropriate locators.
- Prioritize Stable Tests: Make sure you avoid running flaky tests by running stable tests more often in different scenarios.
- Automate cleanup tasks: Automate cleanup tasks and ensure a consistent environment by executing setup and teardown routines. During the setup phase, reset test data, initialize browser configurations and mock dependencies. In the teardown phase, you can clear test data, end browser sessions with driver.quit(, and reset the application state by clearing cookies or logs. These procedures help isolate tests and reduce flakiness in Selenium.
- Avoid hardcoded timeouts: Hardcoded timeouts can mask issues like synchronization problems and slow down tests. Instead, its best to use dynamic waits like WebDriverWait to wait for specific conditions. This ensures the test continues as soon as the condition is met. Thus, optimizing execution time and avoiding long static delays. This further helps in reducing flakiness in Selenium Tests.
Why manage Flaky Tests using BrowserStack Test Observability
A faster yet efficient way to manage flaky tests in Selenium is to rely upon a cloud service such as BrowserStack Test Observability, which solves multiple problems during text execution. It helps debug the entire process within a few minutes without having to spend hours manually.
It differentiates between a genuine defect and a flaky one such that no issues are overlooked. It keeps track of the stability and performance metrics of your test suite over time and alerts in case of regressions in behaviour and much more.
Moreover, it also encourages collaboration among the team members, which makes it an efficient system for test execution so that more minds are involved effectively in problem-solving. Overall, it provides reporting, debugging & analytics all under one roof.
To manage Flaky Tests seamlessly, BrowserStack Observability provides you with:
- Smart tags that automatically categorize failures into flaky, always-failing, and new-failures
- Automatically map each failure into customizable categories like Product Bug, Environment Issue, Automation Bug, etc. with AI-based Auto Failure Analysis.
- Debug with every single log in one pane or even travel back to any previous test execution to gain more context with the help of Timeline Debugging.
- Utilize graphs and reports to track the stability and performance of your test suite
- Re-run tests directly on CI from the Test Observability dashboard
- Customize your dashboards to slice and dice your test run data across different projects and runs
- Get alerted in real-time when specific quality rules in your test suite are compromised via customizable alerts.
The Human Aspect of Flaky Selenium Tests
Here are some things flaky Selenium tests can reveal about people in the organization:
- Teamwork and Communication: Flaky tests are a barometer of how well teams function. It can be challenging to get every team member to take an interest in end-to-end tests. Since flaky tests appear to pass sometimes and fail at other times, it’s safe to assume that anyone asking about them has been looking at test results consistently. They are the people more likely to be receptive to collaborative agile practices. If several flaky tests appear and have not been flagged, this indicates that the team is either not receiving information about the tests, or that the team is not interested.
- Test results fatigue: This refers to a condition in which teams have been so saturated with unreliable test results that they start ignoring end-to-end results altogether. Obviously, this negates the benefits of automation. A prime reason for test results fatigue is flaky tests. If team members start ignoring flaky tests and all results related to flaky tests, they end up disregarding large portions of the automated testing pipeline. By paying attention to how a team reacts to flaky tests, one gains insight into how the team is invested in automation. It also reveals how engaged the team is with the project at hand. In any team or organization that implements continuous deployment, automated end-to-end tests are necessary for product builds and releases. Flaky tests can halt tests and releases, and thus need to be resolved in order to complete test suites.
As discussed above, flaky tests, despite being troublesome can be an indicator of the quality of the Selenium automation test setup. It might be tempting to simply dismiss them as occurring due to some issues with automated tools and get rid of the tools (and, in turn, automation).
Instead, dive deeper, examine flaky tests, and look for where they are emerging from. It could be an anomaly with performance, speed, or with the nature of software. The point here is: use flaky tests as a friend.
Conclusion
Flaky tests are difficult to detect. However, these are crucial indicators that something is wrong with the test.
Therefore, one must be aware of the possibility of a flaky test. There are several ways to tackle flakiness in Selenium tests, including proper synchronization, usage of appropriate locators, and much more.
You can also use tools like BrowserStack Test Observability to solve this problem effectively by leveraging clear and concise reports of your test execution that cover every aspect, thus helping you improve test quality.
Try BrrowserStack Test Observability
Frequently Asked Question
1. What is an example of a flaky test?
A flaky test shows inconsistency; that is, it sometimes passes and sometimes fails in the same configuration.
For instance, a test that calls an API fails when the network is slow and passes when the network is fast. This creates inconsistency.
These flaky tests may be caused by timing issues, external dependencies, dynamic content (e.g., changing DOM elements), or parallel execution issues (e.g., resource-sharing conflicts).
2. How to stabilize selenium tests?
To ensure long-term stability in Selenium tests, certain practices must be followed, including regular maintenance, keeping track of inconsistent test scenarios, periodically optimising test scenarios, and more.