Selenium WebDriver Tutorial : Getting Started with Test Automation
By Jash Unadkat, Community Contributor - June 30, 2024
Selenium WebDriver is one of the most important parts of the Selenium test suite family. But before exploring Selenium WebDriver, let’s begin with a few basics about Selenium.
What is Selenium?
Selenium refers to a suite of tools that are widely used in the testing community when it comes to cross-browser testing. Selenium cannot automate desktop applications; it can only be used in browser automation. It supports a number of browsers such as Chrome, Firefox, Internet Explorer, Safari, Opera and operating systems such as Windows, Mac, Linux/Unix.
Selenium also provides compatibility with different programming languages – C#, Java, JavaScript, Ruby, Python, PHP. Testers can choose which language to design test cases in, thus making Selenium highly favorable for its flexibility.
Selenium Components
The Selenium test suite comprises four main components:
- Selenium Integrated Development Environment (IDE) is a browser extension for Firefox and Chrome that generates tests quickly through its record and playback functionality. The IDE records the user actions on the browser and exports them as a reusable script in different languages such as Java, C# and JavaScript.
- Selenium WebDriver is the core component of Selenium which provides a programming interface for driving the web browsers. It allows to write tests in different programming languages to interact with the web elements, simulate user interactions and perform assertions.
- Selenium Grid is a tool that is used for concurrent execution of test cases on different browsers, machines, and operating systems simultaneously. It allows parallel test execution making it faster to run large test suites.
- Selenium RC is obsolete. It is being replaced by Selenium Webdriver due to its complex architecture and limitations. It allows you to develop responsive design tests in any scripting language of your choice. Server and client libraries are the two main components of Selenium RC.
Read More: Selenium IDE Alternatives
What is Selenium WebDriver?
Selenium WebDriver is a web framework that permits you to execute cross-browser tests. This tool is used for automating web-based application testing to verify that it performs expectedly.
Selenium WebDriver allows you to choose a programming language to create test scripts. As discussed earlier, it is an advancement over Selenium RC to overcome a few limitations. Selenium WebDriver is not capable of handling window components, but this drawback can be overcome by using tools like Sikuli, Auto IT, etc.
Run Selenium Webdriver Test for Free
Now let’s try to understand the WebDriver Architecture.
Selenium WebDriver Framework Architecture
WebDriver Architecture is made up of four major components:
- Selenium Client library
- JSON wire protocol over HTTP
- Browser Drivers
- Browsers
Read More: Architecture of Selenium WebDriver
Selenium Client Libraries/Language Bindings
Selenium provides support to multiple libraries such as Ruby, Python, Java, etc as language bindings have been developed by Selenium developers to provide compatibility for multiple languages. For instance, if you want to use the browser driver in Python, use the Python Bindings. You can download all the supported language bindings of your choice from the official site of Selenium.
JSON Wire Protocol
JSON is an acronym for JavaScript Object Notation. It is an open standard that provides a transport mechanism for transferring data between client and server on the web. It provides support for various data structures like arrays and objects which makes it easier to read and write data from JSON.
JSON serves as a REST (Representational State Transfer) API that exchanges information between HTTP servers. Learn more about REST API for accessing Selenium
Browser Drivers
Selenium provides drivers specific to each browser and without revealing the internal logic of browser functionality, the browser driver interacts with the respective browser by establishing a secure connection. These browser drivers are also specific to the language which is used for test case automation like C#, Python, Java, etc.
Browser | BrowserDriver | OS Supported |
---|---|---|
Google Chrome / Chromium | ChromeDriver | Windows/macOS/Linux |
Mozilla Firefox | GeckoDriver | Windows/macOS/Linux |
Microsoft Edge | Microsoft Edge WebDriver | Windows/macOS/Linux |
Safari | SafariDriver (in-built) | macOS High Sierra and newer |
Internet Explorer | InternetExplorerDriver | Windows |
You can download the browser driver of your choice as per your language requirements. For example, you can configure Selenium Web driver for Python on BrowserStack.
When a test script is executed with the help of WebDriver, the following tasks are performed in the background:
- An HTTP request is generated and it is delivered to the browser driver for every Selenium Command
- The HTTP request is received by the driver through an HTTP server
- All the steps/instructions to be executed on the browser is decided by an HTTP server
- The HTTP server then receives the execution status and in turn sends it back to the automation scripts
Browsers
As discussed earlier, Selenium provides support for multiple browsers like Chrome, Firefox, Safari, Internet Explorer etc.
Curious about Selenium 4? Watch this in-depth webinar on Selenium 4 by Simon Stewart, the Selenium Project’s lead, the creator of WebDriver, and the co-editor of the W3C WebDriver specification.
Benefits and Limitations of Selenium WebDriver
Benefits of Selenium WebDriver
- It is one of the most popular Open-Source tools and is easy to get started with for testing web-based applications. It also allows you to perform cross browser compatibility testing.
- Supports multiple operating systems like Windows, Mac, Linux, Unix, etc.
- It provides compatibility with a range of languages, including Python, Java, Perl, Ruby, etc.
- Provides support for modern browsers like Chrome, Firefox, Safari, and Internet Explorer.
- Selenium WebDriver completes the execution of test scripts faster when compared to other tools
- More Concise API (Application Programming interface) than Selenium RC’s
- It also provides compatibility with iPhoneDriver, HtmlUnitDriver, and AndroidDriver
Note: Opera is no longer supported by Selenium 4
Limitations of Selenium WebDriver
- Support for new browsers is not readily available when compared to Selenium RC
- For the automatic generation of test results, it doesn’t have a built-in command
How Selenium WebDriver Works
Working on Selenium Webdriver varies for Selenium 3 and Selenium 4. Here’s how it works
Selenium 3
In Selenium 3, there is no direct communication between the client libraries (Java, Python, JavaScript, etc) and the browser drivers. Here is how Selenium 3 works:
- Client library connects with browser drivers and browsers using the JSON (JavaScript Object Notation) Wire Protocol.
- JSON protocol provides a transport mechanism for transferring data between client and server on the web through various data structures like arrays and objects used to read and write data from JSON.
- JSON acts as a REST (Representational State Transfer) API to exchange information between HTTP servers.
Selenium 4
The architecture of Selenium 4 is similar to Selenium 3; however, it uses W3C protocol instead of JSON wire protocol for communication between Client Libraries and Browser Drivers. And therefore, WebDriver in Selenium 4 is fully W3C compliant!
Test scripts can communicate directly with the browser drivers (binaries) using the WebDriver protocol. With the removal of JSON Wire protocol, the communication got a lot easier between the client libraries and the browser drivers.
Below WebDriver script would help you understand the execution flow:
1. Initialize a WebDriver instance (for e.g: ChromeDriver or FirefoxDriver)
WebDriver driver = new ChromeDriver();
2. Navigate to a URL to start testing the web application.
driver.get(“URL”);
3. Find web elements using different locator strategies.
WebElement element = driver.findElement(By.xpath(“locator value”));
4. Perform actions on web elements like click, sendKeys, getText();
element.click();
5. Run the test
6. Assert the execution with expected result
7. Clean up
driver.close; OR
driver.quit();
Types of Requests
There are two types of requests you might be familiar with – GET and POST.
If it’s a GET request then it results in a response that will be generated at the browser end and it will be sent over HTTP to the browser driver and eventually, the browser driver with the help of JSON wire protocol sends it to the UI (Eclipse IDE).
Read More: How to configure Selenium in Eclipse
How to use Selenium WebDriver in Java: Example
Below code launches BrowserStack web application on chrome browser and verifies the page title.
import org.openqa.selenium.WebDriver; import org.openqa.selenium.chrome.ChromeDriver; import org.testng.Assert; import org.testng.annotations.Test; public class BrowserStackDemo { WebDriver driver; @Test public void verifyTitle() { driver= new ChromeDriver(); driver.get("https://www.browserstack.com/"); Assert.assertEquals(driver.getTitle(), "Most Reliable App & Cross Browser Testing Platform | BrowserStack"); driver.quit(); } }
The code above does the following:
- Create a Selenium WebDriver instance
- Configure browser if required
- Navigate to the required web page and locate the relevant web element
- Perform action on the web element
- Verify and validate the action
Selenium is not just limited to Browser Automation but also supports different testing levels such as Regression Testing, Cross Browser Testing, UI Testing, Database Testing, Visual Testing, and Headless Browser Testing.
On executing the above java code, Selenium will navigate to the Chrome browser and open the BrowserStack Home page. Then, it will also check the test case status using Assert and try to match the URL.
All Selenium tests must be run on real devices for accurate results. Start running tests on 3500+ real browsers and devices on BrowserStack’s real device cloud. Run parallel tests on a Cloud Selenium Grid to get faster results without compromising on accuracy. Detect bugs before users do by testing software in real user conditions with BrowserStack Automate.
Run Selenium Tests on Real Devices
Why run Selenium WebDriver Tests on Real Devices & Browsers?
Each device and browser render web pages differently, hence, for consistent user experience across different devices and browsers, it is advisable to test it in the real environment set up.
Procuring the latest devices and browsers is challenging. It is best to invest in a real device cloud platform which helps test web apps on a wide range of devices and browser combinations.
BrowserStack Automate is a real device cloud that provides a comprehensive infrastructure for cross-device and cross-browser testing. There’s zero hassle in setting up the device and environment.
Testing in real devices and browsers ensures high-quality user experience.
Here are the features offered by BrowserStack Automate, and that’s why you should use to run Selenium Tests:
- Parallel Testing: BrowserStack Automate lets you test on multiple device and browser combinations at once, speeding up test execution and providing quick feedback.
- Real Devices and Browsers: Testing on real devices and browsers ensures accurate performance insights, unlike emulators. You can test on the latest devices without buying them.
- Dedicated Dashboard: Automate offers a dashboard to track and manage your tests, showing test statuses (Pass/Fail/Pending), device details, test duration, screenshots, and more.
- Custom Reports with Artifacts: Generate detailed, customized reports including test status, device/browser configurations, video recordings, and screenshots.
- Easy Integration with CI/CD: Seamlessly integrate with CI/CD tools like Jenkins, TeamCity, and TravisCI for faster, reliable, and consistent application delivery.
Conclusion
This Selenium WebDriver tutorial is meant to initiate new users to the many functions and uses of Selenium. After studying the basics, explore many of the Selenium tutorials on the BrowserStack Guide in order to explore and understand how Selenium WebDriver is one of the most powerful resources in a tester’s toolkit.