Get Current URL in Selenium using Python: Tutorial

Home Guide Get Current URL in Selenium using Python: Tutorial

Current URL in Selenium using Python: Example
- Pre-requisites
- Performing URL checks with Current URL in Selenium using Python

Get Current URL in Selenium using Python: Tutorial

Selenium is a well-established automation testing framework that provides tools tailor-made for browser automation. When it comes to deftly navigating various websites and web scraping material, or carrying out repetitive tasks Selenium is quintessential.

This tutorial illustrates a core method to get the current URL in selenium using python. The current_url method is generally employed in cases where you require an intermediate URL from a redirect chain or need to do a series of navigations among different URLs.

This method is ubiquitous in most situations involving browser automation.

To give an instance, let’s assume you need to web scrape tables containing data on a certain species’ population information from different biological databases. This task requires taking user input for the URL of each database to be used, and the species to search for. As we navigate to these web databases it’s imperative that the correct URL is accessed, and the best method to ensure navigation to the correct URL is to administer wait commands and check for it using the current_url method in Selenium.

Current URL in Selenium using Python: Example

To instantiate the usefulness of the current_url method, a basic example involving navigation to a website using google chrome was performed. To further illustrate its import, navigation between multiple websites while screen scraping information was also implemented. This example showcased how the current_url method is indispensable to verify correct navigation with browser automation.

Before executing the code to navigate to the URL https://www.google.com, the following prerequisites are needed.

Pre-requisites

Set up a python environment.
Install Selenium. If you have conda or anaconda set up then using the pip package installer would be the most efficient method for Selenium installation. Simply run this command (on anaconda prompt, or directly on the Linux terminal):
```
pip install selenium
```
Copied
Download the latest WebDriver for the browser you wish to use, or install webdriver_manager by running the command:
```
pip install webdriver_manager
```
Copied

Using the Current URL method in Selenium to perform a URL check on google.com

Step 1: Import the required packages using the following command.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

Step 2: Use WebDriver manager to download the required WebDriver for your browser (currently ChromeDriver, GeckoDriver, IEDriver, OperaDriver, and EdgeChromiumDriver can be downloaded via this package).

First, the WebDriver manager package will search for the version of the browser being used. Following this, it will check your cache to see if the WebDriver is already present in your cache. If there is no WebDriver present, or an old version is present, the package will download and save the latest version of the WebDriver.

Also Read: Page Object Model and Page Factory in Selenium Python

In this example google chrome is being used, therefore the WebDriver manager installed the latest version of the ChromeDriver.

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

In the above command, the Service object inherits and sets the executable path as the location where the WebDriver has been saved following installation.

Step 3: You can also manually download the WebDriver for your respective browser here. If you manually downloaded the WebDriver you will either need to place the driver on your system path (put the chromedriver.exe file in the same location as your python code file) or alternatively set your executable path as the location of the WebDriver.

If your driver is on your system path:

from selenium import webdriver
driver = webdriver.Chrome()

Alternatively set executable path as the location of webdriver:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service 

ser = Service(r"C:/Users/Asus/Downloads/chromedriver_win32/chromedriver.exe")
driver = webdriver.Chrome(service=ser)

Following this, the chrome browser should open with data; in the URL bar. When we attempt to get the current URL without fetching another URL, this is the URL, which will be obtained.

Step 4: Load your required URL using get(), in this example, we fetched google.com from the WebDriver as seen below.

driver.get("https://www.google.com")

Step 5: Use the current_url method to obtain the current URL from the driver and print it.

get_url = driver.current_url
print("The current url is:"+str(get_url))
driver.quit()

Try Selenium Testing on Real Devices for Free

Performing URL checks with Current URL in Selenium using Python

One can ensure precise navigation across multiple websites following the methodology of the code below.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import codecs

ser = Service(r"C:/Users/Asus/Downloads/chromedriver_win32/chromedriver.exe")
driver = webdriver.Chrome(service=ser)

driver.get("https://www.google.com")

get_url = driver.current_url

print("The current url is:"+str(get_url))

#Redirect
val = input("Enter a url: ")
wait = WebDriverWait(driver, 10)
driver.get(val)
wait.until(EC.url_to_be(val))
page_source = driver.page_source

soup = BeautifulSoup(page_source,features="html.parser")
title = soup.title.text
file=codecs.open('article_titles.txt', 'a+')
file.write(title+"\n")
file.close()

get_url = driver.current_url 
print("The current url is:"+str(get_url))

val = input("Enter a url: ")
wait = WebDriverWait(driver, 10)
driver.get(val)
wait.until(EC.url_to_be(val))
page_source = driver.page_source
soup2 = BeautifulSoup(page_source,features="html.parser")
title = soup2.title.text
file=codecs.open('article_titles.txt', 'a+')
file.write(str(title)+"\n")
file.close()

get_url = driver.current_url 
print("The current url is:"+str(get_url))
driver.quit()

Output

The current url is:https://www.google.com/

Enter a url: https://www.browserstack.com/guide/python-selenium-to-run-web-automation-test
The current url is:https://www.browserstack.com/guide/python-selenium-to-run-web-automation-test

Enter a url: https://www.browserstack.com/guide/download-file-using-selenium-python
The current url is:https://www.browserstack.com/guide/download-file-using-selenium-python

The program first opens https://www.google.com/. Then it asks for a URL input; The URL of the article Selenium with Python : Getting Started with Automation is entered as input by the user. The title of the article on this page is then scraped and stored in article_titles.txt as shown below.

Fetching the input URL and scraping the title of the article.

Writing the article title into article_titles.txt.

Saving the article_titles.txt file.

Next another url for an article titled How to download a file using Selenium and Python is entered as input. The program scrapes the title of this article as well, appending it the article_titles.txt document as shown in the screenshots below.

Fetching the input URL and scraping the title of the article

Writing the article title into article_titles.txt.

Talk to an Expert

In the example above a series of navigations are demonstrated and the current URL is obtained after each navigation. In order to be certain that the correct navigations are taking place, and the correct URL is being obtained, the wait command is utilized to ascertain that the URL is equivalent to the desired URL before scraping the title.

An explicit wait command is often required when working with the current_url method since a slow internet connection or intermediate URLs can interfere and lead to you retaining an incorrect URL. If you are carrying out web scraping using Selenium Python or testing this could thus lead you to obtain incorrect information. The wait command makes it so that the code waits for a certain condition to occur before carrying on; Depending on the use case there are various conditions, other than the one used above, such as staleness_of or text_to_be_present_in_element which can be utilized.

Following the wait command, the title of the article on the web page was scraped using beautiful soup and then stored in a text file called article_titles. Without the wait command or the check using the current_url method we could end up obtaining the same title twice or no title at all.

Try Automation Testing on BrowserStack for Free