Know All About Pyppeteer

Get detailed understanding of Pyppeteer and implement it effectively using Automation Testing tools like BrowserStack Automate

Get Started free
Home Guide Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)

Pyppeteer Tutorial: Guide to Puppeteer in Python (with Examples)

By GH, Community Contributor -

Puppeteer is the most popular automation testing tool by Google. However, it supports only JavaScript programming language. Many Python enthusiasts and community contributors grouped to support Python for Puppeteer. Pypeeteer is a port of Puppeteer and it mimics the Puppeteer API by using the Python programming language.

What is Pyppeteer?

Pyppeteer is a Python version of the Puppeteer automation tool. It helps to automate the web application using the Python language. It continues to keep the core functionality similar to Puppeteer. Pyppeteer supports Chromium and Chrome browsers for test automation. It is an open-source software distributed under MIT license.

Why Choose Pyppeteer?

Unlike other test automation tools like Selenium, Robot Framework, Pyppeteer is simple and faster. Pyppeteer is highly customizable and extensible. The tester gets access to many core components such as network logs, and consoles which other frameworks lack.

Additionally, it doesn’t require a lot of effort to set up the framework. In comparison with Selenium, Pyppeteer doesn’t use any middleware like Webdriver so test execution is faster and more stable.

Differences between Puppeteer and Pyppeteer

Fundamentally, both are built on the same architecture. However, Pyppeteer is a ported version of Puppeteer JavaScript as there are programming-level restrictions few things may change.

A few pyppeteer examples are listed below:

Puppeteer uses an object for passing parameters to functions. Pyppeteer functions accept both dictionary and keyword arguments for options

Example:

browser = await launch({'headless': True})

browser = await launch(headless=True)

Puppeteer JavaScript uses the $ for the element selector. However, in Python, $ is not a valid identifier, instead of $ python uses J as a shorthand operator as listed below:

PuppeteerPyppeteerPyppeteer shorthand
Page.$()Page.querySelector()Page.J()
Page.$$()Page.querySelectorAll()Page.JJ()
Page.$x()Page.xpath()Page.Jx()

Puppeteer’s version of JavaScript evaluate() takes a JavaScript function or a string representation of a JavaScript expression. However, Pyppeteer takes a string representation of JavaScript or a function

Example:

content = await page.evaluate('document.body.textContent', force_expr=True)

We have discussed Pyppeteer and Puppeteer differences Let’s understand the usage with the Pyppeteer example:

Note: Pyppeteer project is no longer actively maintained. The latest version of Pyppeteer doesn’t work smoothly, it is not recommended to use Pyppeteer instead as per the documentation it suggested using Playwright Python which is well-maintained and more similar to Pyppeteer.

How to set up Pyppeteer?

Let’s discuss installing and setting up a Pyppeteer using Python.

Pre-requisites to install Pyppeteer

  • Basic understanding of Python
  • Download and install Python 3.8 or higher

Install Pyppeteer

  1. Navigate to the desired folder (example, PyppeteerDemo)
  2. Open Terminal and type the below command
pip install Pyppeteer

Once the installation is successful you are good to start the automation.

How to perform various Actions on Pyppeteer

It is important to know about the actions that can be performed using the Pyppeteer. If you are already familiar with Puppeteer the learning curve will be very minimal. Let’s understand the different actions in Pyppeteer

Launching the browser with Pyppeteer

To launch the browser you need to create the browser instance first.

browser = await launch()  

This can be used to create the browser instance or context. Using the browser object you can create as many pages as possible

Pyppeteer Example:

import asyncio

from pyppeteer import launch

async def scraper():

   browser =await launch({"headless": False})

   page = await browser.newPage()

   await page.goto('https://www.google.com/')

   await browser.close()

asyncio.run(scraper())

In the above code, a browser instance is created first and then calls a newPage() method to get the new page. Once a new page is created, you can perform actions like navigating to the URL (example: https://www.google.com), etc.

Opening specific versions of Chrome/Chromium browser with Pyppeteer

Pyppeteer is more flexible and customizable, You can specify the browser executable to launch the browser. For example, if you have already installed a specific version of Chrome you can pass the path to the launch() function as mentioned below

browser = await launch(headless=False, executablePath='C:\\Program Files\\Google\\Chrome\Application\\chrome.exe')

Typing the Test on the web page

Let’s consider you want to search for something on the page, you can use the page.type() method to perform the action

Example:

page.type("#mySearch", "Browserstack");

In the above method, #mySearch is the locator and BrowserStack is the Search text

Clicking the Button on the Webpage

Pyppeteer makes clicking buttons very simple, page.click() method can be used to perform the click action. page.click() method accepts one parameter that is the locator, the locator can be any valid css locator.

Example:

page.click('#mybtn');

Printing PDF files with Pyppeteer

Pyppeteer allows you to print or save the webpage as PDF, instead of taking a screenshot you can save the whole page in PDF format. Page.pdf() function can be used for this purpose.

Example:

await page.pdf({'path': 'python_print.pdf'})

Switching Tabs with Pyppeteer

Pyppeteer follows a different approach to switching the tabs. As mentioned earlier you can create as many as pages you want once you get the browser instance. Let’s understand with an example:

   browser =await launch({"headless": False})

   page1 = await browser.newPage()

   page2 = await browser.newPage()

   await page1.goto('https://www.google.com/')

   await page2.goto('https://www.browserstack.com/');

In the above example, we have created two pages namely page1 and page2, page1 represents the first page and page2 represents the second page. You can just use page 1 or page 2 to perform the action.

Example: Clicking on the search button on the second page

await page2.click('#browserstackSearchButtton')

Managing Cookies with Pyppeteer

Pyppeteer provides the capability to manage the cookies. You can Set, Print, and Delete the cookies. Below are the available methods to manage cookies

Example:

await page.cookies() : Get all the available cookies

await page.setCookie(...cookies) : Set the Cookie

await page.deleteCookie(...cookieNames) : Delete the cookie

Handling iFrames with Pyppeteer

IFrames are a legacy technique to divide the page, however, Pyppeteer supports iFrame actions.

For example, if you want to perform a click action on a specific element on the iFrame you can follow the below approach

Example:

iframe_element = await page.querySelector(iframe_1)

iframe = await element.contentFrame()

await iframe.querySelector(iframe_button).click()

Handling Alerts and Pop-ups with Pyppeteer

Alerts and pop-ups are native to the operating system. These can be handled using the asyncio() functionality. Below are examples of handling the browser popups and alert messages

Example: Handle Confirm Dialog Box

page.on('dialog', handle_confirm _dialog_box)

def handle_confirm _dialog_box (dialog):

   asyncio.ensure_future(dialog.accept(test_message))

Example: Handle Dismiss Dialog Box

page.on('dialog', handle_dismiss_dialog_box)

def handle_dismiss_dialog_box(dialog):

   asyncio.ensure_future(dialog.dismiss())

Similarly, you can handle different types of Dialogs

Handling Dynamic Content with Pyppeteer

Dynamic web elements are a new standard of modern web applications. For example, when you navigate to the page it loads only contents or elements for the current viewport. As you scroll down additional contents are added.

In such scenarios, if you navigate to the web page and perform an action it throws the exception with the error element not found. One of the solutions for this is to scroll down until webpage elements are visible.

This scenario can be handled in Pyppeteer using the scrollIntoView() function.

Example:

Scroll until the element is visible


   async def scroll_to_element(page, selector):

await page.evaluateHandle(

       '''async (selector) => {

           const element = document.querySelector(selector);

           if (element) {

               element.scrollIntoView();

           }

       }''',

       selector

   )

   return selector

Once element is found perform the action

   elem_button_footer = await scroll_to_element(page, button_footer)

   await page.click(elem_button_footer)

Parallel Execution with Pyppeteer

Pyppeteer doesn’t support parallel execution. That means you cannot run multiple tests at a single point of time however this can be achieved using the third-party plugin called pytest-parallel.

Once you install pytest-parallel you can use the following command to run the tests in parallel

  • pytest –workers 2: Run tests in 2 threads
  • pytest –workers auto: Automatically runs by looking at the CPU core

Web Scraping with Pyppeteer

Pyppeteer Python is a good combination for scraping web pages. Web scraping helps to perform research in the industry, especially in the retail segment. Competitors make an analysis of particular products, their SKUs, pricing and discounts, etc.

Pyppeteer can be used for web scraping. Below is a simple example of web scraping.

  • Below are the steps to perform
  • Navigate to webpage
  • Get all the product cards
  • For each product get the name and price

Example:

async def scrape_it (page, url) -> list:

page.goto('https://someretailwebsite.com')

rows = await page.querySelectorAll('#product_cards')

scraping_data_arr = []

for row in rows:

  name = await row.querySelector('p.name')

  price = await row.querySelector('div.price-value')

  nameText = await page.evaluate('(element) => element.textContent', name)

  priceValue = await page.evaluate('(element) => element.textContent', price)

  scraping_data_dict = {

  'product name': nameText,

            'product price': priceValue

        }

scraping_data_arr.append(scraping_data_dict)

return scraping_data_arr

Cross Browser Testing with Pyppeteer

As discussed in the above section, Pyppeteer is a fork of Puppeteer JavaScript. The browser support is the same as the Puppeteer. Puppeteer is intended to perform testing in Chromium and Chrome browsers only. It doesn’t support any other browser.

Though it has experimental support for Firefox it is not recommended to use. Due to these limitations, cross-browser testing cannot be achieved using Pyppeteer. However, the best alternative is to use Python Playwright which supports many different browsers without any hassle.

How to run Pyppeteer Tests on Real Devices with BrowserStack Automate

Pyppeteer is in the maintenance phase and it is no longer actively maintained. As mentioned earlier, the latest version of the Pyppeteer doesn’t allow to download the latest browsers. Since Pyppeteer stopped active development there is not much support for the cloud execution.

Even though this can be achieved through some hacks, stability, and reliability cannot be guaranteed.

Note: However, If you have a cloud execution requirement you can migrate your project to Puppeteer and JavaScript. If you feel programming language is a barrier you can consider switching to Playwright Python.

BrowserStack supports both Puppeteer JavaScript and Playwright Python seamlessly without any hassle. You can learn more about integrating Playwright with BrowserStack and test your application on 20,000+ real devices to test on the cloud.

BrowserStack Automate Banner

Conclusion

Puppeteer is the most popular tool for test automation, however, it lacks programming language support other than JavaScript, to overcome this challenge Pyppeteer is a fork from Puppeteer and written wrapper to support Python language.

Due to recent developments, the Pyppeteer project is not actively maintained. When it comes to automation Cross browser testing, Cloud test execution, and parallel execution are the most crucial things to achieve the ROI. However, Pyppeteer lacks all three crucial features.

Unless you test your application on multiple device and browser combinations you cannot predict the behavior of the application in production and it is always riskier. As recommended by the Pyppeteer team consider switching to Playwright Python which enables all crucial features without any hustle.

Additionally, BrowserStack supports Playwright seamlessly, this helps to achieve good ROI from automation and also helps to release your code with confidence.

Tags
Automation Frameworks Automation Testing Puppeteer

Featured Articles

Cross Browser Testing in Puppeteer: Tutorial

How to run UI Automation Testing using Puppeteer

Automation Tests on Real Devices & Browsers

Seamlessly Run Automation Tests on 3500+ real Devices & Browsers