• Login
Sunday, June 1, 2025
Blogue
  • Homepage
  • Web
    The WordPress Drama- Open Source at War

    The WordPress Drama: Open Source at War

    The Plan to Break Apart Google- RIP Chrome_

    The Plan to Break Apart Google: RIP Chrome?

    Andrew Tate’s Hustler’s University Hacked- What We Know

    Andrew Tate’s Hustler’s University Hacked: What We Know

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Git : Developer’s Favourite

    Git : Developer’s Favourite

    HTTP Status/Error Codes and How to Fix them

    HTTP Status/Error Codes and How to Fix them

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    Guide to Googling

    Guide to Googling

    How To Submit a Form With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

  • Web Developer
    How to Detect Ad Blockers in a React.js Application

    How to Detect Ad Blockers in a React.js Application

    How to Create Responsive Image in CSS

    How to Create Responsive Image in CSS

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    How To Take Screenshot With Puppeteer

    How To Take Screenshot With Puppeteer

    Web Scraping With JavaScript and Puppeteer

    Web Scraping With JavaScript and Puppeteer

    How to create a Responsive Website

    How to create a Responsive Website

    Top Frontend Frameworks To Learn

    Top Frontend Frameworks To Learn

    Full Stack Web Developer: A Guide to Learn

    Full Stack Web Developer: A Guide to Learn

  • Web Scraping
    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Table Data into JSON File with Puppeteer

    Scraping Table Data into JSON File with Puppeteer

    How To Scrape Multiple Pages With Puppeteer and JavaScript

    How To Scrape Multiple Pages With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

    How to Automate Form Submission with Puppeteer and Javascript

    How to Automate Form Submission with Puppeteer and Javascript

    How To Take Screenshot With Puppeteer

    How To Take Screenshot With Puppeteer

    Web Scraping With JavaScript and Puppeteer

    Web Scraping With JavaScript and Puppeteer

  • How To?
    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    Facebook: The Engine Behind the World’s Largest Social Network

    Facebook: The Engine Behind the World’s Largest Social Network

    Snapchat: The System Powering Snaps, Stories, and Lenses

    Snapchat: The System Powering Snaps, Stories, and Lenses

    How WhatsApp Powers Instant Communication for Billions

    How WhatsApp Powers Instant Communication for Billions?

    How Instagram Scales to Billions of Photos and Videos Daily

    How Instagram Scales to Billions of Photos and Videos Daily?

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    Effective Structure for Describe Image, Retell Lecture, and Summarize Spoken Text in PTE Academic

    Effective Structure for Describe Image, Retell Lecture, and Summarize Spoken Text in PTE Academic

    Understanding Marks Allocation in PTE Academic Tasks

    Understanding Marks Allocation in PTE Academic Tasks

    From Power-On to Productivity: How Your Operating System Comes to Life

    From Power-On to Productivity: How Your Operating System Comes to Life

  • Technology
    A Deep Dive into Physical Storage Devices: From Floppy Disks to Modern SSDs

    A Deep Dive into Physical Storage Devices: From Floppy Disks to Modern SSDs

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    The WordPress Drama- Open Source at War

    The WordPress Drama: Open Source at War

    25 Software Bugs That Changed the World

    25 Software Bugs That Changed the World

    The Plan to Break Apart Google- RIP Chrome_

    The Plan to Break Apart Google: RIP Chrome?

    Guide to Googling

    Guide to Googling

    How Video Streaming on the Internet Works

    How Video Streaming on the Internet Works

    Git : A short history

    Git : A short history

    A Walkthrough to Blockchain

    A Walkthrough to Blockchain

  • System Design
    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    Facebook: The Engine Behind the World’s Largest Social Network

    Facebook: The Engine Behind the World’s Largest Social Network

    Snapchat: The System Powering Snaps, Stories, and Lenses

    Snapchat: The System Powering Snaps, Stories, and Lenses

    How WhatsApp Powers Instant Communication for Billions

    How WhatsApp Powers Instant Communication for Billions?

    How Instagram Scales to Billions of Photos and Videos Daily

    How Instagram Scales to Billions of Photos and Videos Daily?

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How YouTube Handles Billions of Videos Daily

    How YouTube Handles Billions of Videos Daily?

    How Spotify Streams Personalized Music to Millions in Real-Time?

    How Spotify Streams Personalized Music to Millions in Real-Time?

Submit Post
No Result
View All Result
  • Homepage
  • Web
    The WordPress Drama- Open Source at War

    The WordPress Drama: Open Source at War

    The Plan to Break Apart Google- RIP Chrome_

    The Plan to Break Apart Google: RIP Chrome?

    Andrew Tate’s Hustler’s University Hacked- What We Know

    Andrew Tate’s Hustler’s University Hacked: What We Know

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Git : Developer’s Favourite

    Git : Developer’s Favourite

    HTTP Status/Error Codes and How to Fix them

    HTTP Status/Error Codes and How to Fix them

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    Guide to Googling

    Guide to Googling

    How To Submit a Form With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

  • Web Developer
    How to Detect Ad Blockers in a React.js Application

    How to Detect Ad Blockers in a React.js Application

    How to Create Responsive Image in CSS

    How to Create Responsive Image in CSS

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

    How To Take Screenshot With Puppeteer

    How To Take Screenshot With Puppeteer

    Web Scraping With JavaScript and Puppeteer

    Web Scraping With JavaScript and Puppeteer

    How to create a Responsive Website

    How to create a Responsive Website

    Top Frontend Frameworks To Learn

    Top Frontend Frameworks To Learn

    Full Stack Web Developer: A Guide to Learn

    Full Stack Web Developer: A Guide to Learn

  • Web Scraping
    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Websites with Various Scraper Frameworks (With Examples)

    Scraping Table Data into JSON File with Puppeteer

    Scraping Table Data into JSON File with Puppeteer

    How To Scrape Multiple Pages With Puppeteer and JavaScript

    How To Scrape Multiple Pages With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

    How To Submit a Form With Puppeteer and JavaScript

    How to Automate Form Submission with Puppeteer and Javascript

    How to Automate Form Submission with Puppeteer and Javascript

    How To Take Screenshot With Puppeteer

    How To Take Screenshot With Puppeteer

    Web Scraping With JavaScript and Puppeteer

    Web Scraping With JavaScript and Puppeteer

  • How To?
    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    Facebook: The Engine Behind the World’s Largest Social Network

    Facebook: The Engine Behind the World’s Largest Social Network

    Snapchat: The System Powering Snaps, Stories, and Lenses

    Snapchat: The System Powering Snaps, Stories, and Lenses

    How WhatsApp Powers Instant Communication for Billions

    How WhatsApp Powers Instant Communication for Billions?

    How Instagram Scales to Billions of Photos and Videos Daily

    How Instagram Scales to Billions of Photos and Videos Daily?

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    Effective Structure for Describe Image, Retell Lecture, and Summarize Spoken Text in PTE Academic

    Effective Structure for Describe Image, Retell Lecture, and Summarize Spoken Text in PTE Academic

    Understanding Marks Allocation in PTE Academic Tasks

    Understanding Marks Allocation in PTE Academic Tasks

    From Power-On to Productivity: How Your Operating System Comes to Life

    From Power-On to Productivity: How Your Operating System Comes to Life

  • Technology
    A Deep Dive into Physical Storage Devices: From Floppy Disks to Modern SSDs

    A Deep Dive into Physical Storage Devices: From Floppy Disks to Modern SSDs

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    The WordPress Drama- Open Source at War

    The WordPress Drama: Open Source at War

    25 Software Bugs That Changed the World

    25 Software Bugs That Changed the World

    The Plan to Break Apart Google- RIP Chrome_

    The Plan to Break Apart Google: RIP Chrome?

    Guide to Googling

    Guide to Googling

    How Video Streaming on the Internet Works

    How Video Streaming on the Internet Works

    Git : A short history

    Git : A short history

    A Walkthrough to Blockchain

    A Walkthrough to Blockchain

  • System Design
    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    How LinkedIn Powers Professional Networking: A Look Inside Its System Design

    Facebook: The Engine Behind the World’s Largest Social Network

    Facebook: The Engine Behind the World’s Largest Social Network

    Snapchat: The System Powering Snaps, Stories, and Lenses

    Snapchat: The System Powering Snaps, Stories, and Lenses

    How WhatsApp Powers Instant Communication for Billions

    How WhatsApp Powers Instant Communication for Billions?

    How Instagram Scales to Billions of Photos and Videos Daily

    How Instagram Scales to Billions of Photos and Videos Daily?

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How Netflix Delivers High-Quality Streaming to Millions Worldwide

    How YouTube Handles Billions of Videos Daily

    How YouTube Handles Billions of Videos Daily?

    How Spotify Streams Personalized Music to Millions in Real-Time?

    How Spotify Streams Personalized Music to Millions in Real-Time?

Submit Post
No Result
View All Result
Blogue
  • Homepage
  • Web
  • Web Developer
  • Web Scraping
  • How To?
  • Technology
  • System Design

How To Scrape Multiple Pages With Puppeteer and JavaScript

The Blogue Team by The Blogue Team
October 27, 2023
in Coding, How To?, JavaScript, Programming, Puppeteer, Web Scraping
Reading Time: 4 mins read
365
A A
0
369
SHARES
1.2k
VIEWS
Share on FacebookShare on XShare on RedditShare on Whatsapp

Google designed Puppeteer to provide a simple yet powerful interface in Node.js for automating tests and various tasks using the Chromium browser engine. It runs headless by default, but it can be configured to run full Chrome or Chromium.

The API build by the Puppeteer team uses the DevTools Protocol to take control of a web browser, like Chrome, and perform different tasks, like:

  • Snap screenshots and generate PDFs of pages
  • Automate form submission
  • UI testing (clicking buttons, keyboard input, etc.)
  • Scrape a SPA and generate pre-rendered content (Server-Side Rendering)

Most actions that you can do manually in the browser can also be done using Puppeteer. Furthermore, they can be automated so you can save more time and focus on other matters.

Puppeteer was also built to be developer-friendly. People familiar with other popular testing frameworks, such as Mocha, will feel right at home with Puppeteer and find an active community offering support for Puppeteer. This led to massive growth in popularity amongst the developers.

Of course, Puppeteer isn’t suitable only for testing. After all, if it can do anything a standard browser can do, then it can be extremely useful for web scrapers. Namely, it can help with executing javascript code so that the scraper can reach the page’s HTML and imitating normal user behavior by scrolling through the page or clicking on random sections.

These much-needed functionalities make headless browsers a core component for any commercial data extraction tool and all but the most simple homemade web scrapers.

First and foremost, make sure you have up-to-date versions of Node.js and Puppeteer installed on your machine. If that isn’t the case, you can follow the steps below to install all prerequisites.

You can download and install Node.js from here. Node’s default package manager npm comes preinstalled with Node.js.

To install the Puppeteer library, you can run the following command in your project root directory:

npm install puppeteer



# or "yarn add puppeteer"

Note that when you install Puppeteer, it also downloads the latest version of Chromium which is guaranteed to work with the API.

w will use the /r/learnprogramming subreddit for this article. So we want to navigate to the website, and grab the title and URL for every post. We’ll use the evaluate() method for that.

The code should look like this:

const puppeteer = require('puppeteer')

async function tutorial() {
   try {
       const URL = 'https://old.reddit.com/r/learnprogramming/'const browser = await puppeteer.launch()
       const page = await browser.newPage()

       await page.goto(URL)
       let data = await page.evaluate(() => {
           let results = []
           let items = document.querySelectorAll('.thing')
           items.forEach((item) => {
               results.push({
                   url: item.getAttribute('data-url'),
                   title: item.querySelector('.title').innerText,
               })
           })
           return results
       })

       console.log(data)
       await browser.close()

   } catch (error) {
       console.error(error)
   }
}

tutorial()

Using the Inspect method presented earlier, we can grab all the posts by targeting the .thing selector. We iterate through them, and for each one, we get the URL and the title and push them into an array.

After the entire process is completed, you can see the result in your console.

Screenshot 2021 07 23 at 18.02.50

Great, we scraped the first page. But how do we scrape multiple pages of this subreddit?

It’s simpler than you think. Here’s the code:

const puppeteer = require('puppeteer')

async function tutorial() {
   try {
       const URL = 'https://old.reddit.com/r/learnprogramming/'const browser = await puppeteer.launch({headless: false})
       const page = await browser.newPage()

       await page.goto(URL)
       let pagesToScrape = 5;
       let currentPage = 1;
       let data = []
       while (currentPage <= pagesToScrape) {
           let newResults = await page.evaluate(() => {
               let results = []
               let items = document.querySelectorAll('.thing')
               items.forEach((item) => {
                   results.push({
                       url: item.getAttribute('data-url'),
                       text: item.querySelector('.title').innerText,
                   })
               })
               return results
           })
           data = data.concat(newResults)
           if (currentPage < pagesToScrape) {
               await page.click('.next-button a')
               await page.waitForSelector('.thing')
               await page.waitForSelector('.next-button a')
           }
           currentPage++;
       }
       console.log(data)
       await browser.close()
   } catch (error) {
       console.error(error)
   }
}

tutorial()

We need a variable to know how many pages we want to scrape and another variable for the current page. While the current page is less than or equal to the number of pages that we want to scrape, we grab the URL and title for each post on the page. After each page is harvested, we concatenate the new results with the ones already scraped.

Then we click the next page button and repeat the scraping process until we reach the desired number of extracted pages. We also need to increment the current page after each page.

Happy Coding …

Tags: JavaScriptNodejsscraping
Share148Tweet92ShareSend
Previous Post

Guide to Googling

Next Post

A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

The Blogue Team

The Blogue Team

We are a team of passionate bloggers

Related Posts

How LinkedIn Powers Professional Networking: A Look Inside Its System Design
How To?

How LinkedIn Powers Professional Networking: A Look Inside Its System Design

by The Blogue Team
December 18, 2024
1.1k
Facebook: The Engine Behind the World’s Largest Social Network
Facebook

Facebook: The Engine Behind the World’s Largest Social Network

by The Blogue Team
December 17, 2024
1.1k
Snapchat: The System Powering Snaps, Stories, and Lenses
How To?

Snapchat: The System Powering Snaps, Stories, and Lenses

by The Blogue Team
December 16, 2024
1k
Load More
Next Post
A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

A Walkthrough to Hashing, Salting , and Verifying Passwords in NodeJS, Python, Golang, and Java

Where Writing Happens

  • Homepage
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Sitemap

© 2025 Blogue - Designed By Najus Digital.

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.

Refresh
No Result
View All Result
  • Login

© 2025 Blogue - Designed By Najus Digital.

*By registering into our website, you agree to the Terms & Conditions. and Privacy Policy.
Enable Notifications OK No thanks