Monitor Visual Changes in Websites with Puppeteer and Resemble JS

栏目: IT技术 · 发布时间: 5年前

Monitor Visual Changes in Websites with Puppeteer and Resemble JS

Detect changes in websites by taking screenshots and comparing them via NodeJS

May 5 ·7min read

Monitor Visual Changes in Websites with Puppeteer and Resemble JS

I recently wrote an article about detecting changes in websites via Visualping. It’s a free tool but with a few limitations (for its free users). I was also eager to build a tool from scratch to compare two screenshots of websites taken at different intervals as it would teach me a lot of things. This article will guide you on creating an application that can take screenshots of a specific HTML element in a webpage, compare it with a previous screenshot of that element and suggest whether there have been any changes. If you want to take a screenshot of the whole web page, you can simply pass html as the selector.

Although this is not a fully implemented solution, this is very much the difficult part of the process. You can easily turn this project into a portfolio project as it is very easy and interesting.

At the end of the tutorial, I will guide you on how to take this application to the next level. The final code of this tutorial is also given at the end of the tutorial.

Monitoring Changes in UI Components

For those of you interested in monitoring visual changes in a project you’re working on (as part of a team) — this post is not for you. For that, use tools like Bit.dev .

Bit.dev is a cloud component hub. It’s a place to publish, document and organize your team’s UI components. It is used for building design systems but also, to easily share and reuse components across repositories.

Whenever a component is created or updated (and published), you’ll get notified with a link to the component page (on Bit.dev ) where you’ll see it rendered “live”. This way you’ll be able to keep track of visual changes in multiple projects.

We need to follow the below steps to achieve our goal.

Take a screenshot of an HTML element in a website
Compare that screenshot with a previous screenshot of that element if it exists
Get the difference and the difference percentage

Step 1 — Install the necessary packages

npm i puppeteer fs resemblejs mz

Step 2 — Import the necessary packages

const puppeteer = require('puppeteer');const fs = require('fs');const compareImages = require("resemblejs/compareImages");const fsz = require("mz/fs");const siteName = 'nameOfSite'; 
// I used a popular movie download site

Step 3 — Create directories

Create a folder named screenshots in the project root. This is where our screenshots will be saved. Although this is not the ideal approach, we will be implementing this is as it is quite simple and easy. I will suggest a more ideal approach at the end of this tutorial.

Step 4 — Create the function to take a screenshot of a specific element in a web page

async function screenshotDOMElement(page,opts = {}) {const padding = 'padding' in opts ? opts.padding : 0;const path = 'path' in opts ? opts.path : null;const selector = opts.selector;if (!selector)throw Error('Please provide a selector.');const rect = await page.evaluate(selector => {const element = document.querySelector(selector);if (!element)return null;const { x, y, width, height } = element.getBoundingClientRect();return { left: x, top: y, width, height, id: element.id };}, selector);if (!rect)throw Error(`Could not find element that matches selector: ${selector}.`);return await page.screenshot({path,clip: {x: rect.left - padding,y: rect.top - padding,width: rect.width + padding * 2,height: rect.height + padding * 2}});}

The above-written function takes a screenshot of any DOM element specified. This clips the page to include a rectangle that covers the element and takes the screenshot.

Step 5 — Create a function to compare the differences between two images

async function getDiff() {const options = {output: {errorColor: {red: 255,green: 0,blue: 0},errorType: "diffOnly",largeImageThreshold: 1200,useCrossOrigin: false,outputDiff: true},scaleToSameSize: true,ignore: "antialiasing",};const data = await compareImages(await fsz.readFile(`./screenshots/${siteName}-new.png`),await fsz.readFile(`./screenshots/${siteName}-prev.png`),options);console.log(data.misMatchPercentage);await fsz.writeFile(`./screenshots/${siteName}-diff.png`, data.getBuffer());}

This function compares two images taken from the screenshots folder named as the two versions of the website we are comparing with. It saves the difference image in the screenshots folder and has a property data.mismatchPercentage which would show the percentage difference between those two images.

Step 6 — Write the entry point function

(async () => {const path = `./screenshots/${siteName}-prev.png`let prevImage = null;try {if (fs.existsSync(path)) {//file existsprevImage = fs.readFileSync(path);}} catch (err) {console.error(err)}const browser = await puppeteer.launch();const page = await browser.newPage();// Adjustments particular to this page to ensure we hit desktop breakpoint.page.setViewport({ width: 1000, height: 900, deviceScaleFactor: 1 });await page.goto(`https://${siteName}.mx`, { waitUntil: 'networkidle0' });const fileName = prevImage ? `${siteName}-new.png` : `${siteName}-prev.png`;if(fileName.includes('new.png')){try {if (fs.existsSync(`./screenshots/${siteName}-new.png`)) {//file existsfs.rename(`./screenshots/${siteName}-new.png`,`./screenshots/${siteName}-prev.png`,(err)=> {console.log(err);});}} catch (err) {console.error(err)}}await screenshotDOMElement(page,{path: `screenshots/${fileName}`,selector: 'div#popular-downloads',padding: 0});browser.close();try {if (fs.existsSync(`./screenshots/${siteName}-new.png`) && fs.existsSync(`./screenshots/${siteName}-prev.png`)) {//file existsconsole.log('file exists');getDiff(siteName);}} catch (err) {console.error(err)}})();

Although this is a big function, I’ll explain it in pieces.

Initially, we check whether there is a previous version of the website screenshot existing in our folder. Then we create a new browser instance with puppeteer and create a new page. We set the page dimensions, you can adjust this per your requirement. Next, we navigate to the website, we need to take screenshots of.

Then we check whether there exists a previous version of the file, in order to name our screenshot accordingly. If we already have two versions of the screenshot, when we again take a screenshot, we should compare it with the most recent previous screenshot. Because we want to see the differences between up to date versions of the website. The next piece of code does that work for you. It renames the filename of the screenshot so that the versions are in order.

Once that is completed, we call the screenshotDOMElement function we wrote in step 4. I am monitoring for changes in the element with id popular-downloads from a popular website for movie downloads. Hence I pass the selector as div#popular-downloads . I also need to pass the location for the screenshot to be saved. This will be taken from the filename variable we spoke about in the previous paragraph.

At the completion of this step, you will have the screenshot saved in your screenshots folder. We close the browser instance after this.

As the final step of this program, we check whether both previous and new versions of the screenshots exist, and then we call the getDiff function we wrote in step 5.

Congratulations!! You have now created a program to take screenshots of a webpage and compare the differences.

Here is the full code for you.

Conclusion

Although we built an application that can take screenshots and compare them with one another, this application cannot perform automatically on its own as it needs to be scheduled. Also, the website has been hardcoded and the application stores the screenshots in its directory, thereby increasing the size of the folder.

Here are my suggestions for you to take this application to a fully working condition.

Store the images in online storage like cloudinary or firebase storage. Make sure you keep track of the file location of these images.
Use a scheduler to invoke the screenshot function on a regular interval. You can use something like Heroku scheduler.
Use a database to store the list of websites and their relevant selectors for you need to keep track of. Take screenshots of these websites at the point of invocation. If you do not want to use a database, you can simply store the list of websites in a JSON file in your project directory.
Use a telegram bot to send and receive data from the user. You can refer to this library here . You can receive messages which contain the new website and the selector you should monitor for. And you can send messages to the user when you identify that a page has a change greater than the threshold value.
You can add more functionality through the telegram bot by adding more commands such as delete, get all websites, etc.

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网