webscraper

Description:
  • Scrape the url and extract details using Puppeteer

Source:

Methods

(async, static) run(options)

Description:
  • Runs the scraper for given url, extracts details

    Properties set in the options:

    • {string} title - page title
    • {string} logo - first logo
    • {object} meta - meta tags
    • {object} event - from LD+JSON
    • {object} webpage - from LD+JSON
    • {object} company - company from LD+JSON
    • {object} ical - first event
    • {string[]} logos - first 5 detected logos

    Files stored under options.root

    • {file} page.png - first page screenshot
    • {file} scroll.png - full page scrolled screenshot
    • {file} full.png - screenshots of the first and fully scrolled down pages
    • {file} page.html - HTML content
    • {file} page.txt - body innerText
    • {file} logo.png - downloaded logo

    Different sources of information are supported:

    • DOM
    • meta tags
    • LD+JSON scripts
    • iCal links
Source:
Parameters:
Name Type Description
options object
Properties
Name Type Description
url string