Warning - anyone new to this module should be switching to the AI module instead, where the AI Interpolator exists as the AI Automators with improved functionality and more providers in to come. The Simple Crawler submodule will give you the same feature set as this module.

The AI Interpolator Simple Crawler module is a plugin for the AI Interpolator module that makes it possible to scrape titles, main images, raw html or main content of a website into a text field.

Features

  • Screpe a whole webpage, the main content or some specific element.
  • Get the title.
  • Get the main image.
  • It scrapes using the Guzzle, so it can only scrape server-side rendered pages. For more complex scraping, see the ScrapingBot module

Important on upgrading to beta

If you ran this module in alpha mode, you can first run the following command:

composer remove "fivefilters/readability.php"

and then remove the following from your composer.json

    "repositories": [
        {
            "type": "vcs",
            "url": "https://github.com/compuccino/readability.php"
        }
    ],

Post-Installation

  1. When you generate a text long field you will have the option in the field config settings to enable the AI interpolator module on an link field and choose how you want to scrape it.
Supporting organizations: 

Project information

Releases