The "Page to PDF" feature reads the webpage and generates a new template based on the page’s content. This can produce a high-quality PDF document that carefully preserves the website's content, with a design and layout based on the main website, but appropriately adjusted to suit the PDF format, optimising readability and printability. This ensures that the PDF accurately reflects the website's content in a consistent and easy-to-read way.

This module handles the infrastructure: it connects to PDF generator service APIs to generate a PDF using Puppeteer, and saves that generated PDF to the specified field. It is up to you as a developer to configure the PDF view of the node appropriately, eg replacing videos, accordions, and other content not appropriate for PDF with fallback content. It is up to you to additionally add eg front and back covers to the PDF view of the node as desired.

Tools used

Puppeteer

This is a tool supported by Google that uses the Chrome browser to render the page content and generate the PDF from that. Using this approach, the majority of features on the webpage are able to be translated onto the PDF.

PDF Generator services

These are managed services that allows us to automate web browsing tasks without the need for a physical browser window. One of these is needed to build the Page-to-PDF feature and run Puppeteer as a task programmatically without manually interacting with a web browser. By automating web-based tasks, these tools provide Puppeteer as a managed service. Currently supported options:

Paged.js

An open-source library automatically breaks apart the page into a grid with the header, sides, footer, and middle as separate ‘cells’ within the grid. This allows the HTML to be styled appropriately for the PDF format.

Post-Installation

  1. Add your PDF Generator Service API key to settings.php
  2. Create a field in your node type to store the generated PDF
  3. Edit the node type and enable page to PDF, selecting that new field as the target field

Additional Requirements

We recommend using Doppio.sh as the provider to run Puppeteer. It is very low cost and means no additional server configuration is needed, so it works then perfectly with hosts like Panthoen, etc where you are not in control of your server. Browserless.io is also available; however, they have massively increased their costs and it tends to be unaffordable compared to Doppio.sh on their new pricing model since late 2023.

Similar projects

Other PDF generators rely on creating curated x/html specific to the particular tool. wkhtmltopdf is in read-only mode, no longer developed. Puppeteer on the other hand works by spinning up a headless Chrome browser and creating the PDF from that (while still respecting additional print specific things like page breaks), so it is generally not limited in terms of html and CSS support.

Supporting organizations: 
Sponsored initial development and handles continued maintenance
Helped refine the functionality together with Soapbox developers

Project information

Releases