Import HTML logo

Import an old existing, static HTML site structure into the Drupal CMS as structured nodes!

Allows an admin to define a source directory of an existing traditional static HTML website, and import (as much as possible) the content and structure into a drupal site.
Source files will be stripped of exisiting chrome and navigation elements before being inserted as nodes.

See import_html_help.htm for a largish overview of import_html features

  • Maintain old URLs
  • Re-create menu structure
  • Validate & improve markup automatically
  • Import Metadata - old dates, keywords, descriptions
  • Additional custom fields - Import old semantics to multiple CCK fields!
  • Operate over thousands of documents.

Read a case study or a walkthrough

This module has no public face at all - it's purely admin. In fact, once it's done you should turn it off again.

Important: Requirements

Because of the number of settings, this is not just a point-and-go module. You also need:

  • XML/XSLT support on the server. Check your php_info(), if it says either XSL or XSLT anywhere, it's fine.
    PHP4 support is being dropped in the Drupal 6 version.
  • HTMLTidy - Either with the PHP module or the commandline version.
    Update: there is now an automatic installer for HTMLTidy bundled in for Linux hosts. There are at least three flavours of tidy extensions for PHP, not including the commandline alternative. The PHP5 binary distributed version has been targeted, the PECL one can be made to work with some tweaks.

See the help document for details. Reading the walkthrough will illustrate what's possible with this.

Recent changes include better control of subdirectories for giant sites, now you can manage the import of thousands of documents without timing out, Just do a subsection at a time.
... even MORE recent changes do large imports as a batch process!

Project information

Releases