Import HTML
Import an old existing, static HTML site structure into the Drupal CMS as structured nodes!
Allows an admin to define a source directory of an existing traditional static HTML website, and import (as much as possible) the content and structure into a drupal site.
Source files will be stripped of exisiting chrome and navigation elements before being inserted as nodes.
See import_html_help.htm for a largish overview of import_html features
- Maintain old URLs
- Re-create menu structure
- Validate & improve markup automatically
- Import Metadata - old dates, keywords, descriptions
- Additional custom fields - Import old semantics to multiple CCK fields!
- Operate over thousands of documents.
Read a case study or a walkthrough
D6 port almost here, but needs encouragement
Update May 2009: ALMOST ready to call it a number release.
Drupal 6 version in the pipeline - if wanted enough
If this is useful enough to you, please consider hurrying things along with an encouragement via ChipIn. It's easily $1000 worth of work to get it to a good state. But if enough people together think it's worth $500, then progress will actually happen!
If this can save 10 people $50 worth of time (it really really will) then we can all break even. In reality, this thing will save anyone 3 days - 2 weeks of copy-paste when used properly. What's that time worth to you?(if you object to a hard-working developer suggesting his time is worth more than $0.00 per hour, please just ignore this message)
Follow progress in the D6 port thread here.
This module has no public face at all - it's purely admin. In fact, once it's done you should turn it off again.
Important: Requirements
Because of the number of settings, this is not just a point-and-go module. You also need:
- XML/XSLT support on the server. Check your php_info(), if it says either XSL or XSLT anywhere, it's fine.
PHP4 support is being dropped in the Drupal 6 version. - HTMLTidy - Either with the PHP module or the commandline version.
Update: there is now an automatic installer for HTMLTidy bundled in for Linux hosts. There are at least three flavours of tidy extensions for PHP, not including the commandline alternative. The PHP5 binary distributed version has been targeted, the PECL one can be made to work with some tweaks.
See the help document for details. Reading the walkthrough will illustrate what's possible with this.
Recent changes include better control of subdirectories for giant sites, now you can manage the import of thousands of documents without timing out, Just do a subsection at a time.
... even MORE recent changes do large imports as a batch process!
Releases
| Official releases | Date | Size | Links | Status | |
|---|---|---|---|---|---|
| 5.x-1.2 | 2007-May-01 | 93.31 KB | Download · Release notes | Recommended for 5.x | |
| Development snapshots | Date | Size | Links | Status | |
|---|---|---|---|---|---|
| 6.x-1.x-dev | 2009-Jul-01 | 170.93 KB | Download · Release notes | Development snapshot | |
| 5.x-1.x-dev | 2008-Feb-01 | 98.83 KB | Download · Release notes | Development snapshot | |
| 4.7.x-1.x-dev | 2006-Nov-13 | 68.35 KB | Download · Release notes | Development snapshot | |
