How do I convert all my html files into drupal, and how to I have the content (node) URLS reflect the old structure?

My site www.pinkpt.com has over 1,000 pages of content, and I want to convert them automatically into drupal... is there a way to do this? I've spent the last few days trying to do it but to no avail and the only available option I could see was to do it manually. This brings me to my next problem.

When doing it manually, the page URLS are listed as ?q=node/3, ?q=node/4, ?q=node/5 etc. Short from modrewriting each number to their required pages i.e. 4 --> gameaguide... is there a way to set these page names? Can it be done automatically with reference to the html files from my previous site?

I've also set up the menu system, but I've run into the same problem. While I can set what address it goes to, I can't create content pages with the names I want.

Any help much appreciated.

Comments

sepeck’s picture

There is path module that can allow you to rename your url's to whatever you need. Some of the recent larger migrations have used a variety of import scripts and path module to retain their existing paths.

As to how to import such a large existing db? You'll have to read through some of the other sites that have been posted recently for hints and clues.... or a wandering dev who has done this will hopefully post.

-sp
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

dman’s picture

I had the same challenge. And started work on.

Import_HTML

Facility to import an existing, static HTML site structure into Drupal Nodes.

This is done by allowing an admin to define a source directory of an traditional HTML website, and importing (as much as possible) the content and structure into a drupal site.

Files will be absorbed completely, and their existing cross-links should be maintained, whilst the standard headers, chrome and navigation blocks should be stripped and replaced with Drupal equvalents. Old structure will be inferred and imported from the old folder heirachy.


I haven't got CVS permissions, and that version of the code has a whole heap of debug info spewed out at you, but might be worth a try.
  • You need XSL support on the PHP on the server
  • If you want to strip the chrome and stuff from the old pages auotomatically, rather than just 'wrapping' them, you may need to tweak the XSL stylesheet
  • It's only been tested on somelimited samples, so there's probably lots that can go wrong

... another option is an on-the-fly wrapper that wraps drupal nav around static pages that stay static. I have a code example for that too...

.dan.

tesla.nicoli’s picture

Never tried it but looks like it might work for you.

http://www.hivemindz.com/project/staticHTML

gilcot’s picture