Demo works, import doesnt
TheoRichel - November 1, 2007 - 10:30
| Project: | Import HTML |
| Version: | 5.x-1.2 |
| Component: | Miscellaneous |
| Category: | support request |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | active |
Jump to:
Description
Hi,
First my compliments for what seems such a lot of work, and you present it like you're only playing. Well I am not there yet: at www.groenerekenkamer.com I would like to import what is on another server at www.groenerekenkamer.nl.
When I put this url http://www.richel.org/grk/bookshop/kroonenberg/index.html in in the Demo, I can see how it gets imported (though here 'Submitting' has no effect).
But when I then really try to import that page, the regular way, it cannot find anything to import.
Do you have a suggestion about what goes wrong?
Thanks in advance,
Theo Richel
www.richel.org/resume

#1
The demo doesn't save/submit, that's a known problem that broke with the D5 forms, but it's not been worth fixing, as the real point of the module was to import the structure, and you just can't do that well page-by-page. The test is a throw-away preview used to check the text is semantic enough to work with. The fact that it may run on remote URLs may have got your hopes up, but it was mainly used for me to try the template patterns on a heap of selected sites quickly.
Neither does the module include a full, remote site-spider. It's imaginable, (on the wishlist) but a really big job.
What you want to do is follow the walkthrough.htm instructions - that's got exactly what you want step-by-step. :-)
To reconstruct a full site, you must have local access to a full site file-mirror, and run it from there. Just zip/unzip or WGET the old site into a temp directory, then proceed.
#2
Thanks for the quick reply. I did the walkthrough, but to me it wasnt clear that the file had to be local. I did read what you said about spiders and so, but then got confused with the way the demo works. Anyway, I have already tried the way it should be done, and it indeed works, though I do not think that the pages that were database generated on my former site can be imported this way, or can they?
#3
They can indeed, - if you follow the wget method to copy the site into a static snapshot first.
The command given in the middle of the walkthrough will do all the spidering for you (it may take 20 minutes). wget (or choose a site-spider you prefer) fixes up the tricky URL-parsing and link-following that is needed.
import_html takes it from there.