By zeitenflug on
Following up on my "memory exhausted" problem (see other post), I'd like to know if there's a module that lets me convert my dynamic site to a quasi-static one. I have 4800 nodes and 15000 taxonomy terms that only change once or twice per day. So it would be enough to run a cron script that once a day converts the node and the taxonomy pages to static ones - and to circumvent Drupal's memory limit problem. I'd like to keep one part of the page dynamic, however, and this is the one that allows users to login and to create new nodes.
What tools are there that could help me achieve this?
Comments
Static Drupal Site
Check out the handbook page on creating a static archive of a Drupal site. It has step by step instructions on how to handle what you want to do.
Httrack
I suggest to use httrack : http://www.httrack.com
HTTrack
HTTrack is mentioned on the book page. However, it is more involved than just using HTTrack.
What Else
What else is involved in your experience with HTTrack? This looks like a valuable tool.
I havn't had time to explore it yet but would like to hear your input.
Thanks,
t4him
Read The Book Page
Read the link I gave above to the book page, it explains it step by step.
It would be really nice to do this on an ongoing basis...
The problem with the very interesting document referenced in the answers is that it does not really answer the question posed here, IMHO.
How wonderful it would be to have some method, not of replacing a Drupal with a static site, but of combining the two such that Drupal could be used for the site maintenance and update, and some mechanism would then allow you to publish the Drupal site into flat HTML files. Trouble is, this would be a pretty major undertaking and once the site got to be any size I doubt whether a php cron job would be able to do it, you would need some custom programming not constrained by php timeouts...
I believe this is done in Typo3 and I know it is a feature of the commercial CMS Tridion - but I imagine it would be a pretty major undertaking to introduce it to Drupal: just looking at the document about creating a static site gives you an idea of the problems involved. Plus you would in reality probably only want a partially static site...
You can try using a file based cache approach
You can try an approach similar to the one described in http://drupal.org/node/29970 A file based cache utility for Drupal.
I used a similar approach, only for testing, in one of my sites (audiocast.it) where I am using the pathauto module.
Basic concept is that I have a directory called "static" under the drupal/apache root directory where I store cached versions of pages, the rewrite directives will rewrite the URL pointing to the cached version ONLY if the cached file exists.
So, for example, http://www.audiocast.it/podlist will go to the file /var/drupal-4.7.0-beta3/static/podlist/index.html (if it exists, otherwise it is managed by Drupal)
http://www.audiocast.it/mymenu/argomenti/Faq.html will go to the file /var/drupal-4.7.0-beta3/static/argomenti/Faq.html (if it exists, otherwise it is managed by Drupal)
I can generate (or regenerate) the cached file version using a cron job.
The following is a snippet of the .htaccess file
Valerio Di Giampietro
Audiocast.it
Re: Valerios approach
Yep, thanks Valerio. That's what I meant. Your solution seems to be smarter than my quick hacking. I didn't know about the available drupal modules - have to get into drupal yet. It seems very worth considering, though, given that I hope my site will grow.
Re: convert site - SOLVED
Thanks for your input. I didn't look for a solution to ARCHIVE my site, but to find a way to reduce the load it was causing. There are essentially two ways:
1 - switch to Java. A decent Java framework allows to buffer mysql request. Basically it means that if node 42 is requested, there will be one database connection even if there are a hundred users whereas PHP will open a new database connection for every page request. This may be an option in the future but not for now, because I am not really into Java yet.
2 - Run a perl script that retrieves nodes and taxonomy pages, saves the html and modify drupal. How modify it? Drupal gets /node/20 and instead of making database requests it reads file /node/20 and dumps it to the screen - including various security measures.
When I asked if there is a way of modifying drupal in the way I described here, I wanted to know if it is the node module that causes the load or other parts of Drupal. Let me clearify this: If the node module doesn't cause much load but the core module of Drupal does, then it would be a major undertaking. If it is the node module that tries to parse all taxonomy terms, replacing it would have been the way to go.
Anyway, I wrote a quick perl script and a node replacement to test solution (2). At the end there was no measurable performance plus. Of course, a static page performs a little better, but my site is not yet a size that would make it matter.
Thus, the issue is resolved. There is no quick way of converting my site to a half static one. I could, though, archive the whole site and run a minimal drupal for the dynamic functions. I tried this too. I only needed to run wget on my site and I had my archive site. Then I changed my index page to index.html and modified index.html to point at a fresh drupal install. All this drupal install did was linking back to the archived pages and provide a possibility to submit new pages. The perl script then checks the new database, moves new nodes to the old database and outputs a new html saved into the static folders.
The only problem with solution 2 was that - although I had a static menu - the minimal drupal produced links I didn't want because at the end they led to empty pages (the new node database is always empty). So if I really wanted to keep that solution, I would have to go through the modules and rewrite the links and I was too lazy for that. And - above all - the performance plus was only marginal.
Anyway, since I put memory_limit to 32M I had no further crash of drupal. I did some reading and discovered nodes on drupal.org speaking about bad taxonomy handling. So I have the impression that the issue will soon be resolved and there will be no need to convert to a static site in the near future.
Static page caching module
I'm presently working on a module that will allow you to convert your site into a static, or semi-static, version. Please see http://bendiken.net/2006/05/28/static-page-caching-for-drupal for more information.