Hi all.
As of yesterday I've been pulled into converting an existing static site into Drupal. Lots of it looks lovely, especially the ease of install. Then, as many seem to have done, I got stalled when trying to create a one-to-one mapping between old content and drupal 'nodes'. After much time in the handbook and archives, I'd like to share my conclusions, Yes, I think I've got the (or at least an) answer.
Can someone tell me if I've got the wrong end of the stick here?
Background in Information Theory
Drupals much-lamented non-traditional lack of heirachical topic nodes can be stated thus (I think).
A taxonomy term doesn't actually have content, it is merely a label for the container. Hence, a taxonoomy term is not a node, and clicking on a term takes you to a list, not a content block. This is Drupals behaviour, and this is what many folk regard as a 'problem' ... as did I for the first few hours of experimentation and research.
When you think about it, this IS actually analogous to how an URL or filesystem operates.
In /parent/dir/subdir/ , subdir IS NOT CONTENT, it CONTAINS content. :-}
Meditate on this 'till you get it.
By rights, calling that URI should return a LIST. FTPs, Gophers, File Explorers and olden-days webservers did just that. It's only convention that serves up the actual content contained in the file subdir/index.html (or whatever). Remember, that the original purpose of the (aptly named) index.html file basically IS to provide a list (or index) of the CONTENT in that directory or section. [see note 2 below]
Bearing this in mind, we can still bend Drupal to our will.
Emulating Static Site Structure
First, enable module:taxonomy_context This will allow us to display overview content for the terms. This is half the battle!
If you are thinking traditional website structure - create a base taxonomy reflecting your directory names. (my intention is to create a module that will automate this)
Next, create content nodes representing your content files (excluding all index.html) and set them as children of the appropriate taxonomy terms.
[Note 1] index.html is a special case. The 'content' of your index.html files should be pasted into the 'description' of the appropriate taxonomy term.
I think the actual rendering of the taxonomy_context page needs tweaking, but the concept of listing links to all its immediate children SHOULD be retained, although more options would be good.
(listing an items parent in its summary, when the summary is being displayed as part of the parent page, hence a no-op, is overkill IMO)
About now, Your breadcrumbs should fit right, your sitemap will be the right shape and all is in place for intelligent URL-munging (which I won't write-up yet)
Drawbacks
Biggest problem is that the index page is not editable as a node hence it behaves a little different for the user/editors. But the technical difference between "editing" a node and "administering" a term is not great.
Some tweaking may be able to make the interfaces more similar. I guess user comments and syndication stuff is also disabled for these pseudo-pages. I'm just trying to emulate a single-editor static site for now, so I don't miss that much.
There may be deeper underlying issues I'm not aware of. (do 'terms' get searched?)
[Note 2] A web-similar way of achieving this result would be to modify module:taxonomy_context or something to 'link' a topic with a default node.
References:
- One posting deep in drupal.org (by a German?) who explained set theory in a way I could finally grok - by mentioning "the class of all flowers" I think, with undertones of "Platos Cave". Dammned if I can find the post now :( - but I wish it was glued to the front page!
- link:GreenAsh's solution to useful pathnames
- link:An approach by Geoff Hankerson - requires coding
author:Dan Morrison coders.co.nz
... I also have plans to write a new import_html module, to automate this approach, but I'll post about that later. Why is there no HTML-->Drupal topic in "Converting to Drupal" ??!
So guys, hello, and what have I missed?
Comments
Nice writeup
Two other ways of doing basically the same thing would have been to use the book module and book hierarchy, or the menu module. In both of those cases you would have avoided the term-as-container problem. You might also look at the pathauto module as a way to build the URLs to look just like they used to on your old static site.
- Robert Douglass
-----
Rate the value of this post: http://rate.affero.net/robertDouglass/
I recommend CivicSpace: www.civicspacelabs.org
My sites: www.hornroller.com, www.robshouse.net
Still getting familiar with the rest of the modules
I saw a few drawbacks with what I've read about book modules - something about them not being extensible back into nodes at a lower level, and how the next-sibling sequence was actually implimented as a big long parent-child list. I may have misunderstood.
I'll see if that will suit our needs. Either way I want to automate the content import somehow.
I haven't even looked at the menu module. I wasn't aware it would help with my actual structure.
Pathauto is definately to be slapped on the top after I've got this structure sorted out. If all goes well, I will be able to transparently replace a static site with a drupal one without breaking any bookmarks!
.dan.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
I founds this an easy way to work as well
I usually use books for this same purpose, it gets me out of a lot of tight corners and brings information to the front in a structured format that I have complete control over and can change without too much energy, grief and time being spent.
Actaully I use this so much that I have renamed "books" to "sections" (not to be confused with section.mod) since I tend to find sections an easier way to think, in terms of structure. If you try to run both mods using this methodology you will get two sets of the features (add child page & printer firendly versions). Since in fact, this method of using books and sections are one in the same.
There are not too many things that can send me into an endless mental loop faster than working with structure.
There are so many on the fly choices with self-questioning results. I have found there is no perfect structure, just a better structure. I always seem to discover a better structure after I have already built one. No matter how much I try to pre-plan, I feel data structure seems to define and redefine itself as content increases and the website matures.
Try out pathauto
i recall mikeryan implementing lists for path aliases .. There was also a directory module many many many moons ago.
--
The future is so Bryght, I have to wear shades.
I want my nodes to have intrinsic structure. I think.
Yeah, Pathauto is part of the solution, later, but isn't it basically a cosmetic layer on the top?
First I have to internally define the parent-child relationships between my content, which is then used by sitemap, breadcrumb and pathauto. I think.
I looked at taxonomy, and saw that it could be used as an advanced equivalent for old-school structured content. Even though (as I found out) that's not really its intent.
Yet it could be done, and done well (I've always secretly wanted multiple inheiritance in my directory structures) ... barring the directory->index.html mapping feature.
Is this the directory module you speak of?
.dan.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
I saw this project recently
I saw this project recently for a static HTML site at HiveMindz. Might be worth a look:
http://www.hivemindz.com/project/staticHTML
Thanks, but I don't think so :(
Well, it seems that the download on that site is playing up. I get corrupt archives because their server is inserting whitespace into the zip :-8
And it would seem from a glance at the config screen
http://www.hivemindz.com/hm/images/staticHTML.gif
... that it's only a Drupal WRAPPER to static HTML stored elsewhere. This is interesting for legacy support, but nothing like a true import.
Thanks anyway.
How many non-Drupal drupal module sites are out there? Or is that an FAQ?
.dan. is the New Zealand Drupal Developer working on Government Web Standards
not many
Hivemindz is an interesting exception. Search for Carl McDade to read the historical artifacts.
- Robert Douglass
-----
Rate the value of this post: http://rate.affero.net/robertDouglass/
I recommend CivicSpace: www.civicspacelabs.org
My sites: www.hornroller.com, www.robshouse.net
It is ages since he was on
It is ages since he was on this forum. Sometimes, I think it is a bit sad, considering he keeps working on Drupal.
Oh, don't want to start anything here...;-)
Edit: Now that I read your post again, perhaps I was reading too much into your post.
Almost ready to believe
From this write up, I'm almost ready to believe in using taxonomy for site structure :P
I've always used the book module for static structure. The "index.html" actually is a node, and pages are organized in a straight hierarchy. Navigation is auto-generated via the Book Nav block.
The only downside is that a node cannot appear in more than one hierarchy, but this is usually not the case in converting from static websites in any case.
Re: converting from static HTML: puregin is working on import of various formats into the book module. He's done the export part (OPML, DocBook), import would be the next logical.
Idea: a "site sucker" -- point at a URL and follow all on site links. Using local filters, decide what to strip out -- e.g. only copy stuff in the "body" tags -- and turn it into local nodes/books. Could even use pathauto to keep the same (relative) URLs.
flat out, you rock
Welcome aboard, I look forward to more of your stuff. I have been trying to explain this for a while now and I think you nailed it nicely.
-sp
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain
-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide
Great mini-tutorial
This is a great, concise guide to setting up a static site structure in the best way that you currently can with Drupal - well done, mate! The analysis of the drawbacks is particularly useful, as it will help many users to decide if using your (and others') approach is worth the pain.
Hopefully there will soon be a better way to set up a site structure (it's coming... soon!). In the meantime, users need all the help they can get with using Drupal's existing static structure resources.
Jeremy Epstein - GreenAsh
Jeremy Epstein - GreenAsh
Glad it makes sense.
Well, it's nice that no-one has told me I'm doing it all wrong, but I still can't believe I got it totally right first go!
Thanks all for the comments.
I've got my auto-import module 80% going so far, I'll write it up in a little while. I've got
search-files->select->import-html->validate->extract-content->transform-to-xml-import working.
I just need repair-hrefs, insert-node and create-aliases and it's done.
I'm going slower than I expected because I want to leverage as much existing core and modules as I can. Which means reading lots of source code. Cool tools like drupal_get_filename() are hard to find if you don't know what to look for!
.dan. is the New Zealand Drupal Developer working on Government Web Standards
More thoughts
I pretty much agree with you, although I think this part below could still be misleading for newbies:
I think that explanation hinders the user breaking free of a hierarchical way of thinking, and the filesystem examples reinforce that (being very limiting in their organisational capabilities).
I prefer to think of terms not as containers of content, but as descriptions of content - ie more about metadata than structure. They are there so the computer can filter big lists into smaller lists for you based on criteria you specifiy.
Your points about lists etc are very good though, and I agree about taxonomy_context being a very useful module :)
Have you looked at the taxonomy_assoc module? I haven't tried it yet but it sounds like it should help out with the searching and editing of those list index pages.
Heavy theory and headbanging
Could be, but the entire point of my exercise was to work from the starting point of a familiar heirachy, without requiring the entire switch in world-view ... at first.
At the same time, was hoping to discuss this paradigm shift as I did in the first section. ... and more today... :-)
Set theory
Ah, but they are! It's just that we are using set theory where there is nothing to prevent something being a member of more than one container/set!
Ah but it isn't. I would say a description is a property of an object, attached to it.
A description requires a descriptee.
... Whilst a container is a class, whose definition can exist even without any nodes in it.
Metadata is strange. Look at metadata as implimented in the web. A page may say it is one thing - it describes itself with keywords. This is your "description". A property of the item. Data.
Destroy the item and the description is gone.
The only time that claim becomes relevant is when the said page is collected as a member of a set - either a list of links to that subject, or even the set of search results on the subject.
Only then is the data "meta"
Without the grouping mechanism, the keyword means nothing.
The existance of the parent set is required for a member to be able to participate in it.
OTOH, The existance of a member makes no difference to the container.
Is that theory to, um , theoretical?
Uncommon sense
*sigh* ... But somehow I don't think any of this is going to help me communicate with my clients.
I recall one afternoon jabbing my finger at their sitemap/content map that was perfect, except every node was a leaf. There was no content at the junctions. It was structured great, but there was no 'index.html' sort of page anywhere.
.
... strangely enough, this client would have been right at home with the Drupal schema!
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Free tagging
While this is little more than a nit-picking technical aside, I still feel that it should be mentioned here. Drupal's in-built support for 'free tagging' (i.e. terms that are created on-the-fly in the node add/edit process) goes a bit against the idea of 'a tag can exist without any nodes'. With free tagging, a taxonomy term really does become a description rather than a container. The term exists because it's needed to describe a node.
In Drupal, a free-tagging term is stored the same way as other terms - that is, in an independent (1 to 0...*) list that doesn't rely on being linked to any nodes. But in other free tagging systems, such as Flickr, a tag only exists if there are nodes associated with it (i.e. 1 to 1...*).
Also, putting your own views aside for a moment, you might be interested to know that the taxonomy module's original purpose was to 'describe' content, rather than to 'contain' it. This underlying purpose has become increasingly clear to many users recently, as we try and bend the module to cater for our 'container' needs, which it was never really designed for. The taxonomy module is based on the formal academic study of taxonomic classification, which also has the 'description' rather than the 'container' philosophy.
Jeremy Epstein - GreenAsh
Jeremy Epstein - GreenAsh
SYNTAX ERROR
True Enough. I don't really know what my 'own views' actually are :-B so I won't labour any more theories.
... Although when you say
I find my self agreeing strongly. It's funny what a shift from noun to verb will do!
... or maybe it's the shift in subject-object. or active-passive or something.
:-} Apologies to all not-fluent-English speakers out there. Yes, All those sentences are basically equivalent. Sorry for frying anyones brain.
.dan.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
That's cool
That's cool (and worthwhile), I just slightly misunderstood what your aim was :)
Thats true and fine and all and more experienced users can relate to that explanation, but I think approaching it that way will confuse newbies more than thinking of descriptions (which is imperfect as well I admit). The filesystem analogy is far too limited too - although once some of the organisational ideas around dynamic queries from OS X Tiger and Longhorn sink in with mainstream users it will probably be quite a good one hehe.
I think the ideas would probably be easier to explain to users familiar with GMail and its label system as opposed to a traditional mail client that has a tree of folders.
Hehe - more filesystem refugees :)
I'm not actually that sure how I'd deal with that situation. That's what we use taxonomy_context for, but how to explain that need to the client - ummm... pass. It's one of those things that previously seemed 'obvious' that are so hard to explain when you actually have to.
I vote dman Head Librarian
dude, you know your Information Sciences. You should have the keys to drupal.org's content "vault" (and I do mean vault -- stuff goes in, doesn't come out) and sort things out.
Off the top of my head, here's how it looks:
Support > Bugs > comments
Support > Feature Requests > comments
Support > Support Requests > comments
CVS > CVS log > CVS log entry
Forums > *category* > posts
Books > *book* > *book page* > comments
Mailing lists > *mailing list* > messages
Subscribe > *subscribed item* > email notifications
IRC > *channels* > messages
Search
RSS Feeds > *feeds*
The sad thing is, the delivery mechanism defines drupal.org's content, not the structure. i.e. a forum post shares almost the same properties as a mailing list message, which is pretty damn close to a comment, that's almost identical to a book page.
Yet, you can't find anything you know exists, let alone things you've never seen before.
For example, right now, sepeck is giving some great introductory advice to vanguard9, yet once that IRC session is lost, so is that conversation.
And mailing lists -- I get so confused as to whats just a notification of something I subscribed to or whether it's strictly a mailing list discussion. Help me Jesus if I happen to not hit 'reply-all' and send it back to the list server.
Don't get me started on all the duplicate threads in the forum -- thankfully there are users with a keen eye who post linkbacks to similar threads.
Personally, I'm waiting on a patch review for a feature request called, "fetch user info from user table". You know how I keep tabs on that? I go to the search and query "fetch user". It's three down from the top, give or take.
And no, its not part of "My Issues" and no, it doesn't appear on "Recent Posts".
Does this sound familiar?
The sad part is that it's all just chunks of formatted text, stored in the same database (for the most part).
These aren't real "content management issues" -- we're not scanning dusty manuscripts from the bowels of the New York Public Library here. We're not fussing with some opaque and proprietary binary content formats.
This is why people give up, or do their own thing.
Technically, Drupal has everything we need -- the irony is that while we have all these great features, we're hardly using them to our advantage on drupal.org.
What we need is a librarian -- someone who can lay the ground work for what goes where, document it, and enforce it:
a) moderate posts and move discussions to the appropriate area. Prune away. Close 'em down. Move 'em out.
b) Make the search kick ass. Let's face it: Google has spoiled everyone -- add a radio box defaulting to "search this site with Google" until 4.7 is out and additional work can be done on the search.module.
c) Kill the mailing list: I mean, come on -- its an extra click going from the notification back to the thread -- do we really need to keep another content "silo"?
d) create a forum for each module. Mambo has a great module called "mamboforge" that mirrors the functionality of the sourceforge project. This would be a great addition to Drupal but in the meantime, can't we have a dedicated forum for each module?
We have great developers on here. There are great admins and power users. We have talented themers. And there are plenty of content contributors here -- with a little organization and structure, we can make drupal.org a showcase for what can be accomplished with Drupal.
Not really, I post commonly
Not really, I post commonly answered questions into the handbook from #drupal-support all the time. The discussion in question was along the lines of another discussion currently being hashed out in the Drupal-devel list. As that paticular issue is a matter of opinion and approach and (admin theme seperation vs built in) and not yet settled, there is no point to putting it in the handbook.
Additionally I have gotten three new users to contribute handbook pages themselves based off discussions, so I hope to continue the trend.
A final note :), you were there and you can create a handbook page in the handbook. Someone who no longer appears to be active, setup the channel a while ago. Many were against it. I am certainly not there all the time and often no one is. It't the joy of free support.
Moderate posts? We move them periodically when we see them.
Search... well, the unpaid guy is just not going to beat Google, but their's really not much more I can say except code and patches form people might really help out.
Kill the mailing list? The only forum I use is Drupal. I can access my mail anywhere and frankly it;s more convienient.. -1
more than 100 forums... we don't have enough poeple for the forums we have now. I am not opposed to the idea, just think it won't solve the issue. That said, when mail list and cvs get moved over some folks will get their time freed up, maybe some other stuff that has been in a holding pattern will get done.
In the meantime, perhaps some code and a demo site for this 'forge thing' so people can discuss the concept would be a good idea.
-sp
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain
-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide
Ook!
Yeah, it sure looks like that.
But no, I don't want to start messing with that lot.
My suggestion - Relevant to the other discussion on documentation consolidation is to create a regimented FAQ.
F.A.Q.
In general, I'm against the concept of FAQs, as they imply that your site wasn't structured well enough for someone to find what they wanted directly. Having a FAQ in your sitemap implies your sitemap is inadequate.
Seeing as this IS the case with the Drupal docs, discussions and tips, here goes:
I haven't looked at the FAQ module at all yet, but my requirements are mostly of the users.
How to change the order of menu items (weighting)
Taxonomy is a great way of grouping issues on different axes, so (for example) this OP of mine could be categorised under both
First Site > Creating Sitemap/Structures > How to create heirachical tree navigation using taxonomy
and
Concepts > The Taxonomy > How to create heirachical tree navigation using taxonomy
"How to hide buttons on the navigation bar" may also be known as "How to control access to some modules".
... there is more, but that's a start from me.
The great news is that Drupal looks like the perfect platform for this dream machine. ... And if there is no move from Drupal.org itself, it doesn't even need to be hosted here. Anyone can make a start on this.
Just not me, I've only met Drupal for the first time this week!
.dan.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Additional Reference - Good discussion
In the interests of helping people find the hidden knowlege lurking in this discussion group, I'd like to add
First experiences with Drupal to my list of valuable references on this topic. I only just found it, but a few posts in there are very rich.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Just cross-referencing a bit. Trying to gather some bookmarks...
And again, following up myself, I see that my suggestion in note 2 has already been done before
Taxonomy assoc - attach a node to a taxonomy term
( thanks to someone who is flabbergasted as I am that there is a List of 350 existing modules I didn't see when I went looking the first time.) ... Ah. I see it's Jaza again. :-B
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Reserved taxonomies for project.module navigation
Those are the same modules that you see by visiting the downloads page. The project module filters display by release version. By default, non-released versions (i.e. cvs) will not display unless you click over to the CVS tab, e.g. http://drupal.org/project/Modules/cvs
The list of 350 will also list modules tagged only for older releases of Drupal, AFAIK.