This patch adds folksonomy support to Drupal (named internally as "Free tagging"). In a nutshell, the core difference is the input method: unlike normal taxonomies which are administratively controlled, a "free tagging" vocabulary allows tag creation when the node is submitted. It does this through an text input box, as opposed to a dropdown or selectbox. This patch:

  • Removes the useless "Preview form" of a vocabulary.
  • Alters the vocabulary table to include a new "tags" column.
  • Adds a new "Free tagging" preference on vocabulary creation/editing.
  • Modifies the vocabulary overview to support pagers for free tagging vocabs.

The new code integrates tightly with the existing taxonomy code. The only additional processing occurs on node save and edit, where we parse through the tags associated with a node. All other display (and thus, code) remains the same. The following screenshots illustrate the changes, integration, and workflow:

These patches were made during the exploration and customization of Drupal by http://www.NHPR.org. In loving support of open source software, http://www.NHPR.org will continue to contribute patches they feel the community will benefit from. Questions about this patch should be directed to morbus@disobey.com.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Morbus Iff’s picture

FileSize
19.52 KB

Updated patch to fix some errors in the update.inc change.

Morbus Iff’s picture

FileSize
19.54 KB

New patch for check_plain and HEAD. Also removed the term indent under vocabularies - there was an extra-space issue in regards to _taxonomy_depth, and I felt it was better to just remove the (non-standard, non-semantic) indent I originally added during the move to tablular display.

Anonymous’s picture

A big +1 from me! This patch is going to be extremely useful for modules like "image" that involve frequent creation of new taxonomy terms.

I've been testing this patch extensively for a couple of days. I love the fact that it is totally non-intrusive onto existing sites if the admin doesn't want to use it when creating vocabularies, and that the free-tagged terms become "ordinary terms" in the database structure, with no special-case table.

Morbus Iff has done a great job of adding a powerful feature without breaking anything, as far as I can tell.

I can see lots of ways in which this can evolve in the future, such as more fine-grained security so that some users can add a given node type without being able to add new free-tagged terms (i.e., that class of users would have to pick from existing terms only, even if the vocabulary allows higher-privileged users to add free terms). I can also see a place for user-owned free-tag vocabularies that are dedicated to their personal image albums. But these features could be added in a future release and still be backward-compatible with what Morbus has done now. That being the case, I suggest that this patch be accepted into core.

syscrusher’s picture

Title: Add Folksonomy, or "Free Tagging", to Taxonomy » Doh!!!! I wasn't logged in!

Comment #3 was from me (syscrusher). Sorry I forgot to login first.

syscrusher’s picture

Title: Doh!!!! I wasn't logged in! » Add Folksonomy, or "Free Tagging", to Taxonomy
Dries’s picture

I'll commit this patch to core as soon CVS HEAD is opened up for development. For now, I'm awaiting feedback from the usability folks. I'm also left wondering how this would affect taxonomy-based permissions -- I don't think that should be a problem but it is somewhat mind-boggling.

I haven't tested the patch yet, but I glanced at the code quickly:

1. Don't use the word 'node' in user output. Use 'post'.

2. The words 'term' and 'tags' are both used in user output. This might be confusing, but I don't see an easy way around it. The way it is used makes sense, so it might be a non-issue.

3. Some extra documentation might be in order. The explanation of 'free tagging' is quite technical. For example, I don't understand the following bit: "Allows the creation of a vocabulary during content creation, as well as through the normal administrative means.". I'd rather see it explain the difference/advantage/drawbacks to help me decide whether to enable 'free tagging' or not. Make the documentation more task-minded.

4. I don't like the way you manipulate the pager's global variables. The paging code is a bit of a hack, it seems.

5. Spacing: we write 'foreach (' not 'foreach('.

6. We usually write code comments above the code, not after the code on the same line. This is really minor as I'm sure we don't do this consistently. Some code comments are rather cryptic and didn't help me much. Maybe give your code comments some love.

That's all for now.

chrismessina’s picture

FileSize
21.7 KB

In setting up the folksonomy, you present far too many options to the user. I tried to cut these extraneous options out but then decided to redo the whole Vocabulary creation workflow. :)

Go figure.

Morbus Iff’s picture

I'll address what I can. A new patch will be forthcoming.

#2: I agree - I previously wanted to keep everything as "term", and spent a good bit of time on #drupal getting jbond (who has since disappeared from all discussion) to agree that "term == tag" and the only difference between the two was their method of input. Eventually, I felt that "tag" was not only a "term" (and term), but also an action. I'm not "terming a node", but I'm "tagging it", which is a common sorta phrase in other folksonomy implementations. Thus, the mixing of the two.

#4: Yeah, I know it's a hack. I had a comment in there (since removed due to moshe's suggestion) that I knew it was a hack, and that integrating with the existing pager code would be uber-difficult based on hierarchies and the recursive nature of the taxonomy_get_tree.

#6: Heh, heh. Boy oh boy. Wrong thing to say. I often find myself overdoing it on comments (for example), and the patch as is was a concentrated effort to reduce the amount of comments I had originally put in there (see also). I'll take another look at 'em.

Morbus Iff’s picture

Regarding #7, it appears factoryjoe wants a far grander rewrite of the taxonomy UI than this patch purports to do.

Dries’s picture

If you can take on such rewrite (or parts thereof) based on Chris' suggestions, by all means.

Depending on the required UI changes, such overhaul might automagically deal with the pager implementation issues (if the pagers get nuked that is).

Morbus Iff’s picture

Not in this patch. His plans are for 4.7, and include wizards, removing most all of the checkboxes on that page, and so on and so forth. Likewise, he's only about 30% of the way there (per #drupal) in his UI mockups, so not to be considered with this folksonomy patch at all.

Morbus Iff’s picture

(Er.. which isn't to say that I think this folksonomy patch is for 4.6 - I know 4.7 will be its earliest release. But, he's just not ready with the workflow in his head yet for me to address any of his issues. And, based on discussion in #drupal, I'm not sure I want to be the one implementing the changes, much less agreeing with them, he has planned [g]). No offense to him, of course - he's (admittedly) too early in his thinking to take on all affronts.)

Bèr Kessels’s picture

Cris's ideas are great, but should IMO not be confused with this folksonomy isseu.

Can we not focus on getting folksonomy in, be it with a less-then-perfect-UI, and then open a new issue to improve the UI of taxonomy?

Bèr

Uwe Hermann’s picture

+1 from me. While I haven't tested the patch, yet, it looks very good and I'd really love to see this in 4.6. I hope it's not too late to get it in...

grohk’s picture

I will add my +1 to the pile. This patch is working well for me. I like Chris' mockup, but I also agree with Morbus that his taxonomy UI ideas are probably beyond the scope of this patch. I actually like that Morbus has added this functionality without drastically altering the taxonomy admin interface. Bravo.

chrismessina’s picture

As far as my workflow changes, yeah, they won't be ready for 4.6. I do think that getting this into the next release is important though, so that we have the general functionality.

I have concerns about "multi-select" and related terms though. I mean, those shouldn't be user options -- those should be allowed by default.

Also, I talked to Drumm at lunch about flat-lists vs tree-hierarchy and we seem to agree that it's a needless distinction. It's better to have a "tree-like hierarchy" (or controlled vocabulary/outline list) and "free tagging" as the main distinctions. Because it might make sense to make your flat-list a hierarchy later on, but not so much your tags. (even though Morbus tends to see a use for making a hierarchy out of free tags.)

But this later discussion probably belongs somewhere else...

So I'm pretty much okay with this moving forward with the understanding that the categories UI needs a wizard-like overhaul for 4.7.

Jaza’s picture

+1 from me for this patch. The interface is so simple - just one text box - and yet so powerful. Being able to add terms at node creation is a critical feature for the next release, IMO, but I never envisioned it being implemented so cleanly and intuitively. Great work, Morbus!

However, three problems/shortcomings that I found with the patch:

1. You cannot create new sub-terms using free tagging (yet). Say I have an existing term called 'sporting news', and I am writing an article about soccer and baseball. The term 'soccer' already exists as a subterm of 'sporting news'. I have no existing term for baseball. I would like to be able to enter into the text box:
"sporting news->soccer, sporting news->baseball"
And it would create 'baseball' as a new subterm (and assign the existing term 'soccer'). With the current patch, the only way to do this is to create the terms with free tagging, and then to go into the admin interface and make them subterms.

2. I don't like the new "view terms" link. I prefer having my vocabs and terms all listed together. In fact, when I made my first term using free tagging, I went into the admin interface, and couldn't work out where my term was, until I saw the "view terms" link. So this is a usability issue - Drupal admins are accustomed to the current layout of the categories page, and may find the change cumbersome. Perhaps make this a setting?

3. The free tagging text box should IMO be displayed AS WELL AS, not instead of, the regular 'select term(s)' list box. Surely users will want to select an existing term, rather than typing it? And if they do type it, they should be able to check (as they type) that they're spelling it correctly. Also, users should be able to see what existing terms there are, so that they don't create new terms that are virtually duplicates.

I also agree (with the person that already said it) that new permissions are needed as part of this patch.

Morbus Iff’s picture

Addressing Jaza:

#1: There was some discussion about this, and it was eventually put on the backburner.

#2: The goal of the new "view terms" was to address the immense size that a tag vocabulary can grow to. It is quite easy to have these vocabularies become 100+, continuing to grow ever bigger. Having the vocabulary admin screen contain 1000+ terms all at once is a problem. This is less of an issue with a controlled vocabulary, which is why those terms are still displayed inline (while it IS possible to create a controlled vocabulary of 1000+ term, it is a bit more unlikely than with a folksonomy). There was some talk of "if vocabulary term count is less than 50, always display inline", but I couldn't readily solve the UI issue of what happened between term #49 and term #50 (49 terms are displayed inline, you add a 50th one, go back to the overview screen and "wtf?! all my tags are deleted! arrrRRgh!"). The nearest fix to that was a "There are more than 50 terms in this category. View all terms." message, similar to the "No terms" one. If people feel this is a proper way of handling it, then I'll roll it into this patch.

#3: For the same reason as #2: a vocab with 1000+ is very prohibitive to have a dropdown. As for similar keywords, I plan on making a third party module that this sort of "similar keywords" GUI. for tag based vocabularies, which uses the "Related terms" feature of a taxonomy term.

Dries’s picture

Postponing the UI changes is OK, though I'd still like to see if we can make the pager stuff less of a hack.

If you have 1000+ terms you probably don't want to manage them as regular terms. You'll want to do things like searching terms (eg. "search for term "*John*"), merging terms (eg. "merge term 'Governer John Lynch' into term 'John Lynch'"), sort terms by popularity (eg. "what terms are used only once?") and act upon them in batch mode. Eventually, that might also impact the UI.

I also wonder how the folksonomy module affects the various taxonomy_* modules as well as the content filter on the 'admin/content' page.

This is going to be interesting. ;-)

Morbus Iff’s picture

Dries: the big problem with the pager() stuff is vocabulary hierarchies, which is what _get_tree gives us (along with additional depth and parent attributes). I could probably reduce the hack's size by using pager_query() with a throwaway SQL statement, but I'd also have to throwaway the returned database $result, since it wouldn't be useful (no hierarchy information). On the order of hacktitude, though, this is probably as equal a hack, only smaller (and possibly more expensive, since it'd be another db pull). Thoughts?

moshe weitzman’s picture

personally, i think a pager rewrite is out of scope for this patch. it is also a very minor part of the patch. during the next release cycle someone can go in and improve pager's API for handling collections that are not SQL query result sets.

syscrusher’s picture

Some comments on the tree hierarchy issue and also release schedules...

Comprehensive tree hierarchy support is nice if feasible, but I don't think it's essential for the patch to begin being very useful. As a simple, interim workaround, I would suggest using a slash or backslash (accept either of them, for maximum user-friendliness) as a delimiter between levels, and handle the hierarchy internally. For example, if the tree now looks like this:

one
-one A
-one B
-one C
two
-two A
--two A one
--two A two
-two B

then I could put "one/one A/newterm ABC, two/two A/two A one/newterm IJK,newterm XYZ" into the tags field to make it look like this afterwards;

one
-one A
--newterm ABC
-one B
-one C
two
-two A
--two A one
---newterm IJK
--two A two
-two B
newterm XYZ

(In my examples, I didn't use any backslashes because I wasn't sure how they would render in the issue, but the idea is that the two punctuation marks are treated as equivalent. Also, as a Linux maven I found myself instinctively putting a trailing backslash on things that "felt" like a directory path, so it would be wise to trivially trim trailing punctuation from each term when validating, because I'll bet I'm not the only one who would do that.)

Advantages:

  • Novice users can just ignore this admittedly-advanced feature, and still use Morbus' most elegantly simple UI with no changes.
  • Syntactically, looks like disk directories, which makes it a little more intuitive to users than other delimiters like "->" or "::" that I have seen similarly used in other software (and which are "intuitive" only to programmers).
  • Relatively modest code addition to the existing patch.
  • Adds, but doesn't change anything in the existing UI, nor require database schema changes, so it can be added after initial release of the module without requiring user-retraining or upgrade scripts.

Disadvantages:

What to do if the user mistypes an ancestor path component?

  • Add it as-is, as if doing "mkdir -p pathname" in Linux et al?
  • Issue an error message and ignore that term? (Node gets created with one free-tag term missing.)
  • Issue an error message and force correction? (Annoying to novices, but probably only advanced users would use this hierarchy feature anyway...and the message could helpfully list all of the existing terms at the failing tree level. The user would still manually type what they want when correcting the entry, but now they'd have a guide in the error message text to help them know what they mistyped.)
  • Try to intelligently match to a near-miss? (Complex code! SOUNDEX might help here.)

I still feel that the patch deserves consideration as-is, with only bug fixes before it is available as a contrib. Morbus has done a great job of laying a foundation to which we can add all this nifty stuff later without ripping out his existing UI. Let's leverage that elegant thinking and get this tool into the hands of site owners. My suggestion would be a published contrib patch available for 4.6, then let Dries do what he sees fit with regard to core for 4.7.

—Scott

chrismessina’s picture

At the risk of sounding obtuse, I really think that this module should be as simple as possible, and then let other modules add functionality to the new free tagging vocabulary type. And by that I mean that tree-hierarchies seem to me to be beyond the scope of this patch, even though you get it "for free".

I know that a good number of developers are going to come out against this idea, but in all the popular folksonomic systems that I've seen gain popular following (delicious, flickr, gmail and so on), they only offer flat tagging and that seems to sufficient. If you've used a tree-enabled folksonomy and can point me to a demo, please do -- I simply have never seen such a thing attempted before in a popular application.

As a matter of fact, Jaza makes the perfect case for me when he comments that "You cannot create new sub-terms using free tagging (yet). Say I have an existing term called 'sporting news', and I am writing an article about soccer and baseball.... I would like to be able to enter into the text box: "sporting news-soccer, sporting news-baseball" And it would create 'baseball' as a new subterm (and assign the existing term 'soccer')."

This is what a tree-like hierarchy is for and the limitation here is not going to, nor should it, be fixed with free tagging. Rather, this is a UI issue with the current implementation of tree-like hierarchies in Drupal. What Jaza wants is a tree -- so instead of free-tagging, he should be able to use his existing vocabulary "sporting news" and then be able to add a term to the tree inline instead of having to go all the way through the round-trip of adding the term in the admin UI. This is not free tagging; this is adding a new leaf to a tree -- free tagging, as Jaza suggests, would simply make it more convenient without actually solving the real problem.

Honestly, I think putting too much specific implementation stuff into this patch before we've had time to see how it's going to be used in the wild is a bad idea. I do, however, think that it's important to focus on the synonomic, spelling and merging issues that are inherent problems in folksonomic systems. Related terms are intrinsic in folksonomies, so I would rather see more work put into that UI issue than trying to recreate tree-hierarchies.

And I know that, just as I say, "we don't know how this is going to be used in the wild" I'll get the response that "well then we shouldn't limit its functionality until we know how it will be used" but I fervently disagree. That's one of Drupal's out-of-the-box problems. You get too much stuff! Give people things that are manageable -- things that I can fit in my pea-brain. And if you need more functionality -- by all means built it! And stick it in a module that I can download later! But folksonomy is important enough -- and Morbus has done a really great job keeping it fairly minimal so far -- that I think we can possibly go just a little further, pull it back some, and make free tagging work the way it should work instead of being just a more convenient version of the other types of vocabulary.

moshe weitzman’s picture

factoryjoe - all that text and i can't figure out what you are proposing. you proposing we don't use the vocabulary/taxonomy system for tags? if so, what is the better option? or maybe you are just emphasizing the importance of getting the 'realtions' stuff right. we will get there, and taxonomy provides excellent tools for doing so (see 'related terms' and 'synonyms')

folks, lets use *action* statements when we comment on patches. if you don't like something, make a counter proposal. patch reviews are not places for chatter.

i apologize for my grumpiness these days. i am seeing way more chatter on the devel list than focused collaboration on core functionality.

Morbus Iff’s picture

I think his points were:

* hierarchy in folksonomy = bad; don't work on it.
* minimal patch to start = good; let's see what people do.
* related terms = good; (my screenshot).

chrismessina’s picture

Yes, Morbus summarized it pretty well (sorry for sounding chatterish). I just don't want to see this excellent functionality muddled in its initial release, so the question of hierarchies, I feel, cannot be answered with the first release and so should not be included.

Tags should represent flat lists -- labels -- and nothing more (for now -- let other modules add to that later).

Anonymous’s picture

Q: "If you've used a tree-enabled folksonomy and can point me to a demo, please do".
A: Category in Wikipedia.
Which probably doesn't help us at all here.
I'm back. I
ll take a look at this over the weekend, but my guess is I'm going to do +1 and then niggle about specific code detail.

moshe weitzman’s picture

so the question of hierarchies, I feel, cannot be answered with the first release and so should not be included

OK, but take this a step further. what are the implications of this course of action? Should Drupal actively force all free tag vocabs to be non hierarchical? An admin can choose that configuration already. And that configuration is the default. Why write more code to disallow something that is optional and useful for some people? Or perhaps you think are proposing to divorce free tags from taxonomy entirely.

As you can see, I don't think this course of action stands up to examination.

Jaza’s picture

My response to FactoryJoe's strong rejection of hierarchies in folksonomy:

1. *grumble grumble*... *mutters under breath*... *curses audibly* :(

2. Yeah, well, I guess I can see where you're coming from, in terms of usability. You're right, hierarchical tagging is something that most users won't be interested in. And for that vast majority, introducing this advanced feature will confuse rather than conduce (productivity, that is). You have a point, Drupal's current out-of-the-box features are already daunting for many, and we want to reduce this problem, not aggravate it. So although I would love this feature, I do admit that I'm probably in the minority. As the leader of CivicSpace - which is targeting non-tech-savvy users more aggressively than Drupal ATM - you are one of the best placed people around here to comment on usability.

So maybe I'll just have to hack the extra functionality into a separate module, as you suggested ;-).

jbond’s picture

The only hierarchical free tagging system I know of is Categories in Wikipedia. del.icio.us does actually make it possible and some people did try using tags like this "branch1/branch2/leaf" but there's no specific UI to support it. My objection to trying to implement this now in Drupal is that nobody knows what the UI should look like. If we use a simple text field with comma delimited terms, how would the user indicate that a term they enter is actually part of a tree? Wikipedia gets away with this by having *no* UI, Categories are just another wikiTag.

jbond’s picture

Thoughts on the patch.

1) The regex to support quotes around terms that include a comma is really good. Explaining this to the user in the description of the term field may be hard but needs to be added.

2) In del.icio.us and flickr, related tags is an important part of the navigation. This is simpler and different from Morbus' example screenshot. The obvious place to cache this information is in {term_relation}. The question is when to cache it. For the sake of keeping the taxonomy patch simple, it could be done by a contrib module using cron. But the obvious place to do it is just after taxonomy_node_save() has finished and the obvious input is an array of tids or ideally an array of terms. That way the {term_relation} entries are immediately up to date rather than some cron time later. Since taxonomy.module doesn't do much of anything with {term_relation} I think this should be a hook for contribs rather than forced in. So I'd suggest a hook right at the end of taxonomy_node_save() that passes an array of tids. In this case, I think it would make sense to split taxonomy_node_save() logically into 2 parts. Process free terms first creating any that are not found. Add the found and created tids to $terms. Then use the existing code to iterate through $terms and save the {term_node} entries.

Futures.

1) I'd like to see a hook so that contrib modules can add functionality to the text input field for free terms. Specifically, adding one click, suggested terms for this node.

2) In developing UI for navigating free terms, I frequently want to get usage counts for a term, or to order queries on usage counts. This can get slow where there are lots of joins. So where should term.count be cached? A new field on {term_data} ?

Oh, and that's a big +1 Morbus, This is great stuff. I really want to get it in so I can take it further.

syscrusher’s picture

jbond wrote:

The only hierarchical free tagging system I know of is Categories in Wikipedia. del.icio.us does actually make it possible and some people did try using tags like this "branch1/branch2/leaf" but there's no specific UI to support it. My objection to trying to implement this now in Drupal is that nobody knows what the UI should look like. If we use a simple text field with comma delimited terms, how would the user indicate that a term they enter is actually part of a tree?

How would the user indicate the tree? Easy -- just as you did. {grin}

Use slashes/backslashes, as I proposed in my earlier post on this issue. There is no UI change required for the user at all, just additional instructions that they're allowed to use the slashes to indicate tree levels. Novice users can just ignore this feature; in fact, it could easily be a role-based "security" feature along the lines of input formats, where the "advanced user" role (for example) gets the permission "create nested free tags" under the permissions for taxonomy.module.

It's a trivial documentation change from the user's perspective, and within the code, just a little more regexp and array-splitting magic. No need for complex UI form elements.

Scott

factoryjoe’s picture

I will have some other comments about the hierarchy thing later, but I wanted to pose a quick off-hand question (unrelated to the tree issue!): if folksonomy gets added to core, would I be able to add tags with an XML-RPC app like MarsEdit? Eventually I would really like to have a desktop app that I can use to post to Drupal -- does the design or implementation of this patch support this activity for the future?

Morbus Iff’s picture

Regarding #33, the determination for when to "create term, then associate" (as opposed to a normal, non-tagging vocabulary of "associate") is whether a [tags] array is included in the data sent to taxonomy_node_save. I know nothing about MarsEdit, only a tiny bit of the Blogger API, and absolutely nothing about Drupal's XML-RPC interface, but it would seem that the assumption is that the category already exists in the backend. As such, I'm going to assume that this patch does not support addition of new categories from the Blogger API. Nor would I know how to do so.

Bèr Kessels’s picture

my $0.02 on the hierarchies:
I want them! *I* will be able to use them.

We shouold not forget about two things, when handling that useability club:

1) Drupal is not only used by pea-brained people. WE use them too, and IMO we ,as developers *always* come on #1. scratch you own itch above all.
2) If no-one does something, that is no reason not to do it. If no-one (virtually) uses linux and a mac, that does not mean we shold all be working on a windows machine!

Remember, factoryjoe, and all the others: free tagging goes far beyond the scope of that thing de.li.cio.us (or wherever they put these dots) does.

I, for example am testing to use it as a very simple keywords/taxonomy-on the fly/quick-filer system on a weblog and on a big photosite. Both do not use folksonomy in a community way (yet)!

syscrusher’s picture

Bèr Kessels writes:
> Drupal is not only used by pea-brained people. WE use them too, and
> IMO we ,as developers *always* come on #1. scratch you own itch above
> all.

This is a point of vital importance, in my opinion. As long as a novice user is not actively impeded, having advanced features valuable to sophisticated users is a good thing, not a problem.

The analog is the use of a GUI for Linux, UNIX, or BSD operating systems, which makes them more approachable to a novice -- a very good thing. But few people would argue that we should remove the command shell from Linux just because most novice users don't understand it.

We need to make Drupal accessible for beginners, true, but as Bèr wisely points out, that need not mean "dumbing it down" for the rest of us. If Drupal loses its appeal to advanced users and hard-core techies, they will not only leave the user community but also the developer community.

Scott

factoryjoe’s picture

It’s really very important to understand that my goal of improving Drupal’s usability isn’t a process of “dumbing down” anything. The goal is help people get things done in clear, logical ways. If that means simplifying a complex interface so that more people can get more done faster, I will remove extraneous, forward-facing UI elements to achieve that. I seriously have no interest in holding back developers from getting what they want so long as it doesn’t come at a significant mental effort cost to the rest of the potential user-base—Drupal’s wider adoption depends on it.

With that said, and applying that approach to the issue at hand, I have serious concerns about hierarchic folksonomies, especially with the numerous syntaxes that have been suggested (slashes, colons, arrows, etc). I do not believe, in other than a handful of cases, that you will be able to design a syntax for folksonomic hierarchies that really makes building said hierarchies easier, faster or more enjoyable. I do think that you can cludge a solution on after the fact through a separate module, like Morbus’ related terms module. Or perhaps you could infer a hierarchy from the structure of tags, but unless you can provide a Google-suggest-like feature that helps you build your hierarchy at the time of tagging, you’re going to end up with a mess.

Consider the example of tagging a sports new story (as suggested by Jaza). Start with this:

  • sporting news->baseball

Now let’s add teams to this:

  • baseball->teams->“red sox”
  • baseball->teams->“yankees”

And add an individual player:

  • baseball->teams->“red sox”>roster>“trott nixon”
  • baseball->players->“trott nixon”
  • “trott nixon”

Now here’s where this becomes unweildy… How many layers of tags do we need? Shouldn’t these “tags” come from a fixed vocabulary? Isn’t there a reason why companies spend tons of money developing such taxonomies? The point is, if you start using folksonomies for hierarchic organization, all you’re really achieving is convenience but not accuracy. What if, for example, I ended up tagging the story with these tags?

  • basebal->players->“mike piaza”
  • baseball->player->“trott nixon”

Though my intention was to have a unified hierarchy like so:

+--+  baseball
   |
   +--+ players
      |
      +-- trott nixon
      |
      +-- mike piaza

I actually ended up with two completely different hierarchies because of two simple typos (basebal and player). A Google-suggest feature might have helped me avoid that mistake, but we’re not going to be shipping with anything like that so we’re instead putting a huge burden on taggers to remember the correct tags, their spelling and in what order they should be applied.

Again, I don’t necessarily mind leaving in this feature for power-users… I don’t think I’m going to “win” this discussion anyway. But I really think that this problem is much more complicated than it’s been made out to be so far. And even though we don’t have to imitate everyone else out there, there may be a reason why others have not attempted folksonomic hierarchies yet. I personally use hierarchies in del.icio.us (at least semantic pairings, like person:boris_mann) but I actually perfer that del.icio.us hasn’t tried to dictate a syntax for this yet, preferring to offer an interface for build ad-hoc lists of related tags.

I still say that this module should be as loose as possible out of the box and that creating folksonomic vocabularies should be stupidly simple. I might recommend Steve Krug’s book on this topic… Don’t Make Me Think: A Common Sense Approach to Web Usability

jibbajabba’s picture

I was asked to comment on this (sorry if it adds chatter). Problem with the term "folksonomy" is that it implies hierarchy, when in actuality, free tagging implies the application of discrete descriptions (e.g. of one concept), not for hierarchical ones. That said, if people can add hierarchy into your term, they will. Likelihood of this happening is probably less than 20% in any system I would argue. Probably less than 5% even.

You probably have several issues here: At the very least, you have administrator's configuration of hierarchy and end-user's ability to tag hierarchically. Free tagging is, in my opinion, a non-hierarchical task. Things get put into hierarchies after they're first described. Then they become folksonomies. The real problem is with finding ways to deal with synonyms after the fact. I like Morbus' ideas about helping the system find similar keywords. By the way, I haven't yet used/seen a module that actually takes advantage of relationsihps in a taxonomy. I'd love to see that aspect of controlled vocabularies be utilized in a meaningful way here.

For an example of a free-tagging system that allows for post-tagging heirarchy, see what James Spahr designed for the Pratt Talent site. The process is 1) let users enter free tags, 2) let administrator drop tags into taxonomy, 3) use different methods of display, i.e. flat lists and hierarchies.

How this gets used will largely depend on the user environment. In multi-user environments, I agree that you want to keep the design as simple as possible. I would target that group for the design.

1) Simplify this from the end-user perspective by making free tagging a flat activity as much as possible (don't encourage hierarchy by default)
2) Consider ways to organize hierarchies as a function of administration (after the fact-classification).

-Michael

Morbus Iff’s picture

FileSize
20.16 KB

I've attached a new patch:

  • per #6,1: removed "node" from all (new) public help text.
  • per #6,3: expanded the (new) documentation in help/taxonomy.
  • per #6,5: fixed the foreach spacing error.
  • per #6,6: added/revised code commenting.
  • per #31,1: added "Company, Inc." to Example on the input box.

This patch does not address any hierarchy (none of my patches for this Issue will) or the pager() discussion. I will follow this comment shortly with the exact same patch EXCEPT for an alternative means of doing the pager() (as per comments #6,4 and #20). This alternative means has its own pros/cons.

Morbus Iff’s picture

FileSize
20.07 KB

I've attached an alternate patch that implements a slightly different way of handling pager() pages for the "view terms" administration screen. This alternative makes the display look exactly like the previous patch, so no new screenshot is needed. Ultimately, this patch reduces the lines of code needed, but at the expense of an extra SQL query which we do nothing with (which is the prime CON of this patch, complemented by a PRO that suggests the throwaway SQL is used on a rarely-visited page anyways). Per my comment in #20:

Dries: the big problem with the pager() stuff is vocabulary hierarchies, which is what _get_tree gives us (along with additional depth and parent attributes). I could probably reduce the hack's size by using pager_query() with a throwaway SQL statement, but I'd also have to throwaway the returned database $result, since it wouldn't be useful (no hierarchy information)

Regarding features, workflow, and interface, this patch IS NO DIFFERENT than the one in #39.

Morbus Iff’s picture

FileSize
20.95 KB

Mmkay. Another patch based on more comments. No new features, just fixes, usability, etc.

Dries: don't commit #40, the altpager version. It's broken. It does, however, still have merit as "another way to handle pagers", but if you want me to pursue it, I'll need to make ya a new patch. It won't, however, be much "better" than the original pager hack, mainly because I'll still have to handle the increment and "from" manually (which is why #40 is broken - I forgot to handle "from").

This patch's improvements:

  • per UnConeD: admin/taxonomy/# displayed same help as admin/taxonomy. Removed.
  • per UnConeD: admin/taxonomy/#/add/term showed "view terms" not "add terms". Fixed.
  • per UnConeD/#18,2: admin/taxonomy shows "This is a free tagging vocabulary: view terms."

Two other changes that are more substantial:

  • A recursion "bug" was found in taxonomy_get_tree that would cause the function to be called once for every term in the desired $vid. This is unnoticable for small vocabularies but, on a vocabulary with 8000 terms, caused any code that used that function to stall indefinitely. Since this would/could happen with an 8000 term controlled vocabulary, it was decided that this was a 4.6 bug, and UnConeD has already committed the fix.
  • admin/node shows a dropdown of terms in its filter dialogs. With 8000 terms, this caused the dropdown to slow the page down immensely (and only worked after increasing PHP's memory allocation to 64M, instead of the default 8M) and to make the dropdown pretty well unusable. node.module uses taxonomy_form_all to create this dropdown, and is the only (core) module that makes use of that function. _form_all, descended from _form, had a bunch of erroneous cut-and-paste parameters from _form that were never used in the actual code. I've removed these parameters and added a single new one, called $free_tags, that defaults to 0. When 0, it won't return any free tag vocabulary data, which removes the need to patch node.module (or any other contrib code that uses taxonomy_form_all). Developers who want $free_tags displayed will be able to pass 1 on their module_invoke.
Morbus Iff’s picture

FileSize
21.78 KB

I really really hope this is my last one for a while. This patch adds the indenting back in (removed per comment #2, and sorta shown in this screenshot). Whereas the screenshot used nonbreakingspaces for indenting, this new code uses CSS, piggybacking off the indent used on "Permissions" at admin/access. I've reduced the indent slightly after lamenting to UnConeD about the 2em being too much - he suggested 1.5em as the absolute minimum.

jjeff’s picture

Very cool. Think I found a bug though...

Until the first 'free tags' were created I was getting the following error:
Warning: Invalid argument supplied for foreach() in /Users/jeff/Sites/drupal/modules/taxonomy.module on line 758

that line is this:
foreach ($children[$vid][$parent] as $child) {

Once I created some tags, the error went away... and I am in free tagging heaven!

-Jeff

Morbus Iff’s picture

FileSize
21.66 KB

#43 has been previously addressed.

Final attached patch per Dries' comments in #drupal.

Dries’s picture

Committed to HEAD. Great job!

Morbus Iff’s picture

Priority: Normal » Critical
FileSize
583 bytes

Somehow, taxonomy_node_delete got removed in my last patch (I think I was in the midst of testing another patch directly related to taxonomy_node_delete). This puts it back in. Critical patch - without it, people can't edit nodes they've posted (as the DB will complain about duplicate indexes).

Dries’s picture

Committed.

Anonymous’s picture

bilgehan’s picture

Version: » 4.6.0
amanda’s picture

Title: Add Folksonomy, or "Free Tagging", to Taxonomy » Free Tagging with 4.6 Patch Broken for Forums
Category: feature » support

I implemented Free Tagging in 4.6.6 using freetag-4_6_3.patch from http://cvs.drupal.org/viewcvs/*checkout*/drupal/contributions/sandbox/mo...

And started encountering the issue described here:
http://drupal.org/node/28607

When I disable the free-tagging vocabulary for forum topics, the problem disappears.

Gerhard Killesreiter’s picture

Title: Free Tagging with 4.6 Patch Broken for Forums » Add Folksonomy, or "Free Tagging", to Taxonomy
Category: support » feature

Don't hitchhike old issues, open new ones. Since free tagging with 4.6 is a non-standard feature, it is unsupported.