Task description:

Drupal currently provides an interface for term synonyms, but those synonyms are not available or used in the system anywhere.

When using the 'freetagging' feature of taxonomy, entering a new word or phrase creates a new term entry, even if that word or phrase is a valid synonym for an existing term.
This creates bogus entries and duplication.

The behavior should be to (optionally) 'collapse' and re-phrase the user-submitted synonym into the canonic term.

  1. Create an interface setting to enable/disable this behavior on a per-vocabulary basis
  2. When a word that is detected as a viable synonym within that vocabulary is submitted in 'freetagging' input, convert that word into its base term.
  3. Enhance the current AJAX autocomplete behavior to offer the base term as a suggestion when a synonym is detected.

Task 3 may be a little harder than it looks, as the autocomplete term lookup currently fires on all partial matches within words. In a large vocabulary containing many synonyms this could result in unintuitive false positives. But investigate.

Example

In a taxonomy of US states, a term "Virginia" may have a synonym "VA".
User should be able to type "VA" in a taxonomy autocomplete field and submit the node and have it matched up with "Virginia" in the database.

Deliverables

Deliverables would include a patch against
taxonomy.module [DRUPAL-6-HEAD] taxonomy_autocomplete() and the vocabulary preferences form (maybe other places, like autocomplete.js , but that should be avoided)
Consideration must be given to database efficiency and performance implications when modifying the code.

This will involve skills in meaningful metadata (taxonomy) management, and, for fun, AJAX!

Resources:

Taxonomy import tools :

Estimated time: 3 Days

Primary contact: dman

Comments

dman’s picture

Title: GHOP task: Integrate the existing taxonomy 'synonyms' support with Drupal freetagging. » GHOP #113 : Integrate the existing taxonomy 'synonyms' support with Drupal freetagging.

Opening issue this for the task. Also found over on Google here

dman’s picture

FYI - ClaimedBy-cwgord Jan 6 2008
... looking forward to it.

cwgordon7’s picture

Assigned: Unassigned » cwgordon7
Status: Active » Needs review
StatusFileSize
new4.79 KB

Needs review

catch’s picture

amaaaazing. Tried with multiple-word terms etc, all seems very smooth.

A couple of issues:
1. patch has windows line breaks (easy to fix in Notepad++ or similar)
2. I think there should be a drupal_set_message() to inform the user that foo has been saved as bar because it has been defined as a synonym by the administrator.
3. It doesn't deal with 3: - is that by design?

I reckon 2 without 3 would be fine though to avoid unnecessary complexity in the code. I'm not sure how usable it is to second guess what people are typing for autocomplete (think MS Word).

There's an issue against D7 which attempted to do this (but iirc, wasn't as clean and was CNW). I've wanted this for a long time but fear it won't make it into D6 since it's a 'feature'. Any chance of modularising it once the task is complete?

catch’s picture

Status: Needs review » Needs work

forgot to change status.

And one more thing:

Synonym collapsing
Enable collapsing of synonyms for this vocabulary.

My synonyms are going to collapse??? I think we need to find another way to describe what's happening. I'd probably say something like "Alias synonyms to terms when saving freetagging forms" - although that's not much good either.

dman’s picture

Good-o.
It applies just fine.

I'm not sure about the best way to flag schema updates as this does to the 'vocabulary' table. This taxonomy.install patch currently works only on a brand-new empty D6 site. To test it on a D6 already with some content (and vocabs) I had to do the schema update by hand.
However, creating a numbered schema update func for a module still in HEAD/dev is possibly overkill, so I won't stress it, but it's tricky for folk to test.

I imported a 'Countries and States' taxonomy, as described in the example, and used devel_generate to fill in a handful of nodes.
(devel_generate seems to be a bit broken with freetagging at the moment)
I then enabled freetagging, and enabled the synonym_collapsing using the new checkbox. All good.

Editing a page, I was able to add 'ny' and 'ab' in the textfield.
Then saving the node, I found that they had indeed resolved to 'New York' and 'Alberta', so things are working.

But.
I found this resolution only happened on 'save'. Not 'preview'.
I would have expected the collapsing to happen on preview ... as that's what preview is for - to ensure everything is resolving properly.
Can you try that? I'd call that critical.

A small issue that will need to be addressed is duplicate handling. Just to mess things up I had 'CA' as a synonym of both Canada and California.
Collapsing that returned only one (Canada). This is possibly a deeper issue with the core taxonomy_get_synonym_root(), but only because this is possibly the only time in history that function has been called!
Probably not critical, but currently the behaviour is 'undefined' - which is undesirable.
Better to do no collapsing (old behaviour) than risk getting the wrong one IMO.

And the fun part - item3, the AJAX suggest, isn't there yet. Have you thought about what's needed?

... and in the process of getting testing up, I went away and upgraded a branch of taxonomy_xml.module (my first D6 conversion) so I'd better go tidy that up. I'll see if I can package up my test vocab for testing purposes. :-)

Good start though!

catch’s picture

I also just came up on the duplicate issue:

core's synonyms allow you to:

1. add synonyms which are the same as synonyms in different terms
2. add synonyms which are the same as real terms in the same vocabulary.

1. Should possibly be disabled?? Or some kind of validation warning when you do it? (not necessarily for this task of course)
2. hmm

I can think of use-cases for the second.
Say I have a load of news items tagged with New Zealand, then New Zealand changes it's official name to Aoteroroa. I want my old items to stay in New Zealand, but any new ones to move to Aoteroroa, so I enable this setting and it happens by magic. There's probably use cases in the other direction as well though (like keeping both tags and using the synonyms to make them show up in searches.

dman’s picture

#2 is not too bad.
In the first case, if you made a vocab like that, it's your own damn fault, and you should have used related terms or something.
In the second case, if your taxonomy is growing organically and arrived at this situation, it's getting more and more specific terms added to it, so splitting the synonym off into its own term later would, as you say, have pretty much the desired effect.

#1 is the more annoying, and less visible one. There is no interface to list synonyms, and there is no current prevention of doing this. Nor need there be, really, as in a large semantic vocabulary such a situation may actually be legal.
BUT we do have to capture it at the automatic stage of possibly unexpected 'collapses'.

catch’s picture

#1, yeah I agree with this. Maybe it should do it by the weight of the term? ack but then most terms will have identical weights anyway :(

Or maybe if you try to save the form with the 'collapse synonyms' option checked on a vocabularly which has duplicate synonyms, it should stop and tell you off. That wouldn't prevent people doing adding dups afterwards, but it'd help, and would just be an extra bit of validation.

cwgordon7’s picture

Ok, not sure what you want me to do. A major rewrite of the node module & taxonomy module to make terms change during preview? Setting a message "Term x has been changed to term y"? Rewrite the definition of synonyms so that synonyms can't be assigned to more than one term? Or just re-saving the patch with unix line endings? ;)

And I did add the AJAX stuff: all the patch's changes to taxonomy_autocomplete were addressing that. Was that not what you had in mind?

Maybe it would be better for you people to work out the details so that the time I commit into this is well spent. :)

-cwgordon7

dman’s picture

Ok, not sure what you want me to do.

This is all just responses to the code we tried, throwing some issues and suggestions for improvement out there. It's a small discussion.

A major rewrite of the node module & taxonomy module to make terms change during preview?

A major rewrite? How about just catching it in form-validate?

I just found on testing that it was disconcerting (actually wrong) to enter a synonym, preview the submission (no change) then save and find that a change magically happened then, when it was too late to fix.
Although the thing works just peachy if you never preview, a misleading preview is wrong behaviour from the user POV.

Although the wording in the task is, to be pedantic, "submitted"
2: When a word that is detected as a viable synonym within that vocabulary is submitted in 'freetagging' input, convert that word into its base term.
The behaviour for this to work as desired is for the user to be able to see (and hopefully understand) the collapse take place.

Needs to be fixed

Setting a message "Term x has been changed to term y"?

That's probably a good thing. It'll only happen very rarely, and it's better to be clear as it could be obscure and confusing to a user who doesn't understand WHY their term won't save right.
So I say yes.

Rewrite the definition of synonyms so that synonyms can't be assigned to more than one term?

This is certainly out-of-scope :-)
It's just a related issue that was brought up for discussion as a result of this task. Not your problem.

...although you may contribute an opinion on how you think duplicates should be handled. This question is still open.
Possible behaviour could be to flag a message something like "CA is listed as a synonym for both Canada and California. Please specify which term you really meant. Neither has been added yet"

Or just re-saving the patch with unix line endings? ;)

Well that's good too.

And I did add the AJAX stuff: all the patch's changes to taxonomy_autocomplete were addressing that. Was that not what you had in mind?

I apologise, I did not see that in the code. It certainly wasn't working at all for me in the UI. ... and still isn't. I'll see if I can find out why.

dman’s picture

Looking at the taxonomy.pages.inc taxonomy_autocomplete()
... I'm not sure the logic is the right way around. It's just not triggering for me.

Desired behaviour is to enter "ny" and have "New York" appear as a suggestion.
Or, if "Chartreuse" is a synonym for "Green", entering "char" will return "Green" as a suggestion.

The code as it stands appears to loop looking for synonyms inside a lookup first made for major terms. If there are no major term hits, there is no attempt to fall back to a synonym lookup.
And if there IS a hit, the lookup is looking for the wrong thing.

Can you describe the sequence/situation that you've solved here? I can't quite see when it would fire.
- a string was entered
- that string was found as part of a term name
- the synonym root of that found term is retrieved??

Is that's what's happening? Am I missing something? Coz that won't work.
Does it work for you?

cwgordon7’s picture

StatusFileSize
new6.8 KB

Attached is a new patch that is pretty throughly tested. Hopefully this is what you envisioned.

As to how I think the taxonomy module should handle synonyms: the admin should be warned when attempting to add a synonym to a term that already has another term its a synonym of. Perhaps open up a separate issue for that.

cwgordon7’s picture

Status: Needs work » Needs review

Whoops, setting to cnr.

dman’s picture

Groovy!
Looks to be working exactly as hoped.

(note that the patch file isn't correctly built from root, although the first one was created just fine. I'm not sure if that's down to the 'unified' format or what the working directory was. I could fix it by moving into the modules/taxonomy dir, so no biggie)

So...
Ajax suggestions work great so far.
Collapses show up on Preview.
Collapsing makes the right guesses, and tells the user what just happened.

I <3 it.

Functionality seems to be there!
I'll just do some more edge-case tests and a pedantic code review. (sorry ;-) and I'd be ready to call it good.
Anyone else is welcome to kick its tyres too.

dman’s picture

I dunno, but possibly a clearer description could help everyones understanding and uptake.

'Enable collapsing of synonyms for this vocabulary (entering a synonym in freetagging will attach the primary term as a tag).'

Non-critical, just a random thought.

...
I like the addition of the taxonomy_collapse_synonyms() function. Very tidy.
...

When deliberately trying to break it (sorry again, but this is testing) to try out the duplicate resolution, I got two unexpected results on preview.
They did actually work out fine on submission, (it's preview only as far as I can see) so they are not killer, but still not quite perfect.
Both involved ignoring/not using the AJAX suggestions, and making logical errors, so it's pretty much edge cases, but:

Problem 1 - collapsing didn't happen right if there was a duplicate and two synonyms.

Problem 2 - collapsing didn't happen right if there was a mixed-case entry alongside a synonym for the same.

Relax. Do not worry about this unless there is an easy fix you can see. The mixed-case problem may be simple. I'm not really sure what went wrong with the first one tho.
Because, as I say, I'm trying to break it, I won't call these serious problems. I'm just being a mean, stupid user. ;-)

I'm really happy with the AJAX suggestions. I haven't hit any bad conflicts as I thought may have happened.

Overall, code looks good. I'll see if anyone else wants to review, but I'd call this a success if nothing else comes up by the end of the day.

dman’s picture

Status: Needs review » Reviewed & tested by the community

OK. A day later and no other feedback, so I'll just call that done and done!

The specified task is certainly achieved.
The result is much better and more usable than before.
The code is good and tidy.
The additional DB call doesn't look to onerous on the system.
Synonyms finally MEAN something to editors now! That's been a long time coming.

Thankyou!

I'll go do the paperwork at GHOP.

I'll mark this as RTBC, as that's the result we wanted to see for this task ... Although I can't make the call on my own as to whether this actually gets rolled into D6 core. I think it should, so +1 for that.

gábor hojtsy’s picture

Version: 6.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Needs work

This is not just a new feature, but a database change without an upgrade path as well. Not to say that the feedback in #16 was not taken care of. New features and database changes go to Drupal 7.

dman’s picture

True 'nuff. I forgot about the database change, which is a bit sticky.
Upgrade path is transparent IMO, But it certainly rates as 'new feature'.

catch’s picture

fwiw, I've discussed with cwgordon about turning this into a contrib module for D6, so hopefully we can refine how it works there - and either keep it as a very useful contrib, or revisit it for core in D7 depending on how that works out.

gábor hojtsy’s picture

catch: sounds very good!

catch’s picture

I should note that we discussed cwgordon turning it into a contrib, not me ;) Although if he decides not to that might have to be my first module.

dman’s picture

I personally think it is such an integrated part of taxonomy management, and such a small volume of code, that having it as a contrib would be much more overhead than it's worth. It touches parts of code where hooks don't and don't need to exist. It's a hard one to do as contrib (IMO)

However, if (in 7) core was streamlined to abstract synonym storage in general out into its own contrib, that would make lots of sense. Currently, and since forever, Drupals core 'synonyms' have been pure cruft.
If this feature - the only practical use of synonyms ever - doesn't belong in core, then synonyms don't belong in core.

cwgordon7’s picture

dman: I agree, synonyms—and this patch—do belong in core. However, since no new features go into drupal six, this is only applicable for Drupal seven core (and it should be there). Providing it as a contributed module for drupal 6, however, would maybe not be ideal, but would still be useful for people who use synonyms on their sites. It's not so hard to do as a contributed module: just some fancy hook_form_alter work on #autocomplete_path, plus some cool stuff with hook_link_alter, and that's it. The module's almost finished, and should be up sometime today or tomorrow. Fixes for the problems dman pointed out should appear in the contrib module; I'll post a revised patch in maybe a few days.

-cwgordon7

cwgordon7’s picture

Update: Has been added as a contributed module at http://drupal.org/project/synonym_collapsing.

cwgordon7’s picture

Title: GHOP #113 : Integrate the existing taxonomy 'synonyms' support with Drupal freetagging. » Integrate the existing taxonomy 'synonyms' support with freetagging.

Removing "GHOP" from issue title so people feel free to come in and help / test :).

dman’s picture

I was actually quite serious about pulling synonyms out of core.

Obviously they are only 'used' by a tiny minority of special-use coders, if at all. For every other Drupal site out there, they are overhead, and possibly UI clutter. Why not actually remove unused features from core :-) ?
Making them an optional core or stand-alone contrib (but with this actual functionality, maybe more, attached) does make sense.

cwgordon7’s picture

Category: task » feature

I think synonyms definitely do belong in core: perhaps a separate module from the taxonomy module? Taxonomy, after all, does not require synonyms, but synonyms are a nice-to-have core functionality for nearly all freetagging vocabularies with this patch. The answer is not to just remove synonyms from core because they're not used, but to ask why they're not used and improve them. My answer: (1) They didn't do anything, and (2) It's a little hard to understand at first: are taxonomy terms synonyms of each other? Are synonym relations two-way? (Etc.) Better documentation / descriptions would improve #2, and this patch would make a good start at #1

Setting this to feature request because, well, it's a feature request.

fractile81’s picture

Just wanted to chime in and say that my shop uses synonyms. We had a legacy taxonomy (non-Drupal) with synonyms, and it was great to have that there in core. That said, it was a little disappointing to not have more uses for it (as mentioned above). This new feature begins to open up the possibilities of synonyms, making for some very interesting free-tagging! I can't remember if this is stock or something I've coded for my site, but synonyms on taxonomy/term pages is one place they should definitely be showing up.

+1 to the ideas here (except for pulling it out of core), and I've already downloaded the contrib module (just haven't been able to play with it yet). Now that synonyms are in the spotlight though, maybe new restrictions need to be imposed? Like-named terms shouldn't be allowed as synonyms, as an example (case insensitive). And certainly better documentation is always helpful!

Again, great work here!

cwgordon7’s picture

Status: Needs work » Needs review
StatusFileSize
new7.02 KB

I believe this should do (?).

oriol_e9g’s picture

+1 to put this patch in core in D6 or D7, but I don't like the last change:

$added_terms[strtolower($term)] = strtolower($term);

I have a taxonomy that the term "a" and "A" are different terms.

cwgordon7’s picture

Ok, so let's decide something: are taxonomy terms going to be case-sensitive? That's really what #31 is about. Is it really useful for people to have different terms for different cases? Or does it just add more confusion?

fractile81’s picture

So long as the auto-suggest matches in a case insensitive way, I'd say make it case sensitive. I agree that sometimes it might sound silly to be case sensitive, but you never know how people are going to use their Taxonomy. "Us"/"us" vs. "US" and any other acronym collisions would be a good example.

cwgordon7’s picture

So then the patch in #13 is really the desired behavior.

sun’s picture

Aren't we able to suggest all terms case-insensitive, but replace only 1:1 matches? This patch should not prevent users from adding new terms, but only replace synonym matches.

+1 aside from that, badly needed for #187480: Allow free tagging vocabulary in issue followups

catch’s picture

Ideally the synonym collapsing should work just as if the term was a 'full' one - so it'd be good to follow whatever the autocomplete/new tag creation behaviour now. in regards to case-sensitivity. If it's case insensitive - then this patch should use drupal_strtolower though. Case sensitive tags are really an issue for non-free tagging vocabularies, which this doesn't impact on at all.

SeanBannister’s picture

+1 this is a great feature and I know many websites that would benefit from it. I think synonyms have a very important role in many Drupal sites and this takes it to the next level. I'm installing the D6 module of this feature now.

Could case-sensitivity be an option on a per taxonomy basis. The websites I run wouldn't want case-sensitivity.

catch’s picture

Status: Needs review » Needs work
Issue tags: +GHOP

Needs re-rolling for dbtng. Since the EOL taxonomy sprint, where our synonym support showed itself to be useless for nearly all use cases (apart from this one of course), I'm thinking we should move synonyms into a contrib module for D7 - this would be a cool feature for such a module, which could also handle some other stuff with synonyms too.

catch’s picture

Title: Integrate the existing taxonomy 'synonyms' support with freetagging. » Synonym collapsing in core
Assigned: cwgordon7 » Unassigned
xano’s picture

Assigned: Unassigned » xano
Status: Needs work » Needs review
Issue tags: -GHOP +d7uxsprint
StatusFileSize
new4.92 KB
dries’s picture

Status: Needs review » Fixed

I've tested this and it works. Committed.

dman’s picture

Hooray!
This is goodness!

xano’s picture

Note that this patch only updates the autocomplete functionality and not the backend. Users can still enter synonyms and they will be added as new terms. We are still thinking about a proper way to handle this.

damien tournoud’s picture

Status: Fixed » Active

Erm. Why is the second LOWER() missing from this?

-      ->where("LOWER(t.name) LIKE LOWER(:last_string)", array(':last_string' => '%' . $last_string . '%'))
+      // Select rows that either match by term or synonym name.
+      ->condition(db_or()
+	    ->where("LOWER(t.name) LIKE :last_string", array(':last_string' => '%' . $tag_last . '%'))
+	    ->where("LOWER(ts.name) LIKE :last_string", array(':last_string' => '%' . $tag_last . '%'))
+      )

Also, this is making an already slow query orders of magnitude slower. We really need a better solution for this.

damien tournoud’s picture

Note: the second lower is not needed on MySQL, but would break on PostgreSQL.

xano’s picture

+  $tag_last = drupal_strtolower(array_pop($tags_typed));
catch’s picture

Category: feature » bug

I'd rather see the synonym query only run if we don't have a current match for a tag, this at least means only one slow query at a time.

damien tournoud’s picture

@Xano: because collation rules between PHP and the database might (and are) different, you have to run both LOWER()s in the database engine.

xano’s picture

Status: Active » Needs work

That means the current logic for finding matches needs a total rewrite.

@catch: Currently we only use one query, which is just a tad slower than the original query. I'm not sure if we should only search for synonyms if the typed tag doesn't match any terms. Especially with larger vocabularies (and the increased possibility of terms with nearly similar names) this can cause unwanted behaviour.

asb’s picture

Can someone please update the status of this issue?

If I remember correctly, support for synonyms has been dropped from D7, so how do we continue here?

SeanBannister’s picture

@asb Drupal 7 allows you to add fields to taxonomy terms, so you can still add a synonym field just like Drupal 6. But because D7 core doesn't come with a Synonym field anymore this issue would have to be dealt with in contrib.

catch’s picture

Version: 7.x-dev » 8.x-dev
asb’s picture

Thanks for the update.

If even synonym handling itself moves to contrib, this issue isn't relevant for the castrated taxonomy.module in D7 and up. So the 'component' tag must not be 'taxonomy.module', right?

Since not even D6 plus existing contrib modules support synonym collapsing (see 'Synonym Collapsing' issue queue), this also can not be a bug report against core anymore (#47).

As I understand it, this is now

  • either a wishlist for a not yet existing contrib module that provides a synonym field for D7,
  • or a feature request for D8 taxonomy.module to re-implement synonyms plus synonym collapsing in D8 core.

What will it be?

catch’s picture

There's no reason for it not to be both.

catch’s picture

Category: bug » feature
dman’s picture

Due to extremely low uptake of this feature even when it was built in, it sure seems like a contrib job now. But all the hooks are deliberately left in place so it is possible to do well in contrib, next time someone tries

tanoshimi’s picture

The 7.x-1.x-dev version of synonyms module already provides a synonym field for taxonomy terms and also adds synonyms to the search index. It seems like synonym collapsing could easily be added into this module too.
http://drupal.org/project/synonyms

xjm’s picture

Status: Needs work » Closed (won't fix)

Since synonyms no longer exist as such in core, I think this is for contrib now.

asb’s picture

xano’s picture

Assigned: xano » Unassigned
xano’s picture

Issue summary: View changes

Subheadings and list formatting added, primary contact linked