Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Posted by xjm
Problem/Motivation
(From #1050466: The taxonomy index should be maintained in a node hook, not a field hook.)
- If a node has multiple taxonomy autocomplete fields that use the same vocabulary and the user enters the same value in two fields, the term will be duplicated.
Example scenario
This is not just a weird edge case. Consider:
- You have a "University" vocabulary.
- You allow users to enter their undergraduate school in one field, and their graduate school in another.
- The user enters "University of Texas" in both fields.
- When the node is saved, there will be two different terms named "University of Texas."
Proposed resolution
- Check for existing records with the same name in the same vocabulary before creating new terms.
Remaining tasks
- It's been suggested that the query used in
taxonomy_field_presave()
to check for existing terms could lead to race conditions:
+++ b/core/modules/taxonomy/taxonomy.moduleundefined @@ -1710,6 +1710,15 @@ function taxonomy_rdf_mapping() { + // Avoid duplicating tags within the same vocabulary. + $tid = db_query_range("SELECT tid FROM {taxonomy_term_data} WHERE name = :name AND vid = :vid", 0, 1, array( + ':name' => trim($item['name']), + ':vid' => $item['vid'], + ))->fetchField(); + if (!empty($tid)) { + $items[$delta]['tid'] = $tid; + continue; + }
See #1050466-34: The taxonomy index should be maintained in a node hook, not a field hook for some proposed solutions.
User interface changes
- None.
API changes
- None.
Comment | File | Size | Author |
---|---|---|---|
#27 | add_duplicate_term-1343822-27.patch | 10.43 KB | chr.fritsch |
#25 | interdiff-1343822-22-25.txt | 3.64 KB | chr.fritsch |
#25 | add_duplicate_term-1343822-25.patch | 10.44 KB | chr.fritsch |
#22 | add_duplicate_term-1343822-22.patch | 7.17 KB | chr.fritsch |
#20 | add_duplicate_term-1343822-20.patch | 5.64 KB | chr.fritsch |
Comments
Comment #1
xjmComment #2
xjmComment #2.0
xjmUpdated issue summary.
Comment #2.1
xjmUpdated issue summary.
Comment #3
xjmMarked #1146744: The tagging widget doesn't check if the same term already exists as duplicate. From that issue:
Comment #4
catchI think there's another issue for this somewhere as well, but can't find it at the moment.
Comment #5
fietserwin#4; #610076: Add a primary key to the {taxonomy_index} table and prevent duplicates being inserted?
I think that the race condition can be avoided by using a merge query (aka upsert), That is, if that would be a real merge query at database level OR if the 2 separate queries (that are issued directly after each other) are placed in a transaction.
As further work on #610076: Add a primary key to the {taxonomy_index} table and prevent duplicates being inserted is going on, I think this one can be closed as a duplicate. Perhaps the test can be added to the mentioned issue.
Comment #6
kscheirer#1: autocreate-1343822-combined.patch queued for re-testing.
Comment #8
xjmI'd suggest postponing work on taxonomy autocomplete bugs at least a week or so until we see what direction #1847596: Remove Taxonomy term reference field in favor of Entity reference is going to take.
Comment #9
james.williams CreditAttribution: james.williams commentedHere's a port of the patch for Drupal 7, without the accompanying tests. Applies cleanly & fixes the bug for me.
Comment #9.0
james.williams CreditAttribution: james.williams commentedUpdated summary.
Comment #9.1
xjmRemoving myself from the author field so I can unfollow. --xjm
Comment #10
jibranAll the taxonomy field related matters are handled by entityreference field after #1847596: Remove Taxonomy term reference field in favor of Entity reference
Comment #11
amateescu CreditAttribution: amateescu commentedDrupal\taxonomy\Plugin\EntityReferenceSelection\TermSelection
is provided by the Taxonomy module, moving back to the right queue.Comment #12
anrikun CreditAttribution: anrikun commented@james.williams
Thanks for patch at #9!
Comment #13
michaelmallett CreditAttribution: michaelmallett commentedPatch worked great for me, thanks. Is there any way to get this in core? Just wondering why it's been around for 3 years.
Comment #14
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSince there is no more dedicated taxonomy reference field, just the generic entity reference one, this can no longer be fixed at the field level, we need to move the fix higher up the chain to taxonomy term entity presave.
But, as far as I see, there is no mechanism that prevents adding terms with the same name in a certain vocabulary, so maybe this should be added as a vocabulary configuration option?
With that in mind, I'm moving the issue to 8.1.x and reclassifying as a feature request.
Comment #17
ABaier CreditAttribution: ABaier commentedI am wondering if there is any movement within this issue or maybe even a patch for 8.2.2?
My editors can add multiple event dates to a node using a paragraphs type which also includes a taxonomy reference field for the type of each individual event (for example "concert"). Field settings are autocomplete tagging and "Create referenced entities if they don't already exist".
If they add multiple instances of this paragraph on node creation, all containing a similar, non existent tag of "event type", this term will be created multiple times with the same name, but varying TIDs, after saving the node.
Unfortunately I cannot help with the integration, but I would think the existence of the entered terms has to be validated, maybe for each individual instance of the taxonomy field, one after another, if it appears multiple times inside the node form.
Comment #18
ABaier CreditAttribution: ABaier commentedAnother case where a taxonomy field could exist multiple times in a node edit form, besides paragraphs, would be the use of media entity. Maybe many more, if inline entity form is used to create other entities within a node.
I ran in both issues, paragraphs and media entity. The paragraph is used to tag for example the location of an event and the media entity image/video can also be tagged with this location vocabulary. This is all for categorization purposes and granulary filtering of content. To mention is, that the node itself also contains some taxonomy fields including the location field.
What I want to say is, that the chance of having more instances of one taxonomy field is quite big. Especially with the upcoming node creation workflows using the mentioned modules.
Now that media entity will be ported to core this issue really should be considered.
Any ideas?
Comment #20
chr.fritschHere is a first patch. I implemented the check as a Validator.
Comment #22
chr.fritschFix some tests
Comment #23
chr.fritschI know, tests are still missing.
I want first to get a sign-off, if this approach is feasible and we want to continue into this direction.
Comment #24
webflo CreditAttribution: webflo at UEBERBIT GmbH commentedThis method should be on the interface. I think its not possible to add methods to an existing interface. Not sure about the BC implications. I think we have to add a new one.
Should check for the new interface first.
Access checks are applied to entity queries by default. Maybe we should opt out to make sure its really unique?
I agree on the overall approach.
Comment #25
chr.fritsch1+2. I discussed #24-1 with @dawehner and we think because of https://www.drupal.org/node/2562903 it should be possible to add a new method to the class and the interface.
3. Opted out for accessCheck
Additionally i added a test to the TermTest.php
Comment #26
mtodor CreditAttribution: mtodor at Thunder commentedThis looks good, I have tested it (works fine) and checked code. Only few nitpicks.
This expression can go after
if (!Vocabulary::load($entity->bundle())->preventDuplicates()) {
statement. And also it would be better to get$items->getFieldDefinition()
in some variable and then use it in 2 places, because $field_name is used only once, so there is no need for variable. There is also repeating for$items->getEntity()
and$entity->getEntityType()
- they could be assigned to variable.If we already have check
$this->assertText('Created new term ' . $term->label());
at end of test, maybe it can be added here too.Instead of making new random name, it would be easier to just make something like this:
$edit['name[0][value]'] .= '_2';
- to avoid 1 in 64509974703297150976 cases when we will get two same random generated names. ;)Comment is a bit misleading. I think it should be: "Indicates if multiple terms with same name are forbidden in the vocabulary." - because that's behaviour when you get TRUE.
Overall -> VERY NICE!
Now we will not get errors like: Multiple entities match this reference; "My Tag (3)", "My Tag (7)". - when I just want to use "My Tag". :)
Comment #27
chr.fritschI adjusted all the comments from #26. Unfortunately it was not possible to create an interdiff.
Comment #28
gg24 CreditAttribution: gg24 as a volunteer and at QED42 commentedSteps followed to test the Patch #27 on version 8.4.x:-
1- Applied the patch.
2- Checked "Prevent duplicates" checkbox on edit vocabulary page.
3- Added two taxonomy referenced fields referencing same vocabulary in a content type.
4- Checked "Create referenced entities if they don't already exist" for both the fields in the respective content type.
5- Created a content for the above mentioned content type and selected same term for both the taxonomy fields.
6- Saved the node.
This still creates two terms with the same name under same taxonomy vocabulary.
Hence, this patch doesn't work as intended.
Please mention if i need to follow any other step too, if i missed something.
Comment #31
cbeier CreditAttribution: cbeier commentedComment #32
cbeier CreditAttribution: cbeier commentedThe validation works currently only in term edit and not in the autocomplete field. In the autocomplete field is in the current state no further validation running (as far as I know).
For the problem with validations inside the term autocomplete field is another issue: #2438017: Entity reference does not validate auto-created entities
If the general possibility to implement validations for the autocomplete field, we can also implement the duplication validation.
Comment #34
acbramley CreditAttribution: acbramley commentedIn my opinion this shouldn't be throwing validation errors, but should be a seamless operation on a tagging widget (potentially via a setting on the field definition or widget) that just uses the existing term that was previously created. This has to be done somewhere in the entity reference field type or the EntityReferenceSelection plugin. This is due to the code flow:
1) The EntityAutocomplete Element plugin validation calls SelectionWithAutocreateInterface::createNewEntity (this doesn't save the entity straight away)
2) When the second field's element goes through validation and looks for the existing entity, it doesn't find the one that was just created since it hasn't been saved. Therefore a new entity is created.
3) EntityReferenceItem::presave saves the unsaved entities.
We could instead pass along an option to the EntityReferenceSelection plugin's config to auto save those entities? This would obviously lead to some orphaned terms for incomplete forms but for me that'd be better than the current state.
It also seems like this issue's title has diverged from what the issue summary is describing so I'm going to set it back to the original title.
Comment #36
guilhermejz CreditAttribution: guilhermejz commentedI have done that, just $entity->save(); before the return on createNewEntity function. It works.
Comment #43
smustgrave CreditAttribution: smustgrave at Mobomo commentedClosing as cannot reproduce in 9.5
Added 2 fields to a content type.
Both pointing to the same taxonomy
Both autocomplete tags style
Added Red to both
Tried to save the form but it actually throws a validation error and won't let me save ( which is good I think).
No terms added to the taxonomy
If you feel this is still an issue please reopen with an updated issue summary, steps to reproduce, proposed resolution, etc