Posted by xjm

Problem/Motivation

(From #1050466: The taxonomy index should be maintained in a node hook, not a field hook.)

  • If a node has multiple taxonomy autocomplete fields that use the same vocabulary and the user enters the same value in two fields, the term will be duplicated.

Example scenario

This is not just a weird edge case. Consider:

  1. You have a "University" vocabulary.
  2. You allow users to enter their undergraduate school in one field, and their graduate school in another.
  3. The user enters "University of Texas" in both fields.
  4. When the node is saved, there will be two different terms named "University of Texas."

Proposed resolution

  • Check for existing records with the same name in the same vocabulary before creating new terms.

Remaining tasks

  • It's been suggested that the query used in taxonomy_field_presave()to check for existing terms could lead to race conditions:
    +++ b/core/modules/taxonomy/taxonomy.moduleundefined
    @@ -1710,6 +1710,15 @@ function taxonomy_rdf_mapping() {
    +      // Avoid duplicating tags within the same vocabulary.
    +      $tid = db_query_range("SELECT tid FROM {taxonomy_term_data} WHERE name = :name AND vid = :vid", 0, 1, array(
    +        ':name' => trim($item['name']),
    +        ':vid' => $item['vid'],
    +      ))->fetchField();
    +      if (!empty($tid)) {
    +        $items[$delta]['tid'] = $tid;
    +        continue;
    +      }
    

    See #1050466-34: The taxonomy index should be maintained in a node hook, not a field hook for some proposed solutions.

User interface changes

  • None.

API changes

  • None.
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

xjm’s picture

xjm’s picture

Status: Active » Needs review
xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Issue summary: View changes

Updated issue summary.

xjm’s picture

Marked #1146744: The tagging widget doesn't check if the same term already exists as duplicate. From that issue:

mr.baileys: Isn't the problem more that the exist-check is done by taxonomy_autocomplete_validate, but the term-creation is done by taxonomy_field_presave, not that the results from taxonomy_term_load_multiple is cached?

So first both autocomplete-fields are element-validated, which sets them as "need to be created", and when the field is then saved, no checking is done to see if the term exists, so both terms are created.

catch’s picture

I think there's another issue for this somewhere as well, but can't find it at the moment.

fietserwin’s picture

#4; #610076: Add a primary key to the {taxonomy_index} table and prevent duplicates being inserted?

I think that the race condition can be avoided by using a merge query (aka upsert), That is, if that would be a real merge query at database level OR if the 2 separate queries (that are issued directly after each other) are placed in a transaction.

As further work on #610076: Add a primary key to the {taxonomy_index} table and prevent duplicates being inserted is going on, I think this one can be closed as a duplicate. Perhaps the test can be added to the mentioned issue.

kscheirer’s picture

#1: autocreate-1343822-combined.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, autocreate-1343822-combined.patch, failed testing.

xjm’s picture

I'd suggest postponing work on taxonomy autocomplete bugs at least a week or so until we see what direction #1847596: Remove Taxonomy term reference field in favor of Entity reference is going to take.

james.williams’s picture

Here's a port of the patch for Drupal 7, without the accompanying tests. Applies cleanly & fixes the bug for me.

james.williams’s picture

Issue summary: View changes

Updated summary.

xjm’s picture

Issue summary: View changes

Removing myself from the author field so I can unfollow. --xjm

jibran’s picture

Component: taxonomy.module » entity_reference.module

All the taxonomy field related matters are handled by entityreference field after #1847596: Remove Taxonomy term reference field in favor of Entity reference

amateescu’s picture

Component: entity_reference.module » taxonomy.module

Drupal\taxonomy\Plugin\EntityReferenceSelection\TermSelection is provided by the Taxonomy module, moving back to the right queue.

anrikun’s picture

@james.williams
Thanks for patch at #9!

michaelmallett’s picture

Patch worked great for me, thanks. Is there any way to get this in core? Just wondering why it's been around for 3 years.

amateescu’s picture

Title: Autocreated taxonomy terms duplicated if they are added to multiple fields. » Add duplicate term names prevention as a vocabulary option
Version: 8.0.x-dev » 8.1.x-dev
Category: Bug report » Feature request

Since there is no more dedicated taxonomy reference field, just the generic entity reference one, this can no longer be fixed at the field level, we need to move the fix higher up the chain to taxonomy term entity presave.

But, as far as I see, there is no mechanism that prevents adding terms with the same name in a certain vocabulary, so maybe this should be added as a vocabulary configuration option?

With that in mind, I'm moving the issue to 8.1.x and reclassifying as a feature request.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

ABaier’s picture

I am wondering if there is any movement within this issue or maybe even a patch for 8.2.2?

My editors can add multiple event dates to a node using a paragraphs type which also includes a taxonomy reference field for the type of each individual event (for example "concert"). Field settings are autocomplete tagging and "Create referenced entities if they don't already exist".

If they add multiple instances of this paragraph on node creation, all containing a similar, non existent tag of "event type", this term will be created multiple times with the same name, but varying TIDs, after saving the node.

Unfortunately I cannot help with the integration, but I would think the existence of the entered terms has to be validated, maybe for each individual instance of the taxonomy field, one after another, if it appears multiple times inside the node form.

ABaier’s picture

Another case where a taxonomy field could exist multiple times in a node edit form, besides paragraphs, would be the use of media entity. Maybe many more, if inline entity form is used to create other entities within a node.

I ran in both issues, paragraphs and media entity. The paragraph is used to tag for example the location of an event and the media entity image/video can also be tagged with this location vocabulary. This is all for categorization purposes and granulary filtering of content. To mention is, that the node itself also contains some taxonomy fields including the location field.

What I want to say is, that the chance of having more instances of one taxonomy field is quite big. Especially with the upcoming node creation workflows using the mentioned modules.

Now that media entity will be ported to core this issue really should be considered.

Any ideas?

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

chr.fritsch’s picture

Status: Needs work » Needs review
FileSize
5.64 KB

Here is a first patch. I implemented the check as a Validator.

Status: Needs review » Needs work

The last submitted patch, 20: add_duplicate_term-1343822-20.patch, failed testing.

chr.fritsch’s picture

Status: Needs work » Needs review
FileSize
7.17 KB

Fix some tests

chr.fritsch’s picture

Issue tags: +Needs tests

I know, tests are still missing.

I want first to get a sign-off, if this approach is feasible and we want to continue into this direction.

webflo’s picture

  1. +++ b/core/modules/taxonomy/src/Entity/Vocabulary.php
    @@ -120,6 +128,13 @@ public function getDescription() {
    +  public function preventDuplicates() {
    

    This method should be on the interface. I think its not possible to add methods to an existing interface. Not sure about the BC implications. I think we have to add a new one.

  2. +++ b/core/modules/taxonomy/src/Plugin/Validation/Constraint/TermNameConstraintValidator.php
    @@ -0,0 +1,57 @@
    +    if (!Vocabulary::load($entity->bundle())->preventDuplicates()) {
    

    Should check for the new interface first.

  3. +++ b/core/modules/taxonomy/src/Plugin/Validation/Constraint/TermNameConstraintValidator.php
    @@ -0,0 +1,57 @@
    +      ->condition($id_key, (int) $items->getEntity()->id(), '<>')
    

    Access checks are applied to entity queries by default. Maybe we should opt out to make sure its really unique?

I agree on the overall approach.

chr.fritsch’s picture

1+2. I discussed #24-1 with @dawehner and we think because of https://www.drupal.org/node/2562903 it should be possible to add a new method to the class and the interface.
3. Opted out for accessCheck

Additionally i added a test to the TermTest.php

mtodor’s picture

This looks good, I have tested it (works fine) and checked code. Only few nitpicks.

  1. +++ b/core/modules/taxonomy/src/Plugin/Validation/Constraint/TermNameConstraintValidator.php
    @@ -0,0 +1,58 @@
    +    $field_name = $items->getFieldDefinition()->getName();
    

    This expression can go after if (!Vocabulary::load($entity->bundle())->preventDuplicates()) { statement. And also it would be better to get $items->getFieldDefinition() in some variable and then use it in 2 places, because $field_name is used only once, so there is no need for variable. There is also repeating for $items->getEntity() and $entity->getEntityType() - they could be assigned to variable.

  2. +++ b/core/modules/taxonomy/src/Tests/TermTest.php
    @@ -585,4 +586,60 @@ public function testTermBreadcrumbs() {
    +    $this->assertNotNull($term, 'Term found in database.');
    

    If we already have check $this->assertText('Created new term ' . $term->label()); at end of test, maybe it can be added here too.

  3. +++ b/core/modules/taxonomy/src/Tests/TermTest.php
    @@ -585,4 +586,60 @@ public function testTermBreadcrumbs() {
    +    $termName = $this->randomMachineName(14);
    +    $edit = [
    +      'name[0][value]' => $termName,
    +      'description[0][value]' => $this->randomMachineName(100),
    +      'parent[]' => [0],
    +    ];
    

    Instead of making new random name, it would be easier to just make something like this: $edit['name[0][value]'] .= '_2'; - to avoid 1 in 64509974703297150976 cases when we will get two same random generated names. ;)

  4. +++ b/core/modules/taxonomy/src/VocabularyInterface.php
    @@ -54,4 +54,12 @@ public function setHierarchy($hierarchy);
    +   * Indicates if multiple terms with same name are allowed in the vocabulary.
    

    Comment is a bit misleading. I think it should be: "Indicates if multiple terms with same name are forbidden in the vocabulary." - because that's behaviour when you get TRUE.

Overall -> VERY NICE!
Now we will not get errors like: Multiple entities match this reference; "My Tag (3)", "My Tag (7)". - when I just want to use "My Tag". :)

chr.fritsch’s picture

I adjusted all the comments from #26. Unfortunately it was not possible to create an interdiff.

gg24’s picture

Status: Needs review » Needs work

Steps followed to test the Patch #27 on version 8.4.x:-

1- Applied the patch.
2- Checked "Prevent duplicates" checkbox on edit vocabulary page.
3- Added two taxonomy referenced fields referencing same vocabulary in a content type.
4- Checked "Create referenced entities if they don't already exist" for both the fields in the respective content type.
5- Created a content for the above mentioned content type and selected same term for both the taxonomy fields.
6- Saved the node.

This still creates two terms with the same name under same taxonomy vocabulary.
Hence, this patch doesn't work as intended.

Please mention if i need to follow any other step too, if i missed something.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

cbeier’s picture

Assigned: Unassigned » cbeier
Status: Needs work » Active
cbeier’s picture

Assigned: cbeier » Unassigned
Status: Active » Needs work

The validation works currently only in term edit and not in the autocomplete field. In the autocomplete field is in the current state no further validation running (as far as I know).

For the problem with validations inside the term autocomplete field is another issue: #2438017: Entity reference does not validate auto-created entities

If the general possibility to implement validations for the autocomplete field, we can also implement the duplication validation.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

acbramley’s picture

Title: Add duplicate term names prevention as a vocabulary option » Autocreated taxonomy terms duplicated if they are added to multiple fields

In my opinion this shouldn't be throwing validation errors, but should be a seamless operation on a tagging widget (potentially via a setting on the field definition or widget) that just uses the existing term that was previously created. This has to be done somewhere in the entity reference field type or the EntityReferenceSelection plugin. This is due to the code flow:

1) The EntityAutocomplete Element plugin validation calls SelectionWithAutocreateInterface::createNewEntity (this doesn't save the entity straight away)
2) When the second field's element goes through validation and looks for the existing entity, it doesn't find the one that was just created since it hasn't been saved. Therefore a new entity is created.
3) EntityReferenceItem::presave saves the unsaved entities.

We could instead pass along an option to the EntityReferenceSelection plugin's config to auto save those entities? This would obviously lead to some orphaned terms for incomplete forms but for me that'd be better than the current state.

It also seems like this issue's title has diverged from what the issue summary is describing so I'm going to set it back to the original title.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

guilhermejz’s picture

I have done that, just $entity->save(); before the return on createNewEntity function. It works.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

smustgrave’s picture

Status: Needs work » Closed (cannot reproduce)

Closing as cannot reproduce in 9.5

Added 2 fields to a content type.
Both pointing to the same taxonomy
Both autocomplete tags style
Added Red to both
Tried to save the form but it actually throws a validation error and won't let me save ( which is good I think).
No terms added to the taxonomy

If you feel this is still an issue please reopen with an updated issue summary, steps to reproduce, proposed resolution, etc