Somehow I got duplicate term names in the same vocabulary.

When Feeds tries to map by name, it picks the first matching term name created (lowest TID). But node forms etc use the last matching term name created (highest TID). So there's a mismatch between what's being imported and what users see when editing. The imported data is essentially worthless (mapped to TIDs that are no longer in use.)

I think feeds should pick the latest matching term name. I've attached a patch (my first, hope it works) to make that happen.

CommentFileSizeAuthor
feeds-mapper-taxonomy.patch467 bytesderekw

Comments

derekw’s picture

Status: Active » Needs review
nurulshakina’s picture

I am sorry..thanks for the patch...which file to put the patch please?

derekw’s picture

I guess I didn't generate the right format patch since it didn't include the file name. The file to be patched is mappers/taxonomy.inc.

Michsk’s picture

This happens as well when you have terms in your node, do a new import without terms, then the existing terms will be duplicated as well.

ArtActivator.com’s picture

Priority: Normal » Major

Confirming this. I am using 2.0 release, I have a term. When I import new content (or updating existing), that has this term, it creates a duplicate of this term.
Patch did not helped. I am using feeds_tamper for term exploding.

Please help!

------------
www.ArtActivator.com - Order your success!

Michsk’s picture

You will have to wait until the framework issue for feeds is commited. Search this issue queue and you will find it.

summit’s picture

Status: Needs review » Active

Hi,
Setting this to active again based on #6. Whats the status of the framework issue please?
EDIT: Issue is: https://drupal.org/node/1107522 and status is review..

Greetings, Martijn

summit’s picture

Hi,
Using patch in this thread: https://drupal.org/files/feeds-mapper-taxonomy.patch
and patch of https://drupal.org/node/1107522: https://drupal.org/node/1107522#comment-7883105
I still got multiple of the same term names in my vocabulary after importing an XML.

I use Feeds Taxonomy Mapper with option enabled: "Auto Create term".

Thanks a lot for your reply in advance!
Greetings, Martijn

summit’s picture

If I disable the option "Auto Create Term". There are no terms build. If I enable the option, I have lots of multiple/the same terms...
greetings, Martijn

summit’s picture

Hi, Do you all have the same situation as mine, that the termfield is used in multiple contenttypes?
May be that gives the problem? Likewise: https://drupal.org/node/1039134#comment-7346894

Greetings, Martijn

geek-merlin’s picture

Assigned: derekw » Unassigned
Issue summary: View changes
Issue tags: +Needs issue summary update

I think this issue about "duplicate terms imported" is braoder than described in the summary (tid not unique between import and edit) so this might need a summary update.

Also took out the assignation, feel free to re-assign to yourself if you mean "i'm currently working on this, so noone else should be working on it".

geek-merlin’s picture

(Erroneous comment)

summit’s picture

Hi, See https://drupal.org/comment/6861682#comment-6861682
With VBO they build a helper module, maybe this is also needed with feeds?
Off course NOT having duplicate terms in the first place is better!

Greetings,
Martijn

input’s picture

Havin this issue too. The funny thing is. If i import a csv with 3 lines everything is fine.

If i import my .csv with 450 Lines it creates duplicates for every new/updated node with excactly 1 entry.

Edit:
If i import my short list after the long. everything is fine. but if i again import my huge set. I get even more duplicates.

twistor’s picture

So, we do check for existing terms. Can you try a recent dev version? There have been some changes to how terms are handled.

input’s picture

Yeah. a check for existing terms is I think the main problem.

Also tryed to recheck with rules before imported node is saved/ updated but can't access the imported value. It's returning existing terms but not new terms.

Btw. I have checked around my setup. I'm on 7.x-2.0-alpha8 with feeds so i defitifly check out dev.

Also noticed that i import other taxonomy terms on there vocabularies. The have very limited terms and don't show this behaviour.

Another point i set up is feeds tamper at 7.x-1.0-beta5. All tax-fields do the same in tamper. explode and remove empty. I think it's somehow dependent on the ammount of imported nodes vs. new taxonomy terms.

but first i give a try on dev version and tell what happens. if problem still occours. maybe my .csv and export of my feeds importer will help?

p.s.
if you have a tip on getting back my id-counter to normal levels i'm on.
I'm nearly on 20k+ now. I know this doesn't matter much, but looks a little bit silly ;)

edit:
today i also turned of "auto create" while i still had duplicates within my dictionary. result: none of the imported/ updated nodes where referenced to any of the existing terms.

input’s picture

okay. killed all terms on vocab. flushed cashes. runned update.php. db update with "7209 - Reschedules feeds jobs. ". flushed and croned again, checked everything and imported.

result:
seems to works as expected.

reimport for recheck:
everything okay.

creat job!

I should have gone to dev. earlier but i made bad experiences with dev versions :\

p.s. so if you want some infos about whisky (unfort. in german) feel free to visit http://bi-club.fem-net.de or soon http://bi-club.de
p.p.s. feeled the need to notice your project at our imprint cause you made the migration of this site possible without to much headache and much more comfortable. also i'm using feeds in some other projects and i like it more and more ;)

alexmoreno’s picture

I can confirm that the -dev version has not this issue.

To delete the taxonomy contents that can be useful:

drush sqlq "SELECT name, vid FROM taxonomy_vocabulary WHERE name = 'Cruises'"

(to get the vid, that case, 3)

drush sqlq "DELETE FROM taxonomy_term_data WHERE vid = 3"

patkai’s picture

Had the same issue with duplicate terms when importing nodes from xml. The interesting part was that I tried to get around it by importing the taxonomy first, but that went even worse, showed SQL errors. The dev version works fine, nodes with terms imported like a charm.

megachriz’s picture

Status: Active » Fixed

Marking as fixed based on the comments in #17, #18 and #19. Feel free to reopen if the problem still exists.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Perseids’s picture

If the issue is duplicate term names on auto create, what actually solved it for me is: here (in #3). Just eliminate the spaces between your term names in the csv file.

When I searched the issue I first landed on this page (unfortunately–wasted a lot of time) and I can confirm that the dev version doesn't have this problem but for me the dev created another (more serious) issue since schedule dates weren't getting picked up regardless of whatever I changed (tested on 2 different servers Linux & Windows). So better stick with conventional wisdom and never use dev on production sites.