I started to work on a content_taxonomy mapper since I need one for one of my projects. I should have a first version in the coming days.

I started from the patch in comment #25 of #623424: Mapper for Taxonomy

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mcaudy’s picture

subscribing

pbuyle’s picture

Version: 6.x-1.0-alpha8 » 6.x-1.x-dev
Status: Active » Needs review
FileSize
10.06 KB

Here is a first version as a .patch file. It also contains tests using my test class from comment #4 in #623452: Port basic test infrastructure for mapper

alex_b’s picture

Status: Needs review » Needs work

Contains tabs as indents...

pbuyle’s picture

Status: Needs work » Needs review
FileSize
9.99 KB

Sorry for the tabs. Here is a fixed patch.

rjbrown99’s picture

I have tested this with a CSV Feed. I am using it with 6 different content taxonomy fields, all of which are also tied to core taxonomy. I have successfully imported feed items into the appropriate taxonomy. I'll report back if I find problems, but so far so good.

BenK’s picture

Subscribing and will test latest patch....

rjbrown99’s picture

I have it working with XML feeds now as well. No modification necessary. Would anyone else like to confirm/deny it works or move it to RTBC?

rjbrown99’s picture

I'm confused as to what my problem is, so I removed this post. I'll re-post later if I figure it out.

rjbrown99’s picture

Hi.

I have tracked down a memory issue with larger taxonomies and this mapper. Well, at least *I* have an issue, which could be because of my setup. Specifically, within content_taxonomy_feeds_set_target, inside the foreach, and the else if.

    } else if($tags) {
      //If the field is configured for free tagging, create a new term
      $edit = array('vid' => $field['vid'], 'name' => $term_name);
      if($field['widget']['extra_parent']) {
        $edit['parent'] = $field['widget']['extra_parent'];
      }
      taxonomy_save_term($edit);
      if(!is_array($node->$field_name)) $node->$field_name = array();
      array_push($node->$field_name, array('value' => $edit['tid']));
    }

I'm dealing with a large import, on the order of 17,000 items, and one of my taxonomies is pretty large. If I measure my memory usage with memory_get_usage() right before that taxonomy_save_term($edit) call, I'll see 111,599,036 in use. Immediately after that, I'll see it jump to 159,637,972.

At the moment, I'm using a taxonomy for keywords and that's what is doing it. All of my feeds are bringing in their own set of keywords, so I'm up to about 46,000 terms in that vocabulary. I have been researching the practical limits of vocabulary terms and I can't seem to find one.

Not quite sure what to do with this one, other than perhaps rethink my use of taxonomy and change the keywords to be a CCK text field? Thoughts would be welcome.

AntiNSA’s picture

Can you please explain what exactly the purpose of this patch is? On the default feed module, you can map to taxonomy taxonomy fields. I havent tried mapping to content taxonomy fields, but if you have the option set on content taxonomy fields to save data to the taxonomy database, then whats the difference?

rjbrown99’s picture

The patch allows you to map to content taxonomy fields. Content taxonomy is a CCK module found here:
http://drupal.org/project/content_taxonomy

It is different from core taxonomy in that it is a part of CCK, although it can also save terms to core taxonomy if you elect to do it. The best thing to do is to head over to the content taxonomy module and learn a bit more about it and why you might want to use it instead of the core taxonomy.

brst t’s picture

I installed content_taxonomy in an effort to make a more usable form. It's a widget that allows autocomplete, tree'd checkboxes/radios as well as the traditional select list for taxonomies. Handy in the case of large taxonomies, particularly those collapsible checkbox trees.

Since content_taxonomy takes over the taxonomy field in the form, Feeds is picking up the taxonomy, but isn't picking up the content_taxonomy part.

Confusing because taxonomy info is imported, but since it doesn't pick up content_taxonomy, then editing the node, or performing bulk-operations causes that info to be lost because it picks up on empty content_taxonomy.

According to the module page, some of the basics of content_taxonomy are now in Drupal 7 with taxonomy term fields in core. I don't know what that means for feeds, but I'm opting to rip out the module out rather than patching feeds. Maybe if content_taxonomy added to taxonomy on the form rather than completely obscuring and replacing it? A restrictive imposition.

I ripped out content_taxonomy and all works as expected. Bulk operations on imported nodes aren't blanking out the taxonomy.

alex_b’s picture

Status: Needs review » Needs work

One assertion fails, there are still some minor code style issues (spaces in comments, capitalize comments, don't use tabs for indenting, add whitespace to end of patch).

Assertion that fails:

"Found form field field_categories[value][] for categories with the expected value."

#12: I am confused. I think I miss some understanding of content_taxonomy to understand your comment. Can you explain what exactly breaks in Feeds?

nicksanta’s picture

I've never tried re-rolling a patch before, but hopefully this works.

I've gone through and updated all the commenting, text indentations etc.. I'm not sure what the assertion problem is, but if the next person to work on this patch uses this that's one less thing to worry about.

AntiNSA’s picture

I am looking forward to seeing the result of this patch. anyone have any feedback"?

ChaosD’s picture

subscribed

nicksanta’s picture

Status: Needs work » Reviewed & tested by the community

So, a new beta was just released - why wasn't this patch included? I've been using it for weeks flawlessly. Time to get committed IMO.

alex_b’s picture

Status: Reviewed & tested by the community » Needs work

Two test assertions are failing: "Created 1 content_type_760923759 nodes." and "Found form field field_categories[value][] for categories with the expected value."

Tested with Simpletest 2.9 and Content taxonomy 1.0-rc2

Hanno’s picture

subscribe

TrevorBradley’s picture

subscribe

jeff.k’s picture

+1 subscribe

bennos’s picture

subscribe

elliotttf’s picture

FileSize
10.42 KB

I've been able to take care of all of the exceptions except for "Found form field field_categories[value][] for categories with the expected value.". For the life of me I can't figure this one out. If I grab the node/add/$nodetype page I don't even see the categories form elements on the form (even though simpletest says that the categories field was in fact added correctly). Another possible way to validate the values came through correctly might be to have the terms added to the term_node table and check against that, but I haven't played with that much. Maybe someone else wants to take a look at this and can get more progress.

Frankly I don't believe this assertion should kill this patch's progress since all other factors of the test work correctly (including setting up the mapper with the categories field).

elliotttf’s picture

Status: Needs work » Needs review
FileSize
10.63 KB

Alright, after the weekend I took another crack at this and have the category field validated now with this patch. Note that this value is validated directly against the database, not the form. I believe this should be sufficient since whatever issue was going on was either a silly error I was overlooking in the test code (possible, but not sure how likely since I've been looking at this for a while) or just a problem with the way content_taxonomy works (not related to feeds).

brycesenz’s picture

Subscribe.

AntiNSA’s picture

cant wait to get a final comitted version. whats the eta?

bennos’s picture

help testing, and the ETA is getting closer.

arski’s picture

just imported 240 data sets into a hierarchical taxonomy that was using the content taxonomy field - flawless! great stuff, thanks! :)

nicksanta’s picture

I've been using various incarnations of this patch for the last 5 months with ZERO problems.

XiaN Vizjereij’s picture

Status: Needs review » Reviewed & tested by the community

I can also confirm that its working fine.

nclavaud’s picture

This patch works great for me (XPath parser, multiple values).

I also had a look at the latest nodereference mapper patch and found it had great ideas inside that this Content Type Taxonomy mapper could reuse :

  • the target name could be more explicit - maybe something like :
    'name' => $name . ' (Content Type Taxonomy by term name)', (content_taxonomy.inc #23)
  • get taxonomy by ID? That would be a great feature.

And btw I don't understand why taxonomy_get_term_by_name_vid($name, $vid) would return more than one value, since it is called in the for loop for each term already. Doesn't look obvious to me.

alex_b’s picture

Status: Reviewed & tested by the community » Fixed

Comitted, thank you

http://drupal.org/cvs?commit=443258

Also committed content taxonomy to Feeds Test profile:

http://drupal.org/cvs?commit=443260

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.