Is bulk processing available for the D7 version? I don't see it anywhere.

Comments

sylvanos’s picture

Still looking for it. So far, i've only been able to get the tags by opening an article and saving it.

michaelgiaimo’s picture

Yeah, same here. With over 30K nodes, it's not an option. I was trying to figure out if I could use node_load_multiple() in a function somewhere. Dunno.

sylvanos’s picture

Have you give autotagging a try? If not I will and let you know.

michaelgiaimo’s picture

Any luck with this?

sylvanos’s picture

I've had to take care of a few bugs in Feeds. I will try this later on this month.

sylvanos’s picture

Hi, I've tried the Auto Tagging module. The D7 version is still very buggy and I haven't been able to make it work. I might try again in a couple of weeks. Meanwhile I've found a more promising solution for us that is integrated with Feeds. It's called Extractor but it's only for D6:-(

RasputinJones’s picture

This snippet does bulk processing. Probably could do with some optimizations.

/**
 * Add calais tags before feed_item nodes are saved.
 * This ensures that each node is tagged on retrieval.
 * Implements hook_node_presave
 * @param $node
 * @return
 */
function my_module_node_presave($node)  {
  if ($node->type != 'feed_item') {
    return;
  }

  $fields = opencalais_get_opencalais_tag_fields(NULL, 'node', $node->type);
  $extras = array();
  $saved_terms = null;

  foreach($fields as $opencalais_type => $field_name) {
    $suggestions = opencalais_get_suggestions($node, $opencalais_type);

    $suggestions_array = array_keys($suggestions);

    foreach ($suggestions_array as $elem_pos=>$tax_term_string) {

      $node->$field_name = array();

      if (isset($node->$field_name)) {

        if ($tax_term_string == null) continue;

        $taxonomy_terms = taxonomy_get_term_by_name($tax_term_string);

        if ($taxonomy_terms)  {
          $taxonomy_term = reset($taxonomy_terms);
        } else {
          $taxonomy_term = new stdClass();
          $taxonomy_term->name = $tax_term_string;

          $vocabulary = my_module_get_vocabulary_by_name($opencalais_type);

          $taxonomy_term->vid = $vocabulary->vid;
          taxonomy_term_save($taxonomy_term);
        }

        $saved_terms[] = $taxonomy_term;
      }

      if ($saved_terms != null) {
        $i = 0;

        foreach($saved_terms as $saved_term)  {
          $anchor = &$node->$field_name;
          $anchor[$node->language][$i]['tid'] = $saved_term->tid;
        }
      }

      $saved_terms = null;
    }
  }
}

 /**
 * This function will return a vocabulary object which matches the
 * given name. Will return null if no such vocabulary exists.
 *
 * @param String $vocabulary_name
 *   This is the name of the section which is required
 * @return Object
 *   This is the vocabulary object with the name
 *   or null if no such vocabulary exists
 */
function my_module_get_vocabulary_by_name($vocabulary_name) {
  $vocabs = taxonomy_get_vocabularies(NULL);
  foreach ($vocabs as $vocab_object) {
    if ($vocab_object->machine_name == $vocabulary_name) {
      return $vocab_object;
    }
  }
  return NULL;
}

You'll need to comment out this snippet in open_calais.module

  if(!property_exists($node, 'nid')){
     return; //short circuit out - the node is brand new and being rendered for the first time
  }

Not really sure why the line above was necessary as element grabbing and drupal render works pretty well without the node being saved.

sylvanos’s picture

Thanks a lot Rasputin,

Right now we're experiencing with AlchemyAPI which does a better job with non-english language. We've decided also to write the tags in a JSON file and retrieve that file with Feeds, but we may tried your patch alonf the way.

Thanks

jec006’s picture

Hi - related to #7 - this is now basically in OpenCalais core. It will now work by default with autotagging and feeds.

We would also like to add some level of batch processing - to provide for legacy content.

jec006’s picture

Status: Active » Fixed

Added here: http://drupal.org/commitlog/commit/5830/320e864ba42dc7e303db4f6cb2ef209d...

Please test and let me know what bugs you find - this is something thats hard to predict where bugs can / will occur. Seems to work during my testing

michaelgiaimo’s picture

Working great for me - in fact, I have over 30,000 nodes to do, so I changed the '25' to '250' and it's humming!

sillygwailo’s picture

I got "Could not process node with id 2. You may try to manually resave the node to resolve this issue." on both a remote attempt and a local attempt. Maybe we could get more information about the error in the database log in order to troubleshoot.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

gazzer82’s picture

Is there any way to retrigger a bulk process on a node type once they have been processed once?

We have had to tweak our OpenCalaias settings and want to reprocess some nodes, however i can't see a way to resubmit them?

Thanks

Gareth