Hi guys

I'm using the Alchemy_autotagging module to extract terms from a feed with Managing News. Basically, Managing News works with the Data API for each feed item (not storing them as nodes), and I managed through a small custom module to get the autotagging module extract the keywords from each feed item of the feed and tag this entry with the extracted terms. It works great !

But I have a problem, and maybe you could guide me. The feeds I provide to Managing News can have articles in multiple languages. When the feed item is in English, the extraction is pretty accurate (thank you Alchemy). When the article is written in another language (problem seen with French) the terms extracted are just messy and useless : it extracts small parts of sentences that have no grammatical meaning by themselves (so they can not be considered as tags) and have relevance regarding the content.
An example translated in English : for this very issue it may extracts "terms from", "for each feed", "guide" and "written in". I hope you understand the uselessness of these terms.

I've searched through the Alchemy module, AlchemyAPI (not the Drupal module) and the website, and if they state that Alchemy works with 95+ languages, I don't know how to somehow pass the language to Alchemy or how to make it work properly according to the language of the item. To confirm I've just now tested a feed with only French articles, and my code is working well (relevant extraction). Maybe you already encountered similar problems. Any hint really appreciated.

Comments

DjebbZ’s picture

Title: Languages issue » Multiple languages issue

Just renamed the title.

TomDude48’s picture

Status: Active » Fixed

I haven't worked with Alchemy in languages other than English. I do know it is supposed to auto detect the language. If you do find something in their documentation that I need to implement I will happy to do it.

skizzo’s picture

subscribing.

I am interested in language detection for submitted posts. Ideally, I would like to accept only enabled Drupal languages (unpublish action for other languages) and autotag a form-hidden taxonomy vocabulary (so that one could then filter views by language). I have no programming skill, but following doc seems relevant: http://www.alchemyapi.com/api/lang/

skizzo’s picture

Status: Fixed » Active

changing status (in case maintainer follows only active issues)

TomDude48’s picture

This is an interesting use case. Alchemy can do language detection and it would be interesting to fire a trigger with language as an argument. I might make for a useful sub-module, however, for the near future I do not have the time to write it.

Maybe at some point in the future.

narnua’s picture

I had the same issue, found my way here after checking out Alchemy and discovering it didn't provide language detection for automatically setting the node language. Since in my case I need to set the language for nodes coming in from Feeds, I added it to hook_action of Feeds Hacks (http://drupal.org/project/feeds_hacks, Feeds submodule providing a couple of batch actions). But it actually does detect language when manually creating nodes as well (when triggered by Rules node save event)

It only needs to know your Alchemy API key to be able to use the Alchemy API language detection service http://www.alchemyapi.com/api/lang/). Right now my code lives in my local Feeds Hacks copy, but I wouldn't mind trying to patch hook_action (or trigger?) for Alchemy, if there's demand.

I haven't provided this back to Feeds Hacks either just yet, but the writeup provides the code: http://blogs.fabfolk.com/anu/2011/07/automatic-language-detection-for-dr...

technologywon’s picture

Status: Active » Closed (outdated)

Drupal 6 is no longer supported