Would it be possible to have options that make

a) all the calais terms go to a single free tagging taxonomy instead of the ones that the module predefines
b) each of the calais terms to be mapped to a particular free tagging taxonomy

--
Ade Atobatele

Comments

febbraro’s picture

I like option B) being able to choose which vocabulary each set of entities maps.

However I think A) pushing all tags to a single free tagging vocabulary actually removes the real benefit of the Calais service that that is knowing the context of the tags you are getting. There are lots of services that will extract your keywords from your content, but not many will tell you that "Michael R. Bloomberg" is actually a Person. I feel that if you dump all tags into one vocabulary you lose the contextual gold.

Nigeria’s picture

While I agree with your sentiments, the idea was to put the power in the hands of the users to actually make the decision rather than force them to agree with us!

Besides if you implement B, then all the user has to do is map all the entities to the same vocabulary anyway!

--
Ade Atobatele

febbraro’s picture

Assigned: Unassigned » febbraro

I knew I liked B for a reason :-)

This one is a bit more medium term than short term, there are probably a few other pieces of low hanging fruit that I might be looking to get into first, but something like this is certainly on the short list. Someone here also brought up that they might like tags associated with CCK fields instead of Taxonomy vocabularies, so that is yet another alternative.

Rob T’s picture

Rethinking the "Johnny Cash" term in my "Artists" vocabulary... (from Blacklist issue).

Optimally, I'd like to map the Calais "Person" vocabulary to my "Artists" vocabulary. However, instead of blacklisting terms, I'd like to Whitelist only those terms that I want (from my predefined list of Artists).

If "Tony Blair" and "Bono" are Calais-extracted terms, I don't want "Tony Blair" to be tagged on the node. He's not an artist. I only want those "Person"/"Artists" term tags that I have predefined. "Bono" would be on my list, so that term would be tagged.

Calais with the blacklist will do me good, no doubt. But the above-mentioned use case really would hit a sweet spot for me while providing an innovative use of the service.

irakli’s picture

Whitelist is an interesting idea, but as a side-note for more specific cases: even if some concrete way of handling tags is not in the user-interface, it does not mean that it can not be achieved.

Calais implements hook_calais_preprocess( &$node, &$keywords) and hook_calais_postprocess( $node, $keywords) that gives some interesting flexibilities to module developers. The "preprocess" hook is invoked after the list of keywords is invoked from the Calais web-service, but before it is saved into Drupal!

febbraro’s picture

Status: Active » Closed (won't fix)

I think much/all of this could be done using the existing hooks.

a_c_m’s picture

Version: 5.x-1.3 » 6.x-3.0
Assigned: febbraro » Unassigned
Status: Closed (won't fix) » Active

I'm revisiting this, as yes it probably /can/ be done with with existing hooks, but whats needed is a nice interface to allow less technical users to leverage this module without creating a lot of unwanted vocabularies.

a_c_m’s picture

Title: Alternative Taxonomies » Mapping Calais taxonomies to internal Drupal taxonomies

After looking at some code, it looks to me like its going to need modification of the calais.module at least, if not the .install as well.

The ideal is to have on the configuration page a "Mapping field set" with the following radio button options

O - 1 to 1  - Each Calais Entities is mapped direct to a Drupal vocabulary

O - All to 1 - All Calais entities are mapped to a single Drupal vocabulary

          Map all Calais entities to [Select Box of vocabularies]

O - Custom Mapping 

         Calais Entity     |     Drupal Vocabularies 

         Anniversary      |     [Select Box of vocabularies + option for none]
         City                   |     [Select Box of vocabularies + option for none]
         Company          |     [Select Box of vocabularies + option for none]
         ....
         Anniversary      |     [Select Box of vocabularies + option for none]

        Unknown Entities  [Select Box of vocabularies + option for none + option for Create new Vocabulary]

As the mapping will be chosen post module activation the creation of all the vocabularies (if 1:1) needs to be done just in time, instead of as part of the install.

I would be happy to help with this, but don't know Calais well enough to get rolling, I have been looking at the calais_get_entity_vocabularies(), calais_get_vocabularies() and calais_process_node() - but working out exactly how/where to start hacking (or what things i might break as a result).

a_c_m’s picture

Title: Mapping Calais taxonomies to internal Drupal taxonomies » Mapping Calais entities to internal Drupal taxonomies
Nigeria’s picture

Is there a likely possibilty that these features may be implemented?

webchick’s picture

I'm interested in coding something like this, but not quite. More like Rob's #4. We want to map people to "players" and regions to "leagues." If the incoming term isn't in our list of 'accepted' terms, then it'll either get dropped, or we'll dual-tag it: once with Calais's stuff and once in our own vocabulary. I'm still trying to decide which is the best way to go on that... if anyone has thoughts, let me know.

I'll try and code our custom module in a somewhat generic way and attach it here, but I can't promise anything remotely approaching support for it.

febbraro’s picture

Version: 6.x-3.0 » 6.x-3.1

I think there is some good stuff in what people are talking about here. I think ideally the array stored in the variables table under the key calais_vocabulary_names just needs a UI to allow editing of the vocabulary id (vid) for each entity type. I think it will need to get taken one step further though. Seems like for each entity type we would need to know if there should be a "parent" term which is the entity type name such that if all terms go to one vocabulary then the Person types would be as "General Vocab" => "Person" => "Johnny Cash"

As far as whitelisting goes, I think that needs to be a separate module that makes use of the hooks.

If someone write any of these I would be more than happy to integrate it into the modules, as I don't see myself having much time in the near future.

SocialNicheGuru’s picture

subscribing.

pyxio’s picture

I think a possible way to accomplish this would be to integrate the mapping via the Content Taxonomy http://drupal.org/project/content_taxonomy. That would be a killer feature. If this were achieved, you'd be able to have only one tab for editing your node taxonomy fields rather than having EDIT and CALAIS like it would be now. Actually, it already works with Content Taxonomy. So "Mapping Calais entities to internal Drupal taxonomies" can already be done. However, it is not elegant because as I mentioned you have two separate tabs so a lot of clicking back and forth. But where I have a Content Taxonomy CCK field mapped to a vocabulary on the EDIT tab and the vocabulary tagging field on the CALAIS tab, if I enter a keyword in the CCK field it does appear in the Calias field also. So we are actually closer to a solution than you might imagine. Tighter integration between the two modules seems to me would solve this request. Let me know what you think. Kevin

AntiNSA’s picture

I have tried for the last 28 hours straight to get this to work with content taxonomy along with feedapi. I tried everything I could, but could not get content taxonomy to automatically populate the custom fields that were created for each calais/taxonomy vocabulary. I tried all night and thought that since you "Mapped"" the content taxonomy to the Calais vocab after the information was gathered from calais that it would automatically populate the custom content taxonomy fields... but absolutely no luck. Actually like 48 hours no sleep for nothing now. If someone could help me with this I would be greatly appreciative.

a_c_m’s picture

Thread seems to have gone cold - webchick did you get anywhere with this?

matthiassamwald’s picture

I would really enjoy seeing "a) all the calais terms go to a single free tagging taxonomy instead of the ones that the module predefines" implemented. In fact, I won't consider using this module before this is implemented.

deltab’s picture

(a) is a very good option, amongst other things, we have duplications between very generic Calais categories at this time, I find almost everything from document categories replicated in social tags

Equinger’s picture

subscribing

febbraro’s picture

Status: Active » Closed (won't fix)

In the D7 version of this module I intend each Entity Type to have a Vocabulary that it maps too via the configuration. This should solve any problem that one might encounter of where/why/how to create the appropriate terms.

a_c_m’s picture

febbraro, thats disappointing. I would have very much like to see this for the 6.x version.

Raf’s picture

Another request to have this feature (especially a) ) in Drupal 6.x. The vast amount of vocabularies right now make it nearly unusable for any of my projects. Right now, administrators don't have control over the taxonomy anymore, but have to make due with what Open Calais throws into the whole site structure.

a_c_m’s picture

Is there really no chance this would be added?

ahansen1’s picture

I would be willing to contribute to a bounty to have this feature added to D6