Calais powered by Thomson Reuter's

What is it?

The Calais Collection is an integration of the Thomson Reuters' Calais web service into the Drupal platform. The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well. The web service is free for commercial and non-commercial use. It requires registration to obtain an API Key.

Read webchick's fantastic Introduction to Calais for Drupal.

What's New?

  • Upgraded to work with Calais release 4.3
  • SocialTags integration
  • SemanticProxy integration
  • Full Calais data integration with Views
  • Full support for Calais disambiguated URIs and data for Geo (City/State/Country), Company, Products

Requires the RDF module alpha7 release or later

API

This module provides a flexible API for modules to use when integrating with the Calais Web Service. There is a function based and an object oriented API.

Tagging Integration with Nodes and Taxonomy

This provides the capability to integrate Calais Entity, Event, and Fact metadata with Drupal Nodes. The Calais module lets you configure which Content Types should be analyzed by Calais for metadata extraction on update. The metadata returned can then be automatically assigned to vocabularies, or it can only suggest terms allowing full user control of the tagging (think of del.icio.us recommending tags). A flexible set of hooks allows 3rd party modules to make modifications before or after Calais metadata has been processed and applied.

SemanticProxy integration

When importing feeds, sometimes the tiny little blurb they give you is not enough for Calais to build real context and give you rich meta-data. Sometimes the content you are pointing to has a more full version of it's text elsewhere. This is what SemanticProxy was made to handle. Just tell SemanticProxy which URL to process (it integrates with FeedAPI Node and CCK Link and Textfields) and it will process the content at that source URL and return all of the incredible semantic metadata (as well as the full text of the source document) to you. Learn more about SemanticProxy.

Geomapping

The Calais Geo module allows for plotting various Calais Vocabulary terms on a map, as provided by the GMap module. It makes use of the Calais Web Service facilities that provide latitude & longitude for relevant geo terms. Map data is provided as a block and as part of the node properties.

Blacklisting and Renaming

The Calais Tag Modifier module allows for basic blacklisting of tags, so that you never get terms suggested that you don't care about (note that existing applications of the term will remain intact, however). Additionally, the term rename mechanism also allows you to modify returned metadata names before it gets assigned or suggested, by merging the undesired term with the new one.

Installation Notes

The ARC2 library is required for this module to function.

  • The D5 version requires the ARC2 library to be installed in opencalais/arc_rdf/arc2
  • The D6 version also requires ARC2, however it should be installed as part of the RDF module, which is a dependency
  • The D6 Version 2.2+ requires the RDF module alpha5 release or later

The Taxonomy Manager module is not required, but can make your life much easier in the event that large amounts of unique terms are applied to your content.

Upgrading

If you are upgrading to the 2.x release of Calais for Drupal, please be sure to unpack this into a NEW directory. Do not unpack over an older version, there have been files that are moved and you may enable the wrong version of certain contrib modules (like tagmods).

Calais Collection

Also part of the collection is

OpenPublish

The Calais Collection is also part of OpenPublish, the semantic Drupal platform for publishers.

Credits

This project is sponsored by Thomson Reuters' Calais and Phase2 Technology.

Project Information

Downloads