Calais
What is it?
The Calais Collection is an integration of the Thomson Reuters' Calais web service into the Drupal platform. The Calais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing, machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well. The web service is free for commercial and non-commercial use. It requires registration to obtain an API Key.
Read webchick's fantastic Introduction to Calais for Drupal.
What's New?
- Upgraded to work with Calais release 4.0
- SemanticProxy integration
- Full Calais data integration with Views
- Full support for Calais disambiguated URIs and data for Geo (City/State/Country), Company, Products
Requires the RDF module alpha7 release or later
See CHANGELOG.txt for details of the releases.
API
This module provides a flexible API for modules to use when integrating with the Calais Web Service. There is a function based and an object oriented API.
Tagging Integration with Nodes and Taxonomy
This provides the capability to integrate Calais Entity, Event, and Fact metadata with Drupal Nodes. The Calais module lets you configure which Content Types should be analyzed by Calais for metadata extraction on update. The metadata returned can then be automatically assigned to vocabularies, or it can only suggest terms allowing full user control of the tagging (think of del.icio.us recommending tags). A flexible set of hooks allows 3rd party modules to make modifications before or after Calais metadata has been processed and applied.
SemanticProxy integration
When importing feeds, sometimes the tiny little blurb they give you is not enough for Calais to build real context and give you rich meta-data. Sometimes the content you are pointing to has a more full version of it's text elsewhere. This is what SemanticProxy was made to handle. Just tell SemanticProxy which URL to process (it integrates with FeedAPI Node and CCK Link and Textfields) and it will process the content at that source URL and return all of the incredible semantic metadata (as well as the full text of the source document) to you. Learn more about SemanticProxy.
Geomapping
The Calais Geo module allows for plotting various Calais Vocabulary terms on a map, as provided by the GMap module. It makes use of the Calais Web Service facilities that provide latitude & longitude for relevant geo terms. Map data is provided as a block and as part of the node properties.
Blacklisting and Renaming
The Calais Tag Modifier module allows for basic blacklisting of tags, so that you never get terms suggested that you don't care about (note that existing applications of the term will remain intact, however). Additionally, the term rename mechanism also allows you to modify returned metadata names before it gets assigned or suggested, by merging the undesired term with the new one.
What's Next?
We are working on a great suite of modules called the Calais Collection that ties all of the great Calais functionality together with great tools for publishers. See the Calais Collection section below. The next version(s) of the Calais modules will include things like: better handling of resolved entity extended data and GUID, exposing Calais suggestions and relevance to views, support for Calais 4.0, and many more Linked Data goodies.
Installation Notes
The ARC2 library is required for this module to function.
- The D5 version requires the ARC2 library to be installed in opencalais/arc_rdf/arc2
- The D6 version also requires ARC2, however it should be installed as part of the RDF module, which is a dependency
- The D6 Version 2.2+ requires the RDF module alpha5 release or later
The Taxonomy Manager module is not required, but can make your life much easier in the event that large amounts of unique terms are applied to your content.
Upgrading
If you are upgrading to the 2.x release of Calais for Drupal, please be sure to unpack this into a NEW directory. Do not unpack over an older version, there have been files that are moved and you may enable the wrong version of certain contrib modules (like tagmods).
Calais Collection
Also part of the collection is
OpenPublish
The Calais Collection is also part of OpenPublish, the semantic Drupal platform for publishers.
Credits
This project is sponsored by Thomson Reuters' Calais and Phase2 Technology.
Releases
| Official releases | Date | Size | Links | Status | |
|---|---|---|---|---|---|
| 6.x-3.2 | 2009-Jul-27 | 53.21 KB | Download · Release notes | Recommended for 6.x | |
| 5.x-1.6 | 2008-Oct-20 | 21.86 KB | Download · Release notes | Recommended for 5.x | |
| Development snapshots | Date | Size | Links | Status | |
|---|---|---|---|---|---|
| 6.x-3.x-dev | 2009-Jul-27 | 53.24 KB | Download · Release notes | Development snapshot | |

