Hello,

Presently the Drupal generated terms hierarchy is flat whereas OpenCalais semantic metadata (entity/fact/event) could be considered as hierarchical.

Sometimes the metadata (Person, Product...) is characterized by more than one attributes. The first attribute could be considered as the parent (Person name, Product name...) whereas the supplemental attributes could be considered as the children (person type, person nationality; product type...).

Continuing with the Person and Product entities, their hierarchical view would be:

  • Person
    • person type
    • person nationality
  • Product name
    • product type

If we submit the Product Tylenol to OpenCalais, the corresponding generated RDF will be:

...
<rdf:Description rdf:about="http://d.opencalais.com/genericHasher-1/b79d043a-acc9-3aaa-a9fe-2f45a0808291">
  <rdf:type rdf:resource="http://s.opencalais.com/1/type/em/e/Product"/>
  <c:name>Tylenol</c:name>
  <c:producttype>Drug</c:producttype>
</rdf:Description>
...

Implementing the RDF structure support as Drupal terms hierarchy would (at least) mean to develop the following features:

  • extract all the metadata attributes as a 2 levels hierarchy
  • map this hierarchy to the corresponding Drupal taxonomy
  • allow the user to setup the Calais entities usage accordingly (ie when disabling a parent, the children should be automatically disabled)

Do you plan to implement the support of RDF structure?

Comments

dman’s picture

Looking at the openCalias spec it seems that what you are describing is not heirarchy at all.

The difference between a Person entity and a Product entity is one of taxonomy facet (or in Drupal - its vocabulary) not one of heirarchy.

Calling this a 'heirarchy':

  • Person
    • person type
    • nationality

Would, I guess, produce a taxonomy:

  • Einstein
    • Physicist
    • German
  • Bono
    • Singer
    • Irish
  • Paris Hilton
    • Skank
    • American

... which is not what you want, and not helpful either.

The 'attributes' provided are not terms above or below the term, although it's possible they could be loaded in as part of the description of the term, but not much more.
You might get better results by inverting that structure, and allowing multiple parents! That's not accurately faceted, but at least it's grouped a bit better.

I've done a lot of RDF+Taxonomy, If you are interested, here are a few RDF syntaxes and here is a detailed paper on the notation of taxonomy data

febbraro’s picture

Status: Active » Closed (works as designed)

I agree with dman here. They are not in the tree itself b/c they are properties of the term, not children. For this (and a few other reasons) the newer versions of this module actually store the Calais RDF in the local RDF store so if you are really interested in all of that data you could query the calais_term table for the tdid [term data id] of the term in question and grab it's guid (linked data uri) then query the local RDF store via SPARQL or more direct lookup of those attributes. Then you can do whatever you like with them, including adding them to the Term Data object.

benoit.borrel’s picture

@dman #1 and @febbraro #2

Thanks for your inputs.

I know attributes or properties are not children, but I am trying to find a way to leverage these attributes through the taxonomy. As dman pointed out I was in fact thinking about sort of

inverting that structure and allowing multiple parents!

Example with one attribute, and therefore a unique parent scheme:

  1. Tylenol
    Entity: Product
    ProductType (attribute 1): Drug
  2. possible taxonomy hierarchy mapping:
  • Product > ProductType > Drug > Tylenol

Example with two attributes, and therefore a multi-parent scheme:

  1. Einstein
    Entity: Person
    PersonType (attribute 1): Physicist
    Nationality (attribute 2): German
  2. possible taxonomy hierarchy mapping:
  • Person > PersonType > Physicist > Einstein
  • Person > Nationality > German > Einstein

Also, febbraro informed us that

newer versions of this module actually store the Calais RDF in the local RDF store

I agree this will store the attributes but to logically link these attributes to a term, it will requires the procedure you described. The solution I described above will not need such procedure but just the taxonomy API functions handling hierarchy (ie the *parents* and *children* functions).

What do you think?

benoit.borrel’s picture

Status: Closed (works as designed) » Active

@febbraro:
I switched back this issue to active in order to have it displayed on the default issues list and to continue the discussion. Feel free to reset its status to by design if you think so.

febbraro’s picture

Status: Active » Closed (works as designed)

I think that this is something that could be done, however I think it is out of scope for what the plan and long term direction are for the Calais module. However, if this approach is something you are after, I will do my best to support your efforts by adding hooks in the proper places of the module(s) to allow you integration points to implement your own functionality.

Currently there is a hook_calais_preprocess and hook_calais_postprocess that will allow you access to the RDF and Node itself to do whatever you may need. If those are not sufficient give me some sort of idea what may help you and we can likely come up with another hook or two to help out.

If you need specific help with hooks open them as a new issue. For more pointers about how to implement this yourself feel free to add more comments here.