The "generalized relationship module" discussion - I call it "RDF Metadata"

By dman on 14 Nov 2005 at 14:20 UTC

Apologies in advance for what will be a big post, I'm just thrilled to see great minds thinking alike.

This post was originally written in the "Issues" section, but got so big I felt I should drag it out here. So many of my comments refer to the body of this big thread. I hope the right folk will still be able to find it...

First - I too identified this need (for a way to relate nodes to each other meaningfully) , and wrote up a really big proposal on how to address it, using only existing web standards.

In cases that some folk don't stop to think about too much, metadata is actually a reference to another resource.
The familiar a href="http://www.w3.org/TR/rdf-primer/" is of course one of these cases.
But there is an implicit relationship in the definition of an href - that is it's a link.
Interestingly, this can be explicitly stated with the attribute syntax rel="link" but nobody ever bothers. This make it what the infomation-theorists refer to as a 'triple' : {this-text}{links-to}{that-page} ... but 'links-to' isn't all we can do...

And the next day set about implimenting it, and so far have achieved what I set out to do:

RDF Metadata

Manages relationships between nodes, and between nodes and other resources.

A node can now have any number of 'statements' attached to it.
A 'statement' is an RDF 'triple' relating one resource with another.
EG, the "next" resource from page one is page two.

Right. that was lots of fun, and I start looking for refinements and ways this can be made useful to others. (Actually I was looking to see how to get DOXYGEN to believe that .modules are PHP) when I find this talk!

Generalized relationship

And randomly searching (for something totally unrelated) I see my task seems to have been stated before (although without much reference to existing metadata working groups) in a discussion of a "generalized relationship module". This is indeed what I've created, although possably with less reference to code that's gone before than I might have had.

Amongst the many ideas thrown in there that I like are ownership of relationships Which I'd considered useful, but not in the first version .

Much mention is made of permissions - which I've not looked at at all.
I do have a plan for cascading relationships however - using the predicate 'characteristics'.

I am feeling sorry about branching away from taxonomy, as I can imagine how it could have been shoehorned into doing the job, yet conceptually it just didn't seem to help.
Discussion of the schemas in every comment assumes only reciprocal relationships, and doesn't consider external entities, just as I didn't spend much thought on INTERNAL non-node entities such as to taxonomy term, (although I was looking for a way to drag in user ids). I am looking at solving this by using paths as alternatives to nids, but then database queries are less cool.
I still think referring to Dublin Core/RDF/OWL ontologies is a much better way of defining relationships than the proposed 'left-term' - 'right-term' syntax.

Restrictions on which types of nodes are allowed to have certain relations was also out of scope for me.

At one level, a proposal for a system to relate 'items' to items sounds grand, but it seems to want to take over the entire interface, rather than append directional tagging.
There is mention of tagnode module - which I sure wish I'd head of earlier! Yet to investigate it.

Obviously this subject has been thought about a lot by vlado in his ramblings at dikini.net. I respect someone who can come up with the SQL magic to do the deeper lookups, and i'm pleasantly surprised he wrote a dissertation on RDF covering topics almost identical to mine!

Real-world use-case

And as a contribution to that discussion (64 comments deep and counting), I disagree that the data structure should be nailed down before anyone talks about the APIs - I do agree instead that it would help to say what we want to achieve in practical terms rather than talk info-theory.

So: ... well my first list in initially stating the problem is a start.
To expound upon that, a few purposes I want to see this solution to fulfil are:

Not add too many more form fields to the node edit form, yet do everything from the one interface.
When creating a new page, be able to state explicitly that an older one should link to it.
Refer to external resources and entities as easy as it does internal ones. (URIs, not Drupal IDs)
Set a 'Next' and 'Previous' and 'Up' link on any page at all.
Optionally have the targeted 'Next' and Previous' pages automatically know about their new reciprocal.
Support explicit change history and versioning, by flagging documents as 'replacedBy' etc, in fact All of the Dublin Core 'relations'. For more reading on the right way to do it, I can't recommend highly enough A No-Nonsense Guide to Semantic Web Specs By the guy that did Cocoon.
Publish bits of this metadata in the HTML header, and actually make use of the recommended LINK REL= syntax
Use inferences and tree-walking to return deduced information, like the 'next-previous' relation, but also the 'part-of-a-part is also a part' inheiritance.

Like above, cascade inheiritance of properties to the most useful match (with regard to weighting).
Logic like 'the glossary for a page is the glossary of the book that the page is part of'. This can actually acomplished in just two OWL statements!

Find a balance between weighting down the DB with too many trivial (reciprocal) links and the possible performance hit of complicated dependanct lookups.

Have a real cute API that is as simple as my example snippet

if($newer = rdf_metadata_get("isReplacedBy")){
print(l("Newer version available"),$newer['path'] );
}

Allows site editors to define what metadata links show up and how, using blocks or similar. ... without having to think about '

...This means full (optional) editorial control of the text of links that show up.

phew!

OK, what I might do is just go and throw open my sandbox site so folk can have a play with the interface I've got going and illustrated as an introduction to my module.
A demo is worth another thousand words today, I think. Please have a play with The Edit Metadata addition to the node edit form on that page to see where I've got so far.
I'll just see about securing a few bits in the sandpit now ...

I requested CVS, but I guess my stated purpose is certainly a large amount of duplication with the topic of this discussion.

big post. sorry. big topic. ...

Comments

Sorry that the first

venkat-rk commented 14 November 2005 at 14:59

Sorry that the first response should be a minor nitpick, but the blockquote sections have turned out into huge links. Hurts the eyes:-)

:)

dman commented 14 November 2005 at 16:17

Hover over it... it was intentional :-B

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Three thumbs up

eaton commented 14 November 2005 at 16:00

I agree wholeheartedly that this sort of system is necessary for Drupal to take the 'next step'.

This sort of system could certainly be used to replace the existing book.module as a proof of concept. Most modules, I'd imagine, would automatically generate the relationships, or only provide UI for one or two relationship types (is-followed-by or is-child-of for example).

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

Sandbox Open

dman commented 14 November 2005 at 18:35

You can now experiment with the interface on a live (dummy) site
http://coders.co.nz/drupal_development/?q=node/8

Anonymous users can create and edit content, and make links between pages.
Some 'inverse' relationships are supported. It's mostly a test of the editing interface - does it seem intuative enough.

There are side blocks which come and go depending on whether they have anything to say. Once the relationship terms and roles are chosen, these would probably be themed or something.

Anyway, have a play with the interface - no logon, just 'create' a story or edit an existing one .

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Interesting stuff...

eaton commented 14 November 2005 at 19:03

The widgets you've got up there for editing the relationships seem really cool, though I can imagine most users being overwhelmed by those options. Do you imagine that UI being more of an 'Administrationg relationship editor?'

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

The Options

dman commented 15 November 2005 at 03:34

The available options are of course editable in a page almost identical to the taxonomy editor.
The terms were mostly supplied by a client. Turns out they are lifted directly from the Dublin Core, so I went back to the sources.
Yes, there are maybe too many. I expect to add more (easy admin options) but the call was for any page to optionally have any of these properties/relationships. I'd rather hide them in a drop-down than have 20 flexinode fields. Plus we support plurality.

It'll get 'worse' if/when I list literals in there too, but it's a unified interface to all that ... meta data.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Not quite intuitive

Thox commented 14 November 2005 at 19:55

The ellipsis button [...] isn't quite intuitive. That button is often used to indicate a popup selection box, not replacing a dropdown with a textbox (which incidently could represent anything).

I do however prefer your concept of putting the "New" relationships into the table with the existing ones. That would certainly be an improvement on my last relationships interface attempt.

Do you intend to publish any source code?

Yes and yes

dman commented 15 November 2005 at 03:12

I was going to make a little widget button to indicate ... something else... but then thought what a hassle it would be with theming and all, Any suggestions on what it should look like? I used [...] because it indicates "more" (options) to me.

The combo box/textbox CAN have anything in it! That was a happy side-effect. When I started thinking RDF, it became apparent that the object of a statement can be both either a reference or a literal!
I don't mind if you put your name in the 'creator' box instead of your homepage. ... although this should probably be restricted a bit as I start layering on ontologies.

A relationship is only one type of property, It's the one I'm focussing on right now, but it looked like fun to arbitrarily add a 'keyword' field or an 'embargo date' if you feel like it, and get free, optional meta tag management without adding a dozen flexinode fields.
The acceptable input types are defined in the term definition.

One schema to rule them all.

Source code, I hope to, depending on what my client thinks. I've done this entirely in my own time so far, because it seems like a good idea, but if they wish to use it, they'll get to make that call.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

frames API in core, relations API on frames

sin commented 14 November 2005 at 20:13

I can't pass through the word “generalized”... :)
There are a lot of useful properties of relations, which in different proposed database chemas are just another columns or tables. This structure can not pretend to be “generalized”: different tasks – different columns. So, for core developers, please leave the original tripples idea by dman:

CREATE TABLE rdf (
  subject   text NOT NULL ,  // a Node ID or URI 
  predicate text NOT NULL ,  // shorthand relationship string 
                             // (from lookup) , or a URI
  subject   text NOT NULL ,  // a Node ID, URI, or string 
  KEY subject (subject)
);

Only autoincrement id must be added. If someone needs timestamp, creator id or any other of thousand good properties, it's value must be second tripple connected to first by its autoincrement id, instead of another column. And API must provide transparent access.
RDF is tripples, general hierarchy is tripples, OWL with all of its overpowered relations and properties is tripples. I think in not so far future Drupal will be extended to support web-ontology with all of the OWL power. It is just a matter of time. May be it is time to implement frame-based approach?
Yes it is hard for developer to store a single object instance in more than one table row with meaningless abstract column names. The API must check consistence. There will be a performance cost, but it is the cost of freedom. Look at Protege (protege.stanford.edu) – it is very flexible Java framework for developing ontology based systems. Relations, as any other objects in it, can have any properties, but some fundamental property types (cardinality, transitivity, symmetricity etc.) are built in. It has advanced API to access all those properties. It is almost as beautifull from inside, as Drupal is. It is frame based. It's database backend consists of only one table:

mysql> desc protege;
+-------------+--------------+------+-----+---------+-------+
| Field       | Type         | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| frame       | int(11)      |      | MUL | 0       |       |
| frame_type  | tinyint(4)   |      |     | 0       |       |
| slot        | int(11)      |      | MUL | 0       |       |
| facet       | int(11)      |      |     | 0       |       |
| is_template | tinyint(1)   |      |     | 0       |       |
| value_index | int(11)      |      |     | 0       |       |
| value_type  | tinyint(4)   |      |     | 0       |       |
| short_value | varchar(255) | YES  | MUL | NULL    |       |
| long_value  | mediumtext   | YES  |     | NULL    |       |
+-------------+--------------+------+-----+---------+-------+

The frames idea in protege is much like flexinode, except that nodes and their additional properties are kept in the same table. Additional modules may just cache all those books, forums, users terms and other in separate tables according to the site needs using Drupal API (it would be Drupal 5.0 :)
Everything is files in Linux. Everything is frames in Protege. I started to use Drupal because ALMOST everything is nodes :) Of cause there is no need to rewrite Drupal (it works -- don't touch it! :) But if relations API is alredy required by a lot of different modules, may be implement them as frames to be more “generalized”?

heh.

dman commented 15 November 2005 at 03:28

NOW on quoting I see a typo in my original schema. The third column is of course "object", not a second 'subject'.

... I haven't heard that terminology before, do you have a reference for 'frames'?

While it's true that anything and everything CAN be a triple, I don't want to evangelize it as the only way forward, when we can make a few short-cuts.
I think special-casing the uid into a column wouldn't hurt too much ... although I currently can't think of using it too much ..

Hm, using the syntax to make relationships to relationships, although truly possible and all, is just extra brain-strain. It's not ruled out, you can do it in the future, you can even basically do it now, (easy to support internally) but it would slow down adoption at this point.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

frames

sin commented 15 November 2005 at 09:08

Not too much on frames in Wikipedia...
"a data structure for representing a stereotyped situation"
http://en.wikipedia.org/wiki/Frames
and it was "invented by Marvin Minsky in the 1970s" :)
http://en.wikipedia.org/wiki/Frame_(data_structure)
Google for "frame-based knowledge representation" for details.

The trick is only to STORE properties of relationships as another relationships in the same database table. It is not the goal, it is the way to solve a problem of how to add different properties for different relations as needed. API will hide the frames under Drupal code as Protege API do, provide transparent access to these properties. So syntax may be the same:

<?php
if($newer = rdf_metadata_get("isReplacedBy")){
print(l("Newer version available"),$newer['path'] );
}
?>

But instead of 'path' you can get eny other extra property of relation, defined in appropriate module before and gathered by Drupal core for you.
Just to name a few applications: using frames one-to-many relations are possible.
Regarding to database chema: please don't forget autoincrement id -- is a must to connect frames. And look at Protege database table again -- it is a product of many years of research and development :) Last mediumtext field may be usefull: it is null when not needed and save the world when we want to store something big without creating a node (description property of a relation for ex.)

relations. relations,....

dikini commented 15 November 2005 at 12:00

In my own mind I view relations in the simplest and cleanest way - what they used to mean by that in algebra. A kind of a generalised function.

The relations are one way to express the structure of an underlying set/domain. A way to order things.

These are the goals what I was putting infront of my self to achieve.

You are talking about meaning. The meaning of a relation is declared by some kind of labelling(ids) and their interpretation either programatically or by humans. The labels are either explicit, like taxonomy terms/tags or induced using some AI method.

RDF represents structures of things. Nothing more. You have RDFS - their schemas, to provide some shorthands for the labels. It doesn't solve the meaning, and reasoning bit more than what we are having now, or the long relations discussions we had in various places around drupal. It is just another, possibly very useful, tool to exchange data between systems.

OK. What I probably mean is we need a way to first represent structures, then label then, reason about them in different contexts.

I don't really want to go on too much about it, but I've written a briefish rant on my website about it, showing what am I doing atm and roughly where am I now.

Once again relations, ....

Wow, that's deep math-think

dman commented 15 November 2005 at 15:32

I tried, I really did, but it's been a long time since I was writing pure theory functions like that.
But I did have the best laugh of the day when I found that one of the important concepts was expressed:

(_,_)

... :-B

I can't believe you think I'm trying to be too advanced, when all I'm working with is triples! You are talking about n-tuples and things to represent a set, while I'm just defining a set as things that share a common reference - everything that has property 'hasParent'=X is the set of 'childrenOf' X.

I guess I've been saying the wrong things about RDF. It's not the XML syntax I think is a big deal, in fact that's the worst thing about it. What I'm using is just the concept of triples, and the use of URIs to define types of relationships.
Because this technology gives me a useful bunch of terms and vocabulary to begin with, I'm using that instead of making up my own.

You are right I'm talking about meaning, but I'm not trying to make it up - by using URIs as descriptors of the relationship, by linking the verb, the statement:
Node100 IsPartOf Node56
has now got defined semantics, leaving no doubt to humans or computers just what you mean. THAT is a big deal.

I wouldn't say RDF represents structures of things at all. It's a way of representing properties of things in an unequivocal way. relationships to other entities are just a subset of those properties.
The triples (or arcs as the node-folk call them) are actually each uni-directional. The fact that my doc refersTo the w3c does not define that the w3c is refferedToBy me, although that can be implied with the addition of ontology... or semantics.

That's what I'm progressing onto now
.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

(_,_) :)))

dikini commented 15 November 2005 at 16:56

Most of us like that. You can always change the _ with .

Actually we ARE miscommunicating things, all I'm trying to say is that the triples (subject,predicate,objects) and P(x,y) are equivalent.

P(x,y) == (x,P,y) are the same thing.

Ok, there is a difference, not only notational.

We have storage, labelling, and possibly some inference. The last bit is too advanced ATM to even consider it. Storage is tightly linked with SQL, and most developers are interested and know that. From a storage point of view, if you don't allow for some degree of freedom based on modelling what you want to allow then you limit yourself from the start. Thus my pathetic attemts at an mini-sql query builder.

Labelling, in drupal terms, is either taxonomy or node, IMO nodes will better serve the purpose - they are more expressive and we can get self-documented predicates|relations.

You probably noticed, that I'm talking about 4 different basic predicates for each relation, they comprise all possible combinations of the arguments for that relation to actually be able to act as a predicate proper.

So the differences are not that big. The language - quite a few miles. But that shouldn't be a problem.

The external designation of relations - URI is important. That's one of the reasons to stick to nodes - this allows to coninue in the best drupal traditions of turning the URI into a mini-language of it's own.

I'm worried not to loose focus of what we are trying to achieve. Which is being able to express "node x isPartOf y" or "which nodes arePartOf something drivenBy by a madHatter" :)

On the ouside the different expressions like the rel tags, dublin core attributes, rdf, owl, microformats, whatever can be mapped to based on meaning.

Hope you are not cross with me:) Laqnguage is definitely not a virtue of mine.

Feedback on the UI - relationship management console - please

dman commented 15 November 2005 at 15:37

I've added a bunch more functionality to the relationship management interface.

You are allowed to enter freetext literals as the 'object' of a statement, they don't have to be URIs
Non - URIs will be listed as plaintext
... but if you do want to use URIs - The title will be automatically retrieved from the web for you!
And, it that's not good enough for you, you can rename the link, so it shows up as something nice and brief on the lists. Just click near the item in the relationship table to enter edit mode.
I've made a nicer general-purpose all-metadata block
There's lots of details hidden in the tooltips

... please tell me what else you would want to see as a user when you try to mess around in my sandpit

cheers! .dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

me again

dikini commented 15 November 2005 at 17:01

I like the interface. Are you interested in doing this relations/directional tagging together. To be honest my UI skills suck a lot. And thinking from a couple of very different angles is a very good thing in my book.

At the moment I want to finish the initial API, so we can actually play and see what is possible and not. That's why I was keeping myself quiet about it for a while. That, and the work which pays the bills.

OK, so lets talk API

dman commented 16 November 2005 at 16:14

Great amounts of these discussions seem to have been revolving around "what a good idea it would be to have an API" without one suggestion to show for it ... other than it would be a good idea.

Unless somebody (possibly me) is much confused, API means "Application Programming Interface" (although I used to prefer "Abstract Programmers Interface" ... whatever)

So the important thing is neither how this is implimented, nor what purposes it's put to.

It's the names and arguments of the public functions it exposes

So here's a sample from my code.


 * Find all statements associated to the given node or resource.
 *
 * When the statements are loaded, the title of the target is also 
 * deduced, (if possible)
 * 
 * Optionally, find (filter) all statements of a given type (predicate term)
 *  that apply to this node.
 *
 * If recursion_level is greater than 0, the inverse relationships are scanned to pick up
 * implied data. Implied statements are returned with the rest, but tagged with a 'depth'
 * to indicate how far away from the original document they were found.
 * Inverse relationships should always have depth 1.
 *
 * Aliases are also looked up for this item
 *
 * TODO : inheiritance and transitive relationships
 *
 * @param $subject - usually a node id, can also be a path (eg taxonomy/term/5) or even a URI
 * @param $predicate If set, return only this type of statement
 * @param $recursion_level int When looking up relationships, follow reciprocal and recursion rules this far, the returned list of statements will include implied statements also
 */
function rdf_metadata_node_get_statements($subject, $predicate='', $recursion_level=1) {

 * To retrieve the metadata for a node.
 *
 * $statements = rdf_metadata_node_get_statements($nid);
 *
 * Returns an array of statements.
 * Each statement may look something like:
 *
 *     $statement = Array 
 *         (
 *             [subject] => 198            // The current node
 *             [predicate] => isPartOf     // The relationship identifier
 *             [object] => 197             // The target node OR ALIAS
 *             [path] => node/197   // The link to display to this target
 *             [title] => Chapter 1 // The display title, extracted from the target node itself, or otherwise defined
 *             [object_nid] => 197         // The target node ID, useful if the defined 'object' was an alias path
 *         )
 *
 * Optionally, it may also have a [depth] value set, which indicates
 * how much logical recursion ir took to come up with this statement.
 * A depth > 0 indicated this statement is IMPLIED, not Explicitly stated.
 * You may choose to display this sort of data in a different way.
 * TODO maybe instead of int depth, I should make a complete recursion trail
 * optionally available on the statement - to track how I came to this conclusion.

If we can agree on what the accessor functions should be named and what job they should do - we have our API and progress!

(I'm still fighting with doxygen BTW - it keeps folding my first few functions up into a variable name - most frustrating, otherwise I'd be able to publish the rest of my API examples)

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Questions

eaton commented 16 November 2005 at 16:47

This is excellent stuff, and I'm thrilled to see it moving forward towards a workable API.

A couple of questions about the schema you've set down. I might just be unclear on the details, but what scenerios would make storing both the object ID, the path, AND the object_nid necessary? Also, doesn't tying it to nid rule out linking to users, or comments, or what not?

I'm working on a similar system for a (non-drupal) work project and we ran into similar questions. It required building a separate data scheme to capture information about 'what kinds of things we're linking to.'

Perhaps I'm just missing something? Don't take the questions as discouragement, though. This is great stuff and I'd like to help however I can.

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

It's not stored, it's deduced

dman commented 16 November 2005 at 17:27

The response to that function call is what happens AFTER the node-walk has done its logic (whatever that may be)

Half those fields are convenience fields, filled in as we went.
I found it handy to be able to store to either side of the equation as either paths or nids.
Hence, if I found it WAS a path or an alias in the storage, I backtracked to find the nid, and kept it handy for later, if needed.

so, if
object='aliaspath' ,
object_id=123,
path='aliaspath' (redundant)

but if
object=123,
object_id=123, (redundant)
path='aliaspath'

You've got a point about other 'things' not having nods - which is why I optionally use paths.

'subject' and 'object' are the key fields. they can both be nids or paths.

I tacked on 'path' because I didn't want to see url('node/'.$nid) in my code, and I can now pass the array directly to a theme_link_list() or theme('menu_item_link') where it will just work.

BUT, my paste here was not to say this example is the best way forwards, it was to say I want to see a spec that looks like this and encompasses everything anyone could want from an interface.

What arguments beyond recursion level should we give to a node walker? (assume walking rules are defined by the predicates themselves)
What filtering would you want beyond simple predicate? Should it optionally be an array?
What ordering or indexing should the result be in (currently it's ordered by weights assigned to the terms)
Is this (plus a set($subject,$predicate,$object,$title='') function) enough to start with?

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Thrilled and concerned...

eaton commented 15 November 2005 at 18:33

Really loving the way this is going.

There are, again, a couple of problems that I can see with the UI and the underlying philosophy of it.

1) On a large site, managing nodes in those dropdown lists will be frightening. AJAXy autocomplete from node titles, direct entry of a nid, anything along those lines as an alternative interface would be great for 'production work.'

2) Tagging == relationships? I'm not sure, honestly, about the association of random keywords, etc with content using this mechanism. For external URLs, I'd rather see a node-to-node relationship established with something from links.module, and for textual keywords I'd rather see a freeform taxonomy used. This code REALLY shines when it's managing relationships between existing pieces of data in Drupal.

It'd be a shame for it to get stalled because it's colliding with taxonomy and so on. Not saying that he capability couldn't be exploited in the future, just... well... I'm not sure, really. Perhaps this system IS the groundwork for something that could replace taxonomy.

3) The use of the (...) to change the field from a select to a text entry field is pretty confusing -- I keep expecting it ot pop up a node picker. But that's already been mentioned.

4) A hypothetical replacement for book.module would use 'follows', 'is_followed_by,' and 'is_part_of' to manage its hierarchy of relationships, yes? I'd imagine that modules wanting to utilize this framework would probably manage those connections behind the scenes rather than making users wire the pages together manually?

Seriously though -- this is good stuff, even with the nitpicking. Getting an API along this lines into contrib sooner rather than later would be an awesome step...

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

yes and no

drofnar commented 15 November 2005 at 19:11

1) yes - definitely would need that

2) Tagging is just one special case ... {tag, describes, object} or in dikini's scheme... describes (document, tag). And I think its great that we could use this to represent tags amongst all the other things. Please dont let concerns about taxonomy constrain this thinking. Taxonomy may be very well regarded by drupallers, but what were talking about here is something ultimately far more important. And considering what Ive seen in the demo, could be very quickly bringing practical benefits.

3) Yes I imagine that there would be many modules that could benefit from this clean approach and at the same time remove the complexity/configuration side for end user administrators.

BTW, if you two got together as dikini suggested im sure the result would be terrific and best for everyone. Why not start with a simple core based on standard RDF as outlined by dman, and then elaborate on top of that with all the ideas of dikini's. BTW, Its not just about better representation internally to drupal, its about the best representation and standards in terms of the whole web community. RDF is a widely appreciated standard, and it provides tremendous benefits going forward.

It just occurred to me...

eaton commented 15 November 2005 at 19:17

..That in the starry-eyed future of Drupal, taxonomies themselves could easily be linked to nodes with the 'described-by' relationship.

Performance issues are my only real concern -- if this much metadata is being shoved into a table it coudl be daunting. But after thought, yes... I can see where all the features of taxonomy (synonyms, hierarchies, etc) would fit into this framework nicely.

Hmmmmmmm!

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

Suggestions please!

dman commented 16 November 2005 at 01:49

1. The drop-down is truly horrid, yes. As it currently is with maintaining menus on big sites.
But we have no 'browse' functionality to access existing nodes. Please suggest the better way to do this - I was going to build my own filemanager-style browse nodes widget, but didn't want to get distracted. Besides, with no intrinsic node structure yet, even my 'browse' would just return a big list.

2. Freetagging is a side-effect. I noted in my first essay that restricted vocabs are handled perfectly well already with taxonomy, and although I could easily replace taxonomy like this, that's not my intent.

3. Please suggest something else. Nice, small, theme-independant, internationally intuative. Would a [+] be any better - that implies 'expand' to me :(

4. the terms available so far are not fleshed out yet - they shouldn't have to be, as you can add them as needed and their exact meanings, and what that means to the code, can be refined later. How and if this suppliments the 'book' structure comes later. But yeah, I'd expect another module to come in and help do specific tasks like that. It would only be a refinement on the UI.
'Wiring it together manually' at the moment is not that far away from how I currently have to build a working menu system anyway :-/

I'm thinking an extension method for pluggable verbs. You can contribute a little library that defines 'isCousinOf' which adds the predicate to the list, and roots around for a response when someone asks for rdf_metadat_get("isCousinOf",$nid);
... or FOAF :-B

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Thoughts...

eaton commented 16 November 2005 at 02:22

1) Thox and I had a discussion about this on #drupal the other day. In a perfect world* I'd lean towards a system that offers a 'plain HTML' field for entering a nid. With AJAX, it would hide the nid entry field and offer an auto-complete 'type the title of the node' field. Clicking the (...) button would launch a popup window with a more advanced node selection screen. Underneath it all, though, it would just be storing nids. The popup node selection screen, naturally, is the funkiest part. but that sort of system would, I think, be easy to implement in stages. It could easily be themed as well, as long as the resulting widgets offer back a nid or list of nids.

3) To be honest, I'm not sure. The best I can think of is to offer yet another select menu full of 'types of things that one can relate to' -- other nodes, other users, comments, files, urls, raw keywords, etc. choosing something from THAT would change the select box (or node-selection widget, or whatever) into textfield, or back. Mind you, I'm full of IDEAS, but you're the one who's put together actual working code here. :-)

4) The idea of defining specific terms is, I think, an important one. It brings us back to the idea of a generalized relationship module, one that several others here have mentioned. A while back, there was a discussion about implementing a very simplified 'relations' API in which modules could simply define their own relationship types and do what they would. Is there anything preventing this from being built *on top of* that as an auxiliary layer of intelligence? I know clipper.module is already using that sort of stuff.

From the sounds of it, the basic schema that others were proposing for that core API is very similar to yours. You're putting a lot of inherent 'intelligence' about what those relationships mean and how they should be displayed into your code... thoughts?

--
Jeff Eaton | Click Here To Find Out Why Drupal "Sucks"

--
Eaton — Partner at Autogram

I guess I missed oyur post

dman commented 16 November 2005 at 15:47

I guess I missed oyur post when I was replying to Jaza this afternoon...

... I still can't see an interface that's nice for the user yet. I'm thinking about clients being told to "Enter the node ID" when they are used to a picker. Maybe just a popup sitemap with tiny text...
No good solution, and certainly should not be part of this project :-/

As for the context switcher, you may find that I haven't got around to switcing it BACK yet - I can discard the select and replace it with freetext OK, but you're right I'd need Ajax (or one of my old arsenal of javascript tricks) to do vice-versa.

As I posted to Jaza below, I'm trying to put NONE of the intelligence about the relationships at all in the code.

The intelligence, such as it is, is stored in a table where any arbitrary terms are defined, applying charactristics such as 'transitive' and 'commutative'.

The term 'Next' doesn't inherantly mean anything to my code until someone themes a page with
rdf_metadata_get('next')[0]['path'] '>Next
or builds a breadcrumb or menu to look for those values.

In fact, I want the semantics of each term to be supplied by user-defined extensions - the FOAF can own the logic of IsFriendOf, the Upload module can define the verb 'HasAttachment' ... I can build my clients versioning system as a little semantics plugin of my own, and the list of predicates will reflect what's needed for each site.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Recently visited nodes

tomamic commented 28 January 2006 at 10:29

First of all: GREAT module!!

My choice would be to have a history of nodes visited by the user. In my opinion, they've a greater probablility to be 'interesting' for the user. Present the most recently visited nodes first. If they're not the right ones, allow the user to select others.

I had a similar problem on my site, to select the parent of a node in a large database of hierarchical articles (thousands). I'm going to move to Drupal soon, and your module is really precious to annotate articles!

In my case, I was thinking to allow users to mark interesting nodes beforehand, explicitly (think about the way you go through "copy&paste"...). Or to show only unpublished nodes (it made sense in my particular case). But a sort of history could be simpler to use, almost transparent. It could be saved into the user's session, or in db. I have to admit I don't know Drupal that much to suggest the best way.

Hope this helps! Ciao. Mic

BTW, is this discussion continuing somewhere else?

Not silly.

dman commented 28 January 2006 at 11:50

You have a point. The edit process is probably something like

view page,
find target page,
go back and edit page,
which should be able to detect that where you just were is relevant.

Tricky in many respects, as the concept of 'interesting' is elastic. ... But I love heuristics :-B

I've seen active tagging (sorta like transitory favorites) a few times. Usually for collating a slideshow or something. I'm not too comfortable with it, but it may be worth a crack. 'sticky'.

I've developed the interface up to be able to refine on node types now (although that may not be visible in the demo, still considering whether to go to AJAX for that)
and what terms may apply to what type is configured with the predicate definition screen.
Still no help when we are talking about thousands of 'forum topics' or blog posts.

Good suggestion. History is probably a good try.
This node-selector is needed elsewhere, so it should be done as a stand-alone.

This discussion (such as it is) faded away until I got CVS.
I got that sorted last week, and

The Relationship Module is in CVS!

But I hadn't made the announcement very loudly (at all)

OK, I'm ready to realease it on the world. Have a look folk.
And do start a new thread, someone!

.dan.

http://www.coders.co.nz/

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

New thread

jasonwhat commented 28 January 2006 at 17:52

I started a new thread recently hoping to get some info from the different developers working on relationships. I think there are at least 10 different modules handling relationships in some way -- from tagging to books and categories to the upcoming CCK. I'm trying to plan some pretty complex relationships and don't want to go back and convert everything once some new module comes out that makes my system obsolete. Unfortunately, I had no response on the thread. http://drupal.org/node/46582

I am also stuck figuring out

venkat-rk commented 29 January 2006 at 03:49

I am also stuck figuring out which relationship module(s) to use for a large community site. Would you mind listing all the modules as I haven't looked at them all? Or better (since this request is OT), could you send me the names using my contact form?

Thanks in advance.

This is quite excellent. I

drofnar commented 15 November 2005 at 18:19

This is quite excellent. I think it will represent a very very important advance for Drupal. Great work dman!

I would suggest getting a module out as soon as possible so that it aids in the rest of the deliberations on this topic.

What I like is the simple fact its adhering to the rdf standards that will enable the sem web. For instance if we have rdf descriptions we can then take the next step by using things like SPARQL

..SPARQL, a protocol for accessing RDF data and for conveying RDF queries from query clients to query processors. The SPARQL Protocol has been designed for compatability with the SPARQL Query Language for RDF but is designed to convey queries from other RDF query languages as well..

http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050114/

which could make drupal more powerful for us, but also a leader in bringing the sem web ( and web 2.0 ) closer to reality.

Suggest you read this article on SPARQL http://www.oreillynet.com/pub/wlg/7823

Gives a nice intro about why its important. There are even open source PHP SPARQL tools already available. Anyway heres a snippet from the article

RDF is pretty foundational to the Semantic Web, and it's got a data model, a formal semantics, and a concrete serialization (in XML). What it didn't have till lately was a standard query language. Imagine relational algebra and RDBMSes without SQL. Pretty hard to imagine. So the SemWeb needed a SQL. ... SPARQL — an RDF query language and protocol.

Most Web 2.0 applications and services involve a REST protocol or interface. ... However, there is a bit of a problem. While using REST offers a standard set of operations (GET, PUT, POST, DELETE), it doesn't offer anything like a standard data manipulation language. In others words, there is no standard way to execute an arbitrary query against a Web 2.0 app or service's dataset and get back a representation of that resource or those resources. And, more to the point, the service or app provider has to explicitly support just those data manipulation primitives or operations which it thinks are most useful.

That's great, but it's limiting.

Since RDF is such a useful data representation formalism, and it now has an equally useful query language, more and more Web 2.0 sites can push more and more smarts and functionality into the place it belongs, namely, the data. REST conceptualizes (and HTTP standardizes) public interfaces; but neither does anything to standardize how one interacts, ad hoc'edly and without central control, with arbitrary slices of someone else's data.

But SPARQL gives you precisely that, even when the data on the other end isn't really RDF, since all it has to do is support SPARQL query and map that into SQL or relational algebra or AtomStore or whatever.

..Imagine the horror of trying to get all of these totally uncoordinated Web 2.0 services and apps to support the same SQL queries?

... the [Semantic Web] and Web 2.0 need a data access protocol, which is the other thing SPARQL gives the world. Using WSDL 2.0, SPARQL Protocol for RDF describes a very simple web service with one operation, query. Available with both HTTP and SOAP bindings, this operation is the way you send SPARQL queries to other sites and the way you get back the results. The HTTP bindings are REST-friendly ... and a simple SPARQL protocol client takes about 10 or 15 lines of Python code.

So what, really, can SPARQL do for Web 2.0? .... Well, imagine being able to ask Flickr whether there is a picture that matches some arbitrary set of constraints (say: size, title, date, and tag); if so, then asking delicious whether it has any URLs with the same tag and some other tag yr interested in; finally, turning the results of those two distributed queries (against totally uncoordinated datasets) into an RSS 1.0 feed. And let's say you could do that with two if-statements in Python and three SPARQL queries.

What a shame if this work of yours was lost to the wider drupal community.. I can imagine drupal becoming really important if it were the first widely used web platform that used standard RDF and was SPARQLing around the web with it :-)

Interesting

dman commented 16 November 2005 at 02:05

Wow. Definately related to what I want to do. I'm starting to think about how the internal queries will be defined, so thanks for that link. I hadn't seen it before.

However I found that article to be a little bit of hand-waving - it's a magic technology that is the solution to the 'problem' .. but I still can't imagine one real-world example!

The W3C spec was actually more readable!!!

It gives me a format for how I should be thinking about retrieving graphs, anyway. I'll see if I can leave the right hooks around (and call them the right names) to be accessed in that way.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

triples for tagging - a "Triplesonomy"

drofnar commented 22 November 2005 at 06:38

you know if we look at del.icio.us and all the other tagging sites , none as far as I know use the simple power of triples, its all just object, tag, and maybe a free text description as well. But if we used triples then tagging could be so much more powerful. Instead of a doc having X number of undefined tags attached to it, it could have well defined predicates, and maybe those predicates could be collectively grown along with the tags.

So what Im saying is imagine a delicious like system, only we use triples and allow the user to define/select the predicate for the tags they apply.

This expands the power of tags by introducing triples, though its still a constrained use of the full system here, because we constrain the subject to a URL and the object to a word.

This will produce a much more useful tagsonomy because instead of just having loose tags such as

url xyz , good, todo, jim_reeves, research

we'd have

url xyz, isby , jimreeves
url xyz, istasked, todo
url xyz, isexampleof, research
url xyz, isqualatively, good
etc

where everything bar the subject is collectively formed

I dont have time to go in more detail now, but I feel a souped up tagging using triples would add huge power to tagging, would even open possibilities of running inferencing across the huge collective "triplesonomy" that emerges.

With your engine here, such a service could be created very quickly!

one last point on the UI side, if you want to use some Ajax, why not go for dojo toolkit, is a single lightweight package, and seems to be increasingly being recognised as one of the best.

Triples seem so natural

dman commented 22 November 2005 at 11:45

I too am amazed at just how damn powerful and simple making these statements is. Just being able to have multiple values is a nice big step forwards, and, as the taxonomy folk have found, working with a defined vocabulary can help speed things up too.

Using the statements is a bit trickier. but only because I keep trying to deduce them rather than state them as much as possible.
Your example is indeed a quantum step in metadata from (what used to be called) keywords to meaningful statements is a big step.
If only a paragraph or two explaining things this way this was at the beginning of the RDF RFC the first half-dozen times I found myself reading it over the last three years. It took me until this month to see its significance.

I'd never used the term 'AJAX' until now, I always just used to call it javascript functionality. Back with Netscape 2 I was loading hidden frames containing pages with textareas (complete with arrival notification) to get information back and forth between server & client ;)
But thanks for the recommendation, I went looking for examples, and all I found was a bunch of stupid calculators that could get the server to add 1+1 for you, and use a network requext to do so! Best I find a toolkit and use it, despite my temptations do do everything from scratch.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

bunch of stuff

drofnar commented 22 November 2005 at 17:03

OK,

check out this:

Adam bosworth - google guru - kicked off some conversations about new directions / needed improvements in the db industry. Interestingly some of the conversations turned towards triplestores as a way forward as a better web based approach for storing data than traditional RDBMS's. I found it very surprising as till now I always imagined using a tripple store would imply a hit in performance, and here's a bunch of brainiacs discussing triplestores as a future way for better db performance!!!

http://dannyayers.com/archives/2004/12/30/are-triplestores-good-databases

what is the basis for your tripple store :-)

I also did a google search on sparql and rdf looking for some php engines and was surprised to see that our own discussion was listed as one of the results!!!

http://www.google.com/search?q=sparql+opensource+php&hl=en&hs=Mxn&lr=&cl...

Anyway RAP ( RDF API for PHP) appears to be worth a look you can download from: http://sourceforge.net/projects/rdfapi-php/

Burningbird - Shelley powers site is now using RAP and in fact these folks, shelley, danny ayers etc seem to be mulling the definition of tags:

http://practicalrdf.info/2005/06/what-is-a-tag/#comments

without seeing that the tag can be the object of a predicate that better defines the tag itself and which immediately addresses the problems they discuss with tags being too loose and with meanings that drift. I couldnt comment as the comments were closed, otherwise these folks are definitely convergent in mindset to me. Im sure Clay Shirky (the tagging messiah :-) would get (if he hasnt already) the inherent value of using RDF tripples for tags with the preicate being employed more usefully. That issue is Exactly what I was alluding to in the other coment to you, by utilising the predicate to constrain the kind of meaning a tag has (in my mind the tag as object should still refer to an actual URI, which might be similar to the page you find for a delicious tag, and which can also have an associated feed).

Back to my search on PHP, SPARQL for RDF. After looking around the ARC approach seems more apporpriate than RAP as something worth leveraging. see here : http://www.appmosphere.com/pages/en-arc its preferable in that its more readily integrated and extendable from what they are saying.

Finally for ajax, I didnt pass you the dojo link..

www.dojotoolkit.org

also the other well regarded ajax kit that is complementary to dojo is mochikit.

the good thing about using them isnt just all the neat widgets already available but the fact they have resolved many hard problems such as the backbutton issue and so on, things that just wouldnt make sense to start from scratch, It would be a shame if drupal doesnt settle on a standard ajax package quickly, before more and more unstandard bits & pieces of home grown ajax infiltatre the modules.

Sorry for the long miscelany, all the best, Mark

btw, dman, Im itching to be able to play with a module for this stuff !! I hope something can emerge to play with. I'll happily test test it for you

my opinion

adrian commented 15 November 2005 at 19:38

is that your system is not as useful to contrib modules as a generalized relationship system would be.

However, there's nothing stopping RDF from being implemented on top of a generalized relationship system.

Also, why can't you just use a relationship between a node and a link node (from the link module) for external things ?

Your point about revisions is a good one, but once everything is a node, revisioning will be handled transparently (should you decide to turn it on for your system). (and your rdf module could even generate it's own replacedBy tags or whatever.). I don't believe everyone needs revisions.

Also. some of us want it to 'take over the entire system', and our need isn't just for directional tagging. However we want to make your needs possible with the system too.

A generalized revision system gives us a very powerful tool for creating new sites , modules and frameworks.

--
The future is so Bryght, I have to wear shades.

So what are you saying?

dman commented 16 November 2005 at 02:22

Can you explain what you see as the fundimental difference between the RDF model of triples and a 'generalized relationship system'?
I thought the ability to say "A Is Related To B In This Way" for all things was pretty much it. Please let me know what else we need, what can't be encompassed in that model?

I'd love for anyone to come up with a few more use-case scenarios, from a website administrators tasklist, that they'd like this to fulfil.

Thanks for the mention of the links module, I honestly wasn't aware of it, and I was seeing the need to manage multiple, repeadted links nicely.

My 'revisions' example may not be the best, especially as it conflicts with the internal revisions method, but this whole project started for me on the basis of a client request.
They needed an 'under review, please make submissions on this planned replacement' sort of flag for a whole bunch of legislations.
I agree that revisions are not common requirements, but it was thinking about what they would require that got me this far...

I'd be interested to hear any other applications or problems people think they need to solve.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

How exactly is

adrian commented 19 November 2005 at 14:39

CREATE TABLE rdf (
  subject   text NOT NULL ,  // a Node ID or URI 
  predicate text NOT NULL ,  // shorthand relationship string 
                             // (from lookup) , or a URI
  object    text NOT NULL ,  // a Node ID, URI, or string 
  KEY subject (subject)
);

functionally different from :

Create table relationship (
   rid serial,
   left_nid integer, // or subject node
   type text,   // a short hand relationship string
  right_nid integer // or object node
}

Taking into account we want to have links as nodes.

Except for the fact that you have a whole bunch of complicated words and add the need to have to understand RDF added to the mix ? Calling it an 'rdf metadata' module is a bit narrow in vision. RDF is a standard document format for relationships between objects, I don't think it should be considered the template at this high level of the design. Otherwise we'd have the node system be based on the
terminology of atom or rss. The way we are designing the relationship system, is meant to hide the background terminology of adding new relationships, and the moment we show the term 'RDF' to the user, we have lost.

The same way we output RSS based on our node system, we can output RDF based on the relationship system, and functionally .. there is no difference in the information being stored, however the approach we are taking now will allow us to more easily manipulate and use these relationships inside the system, such as directly integrating the relationships into the meaning of the objects, instead of having it be metadata (ie: when creating node of this type, you are required to create a node of another type related to it, and the node form for the first type will include the node form for the second type)

Don't get me wrong, I love that you are working on this, and your module is almost exactly what I originally had in mind when I suggested it (although my approach was more directly integrated with the theme system among other things), but our requirements have evolved since then.

I still believe your approach has a lot of elegance, but people who know better than I do (allie etc.) have stated that this approach will not scale, especially when you get to building large trees and the like.

I'd love for you to contact me about this so we can chat about it on skype, possibly with some of the other relationship people. I think relationships are a great place to write a proper proposal before we get too invested in an approach.
--
The future is so Bryght, I have to wear shades.

Stupendous!

Jaza commented 16 November 2005 at 02:31

dman, the sample node edit form on your site has simply blown me away! I haven't really been following this (apparently epic) discussion on node relationships, but I had no idea that you guys had gotten this far.

The ability to define so many relationships from one node to another, all in that one simple interface on the node add/edit form, could potentially supercede every other structural and relationship-based system that Drupal currently offers. Use of the hierarchical descriptors (i.e. is part of, is followed by, comes after) wipes away the need for the book module in one fell swoop.

Adding an 'is described by' option to that list would definitely go a long way towards integrating taxonomy-like functionality into this relationships system. Also, IMHO, this would be the perfect reason for Drupal to take up my suggestion of having taxonomy terms (and vocabularies) as nodes.

Obviously, this discussion very much overlaps with my proposal to merge the book and taxonomy modules. Your sample interface has really made me start to think that maybe an all-encompassing relationships module is a better solution to Drupal's woes than a category module. But I really need to think about this more - it opens up so many possibilities, but also presents so many challenges.

The main thing that my category module proposal aims to do is to make hierarchically structuring a site simpler. Structuring a site and categorising content should be one and the same thing (hence my core idea of merging the book and taxonomy modules). Your relationships system could potentially also make structuring a site easier - but which of the descriptors will actually determine this structure?

For example, you already have the 'is part of' and 'has table of contents' relationships - and the 'is described by' relationship has been suggested as an addition to this. If a node has all three of these descriptors defined, and they all point to different references (all of which are other nodes in the site, for argument's sake), which one actually becomes that node's parent, and is reflected as such in the menu hierarchy and the breadcrumb trail? Because essentially, all of them are variations on the 'is parent' relationship.

My suggestion would be as follows:

'is part of' => can only be defined once for each node, has to point to another internal URL on the site (preferably another node), and this determines the node's position in the menu hierarchy.
'has table of contents' => if defined, must be the same reference as 'is part of'.
'is described by' => can be defined zero or more times for each node, and values for this can (but don't have to) include the 'is part of' reference.

I'm also concerned about the complexity of programatically 'decoding' all this metadata in general, and actually transforming it into 'next/prev/up' links, menu structures, 'see also' links, etc. Surely this would be no walk in the park? Are there any real-life examples of a system that has successfully implemented this? It would be great if there was something to show us that all of this 'can be done and has been done'.

Coming back to the whole taxonomy issue... now that I think about it, I don't know that this relationships system would be a suitable replacement for the taxonomy system, either in backend storage or in frontend UI terms. I can't see how this relationships system could cater to managed vocabularies of terms in the way that taxonomy currently does. It seems perfect for all the other metadata that a node has... but not for that.

Also, when you define a relationship on one node's form, does that relationship then appear (in editable form?) on the other node? I.e. if I edit 'story 18' to have the relationship 'comes after story 12', if I then go into story 12, does it have the relationship 'is followed by story 18'? Related to this is the issue: do you plan to handle conflicting relationships in your module? E.g. will two nodes be able to both have the 'is part of' relationship for each other?

An amazing idea, but so far discussion has barely scraped the surface of its practicalities.

Jeremy Epstein - GreenAsh

Thanks for the Kudos...

dman commented 16 November 2005 at 04:25

... I had no idea that you guys had gotten this far.

Actually I've come in very late in the piece, and only discovered previous discussion on this after building this over the weekend.

wipes away the need for the book module in one fell swoop.

Potentially, but there's still a place for a top-down structure. All these tags are bottom-up, which is great for editors, but less so for administrators.

Adding an 'is described by' option to that list would definitely go a long way towards integrating taxonomy-like functionality into this relationships system.

Feel free to add it! I've exposed the term management interface in my sandpit.
You could then have the object of the statement refer to the taxonomy page, eg, go freetext and enter taxonomy/term/5
... or, if as you suggest, the taxonomy was nodes - just that nid.

... OK, doesn't quite work - I don't yet recognise "taxonomy/term/5" as being a system path. But I added an alias to the taxonomy page and the statement
"this References 'metadata'" adds a nice link to my metadata topic pages... Great!

Obviously, this discussion very much overlaps with my proposal to merge the book and taxonomy modules.

I read that with interest, but as I didn't have the historical baggage you were dealing with, I couldn't quite see what you were trying to achive. I saw the cross-over potential, but wasn't aware of the roadblocks you seemed to be fighting against.

Structuring a site and categorising content should be one and the same thing

Hear hear! I think you've identified the biggest conceptual trip-up users face.

- but which of the descriptors will actually determine this structure?

... any damn one you like! :-B
The meanings of the terms are layered on afterwards. How and if those terms are used to show up on the page is implimented by other plugins.
IsParentOf (or HasChild, I dunno) is just another conceptual relationship. It's up to the breadcrump builder to decide which terms it wants to look up first.

IsPartOf, BTW, I see as a transitive relationship (a part of a part is a part of the whole) , and may be not the right phrase to use for building books.

The good news is that there is a whole stucture - the ontology - that defines what those terms mean in practice. This I'm starting on (adding inheiritance) as you may see from the management interface. Only inverse relationships are there yet.

I want to find a way to make pluggable verbs - so that the addition of an additional 'book.semantics' file can be installed to supply the Next and Previous sort of terms, and take care of any extra logic that is needed.

'is part of' => can only be defined once for each node, has to point to another internal URL on the site (preferably another node), and this determines the node's position in the menu hierarchy.

Now you are thinking ontology! :-) which is good. But I'm not building those rules into the code!

Those rules can/may become 'characteristics' of the term.
Ordinality (required, only one) and Restricted content (vocabulary,sets) are to be built into a separate ruleset (on its way, maybe).

I'm also concerned about the complexity of programatically 'decoding' all this metadata in general, and actually transforming it into 'next/prev/up' links, menu structures, 'see also' links, etc. Surely this would be no walk in the park?

Compared to what the menu system already does? Look at the job menu_get_menu() has to do to build a structure from the snippets of structural info available to it! It's a walk alright (pun intended). Add a bit of caching (again, the job of the book.semantics extension) and it's done.

Also, when you define a relationship on one node's form, does that relationship then appear (in editable form?) on the other node?

Yes, it's supported, but I chose not to have a one-way relationship editable from its destination. All links are one-way! some of them are implicitly two-way but thats a job for the ontology. Think Friend-of-a-friend - I may claim you as a friend, but IsAFriendOf may not be a relationship you want to have!
The implict links currently show up (marked as such - see the tooltips). If a link is reciprocal (it has an inverse) I could have automatically defined that too, but I don't need to fill up the database with that stuff yet, if I can avoid it.

I need to find a balance between doing extra lookups and keeping track of extra data. I want to keep statements about items attached as near as possible to their items. Each chapter IsPartOf a book, but the book itself doesn't need to maintain a list of all HasParts when the information can be implicitly retrieved with a query. I feel that info is best stored as a property of the smallest item.
It doesn't make a difference in the code, but I want to filter it for the user. The ability to break off remote links could be patched in, but I just don't feel right about that yet.

Thanks for the feedback!

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Yes, I'm thinking Ontology now

dman commented 17 November 2005 at 17:52

I know everyone is banging on about starting simple and getting the schema sorted before thinking about the meanings of things, but I've been playing with a working, simple schema for a few days now, and I'm ready to start applying logic.

If you have a look at My current predicate editing screen you'll see a big list of 'characteristics' which describe the sorts of lookups I expect to be able to do. They are mostly modelled on OWL, but I've added two of my own.

roll your own logic package

In other news, I've got pluggable semantics working. A separate module - ontology_navigation - when enabled, makes a few extra predicates available, then uses lookups on them to emulate a the book module!
Book (at least the buttons part) was totally easy to port. I don't need any admin interface or form at all if my relationship manager does everything. There may be tricky bits I'm not aware of, but not so far.

So although the page navigation on
my imitation book may look familiar, it's really administered via relationship arcs.

This is however just a proof of concept, as I do believe a true book structure need top-down organisation. This just demonstrates how easy it is for other modules to migrate to using an API. I deleted all the SQL in book.module! Gotta be a good thing!
I threw out a handful of functionality too, but nothing I've missed yet ...
The full source of this demo is posted up here if you're interested. If I ever get a CVS account, I may be able to share more.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Need new thread - relationship API

dman commented 17 November 2005 at 23:10

I want to re-start this discussion focussing only on the "I" - the Interface. I was just wondering if it's really 'core' or a module, or a project.

I get the impression from the docs that it's best to expect a gestation as a user module than assume anything's destined for the core right away. Despite the future that many folk can imagine for this being the panacea to bring everything together...

So please,

What should this module be called? We already have a 'node relativity' (which appears to be limited to parent-child relationships) and Clipper - which I admit I've not evaluated yet. Damn names again.
Related Links and Whatsrelated are not treading the same path at all, and also don't do predicates. I named mine oddly mainly to avoid stepping on anyones toes, but I think that the name is all-important for folk to look at the list and realize "that's what I need".
I don't want to (and can't :-( ) start a project until we've got a name that encompasses the purpose. "relationships", "relativity" could be fine, although I ended up with "metadata" when I saw how general this could be. It's certainly much more than "structure".
What should the accessors be called I'm thinking ...get_statement() ...set_statement() because a 'statement' triple is my building block. Other methodologies may not think that way. I don't want to use get_relationship() because I like how literal statements need not be relationships. I don't mind it being statement_get(), but I would rather read plain English (as you can tell by the predicates I've been using)

IF we can get an API really defined - like by starting with the documentation, then building the code to fit, then different implimentations of the underlying schema can be swapped in and out. I'm very happy with what I've got, but appreciate that some folk that really grok SQL stored procedures and inner joins may have done things differently. I don't know how my constructs will scale.

My (proposed) requirements are listed in English in my OP here, and in more technical language in my admin screen and I guess it could be restated as math functions also.

I have a working model that I can share as soon as my CVS comes through, but I'd be willing to bench-test it or unit-test is vs any of the other theories put forward here so far. My UI is pluggable and API-agnostic, so it'll work on anyone elses version too, I guess, sounds like most folk like that bit enough already.
I'm after a solution that everyone's happy with, but I'm not going to make many ideological changes to what I've got under the hood (RDF Triples) until someone can demonstrate a better alternative.

So, if we can agree on a name, we can make this a real project (and a new thread - I'm losing focus after scrolling 20 screens).

Until then I'm just battling along on my own.

Feedback?

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

How about just "metadata"

drofnar commented 21 November 2005 at 14:06

I agree with you, "RDF metadata" for me is most appropriate of all thats been suggested. all the other terms dont cover the generality of this module.

Why not simply call it the "Metadata" module. That way you keep it general and those who arent keen on the "RDF" being so central may be happy. I fully agree with you so dont see this as trying to change your "ideological base" Just trying to think of a name that may make everyone happy.

I think jaza raised a query about how you'd use this to structure a site, you may have answred, I think it raises a good point that many may not expect and hence realize just quite how simple and powerful this approach is. Maybe a small demo menu module based on it might help? For me a quick glance at your demosite, and seeing the blocks based off the data was sufficient to realise how powerful and easy it can make things for administrators.

Sometimes the best and most sophisticated concepts are the simplest in form.

Still not sure

dman commented 21 November 2005 at 17:48

While it certainly turned out to be metadata once I approached it from my angle, I don't know if this is what many people would think of when they decided they needed a "see also" link block.

I've had "meta relations" suggested, and that's closer - it indicates the linking a bit more.
I am surprised metadata isn't taken already :-} Maybe so... Especially once I start putting the rel tags into the head...

As I said in my demo code for emulating book.module I'm not trying to replace it yet, although rebuilding a link structure based on an existing book-tree structure would be a fun migration exercise.

It is simple, and it is powerful too, when enough of the grunt is hidden away behind the UI.
I've come up with some interesting gotchas too..

I insist that all relationship arcs are one-way. Either or both of parent and child can define their relationship. In the first instance, that doesn't make a difference to the viewer (the reciprocal is implied) but it does to the editors.
What if two nodes both claimed they were 'next' from one document? Whilst I have the ability to define 'ordinality' and limit the display to one, ... maybe there are circumstances where that situation would be true - for varying definitions of "next".

But it's cool, we can define our own interpretation of what is 'next' in the ontology.

I notice that the book module defines the 'next' thing after chapter 2 as chapter 2, page 1, whilst others may say it's obvious that after chapeter 2 comes chapter 3 ... :)

I have a feeling that simple in form or not, pretty soon trying to manage a website structure like this will spaggetti up a bit. I'll need an overview table to manage all links or something, and then it'll be ugly.

Sometimes you need top-down rules. I'll concentrate on tuning this version for cross-references and non-structured relationships for now. To represent stuff we CAN'T represent adequately yet.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Some code on what I was talking about

dikini commented 21 November 2005 at 09:51

I've posted some initial code, mind you it is a bit ugly, but It can and will be cleaned up. I focused on the query generation and the generalisation side of this overall problem. I don't want to commit this to cvs yet, but please comment. I'm really intersted in all opinions.

Haven't given thought on the UI at all. This code is all about the API at the moment.

Relations API - query generation and TODO

The code makes possible to define different predicates, as well as different forms, remember the (_,_) ? :)

I Like that

drofnar commented 21 November 2005 at 14:17

We might end up with lisp ... hey that was only a joke!

But seriously its a nice move to allow an admin to choose the form to represent the predicates, say between prolog style, rdf style, etc. and nicely sidesteps that being an issue between your two approaches.

it was and it is, actually

dikini commented 21 November 2005 at 14:55

to be honest, there is no practical difference, just different vocabulary, different point of view and thus different implementations.

There is no real difference in rdf or prolog style - you declare somehow what is your relation, then you are able to show that. Why I mention prolog at all, is because this helps to illustrate what I want to be able to do- specialise a relation, i.e make a relation that is more 'narrow' in meaning, reuse relation definitions, for example different kinds of books, etc...

Have a look at the code - it attacks only the problem of how to generate a query, based on predefined relations.

The next thing I'm going to do is provide path parsing, so you can do things like node/100/related-to - displaying all nodes related to a 'subject node', etc...

prolog infers as well as declares

drofnar commented 22 November 2005 at 17:25

well not quite the same. While RDF simply declares the data, prolog brings with it an implicit backward chaining inferencing mechanism that will run across those declarations. Which may or may not be what you want.

BTW, What youre doing is great dikini

That's, Um, wow.

dman commented 21 November 2005 at 19:14

I may have been frying my brain all night working out recursion rules for my predicate logic engine, but that example still just confuses me :-} I'll have to study it a bit more to even begin to see what you're demonstrating, all I see is obfuscated SQL!
What sort of calls would you expect module developers to make on that engine? I guess I was thinking an API would be more about an intuative 'Interface' than the powerful yet 'Abstract' everything-to-everything machine you've got happening there.

I've also been trying to figure out why the PHP documentor doesn't like the code I've been so careful to add great big prologs too - http://coders.co.nz/drupal_development/?q=api/drupal_development/file">so my documentation isn't rendering as well as it's supposed to. But there's some sources to view.
I'm using my own CVS repository this week. What does it take to actually get project rights around here?

I've been making pluggable logic functions, to support arbitrary relationship terms, and what they mean. User modules can define their own logic lookups (if they really really want) to influence how the graph is walked to retrieve the information they need.
EG, now a request for 'HasAncestor' will initiate a cumulative, transitive walk and return a set containing a nodes parent, parents parent and so on. But it's all user-configurable in the 'characteristics' of the predicate terms.

...fun.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

fun indeed :)

dikini commented 22 November 2005 at 07:52

yep, it is abstract. ended up there. I can see why it may be obfuscated SQL. The whole purpose was to marry high level function/predicate invocation with the underlying data storage. One of the aims is to make most if not all of the heavy duty processing in the database, leaving the finishing touches only to php.

Although I wouldn't call it obfuscated, I want to keep a pre-prepared AST-like structures for the underlying SQL. This enables us to create a large array of queries. It time, energy and interest permits, we can do optimisations of the code as well, but that is a bit far-fetched.

From what I can see, the only thing which can't be done in the db is graph-walking. That is due to the version restrictions for mysql in drupal. With stored procedures it can be achieved. But alas, we shall see. That is something for the future, I suppose.

Well, you are right, that we need no provide a more developer friendly interface. I just don't have a clue at the moment what that will look like. That is the main reason, why it is not there. It will come based when trying to implement some use cases.

Initially I'll keep away from touching anything related to UI. I think it is too early. I'm writing some path parsing at the moment. So you can implement things like relation/node/123/authors/...

It will allow playing with different scenarios before entering the UI domain, which is definitely going to be strange.

The examples are what I used to quickly test the code.
Let's have a look at the last one - it is all about a node_authors relation. Not who created a node, but who are the real authors of a story. A very feasible scenario, dependent on the editorial process of a site.

RFS('user',
   RFS('node', array('param'=>array('nu.nid=%d',' AND ','nu.uid=%d')),
   RFS('node_user',array('conditions'=>array('nu.nid=n.nid','AND','u.uid=nu.uid','AND','nu.rid=1000')))));

In the ideal world, that is after relation management is implemented, you will have something like:

$node_authors_signature=RFS('user',
   RFS('node', array('param'=>array('nu.nid=%d',' AND ','nu.uid=%d')),
   RFS('node_user',array('param'=>array('nu.rid=1000')))));

which can be cached in db, my current plans are to use nodeapi, to store these cases as nodes, so we have verbose descriptions, help, etc...

Then the call:

$node_authors=RFS('node_authors',array('param'=>array('nu.nid=%d',' AND ','nu.uid=%d'));

Will give us the signature for any inquries about node authors.

From a node to ask which are my authors:

$node_authors_query=RSQL(RFS('node_authors',array('param'=>array('nu.nid=%d' )));
$result=db_query($node_authors_query);

Hope you see the pattern. Remember the P(_,_)? With the fields, and param filters we can manipulate the return result, and the parameters/arguments of the function, i.e the final shape of our SQL statement.

The main advantage is that we can allow site admins to create template relations, which afterwards, can be applied on a per 'thing' basis. A thing is a node, a node type, a user, a taxo term, a relation, whatever else you can dream of.

Hope this helps a bit. This code is not a finished item. It is just the start, so it will probably change. The query parameters, and conditions (they are treated similarly, and might become one and the same later on), as well as the filter definitions are intended to be filled in via the user interface.

Together with the cached relation definitions, equivalent to functions in a programming language, we get the expressiveness of a nice little language. This is what it is about. If we were not having the need to manipulate the results, to do different aggregatrions, etc... we wouldn't need this at all. The other option is hard code everything - which is not flexible enough.

a little progress report

dikini commented 30 November 2005 at 11:36

I've been silently working towards what I was describing earlier and in this thread and other places. I've written an oververview of the current status and the showstoppers in this relations report.

In short - The query generation is working and will be extended later on. The url parser works, there are a few bugs I know of) - this allows for experimenting with the relations api, and on the fly list generation, similar to cck's listings api, but a very differnt aim and approach.

The show stoppers for wider play are - UI for adding relations and list themeing. The first is due to my struggling with UI in general, the second needs a little thought. Themeing in this case is very context dependent, so I need to find a way of descring this meta-data. So it can be reused later, in themes for example.

Nice to see

dman commented 30 November 2005 at 12:24

I too have been working reasonably silently - mostly on the ontology logic side of things over the last week.

I found some incredible reading today at the Haystack Project which has a working and dynamic (albeit over-powerful) metadata browser engine ... thing ... that seems to do everything and more.
Worth a look for UI ideas - on a user level anyway. I think...
GIANT java client engine however.

I've been extending my 'pluggable ontology' ideas, so each user module can define their own predicates and the graph-walks that are triggered when a relationship is requested.

I've so far re-built both the forum/comment structure using 'isReplyTo' predicates, and the book module using 'next/previous/container' predicates... it's all fun.

I still can't see the common ground we are aiming for, although I'm comfortable with the alternate notations you use ... after reading the one-page manual for a whole new language that 'haystack' uses.

You are still talking SQL, when I'm thinking predicate logic ... OK, it's not THAT different, but it looks it.

I want an API - an object or set of function definitions that anyone can call to use this functiionality.
I've renamed my code just 'relationship' for now. I didn't want it to be too verbose.

Recently I've found 'aliases' are giving me pains. I want to carry on supporting nids or paths or aliases or URLs as the prime IDs, but if I'm to move forward I may have to drop nids altogether, and use an ontology isEquivalent construct instead in the logic phase. The SQL lookups are getting awkward (for me) as the alias table maps paths, not nids, and using CONCAT('node/',nid) in a query seems really unhygenic to me.

I really really want to release the code but it's been something like a month since I requested CVS, and I don't even think the (follow-up) emails are going anywhere. Can someone tell who is best to contact ... without being too pushy? I know I've got a lot of interest here, but don't want to whine about it.

I have code, I have even more documentation, I have several working demonstrations, I just have nowhere to put it, and don't want to branch another off-site repository.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

hmm, probably need to clarify things a bit

dikini commented 30 November 2005 at 15:16

Yes, I'm talking SQL, because this the the underlying storage engine. I'm not thinking SQL though.

Let's start with SQL - there is a reason why SQL databases are called relation databases - each table defines a relation between it's columns, and so on. Don't want to go into theory or undermine anyones knowledge or being sarcastic here, I'll save that for myself ;) The real point I want to make is that there is a direct mapping between statements A is B, C is_a_reply_to D and the way that is stored in an SQL database in drupal's case. We will be fools not to explore that mapping. It would be beneficial for speed and code simplicity if the heavy work is done in the db, meaning that the target should be one high level question - one SQL query. This is what drives me to write that database layer. We need the flexibility to define what is_a_reply_to means in an application.

Now that is the end of the SQL bit. What I aim for is to avoid the SQL, when we work with predicates, or relations. To be able to build them and their meaning out of some primitive building blocks.

This will enable us to define what do we mean by is_related_to. By meaning I presume is what conditions apply for something to be considered related to another thing, not just simple expression of A is_related_to B. If it was the latter, we have that with taxonomy or it could be done with a very simple triple interface. The rest is just the case of constructing a custom taxonomy browser or that custom trible display. But in reality this is a daunting task for the user to construct it. We must allow it, of course, but why not infer relations? Why not minimise declarations?

Now let's stand back a bit. If you can express A is_related_to B, what does that mean from a programmer's point of view? That you have a function is_related(), which can give you the answer to your potential questions. is_related(A) - all things realted to A. is_related(A,B) - true or false, etc... If you skip the technichal details, and retouche things a bit, we are talking about similar things. Approaches differ, a lot. Maybe the end aim is different. I'm not sure about that bit. What I'm trying to do is to build myself a small language to express what does is_related_to means within drupal. From there on, it can be polished, made to look meaningful to end-users, etc... That's why the url parser - it is a test tool, exploration tool, a demo of how to use the engine.

Can't this be done via forms? Can it be used in nodes, users, ...? Yes it can. I can't do everything in one magic split second. I do this mainly in evenings and weekends, cause I'm quite busy the rest of the times.

As for cvs, ask in #drupal, you'll get a time estimate, or an immediate result, though unlikely.

SPARQL code for PHP

bengee commented 22 November 2005 at 19:38

Heyup, just saw ranford's comment on the rdf4food wiki. I'm maintaining ARC, which will have a lightweight (php/mysql-based) SPARQL store (hopefully) soon. If there are RDFy/SPARQLy bits you could need in drupal, I could perhaps provide some hints or code.

Hi! Welcome...

dman commented 23 November 2005 at 01:29

I've just been reading ranfords links, and really like what I see with ARC.
Specifically, the defined interface for retrieving data

Your get_resource_props() is analogous to my get_statements() ...

This sort of API is what I think the developers here can start to build on. "These are the calls you can make, and this is what they will return."

I'll see if I can have a play with what you've got. My code is awaiting my CVS account activation, but the real-world interface looks like the metadata block at the bottom of this page.
I've given thought to the predicate-as-uri implimentation when defining them and their OWL characteristics but so far I've avoided layering on namespaces, as they are a bit scary and extraneous for many humans. I'd like to have background support for them in there, but for now it'll be shortnames for everything.

What I have not done at any point is any actual XML/RDF either in or out! All my triples are just database rows. I'm using RDF semantics and theory, but not the markup ... yet. It looks like it would be easy to use your package to actually take that step. If my internal arrays are conformant with your internal arrays, your tool can do the XML-scrunching!
Given a working database, it seemed wrong to spend time parsing XML in and out to maintain data that was only to be used internally (in this case).

... I'm trying to find an example of SPARQL that actually goes from real-world example to real-world answer. Just to see what bits (if any) I need to leverage. Studying the RFC now ... :(

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Re: Hi! Welcome

bengee commented 23 November 2005 at 18:00

Thanks :)
I like your RDF metadata admin tool!

Re internal use of rdf: There is actually no need to completely map rdf to internal data structures, esp. if you have already a working database. Many rdf-enhanced apps keep their customized tables, but use rdf/xml (or other rdf serializations) only for data exchange. An rdf-optimized db schema (triple hashes etc.) can speed up query performance and can facilitate sparql2sql mappings, but for many use cases it's not necessarily required.

Sorry for the shameless plug, but you may have a look at CONFOTO, where I'm generating (non-scary) labeled form fields from properties defined in OWL. I'm also using simple SPARQL queries there to generate "suggest as you type" lists instead of html drop-downs. (guest account is alec:tronnick. Pick any photo, select the "my annotations" tab and try entering "web" or "sem" in the "Has subject" field.)

RDF can have an initial implementation burden as it's so open that you sometimes don't know where to start, but once you have such a system running, its flexibility really increases productivity, e.g. a list of pages in an rdf- and sparql-enhanced drupal could be retrieved by something along

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
							
SELECT ?node ?title
WHERE {
  ?node a foaf:Document .
  ?node dc:title ?title .
  FILTER (REGEX(?title, "metadata module"))
}
ORDER BY ASC(?title)
LIMIT 10

A query to find the 5 latest node referrers could look like

PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
							
SELECT ?ref_node ?title
WHERE {
  ?ref_node dct:references <http://drupal.org/node/37556> .
  ?ref_node dc:title ?title .
  ?ref_node dc:date ?date .
}
ORDER BY DESC(?date)
LIMIT 5

SPARQL nicely maps RDF's graph structure to a convenient query language. Even more sophisticated queries (e.g. in order to automatically publish node comments by "people who created nodes that were rated as 'outstanding' by me or someone from my blogroll not longer than 2 months ago") stay simple due to the basic triple pattern mechanism.

However, you don't need to switch to RDF and SPARQL if you use and can implement the functionality internally with an existing DB and SQL. SemWeb technology becomes useful when you start to export your metadata so that other installations can import and re-purpose it as well. Imagine a blog where every node/reference/rating/etc is made available as rdf/xml or via a sparql interface. Then the people who are referenced could add the remote metadata to their systems and vice versa. And then it would become interesting to run queries on the aggregated data of a "blog neighbourhood", e.g. who commented most, who received the best ratings, which post caused many comments, etc. To create the metadata/annotations, your approach is probably just fine. An RDF extension could still easily be added later.

</bla>

RDF Module: SIOC (Semantically Interlinked Online Communities)

Cloud commented 23 November 2005 at 09:59

Hello - I've started work on the SIOC module to link online communities... It uses RDF. This might be what you are thinking of (or at least some of it).

(Here follows a copy of drupal-dev/development mailing list post.)

Hiya -

I guess I can repeat a post from the civicspace-dev list here to describe the idea of SIOC [...]:

I met Kieran L. in July, and talked about our SIOC (Semantically-Interconnected Online Communities) project. Basically, what it is is an open specification for describing communities using online discussion forums or blogs, leading to what some may term "distributed conversations". At the moment, online communities are islands that are not interlinked, and the SIOC ontology has been proposed to not only link these communities but to leverage data in ways that were previously unknown.

The initial version of our SIOC specification has been drafted. It can be used in on its own (having a complete set of terms) or in conjunction with other RDF formats such as RSS 1.0 (and 1.1). In terms of producing metadata, we've started with SIOC exporters for open-source discussion systems such as WordPress and Drupal / CivicSpace, and more are on the way. There are obvious connections between SIOC and the subscribe / publish module, so I've asked John van Dyk about this.

More info is available from:

http://rdfs.org/sioc/

and the Drupal module is at:

http://rdfs.org/sioc/drupal/

or in the Drupal CVS contributions area under SIOC.

While there are many (useful) classes and properties in SIOC, it can essentially be boiled down to: Users create Posts that are contained in Forums that are hosted on Sites, e.g.

Site -> host_of -> Forum -> container_of -> Post -> has_creator -> User

Posts have reply Posts, and Forums can be parents of other Forums.

Looking forward to your feedback, and of course contributions to the code :-)

Thanks,

John.
--
Kieran Lal wrote:

> You will want to look at Semanically Interconnected Online Communities.
>
> http://drupal.org/node/24925
>
> Cheers,
> Kieran

Is anybody capable of

Gunnar Langemark commented 23 November 2005 at 12:05

Is anybody capable of explaining to me (in perhaps 5-10 sentences?) what the relationship between taxonomy module and this "RDF metadata module" is?

What I mean is:

Taxonomy is capable of going at least as far as Faceted metadata. It has related terms, parent(s) and synonym rings. It is quite good at this.
As I understand it, Taxonomy could be extended to do Topic Maps and Ontologies. Which would in effect be the same as having a way to do RDF metadata. (am I wrong here?)
Taxonomy "lives on its own" - and nodes are attached to this system of categories - as should be.

This "RDF metadata module" works to organize nodes and has no inherent structure in itself - only through the organization of nodes? (please either tell me I'm wrong, or tell me why it should be so.)

Isn't there a fundamental difference between a system which builds a set of categories, and a system which builds a set of relationships between instances?

Sorry if I make too little sense here.

Gunnar Langemark
http://www.langemark.com

Doubles vs Triples

dman commented 23 November 2005 at 14:00

The type of classification can now be defined.

Taxonomy gives you keywords, and you have to guess how the keyword is applied or deduce it from the vocab. And you still don't have node-to-node links, unless "Both have the same keyword" is good enough for you.
The structure you mention in taxonomy is all just between other taxonomy terms, not node-to-node.

Even given simple 'related link' implimentations or grouping, the data looks like
Page1 : Diagram7
Page1 : Chapter2
Page1 : Page77
Page1 : Linguistics

Triples turn your data from that into the much more powerful
Page1 : HasAppendix : Diagram7
Page1 : IsPartOf : Chapter2
Page1 : RefersTo : Page77
Page1 : HasSubject : Linguistics

- Only the last example crosses over with the current capabilities of taxonomy.

That sort of info is much more interesting to maintain, theme, and display.

To answer your question, there is truly not much relationship between the two. Current behaviour of taxonomy can be expressed using this syntax (but not vice versa) although that was/is not the intent.

When every verb relationship (predicate) is "hasKeyword", it's redundant to keep stating it. So if you chopped all the predicates out of my tool, you'd be left with basic taxonomy.

OTOH, if you enchanced taxonomy to be able to specify types of linking, and allowed references to other nodes, URIs as well as the vocabularies, you'd be approaching what I'm doing.

I don't think of this data as primarily structural, or even organising. Top-down organizing, sorting, classifying, is still a taxonomy job.
I say 'metadata' because at one level it's just souped-up tagging. The info is applied at the node level, and if you choose to walk the graph to infer a structure from that, Cool, but it really is just pointer soup. ... so far.

Yes, my first post was lamenting the lack of structure in Drupal nodes, and this was supposed to address that, but ... things got a bit more generalized after that.

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Is this module available

jasonwhat commented 9 December 2005 at 17:16

Sorry if this is mentioned in the posts, but I couldn't read them all, just skimmed. Is this module available anywhere? I looked on your site and found lots of references, but no links to the full module. It looks very promising. My hope with relationships in Drupal is that display will be highly flexible. For projects I'm working on I'd like both inline display links, perhaps using ajax to show content related to a certain word, and control over RDF output. For the second, I'm trying to place related content in tabs for easy viewing from the main content.

:(

dman commented 9 December 2005 at 23:19

I've been waiting on CVS for a while.
I got a note saying it was 'approved' last week, and I've been able to make a placeholder project page just now. But CVS login is still saying user dman not known. :-/

I've been dramatically re-jigging a few internals, but I think the main interface is ready to play with. I was making up a few demo walkthroughs and docs and stuff so it could bootstrap itself on install with something useful to start with.

Spent most time recently building a predicate logic inference engine! Lotsa fun, but a bit off topic, so I extracted it to an optional plugin library.

Anyway, I guess I'll go to IRC or something and troubleshoot my login probs.

(I hate IRC)

Thanks for the interest. Hope to get something out this weekend.

(Thanks for the enquiry nethobbes too - your profile contact was turned off, so I couldn't get back to you directly)

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

Lotsa code :-) No CVS :-(

dman commented 11 December 2005 at 21:45

OK, I still don't have drupal CVS happening.
So here's the code I've got together so far.

Start with the README (as if anyone ever does) just to see what to expect.

Lotsa docs over here in the API viewer - but still broken in places. Not sure why.

Lotsa files too. Half them are optional, the other half are 3/4 comments and docs, really!

The 'ontology_navigation' module can build a mini-site with some demo links, so that's a fun place to start.

ontology_rdf.module plugs in a lot of power, but you need to like RDF.

If you go and get the ARC RDF Parser you can automatically absorb information from other sites (like SchemaWeb) - this is VERY cool.
I didn't distribute the parser library myself, 'coz I haven't thought about the license yet. But drop it in alongside my stuff and I'll use it.

TODO is to hide / filter which properties show up where - like to remove some of the mysterious terms out of the select box so they don't confuser da user.

Also I can look at caching the intermediate DB lookups - the accesses are quite repetative at the moment, and everything, the configs for the term definition themselves, is stored in triples. This means a dozen requests to just retrieve one 'block' of data, even before the inference engine goes to work. Easily optimized soon I think.

There are a few places SQL magic could be applied - but I'll leave that to others to make suggestions on :) . I've been building this top down, making up lookups as I needed them.

I have NO IDEA what folk on different systems may encounter when trying to turn this on.
(Oh, it's Drupal 4.6 only) but I've dev'd it on Win and deployed on Linux so I'd be interested in troubleshooting.

There are a few swathes of redundant scaffolding still in the code. Just ignore that.

Any help with telling me the best way to work with the API documenter would be appreciated.

Have a look!

.dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}

workable with 4.7?

drofnar commented 30 January 2006 at 10:16

Hi Dan,

I tried posting the other day but my connection crashed out on me, Im in a remote region of indonesia, flores and comms is really difficult.

Ive been out of the loop a while, but tried your recent version of the mod and its great.

Im hoping to use 4.7 soon, do you think it will work ok with that or will it need an upgrade? Actually I'll try it and answer my own question

Whatever I'll test out the mod and provide some feedback.

More general thoughts - i'd suggest keeping the inference engine aspects as distinct as possible from the declarative triples. that way we may end up with a variety of contributed inference engines, eg backward as well as forward chainers, and may even end up with some basic expert systems or reasoners all of which are independent of the base module you have.

Im not sure what doug lenats latest efforts up to, but his whole cyc effort must have a huge value if can be leveraged. I know he released an Open Source version, but dont know if it would be usable

anyway, congrats on the great work youve done.. Mark

modularization.

dman commented 30 January 2006 at 11:59

I've just released a statement about the path to 4.7 over in the issues.
Short answer, eventually, but not till the current version is silky smooth. Will not work for 4.7 today.

Your suggestions for inferencers are right on the money. Having discovered so many other implimentations out there, I decided a while back to abstract that process out to another pluggable library.

I was halfway through converting my original inferencer (which did most of the logic on request to a more verbose, static version when I read http://dannyayers.com/archives/2006/01/18/where-does-the-reasoner-go/ (link not working today?)
Which described my dilemma quite well.

On request seemed more natural and tidy. Terse statements in the databse, Easy to reflect new facts and their implications in real time. But a resource hog. Now I'm seeing what happens if I fill the database with lots of trivial, but easy-to-find facts instead. I intend to retain both methods for the admin to choose from.

So I think I'm working towards having some sort of inferencing_hook system where extra plugins can do extra work each time a statement is input, AND/OR each time a query is made (currently implimented on a per-ontology-term way)

I'll continue to work through the code extracting the interface from the logic from the storage as much as possible. I have a set of separate .inc files for exatly this reason.
The current storage routine (for example) is designed as a stub that MAY one day be replaced with an interface to a totally different server talking RDDL or something.

Cheers, .dan.

_{.dan. is the New Zealand Drupal Developer working on Government Web Standards}