[D7 port] Entities and fields for the D7 port [#1196510]

This entry will talk about the entity types and fields for Neologism in D7. It will also talk about how we plan to handle the duplicated database storage that arises from using both the Entity/Fields APIs and evoc.

Entity types

For Neologism D7, we'll need four entity types:

Project
Vocabulary
Class
Property

This is a bit different from D6, where we had only Vocabulary, Class and Property.

A Project is the top-level entity. A project has one or more users associated with it (the authors).

A Vocabulary is a collection of classes and properties that all share a single namespace. A vocabulary always belongs to one Project. A Project has one "main vocabulary". That main vocabulary contains the classes and properties that are actually being created and edited in the project. The other (non-main) vocabularies in the project are the external vocabularies that are being used in this particular project.

So, the handling of external vocabularies will be a bit different from how we did it in D6. In D6, external vocabularies were independent from the projects. Once an external vocabulary was loaded, it would be visible and usable in all projects. This is a problem, because on a large site with many users, you can have many projects that all want to use different external vocabularies. When each project can see all the external vocabularies loaded by all other users, it just gets too much. Therefore, in D7 we want an external vocabulary to be only part of a single project. This means, if two different projects both need FOAF as an external vocabulary, then they both have to load FOAF, and there will be two copies of FOAF in the database. The schema supports this -- the two copies will have the same namespace URI (http://xmlns.com/foaf/0.1/) and smae prefix (foaf), but different associated projects.

The remaining entity types are class and property. They are associated with a vocabulary. They basically work the same in D7 as in D6.

Nodes vs. custom entities

For now, we want all these four entity types to be Drupal node types. Later on, we probably want to keep only Project as Drupal nodes, and we want the other three types to be their own custom entity types. This is because they don't really behave like nodes. For example, they can't exist on their own, but always have to be part of a project. But as I said, let's worry about that later, and for now we'll just make them all nodes, and use node_references when we need to refer from one entity to another.

Database storage and evoc

In D6, we had the problem that we stored everything twice in the database: once in CCK fields, and a second time in the evoc database schema. This was a major source of complexity and bugs, and we definitely don't want to repeat this. We thought long and hard about the best way of avoiding this duplication in D7, and after some back and forth think that the following is the best way forward:

We will not use evoc at all. We will just use the Entity API and Fields API and the database schema that is automatically created for the entities and fields.

The only thing that we want to use from evoc is the code for importing vocabularies. It's best to just copy-paste that code into the Neologism project and remove the evoc dependency. This means that new code has to be written to *store* the loaded vocabulary into the entity/fields tables.

Fields

Some details about the fields we think we need for each entity type (cardinality in brackets):

project

main_vocab: a node reference to vocabulary (1)
authors: a user reference to the users associated with the project. Later, we want to add access control, so only these users can make any changes in the project. When a project is first created, then the user doing the creation is the initial value of this field. (1 or more)
layout: the XML layout information stored by the diagram widget. A text field. (1)

vocabulary

project: a node reference to project entity (1)
prefix: the vocabulary prefix, such as foaf (1)
uri: the vocabulary URI, such as http://xmlns.com/foaf/0.1/ (1)
label: the title of the vocabulary ("The FOAF Vocabulary" or something like that) (1)
description: the short description of the vocabulary (0-1)
body: the full, long description (0-1)
additional_rdf: additional triples in Turtle syntax (0-1)
date_created: date when vocabulary was created/imported/loaded
date_updated: last time when anything on that vocabulary was changed

class

vocabulary: a node reference to the vocabulary (1)
local_name: term ID, such as "Person" (1)
label: the rdfs:label (1)
comment: the rdfs:comment (0-1)
body: the full, long description (0-1)
superclasses: node references to superclasses (0 or more)
disjoints: node references to disjoint classes (0 or more)

property

vocabulary: same as for class
local_name: same as for class
label: same as for class
comment: same as for class
body: same as for class
functional: boolean property: is this an owl:FunctionalProperty? (1)
inverse_functional: boolean property: is this an owl:InverseFunctionalProperty? (1)
superproperties: node references to superprops (0 or more)
domains: node references to domain classes (0 or more)
ranges: node references to range classes (0 or more)
inverses: node references to inverse properties (0 or more)

Comment	File	Size	Author
#12	cleaned-1196510-12.patch	43.97 KB	mayankkandpal
#2	project-1196510-2.patch	66.27 KB	mayankkandpal
#1	neologism_entity_types.png	76.38 KB	cygri

Comments

Comment #1

cygri commented 22 June 2011 at 13:41

Status:

Needs work

» Active

Status	File	Size
new	neologism_entity_types.png	76.38 KB

... and the whole thing in a picture:

Comment #2

mayankkandpal commented 5 July 2011 at 21:52

Status	File	Size
new	project-1196510-2.patch	66.27 KB

Patch which does all of the above except that data is still stored in evoc as well as field/entity schema.

Sorry for delay :)

Comment #3

cygri commented 6 July 2011 at 13:38

Status:

Active

» Needs work

I'll start with some comments on code style, before looking at functionality.

I understand that a lot of these issues are inherited from the old Neologism codebase and not your invention; it's either probably me or Guido who is actually to blame for them, but I'd like to clean them up before committing the code here.

There are some issues with whitespace, like inconsistent indentation (I consider that a large problem because it makes stuff hard to read), and mixed spaces and tabs, and (a minor issue) unnecessary whitespace at the end. You can use dreditor to see most of the errors, and the Coder module to fix such issues up automatically.

You're using Javadoc-style comments for your functions. Drupal uses Doxygen-style comments. In particular, the "General header documentation syntax" section is important. It doesn't ask for so many ***** asterisks.

When using links in Doxygen comments, use "@see http://whatever". Also, I saw links to the API docs in inline code comments. This shouldn't really be there.

There are quite some chunks of commented-out code. In general, committing such code is a bad idea, and these sections should be cleaned up before rolling a patch. (In our case it doesn't matter so much right now because it's work in progress.)

Comment #4

mayankkandpal commented 6 July 2011 at 13:59

Totally agree Richard, will start clean-up asap. Thanks for the tips.

Comment #5

cygri commented 6 July 2011 at 15:13

I just did a detailed review of the patch for functionality, but dreditor ate it all :-( Will write it up once more, but may not have time to finish it today. Sorry about that!

Comment #6

Anonymous (not verified) commented 6 July 2011 at 15:17

I'm going to cover what I remember from the review we did together in the meantime. I will then tell Richard what I think he needs to go back to cover. I will be covering this file by file.

First, in neologism.info

+++ b/neologism.infoundefined
@@ -1,4 +1,11 @@
+dependencies[] = references

The dependency here should be user_reference and node_reference.

Comment #7

Anonymous (not verified) commented 6 July 2011 at 15:29

In neologism.install

+++ b/neologism.installundefined
@@ -0,0 +1,1255 @@
+function neologism_schema() {
.......
}

For the most part, you shouldn't need a custom schema. The idea behind the change that Richard posted to use nodes and fields instead of evoc was to allow Field API to handle the creation, read, update, and deletion of all of the class, property, vocabulary, and project data (or as much as possible). Therefore, most data should be automatically processed by Field API on form submission and stored in the field_data tables automatically. Less code for us, yay!

Comment #8

Anonymous (not verified) commented 6 July 2011 at 15:48

I didn't check the JavaScript, but focused on the Drupal API stuff for now. Once that is set, we can look over the JS.

Fortunately, most of the comments below mean less code (yay again!)

+++ b/neologism.moduleundefined
@@ -0,0 +1,561 @@
+	'title' => 'Neologism Vocabularies',

Change this to "Vocabularies". Non-administrator users won't know what Neologism is.

+++ b/neologism.moduleundefined
@@ -0,0 +1,561 @@
+  $build = array();
+  $sql = 'SELECT nid FROM {node} n WHERE n.type = :type AND n.status = :status ORDER BY n.sticky DESC, n.created DESC';
+  $result = db_query($sql,
+    array(
+      ':type' => 'vocabulary',
+      ':status' => 1,
+    )
+  );
+
+  // Loop through each of our node_example nodes and instruct node_view
+  // to use our custom "example_node_list" view.
+  // http://api.drupal.org/api/function/node_load/7
+  // http://api.drupal.org/api/function/node_view/7
+  foreach ($result as $row) {
+    $node = node_load($row->nid);
+    $build['node_list'][]= node_view($node, 'teaser'); // default is the view mode ... we can create a specific view mode (though another function) if needed and use it here.
+  }

You should be able to use entity_load with EntityFieldQuery here. There is an example at http://api.drupal.org/api/drupal/includes--common.inc/function/entity_lo...

+++ b/neologism.moduleundefined
@@ -0,0 +1,561 @@
+    $form['#submit'] = array();
+    $form['#submit'][] = '_neologism_form_project_node_form_submit_alter';
+    $form['#validate'] = array();
+    $form['#validate'][] = '_neologism_form_project_node_form_validate_alter';

Currently, you clear out the submit functions that are set and use your own custom submit function. As mentioned in the comment above, we want Field API's submit functions to handle the data CRUD (as much as possible). This applies to every hook_form_alter implementation you have.

+++ b/neologism.moduleundefined
@@ -0,0 +1,561 @@
+    $form_state['values']['path']['alias'] =  "project/" . $title ;

This causes a problem... if the use enters their own alias into the form, it will be overwritten. There was path aliasing code in the D6 version which Richard can comment about.

+++ b/neologism.moduleundefined
@@ -0,0 +1,561 @@
+    $form_state['values']['path']['alias'] = $related_vocabulary . '/' . $form_state['values']['title'];

Aliasing only needs to happen for 1 of the 4 node types (should be the project, IIRC). Richard should comment on this.

EDIT: added whitespace for easier reading.

Comment #9

cygri commented 8 July 2011 at 13:01

Ok, here's the last set of comments for me on this patch. Thanks Lin for recalling and writing up the comment above. I'll comment on the way aliases are managed.

Only projects should have aliases. This is because vocabularies don't need a special alias (the node/123 standard URLs will do). And classes and properties in Neologism have "hash URIs", that is, their URI is always: http://site/project#ClassID or #PropertyID. Assigning hash URIs as aliases won't work, so we simply shouldn't assign aliases to the classes and properties, and rather handle their URIs in custom code.

The URI of the project should be http://site/projectID and not http://site/project/projectID. This is because we want short and sweet class and property URIs. For example, there's http://vocab.deri.ie/void#Dataset, and we don't want that to be http://vocab.deri.ie/project/void#Dataset.

Also, the way you manage the aliases (by using the alias form field in the edit form) works now, but will cause issues later on. That's because we want not just
http://vocab.deri.ie/void
but we also want
http://vocab.deri.ie/void.ttl
http://vocab.deri.ie/void.rdf
http://vocab.deri.ie/void.xml
and that won't work so easily. I recommend looking at the way we did this for D6, this was actually (unlike some other parts of the codebase ;-) ) well thought out and worked nicely. Check around here:

http://code.google.com/p/neologism/source/browse/branches/drupal-6/neolo...

We'd invoke the appropriate function (set_aliases, unset_aliases, update_aliases) whenever a vocabulary is created/updated/saved. I recommend using the same approach.

As a general note, it might be useful for you to look a bit more at how a feature was done in the D6 version before tackling that feature in D7, and re-use code and logic that already works. This will save time in the long run. (It is not always possible of course, because many things just work differently in D6/D7, and we *want* some things to be different in D7 such as the evoc stuff.)

Comment #10

Anonymous (not verified) commented 8 July 2011 at 13:54

One thing to also take into account would be that you could possibly use RestWS to do the content negotiation between different versions (.ttl, .rdf, etc). I know that it works for paths like node/1.ttl. I do not know whether it works for path aliases.

Comment #11

mayankkandpal commented 11 July 2011 at 16:42

Richard, I think you meant class and properties here instead of vocabularies :

Only projects should have aliases. This is because vocabularies don't need a special alias (the node/123 standard URLs will do).

We are providing a field in the Add vocabulary form where the user can fill in the desired URI or go with the Default URI so I guess we must have aliasing for Vocabulary.

Also, in your comment, it seems as if class and property are linked to a project whereas what I understand is that class and property are related to a vocabulary and a vocabulary in turn is related to a project.

Comment #12

mayankkandpal commented 13 July 2011 at 11:00

Status	File	Size
new	cleaned-1196510-12.patch	43.97 KB

I have not been successful in running coder upgrade on D7. So the code might still have indentation and whitespace issues. Sorry about that.

This patch has no dependency on evoc and content is stored in field API schema.

Comment #13

cygri commented 13 July 2011 at 18:30

Status:

Needs work

» Fixed

Committed: http://drupalcode.org/project/neologism.git/commit/ecb5280 … I'm afraid I forgot to put --author when committing to give you proper credit in the drupal.org system … won't happen again!

I fixed two problems in the code that the coder module complained about: "SELECT node.title FROM node …". Table names should always be in {curly brackets}.

Coder_review complained about some more errors that were bogus. I used the dev version from git, which has these problems fixed. To get it, I said:

cd sites/all/modules
  git clone --branch master http://git.drupal.org/project/coder.git

and then enabled the module. Then, I went to Configuration > Development > Coder, checked "Neologism" under "Specific modules", and clicked "Review".

I also used coder_format to fix all the whitespace notices:

php coder/scripts/coder_format/coder_format.php neologism

So, since I changed some stuff from your patch, you should do a git pull before continuing development (and definitely before rolling the next patch).

I'll file new issues related to functionality.

Thanks a lot for this patch, and congrats to your first committed patch on this project :-)

Comment #14

mayankkandpal commented 23 July 2011 at 09:01

Thanks for the detailed procedure Richard. The future patches would not contain whitespace errors.

Comment #15

6 August 2011 at 09:02

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

[D7 port] Entities and fields for the D7 port

Entity types

Nodes vs. custom entities

Database storage and evoc

Fields

project

vocabulary

class

property

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

News items

Our community

Documentation

Drupal code base

Governance of community