Tables part 1: Joins

To describe your data to Views, you need to implement hook_views_tables() for your module. This hook returns an array of table objects. The keys to the array are the names of the tables, and each one must be unique.

The general structure of this system looks like this:

<?php
function example_views_tables() {
 
$tables['example_table_a'] = array(
   
//more...
 
);

  return
$tables;
}
?>

Your hook can return as many tables as it likes.

The first member of the array should be 'name', which is, in fact, the real name of the table. If you're not using an alias, this will simply be the name of the key:

<?php
function example_views_tables() {
 
$table = array(
   
'name' => 'example_table_a',
   
//more...
 
);
 
$tables["example_table_a"] = $table;

 
$table = array(
   
'name' => 'example_table_b',
   
//more...
 
);
 
$tables["example_table_b"] = $table;
  return
$tables;
}
?>

In general, the key to the array and the name of the table will match, but this will not always be the case. More on this later in the "Naming your table" section.

Joining your table

The most important thing Views needs to know about your data is about how it relates1 to the node table. Ideally, Views needs to be able to draw a straight line from your table to the 'node' table. It does not need to be a direct relationship; Views can take any number of steps to get from your table to 'node'. Bear in mind, however, that the more steps, the more JOIN statements go into the query, which means the poorer the query may perform.

When considering a relationship between your table and the node table, Views considers the node table to be on the leftmost side, and all other tables are joined to the right of it. When you describe this relationship to Views, you always tell Views that the next table to node is on the left, while your table is on the right.

Let's take the basic model of taxonomy as an example. Taxonomy has 2 important tables (there are others but we will ignore them for this example). One table, called term_data describes a single taxonomy term. It contains the name, the weight (sort order) of the term, the vocabulary the term belongs to and a description of the term. The other table we are interested in is called term_node, and it contains only two fields: a tid, which relates to the term_data table, and a nid, which relates to the node table. For every term associated with a node, there is a single entry in term_node describing this relationship.

In this diagram (which is vertical for space considerations, but staggered so that you can see the left -> right relationship) node is on the left. Directly to its right is term_node and then to its right is term_data.

This information appears in the array in the 'join'. The following example is the 'join' information for any table that relates directly to the node table via the nid field:

<?php
 
'join' => array(
   
'left' => array(
     
'table' => 'node',
     
'field' => 'nid',
    ),
   
'right' => array(
     
'field' => 'nid',
    ),
  ),
?>

Left join vs inner join

<?php
$table
= array(
  
'join' => array(
    
'type' => 'left',
    
//more...
    
),
  );
?>

For the most part, you don't need to worry about the 'type' field of a join, but you should at least think about it when describing your table. There are two types of joins available:

A left join (which is the default) tells SQL that a record for the table on the left will appear whether or not the table on the right has a matching record. For example, using a left join to to the term_node table above, the node record will a appear in the view whether or not it has any taxonomy terms attached to it.

An inner join, on the other hand, requires that the table on the right have a matching record, or no record will appear. This can have a detrimental effect on some views, because it can cause records to disappear. On the other hand, proper use of the inner join is significantly more efficient and can result in much faster queries, so you should carefully consider the use of an inner join. There are times when it may make sense; however, since Views is hiding this from your users, the UI will not make it easy to let the user decide which is appropriate, and if your software decides inappropriately, your view will seem broken.

To illustrate the difference, let's assume that you want to create a view that pulls up a list of articles and their terms. If you use a left join, then you will get all articles, whether or not they have a term associated. However, if you use an inner join, your list of articles will contain only those articles that have a term -- any term -- attached.

[Insert Diagram Here]

Note that there isn't a join that will cause a record to appear in the view if a record exists in the right table, but not the left2.

For those in need of basic join information, Ligaya Turmelle wrote a very clear introduction into SQL joins.

Naming your table

Earlier, it was noted that the name of your table and the key to the tables array are usually the same (i.e, array('example_table_a' => array('name' => 'example_table_a, ...))). Sometimes this is not the case because you might use the same table multiple times and alias it. If you don't understand why you would do this, don't worry about it, and come back to this later.

One really fast example, however, is that in Views, there are actually several instances of the term_node table we have been discussing, one for each vocabulary. Views provides a field, "Terms for Vocabulary 'foo'", and each one of these fields uses a different table. Obviously, they're all the term_node table, but it aliases each one:

<?php
  $tables
["term_node_$vid"] = array(
   
'name' => 'term_node',
   
//more...
?>

This is referred to as table aliasing or sometimes pseudo-tables.

Advanced usage: More complex joins

Sometimes the simple field to field relationship is not quite enough to describe a relationship. To cope with this, Views allows you to add 'extra' information to the join, in the form of an array:

<?php
 
'extra' => array('field' => 'value'),
?>

What does this mean? Let's take the moderately complex example of the Views Bookmarks module. The basic relationship looks a lot like the term_data relationship to node above, in that there is a views_bookmark_nodes table which relates directly to the node table, and there is a views_bookmarks table which is analogous to term_data -- it describes which of the available bookmarks were used. There is also a third field in the views_bookmark_nodes table, the uid which tells us who actually created the bookmarks.

Views Bookmark example

The reason this relationship can be complex is that the vast majority of the time, we are likely interested in only one group of bookmarks. Let's assume, for the moment, that the administrator has created a bookmark named "My favorites". Most of the Views that are built using bookmarks will only be interested in bookmarks of that type; bookmarks of another bookmark, say, "Todo" will be superfluous. In order to handle this, we use the 'extra' field to add additional information to the join.

To start off with, we have to know the identifier for the bookmark we're interested in. The views_bookmarks table contains the 'vbid' which is the views bookmark id, and it is just like a nid or a tid. To create this relationship, we create a pseudo-table:

<?php
  $tables
["views_bookmark_nodes_$vbid"] = array(
   
'name' => 'views_bookmark_nodes',
   
'join' => array(
     
'left' => array(
       
'table' => 'node',
       
'field' => 'nid'
     
),
     
'right' => array(
       
'field' => 'nid'
     
),
     
'extra' => array(
       
'vbid' => $vbid
     
),
    ),
?>

In the above example, the 'extra' tells us that the vbid field of the table on the right must equal $vbid. And in fact, the Views Bookmarks module creates one of these pseudotables for every type of bookmark it knows about. That way, when something uses "views_bookmarks_2" as the table, it will automatically only get bookmarks for vbid 2 in the query..

There is an even more advanced usage of the extra field. Sometimes a simple equality (i.e, vbid = 2) isn't right for the join. Sometimes a join might want something along the lines of the following silly examples: shoe_count > 2 or foo IN ('bar', 'baz', 'greedo'). However, the construct array('vbid' => 2) doesn't appear to allow for any such thing. In this case, one can use array('shoe_count > 2' => NULL).

Note also that the 'extra' being an array can support as many extra clauses as you need. The Views Bookmark module, for example, also adds 'uid' in some cases. For more information, see the module, which is an excellent example of more advanced Views' API usage.

The taxonomy term example

With all of this information in mind, let's put together the basic join data for term_node and term_data.

<?php
function taxonomy_views_tables() {
 
/**
   * The term_node table links directly to the node table,
   * via the nid field
   */
 
$tables['term_node'] = array(
   
"name" => "term_node",
   
"join" => array(
     
"left" => array(
       
"table" => "node",
       
"field" => "nid"
     
),
     
"right" => array(
       
"field" => "nid"
     
),
    ),
   
//more...
 
);
 
/**
   * The term_data table links to the term_node table
   * via the tid field.
   */
 
$tables['term_data'] = array(
   
"name" => "term_data",
   
"join" => array(
     
"left" => array(
       
"table" => "term_node",
       
"field" => "tid"
     
),
     
"right" => array(
       
"field" => "tid"
     
),
    ),
   
//more...
 
);
  return
$tables;
}
?>

The comments like //more... represent a large amount of information still to be discussed in part 2.

Footnotes

  1. Relating is a database term which can be interpreted as a connection. When two tables relate, each table will have a field that contains the same value.
  2. Sadly, there is no true 'right' join, even though there is a SQL keyword by this name. Some databases actually do support this (possibly including the more recent versions of MySQL), but it should be avoided for cross-database compatibility.

Joining on two fields at once

mikeryan - May 8, 2008 - 22:26

Maybe someone else has figured it out previously, but I couldn't find it... I needed to join to a table on two fields (uid and nid). og_uid JOIN og_membership ON og_uid.nid=og_membership.group_nid AND og_uid.uid=og_membership.group_uid is simple enough at the SQL level, but the array structure is set up for a single left-right relationship. But, peeking at how the join is constructed in _views_query->query(), I found that (at least in Views 1.x on Drupal 5) you can stuff all the complexity you want into the right side of the join:

<?php
  $tables
['og_membership'] = array(
   
'name' => 'og_membership',
   
'join' => array(
     
'left' => array(
       
'table' => 'og_uid',
       
'field' => 'nid',
      ),
     
'right' => array(
       
'field' => 'group_nid AND {og_uid}.uid = {og_membership}.group_uid',
      ),
    ),
...
?>

I haven't looked at the Views 2 API yet - I hope there's a cleaner solution (e.g., an array of 'left'/'right' members).

Mike
Cyrve

 
 

Drupal is a registered trademark of Dries Buytaert.