As a follow-up to #324313: Load multiple nodes and terms at once, we need a generic schema-based method to load multiple records.

Attached is an initial draft.

Needs accompanying tests.

Features:

* Returns a single record if requested by primary key(s), otherwise multiple records.
* Unserializes fields in results based on schema information.
* Accepts a single numeric criterion to load by ID, an array of fields values to load by values, or an array of conditions for more complex matching (e.g., using LIKE, BETWEEN, <, etc.).
* Accepts a list of fields to be returned. If none is given, returns all fields based on the schema.
* Optionally runs results through drupal_alter() before returning.

Function signature:

<?php
/**
* Load one or more records from the database based upon the schema.
*
* @param $table
*   The name of the table; this must exist in schema API.
* @param $alias
*   An alias for the table.
* @param $conditions
*   An array of conditions to apply to the query.
* @param $fields
*   Array: the names of the fields to be returned. If empty, all fields will be
*   returned.
* @return
*   If all primary keys are sent as conditions, a single record object.
*   Otherwise, an array of all matching records. False on failure.
*/
function drupal_read_records($table, $alias = NULL, $conditions = array(), $fields = array(), $alter = TRUE) {
}
?>

Usage:

<?php
// Load all records from a table:
$result = drupal_read_records('node');
// Load a specific record. Because we feed in all primary keys, a single
// record rather than an array of records will be returned.
// Load the node with nid 4:
$result = drupal_read_records('node', 'n', array('nid' => 4));
// When loading a result by a single numeric primary key, the value can be sent
// instead of an array of conditions.
// Load the node with nid 4:
$result = drupal_read_records('node', 'n', 4);
// Load all records matching a set of criteria.
// Load all nodes with nids greater than 4 and 'test' in the title.
$result = drupal_read_records('node', NULL, array('nid' => array('value' => 4, 'parameter' => '>'), 'title' => '%test%', 'parameter' => 'LIKE'));
?>
Files: 
CommentFileSizeAuthor
#15 load-delete-api-365899-15.patch16.15 KBnedjo
Failed: Failed to apply patch.
[ View ]
#13 load-delete-api.patch14.12 KBnedjo
Failed: Failed to apply patch.
[ View ]
#12 load-delete-api.patch14.06 KBnedjo
Failed: Failed to apply patch.
[ View ]
#10 load-delete-api.patch11.96 KBnedjo
Failed: 9655 passes, 3 fails, 2 exceptions
[ View ]
#6 load-delete-api.patch4.84 KBnedjo
Passed: 11371 passes, 0 fails, 0 exceptions
[ View ]
#1 drupal_read_records.patch4.01 KBnedjo
Failed: 10267 passes, 1 fail, 0 exceptions
[ View ]
drupal_read_records.patch3.85 KBnedjo
Failed: 10267 passes, 1 fail, 0 exceptions
[ View ]

Comments

StatusFileSize
new4.01 KB
Failed: 10267 passes, 1 fail, 0 exceptions
[ View ]

Adding some missing lines from the docs.

Other parameters we might want to add:

* range/limit
* sort order

Thought we'd need to be careful not to weigh the method down with too many arguments.

Note: despite my 'node' example, this method would be for loading objects that lack their own load APIs.

A few things that should be borne in mind here. drupal_write_record() may end up deprecated now that we have db_insert, db_update and db_merge. Similarly, I'm not sure we need to support as much flexibility as the current patch allows for - since db_select() already allows us a flexible way to build queries.

At the same time, there's also chx's proposal for #355515: Create a small nonfieldable item API which would definitely require a generic loader. And we'll hopefully have cache_get() and cache_get_multiple() soon if #111127: Cache node_load gets in which follow a similar pattern.

The primary use cases for this which I see are:
1. An easy way to load objects by id, by array of ids, or by conditions on a single table.
2. Ability to do this while also sharing a static cache whether loading by any of the above methods
3. ability for hook_$object_load_alter() functions to operate on multiple records to save database queries inside hook implementations.

When working through the multiple load patches (of which we now have node, taxonomy terms and vocabularies, files and users if you include pending patches), about 70% of each function is exactly the same. However those objects have enough inconsistency between them (node revisions, taxonomy and user case insensitivity issues) that actually abstracting the code out into helper functions isn't really practical at the moment. What we have done though, is separated 'potentially multiple' things which would be loaded from an array of IDs or $conditions, and the single case into separate functions, I think it makes sense to do the same here - so that the single case is as simple as possible, and so we can type hint parameters and have consistent return results.

So before seeing this patch, I'd considered a drupal_load_object() and drupal_load_object_multiple(), but as mentioned given up due to the inconsistencies in our actual core objects.

These would've looked like:

<?php
function drupal_load_object($table, $id, $reset = FALSE) {
 
// Just a convenience wrapper.
 
return reset(drupal_load_object_multiple($table, array($id), $reset);
}
function
drupal_load_object_multiple($table, array $ids = array(), array $conditions = array(), $reset = FALSE) {
}
function
hook_$table_load_alter($objects) {
}
?>

The limitations of this compared to the approach here would be

1. conditions would all be '=' - however you can do a straight db_query to find the relevant object ids, and load those once you have them, which adds a query in those cases, but saves type checking internally all the rest of the time.

2. I'm not sure how it'd work for primary keys on more than one column - would have to be restricted to a single column primary key for a single object load.

Other options might be an $oid = 'oid' parameter, or a registry hook which provided additional metadata about the table - i.e. which column is the object ID, if any columns need to be loaded using LIKE/LOWER etc.) There may well be a happy balance somewhere in the middle.

Ideally I'd love to be able to change at least some of the multiple load functions to use this API once it's in, and I think it has potential for use in plenty of other places as well.

edit: another thing we could consider is an optional cache_get() call in here as well. So that if we use this to load locale_records etc., we can grab them from memcache where possible as well. I don't have a clear idea how we could abstract that without cluttering the function signature though.

Issue tags:+i18n sprint

I think this may be really important to think of some hooks for translating objects / lists of objects on the fly, so just tagging it for i18n sprint

(And this is my request, that we can take out some fields, join in some tables, and add other fields here)

I've been working a bit more on this list concept and though I don't have a better patch, I have some more ideas:

1. We need some generic object API
I think we really need some object abstraction, and we really need these 'multiple object loading functions' that catch talks about: drupal_load_object() and drupal_load_object_multiple().

2. We should use query objects as a parameter or extend the SelectQuery object
The list loading function in this patch is limited because it doesn't support many query options. For getting a list query, either we could implement it as an extension to the SelectQuery object (ListQuery), or create a function that takes a query object as an argument, which would give us all the options.

3. We need a better implementation for pager queries (PDO Extension)
The big still unresolved trouble here is pager_query() which is sql (string) oriented. The main list loading functions all around are pager_query(), or query_range() ones.

4. We could (and should) add metadata to the lists
If you build a list of objects, and then pass it around, some metadata linked to the list would be great. We could do something forms-api like with a list having objects and '#properties' like '#object_type', '#format'. Then at some point in the future maybe we can just do

$build = list_build($list);
and then
drupal_render($build);

So I'm thinking of starting with some generic object API (object.inc), and maybe a hook_objects() for the modules to define their objects....

<?php
/*
* Implementation of hook_objects().
*/
node_objects() {
 
$objects['node'] = array(
   
'name' => t('node'),
   
'table' => 'node',
   
'key' => 'nid',
   
'translatable' => TRUE,
   
'access callback' => 'node_access'// Uniform access hook node access like, which implements permissions for 'view', 'update', 'delete' ops
   
'load callback' => 'node_load', // Uniform object loading that should take an object id
   
'multiple load callback' => // Same taking an array of object ids
 
);
  return
$objects;
}
?>

Title:Flexible drupal_read_records method for schema-based loading of multiple recordsAPI methods for schema-based load and delete operations
StatusFileSize
new4.84 KB
Passed: 11371 passes, 0 fails, 0 exceptions
[ View ]

Here an updated patch. I've changed 'read' to 'load', added a single as well as multiple loader, and added delete.

I'm using this patch in #361597: CRUD API for locale source and locale target strings and finding it greatly reduces duplicated code.

Looks good though I'm thinking of some improvements.

* None of these functions need an alias parameter
* The function to retrieve a single/multiple record should have a more different name, like: get_records / load_record
* We could skip the $alter parameter: queries already can be altered and we possibly would like to be able to alter all /none. Instead we should tag the queries (addTag) so query altering methods now it is a 'get_records' operation.
* We need support for range / limit, as most row loading operations use it. And also, as I mentioned before, for pager queries for this to be really reusable.
* When loading multiple records, if none found we should return an empty array so you can use the return straight in a foreach() and it will still evaluate to FALSE
* Unserialize should default to TRUE to be consistent with drupal_write_record which handles serialized fields automatically.

Jose's comments are good. Since db_select() handles aliasing internally, I don't see a need for it either, same with $alter.

However I don't think we need range/limit at all - or at least this should be left out of the patch until pager query/extender handling is actually in dbtng.

One other issue, with the existing multiple load functions in core, it saves a lot of messing about having the array of objects returned keyed by object ID - since you can then do array_keys($objects) and stick that into hook_$something_else_load_multiple() (i.e. node_load_multiple() gets passed to taxonomy_term_load_multiple() which in theory could be passed down to file_load_multiple()) - otherwise it's a lot of foreaching to pull out the IDs required.

It would be useful if the caller could pass an array of tags which should be added to the select builder. this encourages tagging of queries which makes them easily alterable in hook_query_TAG_alter().

StatusFileSize
new11.96 KB
Failed: 9655 passes, 3 fails, 2 exceptions
[ View ]

Reworking the approach based on feedback, adding some tests, not working yet, I'll fix up and explain changes then.

I think this is looking good, however I'm concerned about something: we are dealing with a lot of parameters and options, though we are missing a lot of normal query functionality (joins, complex conditions, ranges....);

So I propose, let's take a step back and think what this api adds: the ability to provide some uniform hook for altering (first the query and then the records). So I'm thinking why don't we pass a pre-built query which would give us the ability to add any query condtion, join, etc, here...

Thus we could use it with a simple table name, no conditions, or a pre-built query, we just need to pass the main table in the query for the alter hook names.

<?php
function drupal_load_records($table, $query = NULL) {
  if (!
$query) {
   
$query = db_select($table, 't');
   
$query->fields('t');
  }
 
$query->addTag('load_records');
 
$query->addTag('load_records_' . $table);
 
$records = $query->execute()->fetchAll();
 
drupal_alter('load_records', $records, $table);
 
drupal_alter('load_records_' . $table, $records);
  return
$records;
}
?>

Note: The $query->fields('t'), I found it while digging in the db code, that adds all the fields for that table, no need to check the schema.

Status:Needs work» Needs review
StatusFileSize
new14.06 KB
Failed: Failed to apply patch.
[ View ]

Here's a patch that provides a very flexible and powerful pair of API functions, with accompanying tests.

The drupal_load_records() function supports:

<?php
// Load all records from a table.
$record = drupal_load_records('taxonomy_vocabulary');
// Load a single record by primary key.
$record = drupal_load_record('taxonomy_vocabulary', $vid);
// Load a single record by primary key, specifying the primary key to avoid loading
// the schema.
$record = drupal_load_record('taxonomy_vocabulary', $vid, 'vid');
// Load a record and unserialize any serialized fields, determined by consulting the schema.
$record = drupal_load_record('taxonomy_vocabulary', $vid, 'vid', TRUE);
// Load multiple records by primary key. Results are keyed by primary key.
$records = drupal_load_records('taxonomy_vocabulary', array($vocabulary_1->vid, $vocabulary_2->vid));
// Load multiple records with a complex set of options.
$options = array(
 
// Specify one or more fields. If arrays of values are given, the 'IN' operator will be used.
 
'conditions' => array(
   
'name' => array('testing', 'test 2'),
  ),
 
// Specify fields to be returned.
 
'fields' => array(
   
'vid',
   
'name',
  ),
 
// Order results.
 
'order_by' => array(
    array(
'name', 'DESC'),
  ),
);
$records = drupal_load_records('taxonomy_vocabulary', $options);
// Load multiple records passing a SelectQuery object.
$query = db_select('taxonomy_vocabulary')
  ->
fields('taxonomy_vocabulary')
  ->
condition('vid', array($vocabulary_1->vid, $vocabulary_2->vid), 'IN');
$records = drupal_load_records('taxonomy_vocabulary', $query, 'vid');
?>

So we get a method that can be used for everything from the most simple loading to the most complex custom queries.

Whatever the request, it has all the benefits of a predictable API, including: schema-aware handling (e.g., unserializing, smart keying of results according to the number of primary key fields, etc.), a standard query alter tag, a common request and result format.

StatusFileSize
new14.12 KB
Failed: Failed to apply patch.
[ View ]

Here's a slight refinement, to make the method easier to use and more flexible.

In the previous version of the patch, loading by field other than primary key required:

<?php
$records
= drupal_load_records('taxonomy_vocabulary', array('conditions' => array('name' => 'testing')));
?>

Here, instead, the conditions can be in a more familiar form, already used in e.g., node loading:

<?php
$records
= drupal_load_records('taxonomy_vocabulary', array('name' => 'testing'));
?>

To distinguish between field-value pairs and query options (which have a leading #), we use element_children().

This change also means that '#conditions' fed to the array can have any valid operator (e.g., <, >, etc.) rather than being limited as before to = and IN.

The $query parameters can now be in the following forms:

  1. integer or array of integers--These will be taken as primary key values.
  2. field-value pairs, e.g. array('name' => 'cheese')
  3. query options: These are more complex query arguments (conditions, fields to be returned, range, order by, etc.).

An example of a load operation mixing field-value conditions with extended query options:

<?php
// Load multiple records with a complex set of options.
$options = array(
 
// Specify one or more fields. If arrays of values are given, the 'IN' operator will be used.
 
'name' => array('testing', 'test 2'),
 
// Specify fields to be returned. The # designates a query option rather than a field.
 
'#fields' => array(
   
'vid',
   
'name',
  ),
 
// Order results.
 
'#order_by' => array(
    array(
'name', 'DESC'),
  ),
);
$records = drupal_load_records('taxonomy_vocabulary', $options);
?>

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review
StatusFileSize
new16.15 KB
Failed: Failed to apply patch.
[ View ]

Refreshed after drupal_write_record() tests added in #369423: drupal_write_record() returns FALSE on insert when table has multi-field primary key.

Changes:

* Fix up some missing and incorrect code documentation.
* Implement $alter_results option to pass drupal_load_records() result through drupal_alter().
* Use assertIdentical() instead of assertTrue() in tests where appropriate.

Subscribing - this is interesting.

What are the benefits of the current approach with a query assembly in drupal_load_records() over the alternative described in #11?

Subscribing.

Status:Needs review» Needs work

The last submitted patch failed testing.

Status:Needs work» Needs review

Status:Needs review» Needs work

The last submitted patch failed testing.

Version:7.x-dev» 8.x-dev

Moving to Drupal 8

Issue summary:View changes
Status:Needs work» Fixed

Entity API is in core now, i.e. entity_load_multiple()

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.