The attached patch changes the field module integration so that, rather than loading entire entities to display a field, Views directly load the data it needs and hands that off to the field module to prepare. The current patch is a first pass, has only seen basic testing, and I welcome feedback.

In my basic testing, using Apache Bench, this is at least as fast as the current method. It appears to be a little faster when displaying nodes with more fields.

This patch grew out a discussion on IRC with regard to how projects like Searchlight (which rely on Views to generate SQL) could continue to utilize Views to query Field module data.

Comments

jmiccolis’s picture

StatusFileSize
new5.93 KB

Quick update to the patch, to fix a bug where it couldn't render multiple bundles at once.

adrian’s picture

StatusFileSize
new5.85 KB

here is a more complete patch that tries to use the right field handlers for the attached fields.

yhahn’s picture

Status: Active » Needs review
yhahn’s picture

Status: Needs review » Needs work

Needs a re-roll with some adjustments. Working on it.

yhahn’s picture

Status: Needs work » Needs review
StatusFileSize
new6.65 KB

Re-rolled patch. Notes:

  • Goes back to approach in #1. Discussed with Adrian and using native views handlers for fields doesn't make much sense.
  • Fixes bundle/revision field querying for taxonomy_term_data base table. It turns out that querying for bundle/revision data cannot always be done from information provided by hook_entity_info -- instead EntityController implementations may add custom data to loaded entities in the protected buildQuery() method. See taxonomy term entity controller for more info.

It's also worth noting that directly querying for field values by Views will be important for use of the 3.x branch's aggregate handling (group by, count, etc). Another reason to consider this change! : )

yched’s picture

The downside being, of course, that this loses the opportunity to get the values of all fields at once from either field cache or http://drupal.org/project/entitycache ?

[edit : and bypasses hook_[entity]_load(), which, well, has always been the case in Views so far]

bojanz’s picture

Went through the code, talked to dereine about it, and despite initial dislike I think it makes sense.

I like the entity_load approach. And it works perfectly for efq_views (where we already do an entity_load for each entity because EntityFieldQuery returns just stub ids).
The main issue is that drupal 7 has it's own entity / field loading code, that in parts duplicates what views does.
However, Views is complex enough that it can't rely on it, otherwise it would have to sacrifice certain features (such as grouping).
So, I'm +1 on this, and in efq_views I'll just override the changed parts and make it behave like the current code...

However, I think the code will fail on comments, since they have the same problem as taxonomy_terms for bundle (no column in the table, calculated by doing a join on the nid column, taking the node type, appending "comment_" to it).
See #938462: Fix EntityFieldQuery fatal error in certain cases and document bundle limitations for more info.

dawehner’s picture

Another advantage of using sql here is that click_sort works again.

bojanz’s picture

Status: Needs review » Needs work

So, we all agree we want this in.

Let's fix the comments issue and get it in.

We need to join the {node} field, take it's "type", and prepend "comment_node_" to it.
Maybe this could be done by passing a formula to add_field in query(), haven't tried it.
Either that, or we just take "type" and do the prepending in pre_render().

chx’s picture

Not doing an entity load completely kills entitycache. Do it if you must but think twice before doing it. http://drupal.org/project/entitycache

Edit: ops, yched said this already :)

catch’s picture

Yes this should be benchmarked with entitycache + memcache or apc caching backends rather than just core to get the full impact. Also I'd be interested to see what size database this was run on.

Also it's obvious from the patch but it doesn't just bypass entity cache but the core field cache as well (if you really wanted to you could do field_attach_load() on a stub entity to save a full entity load, although this then runs the risk of duplicating data loaded from elsewhere and that doesn't appear to be the intention here anyway).

bojanz’s picture

Views has always had it's own caching.
This (not being able to take advantage of entity / field caches) is unfortunate, but not having grouping, click sorting, and preventing other modules (like searchlight) from using it is even more unfortunate.

We'll always have efq_views where all the caches and entity / field APIs are used, at the expense of not having some features (like grouping & click sorting mentioned above). So everyone can chose what's more important to him/her.

Edit: talked with catch on IRC, and we agreed that it would be good to actually know what the Searchlight use case is, since it isn't explained here.
Also, an option to investigate would be to only avoid entity_load if grouping / tablesorting is in use.

catch’s picture

OK more questions then:

If tablesort or groups are selected as options, what stops those being added as joins? If there is a performance penalty there then it'd be unfortunate to apply to all views just to support ones that use those features.

Also what's the actual use case that searchlight has, that requires this change? Just using views to generate an sql string again doesn't seem like sufficient justification to chuck out entity_load() - searchlight is equally capable of overriding this behaviour as efq_views would be no? And why would a loaded entity be no good if it's the same data?

dawehner’s picture

Searchlight supports both solr and sphinx. Sphinx runs sql queries against drupal to index.

karens’s picture

I've been trying to understand the reason for using entity_load() instead of querying individual fields. It's a massive change from the way that D6 works and the rational is not explained anywhere. Subscribing so I can get up to speed.

merlinofchaos’s picture

I have zero sympathy for entity cache. Here is why:

If I can load all of the fields in one query, adding additional fields to that query has impact only insofar as it joins in more tables. Yes, it can impact the query a little bit.

Running my query, then going back and retrieving additional fields adds another operation. Cached or not I still have to perform another operation, and that other operation is not free. Plus, you load a LOT more data in your entity cache, increasing memory footprint.

I see zero benefit to respecting entity cache on fields where the data is right there in the database, waiting for us to take it.

I see lots of benefit in respecting entity cache on complex fields where you can't simply read the field from the database. But these fields are already hamstrung; you can't sort, filter or group on these fields anyway.

In short: entity_load() should be used by Views, IMO, only if necessary. "Necessary" can be defined by the field itself. Simple core fields such as text, number and date should definitely not be necessary.

Complex fields such as file and image probably should continue to use entity_load.

merlinofchaos’s picture

If I can load all of the fields in one query, adding additional fields to that query has impact only insofar as it joins in more tables. Yes, it can impact the query a little bit.

Note that this little bit is measurably small. There is a reasonable chance that the tables you're joining for said field were joined in already for sorting or filtering purposes. Even if not, joining tables on primary keys for the purposes of pulling data but NOT sorting/filtering on them is quite cheap. The chance of the total sum time spent on JOINs you can save being outweighed by the cost of fetching entities -- even from memcache -- seems very small to me. Remember that even if you memcache, fetching that entity from cache means unserializing it. Sure, maybe you don't have to do that because it's already statically cached.

merlinofchaos’s picture

Priority: Normal » Major

Bumping to major. I feel strongly that skipping entity_load for straightforward fields is the right option. Let's proceed with this and get it done.

attiks’s picture

subscribing, since I need to do grouping/aggregating on the sql level and it doesn't work for the moment since the grouping is done on the entity ids.

catch’s picture

Running my query, then going back and retrieving additional fields adds another operation. Cached or not I still have to perform another operation, and that other operation is not free. Plus, you load a LOT more data in your entity cache, increasing memory footprint.
[...]
I see lots of benefit in respecting entity cache on complex fields where you can't simply read the field from the database. But these fields are already hamstrung; you can't sort, filter or group on these fields anyway.

In short: entity_load() should be used by Views, IMO, only if necessary. "Necessary" can be defined by the field itself. Simple core fields such as text, number and date should definitely not be necessary.

Complex fields such as file and image probably should continue to use entity_load.

With the current patch you can add text fields to that too, or in fact any field that implements hook_field_load().

text_field_load() is cached by the field cache, and runs an (uncached) check_markup() to avoid additional round trips to the database if using a formatted text field type. With the current patch, by bypassing the field cache, you're left with no caching for check_markup(), so it's a completely false saving trying to bypass the API here.

So the 'straightforward' fields your left with are just list, number and unformatted text fields. taxonomy, node and user reference fields are also complex, but the bulk of the work there is done in hook_field_formatter_prepare_view() which so far is unaffected by this change.

Specifically this hunk:

-        $objects = entity_load($entity_type, $oids);
+      // Invoke field-type module's hook_field_load().
+      _field_invoke_multiple('load', $entity_type, $objects);
 
-        foreach ($oids as $key => $entity_id) {
+      // Invoke hook_field_attach_load(): let other modules act on loading the
+      // entitiy. << NOT CACHED ANY MORE.
+      module_invoke_all('field_attach_load', $entity_type, $objects);

So focusing entirely on entitycache here is a red herring, the field cache (which entity cache subsumes but it's the same principle) is the important thing being left out of the equation, since that's quite critical to avoid unnecessary overhead and additional round trips.

attiks’s picture

I understand the reasoning behind using the cache, but I think we also need to be able to act on the actual field values, in a custom module I defined a views table and on that data I can act on the actual values, but when I do the same based on nodes data it starts using the entity id's

I created a new view style plugin that allows me to automatically group by an interval on timestamp data, more information and the code I'm using can be seen at http://groups.drupal.org/node/105114. I want this to be as general as possible so I can easily reuse it.

karens’s picture

The first thing I had to do to get Date and Calendar working was override the default handling and join the date fields into the query. It doesn't do me any good to have the date information available in the themeing layer, I need the raw data in the table to do anything at all with dates. So it's not just about 'simple' vs 'complex' fields, it's that the information in entity_cache is only available in the themeing layer. If you need any of that data to manipulate the query, it doesn't do any good.

sun’s picture

subscribing

bojanz’s picture

So I'm guessing that to push this issue forward we need a benchmark comparing the proposed patch with the current solution, with emphasis on the field caching (which is still a valid concern).

merlinofchaos’s picture

Status: Needs work » Active

With the current patch you can add text fields to that too, or in fact any field that implements hook_field_load().

text_field_load() is cached by the field cache, and runs an (uncached) check_markup() to avoid additional round trips to the database if using a formatted text field type. With the current patch, by bypassing the field cache, you're left with no caching for check_markup(), so it's a completely false saving trying to bypass the API here.

At this point we are now overcaching. Sorry, I will not let Views' be hamstrung for this.

This discussion can end now. There is no need for benchmarks. I am playing the Dries' card. This is how Views' will work. If people don't like it, they can write a contrib module to screw Views by returning to previews behavior by swapping out field handlers.

merlinofchaos’s picture

Also unless filter.module has changed, check_markup calls are cached anyway, so all you're going to do is hit the database (or memcache).

sun’s picture

Filter module has changed - the filter cache is disabled by default now. That's because almost all filtered text lives in text fields now, and text fields are using the field cache already.

bojanz’s picture

Status: Active » Needs work

The main question here is how much d7 entity/field core abstraction can and should be used by Views.
Views has always been a query builder, and as such close to the metal, thus needing to avoid other abstractions.
In this case it's obvious that the halfway-approach used leads to problems and sacrifices certain features (grouping, click sorting, the "multiple field handler"), so it's not feasible for a module as flexible and feature rich as Views.

So, I'm glad we are continuing with the patch. Changing the status to "needs work" since we need to address the comments bundle issue I mentioned in #7/#9.

At the same time, I'm reminding everyone of my views backend, efq_views, which embraces every possible abstraction offered today (EntityFieldQuery, entity metadata..) with interesting results. Since it's not part of Views, I can easily sacrifice grouping, relationships, etc (although relationships are somewhat possible). Thus everyone has a choice when creating a new View, use an efq_views base table and get the benefits of entity caching, fields in no-sql databases, etc; or use a reglar views base table and get the features outlined above (grouping, relationships, etc etc).
That way everyone can decide what is more important to them on a per view basis.

dawehner’s picture

Status: Needs work » Needs review
StatusFileSize
new8.34 KB

* Rename $table to $base_table to make the code a bit cleaner.
* Fixed it to work with comments.
* fixed a minor missing space

dawehner’s picture

StatusFileSize
new8.08 KB

* "merge" the taxonomy/comment case
* removed the todo.

dawehner’s picture

StatusFileSize
new8.81 KB

* Fixed comment:

    // The bundle system will be fixed in d8, so so special cases should be needed.

* Fixed another comment
* Renamed _field_cache to _field_data

bojanz’s picture

Status: Needs review » Reviewed & tested by the community

Discussed on IRC.
dereine is awesome as always :)
Very happy about the bundles fix, didn't even think of that in #9.

dawehner’s picture

Status: Reviewed & tested by the community » Fixed

http://drupal.org/cvs?commit=456366

Thanks for all the work/discussion in the issue.
Next issue on the list: #895046: Implement views_handler_field_multiple

rszrama’s picture

Status: Fixed » Needs review

I'm not entirely sure, but I believe this patch is responsible for killing my Views, so I've put it to needs review. : )

Basically, I have Views that involve establishing a relationship (i.e. from an order to the line items referenced on it) and displaying fields from the referenced item (i.e. the price of the referenced line item) in a table. When Views attempts to render this field from the referenced item, it generates a host of errors including:

    * Notice: Undefined index: commerce_order in views_handler_field_field->query() (line 42 of /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc).
    * Notice: Undefined index: entity keys in views_handler_field_field->query() (line 53 of /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc).
    * Warning: Invalid argument supplied for foreach() in views_handler_field_field->query() (line 53 of /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc).
    * Notice: Undefined index: commerce_order in views_handler_field_field->pre_render() (line 170 of /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc).
    * Notice: Undefined index: entity keys in views_handler_field_field->pre_render() (line 177 of /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc).

With the fatal error being:

PHP Fatal error:  Cannot access empty property in /Users/rszrama/src/cvs/d7/views/3/modules/field/views_handler_field_field.inc on line 177

I thought this could just be a problem in my relationship code for the line item reference fields, so I duplicated the problem with a View using nodes and product reference fields. Then I thought this could be a problem with all of my code, so I duplicated the problem by adding a field to users and creating a node View that adds an author relationship and then attempts to display my custom field from the user. Dead.

There appears to be a good bit of metadata missing in the field's definition now, and there are some attempts to load info arrays that result in missing data that cascades down to the fatal errors. This patch suspiciously rewrites a lot of the file in question, views_handler_field_field.inc, so that's why I'm hoping this is the issue. : )

rszrama’s picture

Status: Needs review » Needs work

And just to confirm, reverting this patch resolved my issue.

dawehner’s picture

Status: Needs work » Needs review
StatusFileSize
new2.58 KB

What about this one?

rszrama’s picture

Status: Needs review » Needs work

Still generates the same errors for views_handler_field_field.inc. I can't really just post my View for you to test with, because it involves setting up fields on something, but the quickest way to build a test is to add a field to users and then build a node View that tries to display one of those fields through a user relationship.

attiks’s picture

I can only confirm the bug with the scenario from #37, with or without the patch in #36

Fatal error: Cannot access empty property in /sites/all/modules/views/modules/field/views_handler_field_field.inc on line 177

I tried debugging a bit and it appears that $entity_type is empty but the $entity_info structure contains information about all entitities

dawehner’s picture

Okay what is the problem:

      $entity_type = $this->definition['entity_tables'][$this->view->base_table];

We can't assume that the entity-type of the current field is the one from the base table. So from the field-data defitintion there should be a way to get the entity.

dawehner’s picture

StatusFileSize
new4.73 KB

After some hours at the snowed snowy trainstation here is a patch

attiks’s picture

Patch is working with the scenario of #37.

In the views UI I created a relationship on Node: Author, for the fields I selected:
* Node: Title
* User: E-mail
* User: Uid
* (user) Fields: field_naam (custom textfield added to users)

But my query looks a bit strange:
* it contains two joins?
* it contains a lot more fields in the select?

SELECT node.title AS node_title, node.nid AS nid, users.mail AS users_mail, users.uid AS users_uid, users_node__field_data_field_naam.entity_id AS users_node__field_data_field_naam_entity_id, users_node__field_data_field_naam.etid AS users_node__field_data_field_naam_etid, users_node__field_data_field_naam.delta AS users_node__field_data_field_naam_delta, users_node__field_data_field_naam.language AS users_node__field_data_field_naam_language, users_node__field_data_field_naam.field_naam_value AS users_node__field_data_field_naam_field_naam_value, users_node__field_data_field_naam.field_naam_format AS users_node__field_data_field_naam_field_naam_format
FROM 
{node} node
LEFT JOIN {users} users_node ON node.uid = users_node.uid
INNER JOIN {users} users ON node.uid = users.uid
LEFT JOIN {field_data_field_naam} users_node__field_data_field_naam ON users_node.uid = users_node__field_data_field_naam.entity_id AND (users_node__field_data_field_naam.etid = :views_join_condition_0 AND users_node__field_data_field_naam.deleted = :views_join_condition_1)
LIMIT 10 OFFSET 0
dawehner’s picture

Status: Needs work » Needs review

The many fields are okay, they are all needed.
This isn't changed by this patch.

The additional join seems to be fine too, it comes from the fieldapi field. The first join comes from the node<->user join.

Thanks for testing!

attiks’s picture

I had a quick look at the code as well, but looks ok to me, so RTBC for me, but better wait for Ryan so he can test as well.

rszrama’s picture

I'm still getting the same errors on the shopping cart order View, but I'll try re-applying, trying with a user View, and debugging before I give up. Gotta nail this one. : D

rszrama’s picture

In case this helps, my $this->definition array is:

Array
(
    [field] => entity_id
    [table] => field_data_total
    [handler] => views_handler_field_field
    [click sortable] => 1
    [field_name] => total
    [additional fields] => Array
        (
            [0] => etid
        )

    [group] => Fields
    [title] => total
    [title short] => total
    [help] => Appears in: commerce_line_item:product
)

This results in an error on line 55 of views_handler_field_field.inc:

  $this->entity_type = $entity_type = $this->definition['entity_tables'][$base_table];

And cascades down to a fatal error. Looking into it...

rszrama’s picture

fwiw, I can at least get this to work using a field on the user account with a relationship to the user through a node's uid. Gonna look into my relationship code and compare it to the node author code to see if anything sticks out as missing.

Also, I'm wondering if I might just have had a caching issue turning up the same errors as before. At present I'm getting the following notices and no longer a WSOD:

* Debug: 'Missing join: commerce_line_item commerce_order' in views_get_table_join() (line 95 of /Users/rszrama/src/cvs/d7/views/3/includes/handlers.inc).
* Debug: 'Exception: SQLSTATE[42S22]: Column not found: 1054 Unknown column \'commerce_line_item.type\' in \'field list\'' in views_plugin_query_default->execute() (line 1197 of /Users/rszrama/src/cvs/d7/views/3/plugins/views_plugin_query_default.inc).
rszrama’s picture

Just to explain the setup here, I do see that my definition differs from user, and I'm wondering if there are recent changes to the relationship schema I may have missed.

I see that all that node.views.inc does to enable the relationship over a node's uid is:

  // uid field
  $data['node']['uid'] = array(
    'title' => t('Author'),
    'help' => t('Relate a node to the user who created it.'),
    'relationship' => array(
      'handler' => 'views_handler_relationship',
      'base' => 'users',
      'field' => 'uid',
      'label' => t('user'),
    ),
  );

With relationships through the line item reference field, it's not as straightforward since the relationship is coming through the value of a field instead of the value of a property on the entity itself. However, I'm not sure that explains the fact that I'm adding an explicit join to the table like node does for revisions but doesn't for users:

  // First indicate this field's table joins to the commerce_line_item table.
  $data[$table]['table']['join']['commerce_line_item'] = array(
    'left_field' => 'line_item_id',
    'field' => $key,
  );

Note that $table is the base table name (like field_data_line_items) and $key is the field name (like line_items_line_item_id). Then my code that actually adds the relationship has minor differences in that there are keys here that node isn't using for the uid relationship and that I'm using a 'base field' key instead of what I see as 'field' on the uid definition:

  // Then add the relationship data to the field itself.
  $data[$table][$key]['relationship'] = array(
    'title' => t('Referenced line item'),
    'help' => t('The line item referenced through this field.'),
    'base' => 'commerce_line_item',
    'base field' => 'line_item_id',
    'handler' => 'views_handler_relationship',
    'label' => t('Line Item'),
  );
bojanz’s picture

I'll try to debug this tonight if it doesn't get resolved until then.

One of the joins that attiks is seeing comes from #986992: Insane etid / {field_config_entity_type} abstraction

rszrama’s picture

Trying to work a little new information into this... the difference between my relationship and the node author relationship is that our relationship is established through a reference field. It's like this:

Order entity (has a) -> Line item reference field (that references a) -> Line item entity (with a) -> Price field

The View starts with the order, establishes a relationship through the line item reference field, and tries to display the price for that line item. Somewhere along the way, Views is trying to do a direct join between the order table and line item table that will always fail since there is no common key between them. Views should be (and I guess has been) joining from commerce_order to field_data_line_items and from that to commerce_line_item. Something about the new process here has altered how Views tries to build the relationship.

rszrama’s picture

Also, this is the query as it is currently being built w/ dereine's patch in #40 applied:

SELECT commerce_line_item_field_data_line_items.quantity AS commerce_line_item_field_data_line_items_quantity, commerce_line_item_field_data_line_items.line_item_label AS commerce_line_item_field_data_line_items_line_item_label, commerce_line_item_field_data_line_items__field_data_total.entity_id AS commerce_line_item_field_data_line_items__field_data_total_e, commerce_line_item_field_data_line_items__field_data_total.etid AS commerce_line_item_field_data_line_items__field_data_total_e_1, commerce_line_item_field_data_line_items__field_data_total.delta AS commerce_line_item_field_data_line_items__field_data_total_d, commerce_line_item_field_data_line_items__field_data_total.language AS commerce_line_item_field_data_line_items__field_data_total_l, commerce_line_item_field_data_line_items__field_data_total.total_price AS commerce_line_item_field_data_line_items__field_data_total_t, commerce_line_item_field_data_line_items__field_data_total.total_currency_code AS commerce_line_item_field_data_line_items__field_data_total_t_1, commerce_line_item.type AS commerce_line_item_type, commerce_line_item_field_data_line_items.line_item_id AS commerce_line_item_field_data_line_items_line_item_id
FROM 
{commerce_order} commerce_order
LEFT JOIN {field_data_line_items} field_data_line_items ON commerce_order.order_id = field_data_line_items.entity_id AND (field_data_line_items.etid = :views_join_condition_0 AND field_data_line_items.deleted = :views_join_condition_1)
INNER JOIN {commerce_line_item} commerce_line_item_field_data_line_items ON field_data_line_items.line_items_line_item_id = commerce_line_item_field_data_line_items.line_item_id
LEFT JOIN {field_data_total} commerce_line_item_field_data_line_items__field_data_total ON commerce_line_item_field_data_line_items.line_item_id = commerce_line_item_field_data_line_items__field_data_total.entity_id AND (commerce_line_item_field_data_line_items__field_data_total.etid = :views_join_condition_2 AND commerce_line_item_field_data_line_items__field_data_total.deleted = :views_join_condition_3)
WHERE (( (commerce_line_item_field_data_line_items.type IN  ('product')) ))
ORDER BY commerce_line_item_field_data_line_items_line_item_id ASC

If I revert and then roll back the patch in #31, the View works with the exact same query except the SELECT list of columns does not include commerce_line_item.type. As far as I can tell, everything else is the same. Now, that column should only be referenced by the filter on the View which limits it to only show product line items... it's interesting that the query now wants to include that in the column SELECT list when the WHERE condition (( (commerce_line_item_field_data_line_items.type IN ('product')) )) is all that is necessary to filter the View.

Is something about the changes to field.views.inc now mandating the type field be added even though it's unnecessary? And if so, it's the adding of that type field that is resulting in the missed join, right?

dawehner’s picture

StatusFileSize
new3.38 KB

What about this one.


diff -u modules/field/views_handler_field_field.inc modules/field/views_handler_field_field.inc
--- modules/field/views_handler_field_field.inc 2 Dec 2010 08:28:00 -0000
+++ modules/field/views_handler_field_field.inc
@@ -61,12 +61,12 @@
     switch ($entity_type) {
       case 'taxonomy_term':
       case 'comment':
-        $this->aliases[$this->definition['field_name'] . '_' . 'bundle'] = $this->query->add_field($this->table, 'bundle');
+        $this->aliases[$this->definition['field_name'] . '_' . 'bundle'] = $this->query->add_field($this->table_alias, 'bundle');
         break;
       default:
         foreach ($this->entity_info['entity keys'] as $key => $column) {
           if (!empty($column) && ($key == 'bundle' || $key == 'revision')) {
-            $this->aliases[$this->definition['field_name'] . '_' . $key] = $this->query->add_field($base_table, $column);
+            $this->aliases[$this->definition['field_name'] . '_' . $key] = $this->query->add_field($this->table_alias, $key);
           }
         }
         break;
dawehner’s picture

StatusFileSize
new3.66 KB

Next version

dawehner’s picture

StatusFileSize
new3.67 KB

Yet another version

das-peter’s picture

subscribe

dawehner’s picture

Regarding the groupby


[23:29] <dereine> i have a query against a field table, the field table has the primary key stored on the base table
[23:29] <dereine> so if the groupby has the primary key of the base table added it worked fine
[23:30] <dereine> but if the groupby has the primary key as part of the field table it doesn't work
[23:30] <merlinofchaos> dereine: Usually adding the primary key breaks group by unless you're intentionally grouping multi-value fields per node.
[23:31] <dereine> http://drupalbin.com/16845 vs http://drupalbin.com/16846
[23:32] <-- manuee has left this server (Remote host closed the connection).
[23:33] <dereine> merlinofchaos: the difference is just the vid/type in the base table vs the vid/type in the field table
[23:33] <dereine> i have no idea what's the difference
[23:34] <merlinofchaos> holy crap those are big queries
[23:36] <dereine> looks like yql
[23:37] <merlinofchaos> dereine: There's too much in those queries for me to get a handle on. The thing is, when using GROUP BY, you don't want the primary keys unless they're specifically added as fields, or they're going to break things. ARe they being added as fields in the UI?
[23:38] <dereine> merlinofchaos: no this comes from fieldapi fields
[23:38] <dereine> to construct the objects the entity-id/bundle/revision are needed
[23:39] <merlinofchaos> dereine: entities are going to have to suffer without that with GROUP BY, because the data ends up anonymized.
[23:39] <dereine> merlinofchaos: http://drupalbin.com/16848 does help you?
[23:39] <merlinofchaos> Simple query: SELECT node.type, COUNT(node.nid) FROM node GROUP BY node.type
[23:39] <dereine> the old version worked fine
[23:39] <merlinofchaos> If you try to add the primary key there, the group by breaks completely.
bendiy’s picture

He thinks that's a big query? Try dealing with an order/accounting system. :P

@dereine, #53 fixed the errors I was getting related to adding a product to the shopping cart in Drupal Commerce.

greg606’s picture

I have a major problem with displaying some fields and the last patch 936196-fix.patch didn't help at all.
Subscribe.
I described it here: http://drupal.org/node/987478

dawehner’s picture

The groupby problem will be fixed by the multiple handler.

bojanz’s picture

Status: Needs review » Reviewed & tested by the community

dereine tells me that Drupal Commerce was tested last night with the latest patch, and that everything is in order.

The patch looks fine, we have three fixes for three corner cases. Hopefully, these are the last ones.

Great job, everyone.

dawehner’s picture

Status: Reviewed & tested by the community » Fixed

Commited a while ago.

rszrama’s picture

Awesome news. Thanks so much for pursuing this. It's like you guys are my counselors, helping me with all my "relationship" problems. Bwahahaha... erm. Sorry. ; )

damien tournoud’s picture

Status: Fixed » Active

Ewww. This is seriously horrible. I tried to explain that *many times*, that mixing querying (ie. determining the list of objects to display and ordering them) and retrieval (loading fully loaded objects) is just a bad idea.

The multiple load pattern was *precisely* designed so that Views can stop being an horrible hack. Skipping all the load hooks is just really ugly.

Additionally, there have been no proof that this actually increases performance.

Let's please reconsider.

catch’s picture

Yes no-one has yet answered why grouping etc. can't add the joins to the query as oppposed to them being added unconditionally.

Also even if you insist on unconditionally adding the joins and skipping entity_load(), I will bet that any view which includes a formatted text area or any other hook_field_load() implementation will be significantly faster if field_attach_load() is used, rather than calling private field API functions and invoking hooks from views.

catch’s picture

I missed this first time around:

At this point we are now overcaching. Sorry, I will not let Views' be hamstrung for this.

This discussion can end now. There is no need for benchmarks. I am playing the Dries' card. This is how Views' will work. If people don't like it, they can write a contrib module to screw Views by returning to previews behavior by swapping out field handlers.

This is not 'overcaching', it is exactly the same amount of caching as D6. The only difference is that the cached data is stored with the rest of the cached field data in a single entry, instead of separately in the filter cache.

Also unless filter.module has changed, check_markup calls are cached anyway, so all you're going to do is hit the database (or memcache).

Sun already answered this perfectly well, but I'll repeat it here since it wasn't responded to:

Filter module has changed - the filter cache is disabled by default now. That's because almost all filtered text lives in text fields now, and text fields are using the field cache already.

If you want to call ignoring reviews and shutting down discussion (discussion that's backed up by actual data) "playing the Dries card" and "hamstringing Views" that's up to you. I will try to make this my last post on this issue.

attiks’s picture

My 2 cents:

#62 if you're talking about retrieving objects, you're right, but my use case (flot graphs - #961976: D7 Port) is to retrieve separate fields so I can manipulate them, read group by, count, ... . Before this patch I only got the eid fields in my query, not the actual values, so grouping didn't work. With this patch I'm at least able to rewrite the query in a sensible way.

In my opinion views is used in two different ways, one is to create lists of objects like explained in #62 and then it's probably wiser to use the field api's, but most people also uses views to display some fields (ex. titles of the last 5 blog items)

dave reid’s picture

Subscribe

damien tournoud’s picture

Grouping, sorting and filtering is completely unrelated to this issue.

We probably want to provide a "raw" field for grouping and Searchlight, but the standard field should use standard entity_load() mechanisms. Please don't throw out the baby with the bath water.

attiks’s picture

Damien, I'm guessing I'm missing something, without this patch the only things passed to $view->query are the selected fields and without this patch it means the eid of the field, i have to dynamically add grouping while the query is executed, but it only makes sense if I have the right column to act on.

I basically loop through all selected fields in the view and add grouping for all of them except the first one.

I you want I'll create a separate issue to discuss / clarify this

febbraro’s picture

subscribing

dawehner’s picture

Just to summarize:

* pre_render should do entity_load as before.
* the joins are still defined, so that click_sort, and groupby works in theory.
* for groupby use the click_sort "click_sort_column". Right?

yched’s picture

FYI - Discussion about removing the {field_config_entity_type}.etid abstraction : #986992: Insane etid / {field_config_entity_type} abstraction
If we do this, Views is the main (only ?) thing that gets broken in contrib. What do you guys think ?

bojanz’s picture

bojanz’s picture

Okay, coming back from #987478: Field language is not properly handled.
We are adding complexity to deal with all the issues that entity_load() handled for us automatically, and I'm not sure it's still justified, especially performance-wise, since that seems to be the main point here.

Quoting myself:

However, I've really grown to hate what we are doing here. It's nasty, and we are pretty much insane.
Note that we rebuild the fake entity object for each field, and call all the hooks for each field.
That can't be good for performance.

What I'd really like to see is #70 with some benchmarks comparing the approaches for common uses cases.
However, knowing how much free time we have VS how much work is left to be done to make Views 3 happen, that won't be for a while, and without that (concrete code + benchmarks), we are pretty much spinning in circles here. It's a complicated issue, and it makes me yearn for the simplicity of efq_views where there's no dillema of "let's double our queries, do everything twice and pray to the cache gods" VS "let's emulate a part of core and pile on hacks until we're kinda good, pray it works out".

With every additional comment here I'm feeling like I'm more harming than helping, so I'm just gonna concentrate on other release blockers.

rszrama’s picture

Hmm, I think the absence of entity_load just bit me again somewhere else. The problem I believe is related to the fact that fields are being displayed using legit display formatters that expect an $object_type and $object as parameters. I have code that depends on a property of the $object (specifically $product->sku) being present in the object, but Views is only providing a truncated object to the formatter (containing the product_id, type, and relevant field's keys). What's interesting is that this is happening even though the property I need is being displayed elsewhere on the View - so it's data that's coming from the query, but there's no way to anticipate it would be required by a particular field's display formatter to stuff it into this pseudo object.

Without rebuilding the entire $object from direct SQL, I don't see what you can do here besides an entity_load() and pass the actual object to the display formatter as expected. Otherwise you won't properly support using Field API display formatters, which seems like an unfortunate loss.

On a side note, I found it interesting in my variable dumps that the structure of the field's data passed to the formatter is different than it would have been if loaded normally, including an additional key that is defined in the schema but not present on normal load because the widget doesn't currently accommodate it. Thus, if a widget intentionally left data out of the array that was then present on the display formatter, you'd have an unexpected incongruity... but I'd have to get creative to decide how it would be harmful. : P

bojanz’s picture

Status: Active » Fixed

Ryan, thanks for reporting this.
This was the last straw, hence #1002744: Use entity_load for fieldapi field handlers.
The current patch includes both code paths (entity_load, and assembling the entity ourselves), configurable by setting "add fields to query" in the views field definition, but if entity_load proves to be not that big of a slowdown, then we can just remove the current "simulating" code, since the contrib modules really need the data in the query, and not the fake entities we are constructing for ourselves.

This whole issue would have been much easier to decide on in the beginning if someone told us "you're breaking multilingual fields and field formatters" :)

Marking this as fixed, so that communication can continue in the new issue.

alex_b’s picture

sub

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.