We try to support pretty much everything you can support with valid SQL.
More about the different parts later.

Proposed API:

// Aggregate the entities.
$query->aggregate($field, $function, &$alias = NULL);
// Groupby entities.
$query->groupby($field);
// Condition works as before.
$query->condition($field, $operator, $value)
// Sort has no meaning so it'd throw an exception
// But execute() can't  be used to avoid a mess
$query->executeAggregate();

Beside simple aggregation/groupy you might also want to filter/sort by aggregated values.

Proposed API for conditions/sort of aggregated values Nr.1 :

But the results so far can also be filtered and sorted. How do we do that? Add new sortAggregate($field, $function) methods? Would that do an automated aggregate($field, $function) as well? What about conditons? conditionAggregate($field, $function, $operator, $value)?

// Get the amount of published nodes by type, sorted by the amount of nodes,
but only for types with more then 3 nodes.

$query
  ->aggregate('nid', 'count')
  ->condition('status', 1)
  ->groupby('type')
  ->conditionAggregate('nid', 'count', 3)
  ->sortAggregate('nid', 'count', 'DESC')
  ->execute()
  

The main advantage of this idea would be, that you can build up your query in a row,
and not need some context in mind.

Proposed API for conditions/sort of aggregated values Nr.2:

// Get the amount of nodes by type, sorted by the amount of nodes,
but only for types with more then 3 nodes.
$query = entity_query('node')
  ->condition('status', 1)
  ->aggregate('nid', 'count', $alias)
  ->groupby('type')
entity_aggregate_query($query)
  ->condition($alias, 3)
  ->sort($alias, 'DESC')
  // The return value of this is the same as entity_query()->executeAggregate().
  ->execute();
  

One advantage of this way would be that it might be easier to distinct between content
entity query and config entity query. Yeah, just asking for entity_aggregate_query($query) where $query is a config entity would immediately throw an exception. But then again, conditionAggregate and sortAggregate is only two points vs this single one so this benefit is not huge.

Internally replace $alias with the aggregate function 'cos SQL doesn't allow for aliases...
In mysql you could use the field alias which is part of the select statement, though
this does not work for PGSQL, see http://www.postgresql.org/message-id/18195.1038530280@sss.pgh.pa.us

Proposed API for conditions/sort of aggregated values Nr. 3:


entity_aggregate_query('node')
  ->condition('status', 1)
  ->aggregate('nid', 'count', $alias)
  ->groupby('type')
  ->conditionAggregate($alias, 3)
  ->sortAggregate($alias, 'DESC');
  ->execute();

Result-Set:

As we can't choose a proper key once aggregation is on, the key of the result
will NOT be the entity id, but just an ascending number, see example.
The developer should be aware about that, so let's document that.


$result = $query
  ->aggregate('nid', 'count')
  ->condition('status', 1)
  ->groupby('type')
  ->executeAggregate();

$result[0] = array('count_nid' => 3, 'type' => 'page');
$result[1] = array('count_nid' => 1, 'type' => 'poll');
$result[2] = array('count_nid' => 4, 'type' => 'story');

Note that the attached patch is obsolete, so disregard it.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

chx’s picture

Status: Active » Needs review
Damien Tournoud’s picture

Status: Needs review » Active

This is really important.

We have seen in multiple entreprise project that there is a strong need for Drupal to cleanly support different storage backends. That means that Views needs to stop generating SQL queries itself, and just generates an EntityFieldQuery. As a consequence, that means that EntityFieldQuery needs to be rich enough so that most (if not all) types of queries.

Aggregation support is one of the features that is currently missing in EntityFieldQuery.

We designed this API so that it's sufficiently powerful while remaining sufficiently simple to be easily implemented on alternative storage backends.

bojanz’s picture

Status: Active » Needs review

This is definitely the last step needed to make EntityFieldQuery a truly powerful and versatile solution for querying entities and makes me optimistic about starting the effort to switch Views to it by default (with a "legacy" fallback if needed).

The patch in #0 also contains the EFQ relationship patch (RTBC), so I suggested to chx that he posts a version of the patch with just the aggregation, rolled on top of the relationship changes.

My main concern when first discussing the idea was whether other backends (such as mongo) would be able to implement it, but with Damien and chx on the problem, I have no doubt that's sufficiently taken into account.

DanZ’s picture

Does this mean that the current Views (Drupal 7, Views 3) cannot perform aggregation on non-DB (computed) entity properties?

bojanz’s picture

@DanZ
Correct.

dawehner’s picture

It's so great to see progress on getting EFQ as a full replacement of pure dbtng querying

+++ b/core/lib/Drupal/Core/Entity/Query/QueryBase.phpundefined
@@ -195,6 +212,18 @@ public function pager($limit = 10, $element = NULL) {
+  function aggregation($aggregation_function, $aggregation_property, $group_by_property, $langcode = NULL) {

Let's assume you want to have 3 aggregation min/max/average and 2 groupby values, how would this would look like for multiple groupby values, as you have to specify both aggregation and groupby here. Shouldn't we split them up?

Do we mixed up the relationship with the aggregation support issue :) Like 80% of the patch is about relationships

chx’s picture

Yeah the patch is mostly the relationships patch but the refactor from the relationship patch is necessary -- I am not touching the original code and the relationship patch is due core this weekend finally.

dawehner’s picture

Posted a new issue summary after quite some discussion with chx.

For everything you follows this issue, please read it and comment on it, especially on which of the three (or even more) sort/condition suggestions we should go for.

dawehner’s picture

Issue summary: View changes

after quite some discussion, now a proper suggestion.

chx’s picture

chx’s picture

FileSize
257.46 KB

I finally feel like there's something to post. The QueryAggregate results is not yet fully hashed and so the rather limited tests are unlikely to work yet but the API is in place, I feel like I am finished with rearranging the many pieces. I wish I had Traits so do not condemn QueryBase for not implementing interfaces or implementing too much of them. What any IDE will show depends on the wrapper (entity_query/entity_query_aggregate) @return interface anyways.

chx’s picture

FileSize
43.32 KB

Opsie.

Status: Needs review » Needs work

The last submitted patch, 1854708_10.patch, failed testing.

dawehner’s picture

If someone is reviewing that, please don't complain about documentation, as I'm fixing some those + write more tests.

DanZ’s picture

Just as something to keep in mind....

I needed a report on product purchases. Each record had a purchase price and a quantity. I wanted a View that would take a SUM() of r->price * r->qty, but there was no good way to do that. I defined a computed property of the entity, and that didn't work at all (#14). I ended up being able to compute it by defining a handler that did the computation with an SQL formula and using SUM aggregation on the formula, but that's messy.

This looks like the place to solve this mess for D8.

On the other hand, in some computations, SQL really is the best place to do it. It could avoid memory issues on the Web server by moving them to the DB server. So, it would be nice to have some sort of an option to generate a computed field from either PHP or SQL calculations, depending on the situation.

dawehner’s picture

Status: Needs work » Needs review
FileSize
62.98 KB

Status: Needs review » Needs work

The last submitted patch, drupal-1854708-15.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
64.46 KB

Here we go.

Status: Needs review » Needs work

The last submitted patch, 1854708_17.patch, failed testing.

chx’s picture

Status: Needs work » Needs review

Re #14 that definitely sounds out of scope. I'd be glad to explore expressions in a followup but the API is tricky to say the least -- see how field and function is separated here? That's because field transforms into a SQL table.column and so you need to find a way to specify an expression where the fields can be extracted and replaced with. Definitely followup -- please do file a followup and if you have an API idea do not hesitate to share with me. I'd be more than glad to work on implementing it.

Status: Needs review » Needs work

The last submitted patch, 1854708_17.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
63.67 KB

Fun with PHP versions. Also, I simplified the asserts.

Status: Needs review » Needs work

The last submitted patch, 1854708_20.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
71.52 KB

Poor ConfigQuery. And, I hate PHP 5.3.5 but oh well.

Status: Needs review » Needs work

The last submitted patch, 1854708_23.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
71.35 KB

I see that we will be here for a while.

Status: Needs review » Needs work

The last submitted patch, 1854708_25.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
68.16 KB

I simply hope we eventually go 5.3.10. For now, I simply removed the execute() override in QueryAggregateInterface as it is for doxygen only. I also resolved a lot of @todos in the code.

To make this work with 5.3.5 we would need to break out most of the QueryInterface into a QueryBaseInterface too minus execute and have Query and QueryAggregate interfaces inherit that and then move most of the SQL Query into a QueryBase class not implementing any interfaces and have both Query and QueryAggregate inherit it.

damiankloip’s picture

Status: Needs review » Needs work

Firstly, this is looking great.

+++ b/core/lib/Drupal/Core/Config/Entity/Query/QueryFactory.phpundefined
@@ -50,4 +51,8 @@ public function get($entity_type, $conjunction, EntityManager $entity_manager) {
+   public function getAggregate($entity_type, $conjunction, EntityManager $entity_manager) {
+    throw new QueryException('Aggregation over configuration enitties is not supported');
+  }

Indentation is a bit messed up, and needs a docblock.

+++ b/core/lib/Drupal/Core/Entity/Query/ConditionAggregateBase.phpundefined
@@ -0,0 +1,27 @@
+abstract class ConditionAggregateBase extends ConditionFundamentals implements ConditionAggregateInterface {

We should give this class a docblock too.

+++ b/core/lib/Drupal/Core/Entity/Query/ConditionAggregateInterface.phpundefined
@@ -0,0 +1,84 @@
+  /**
+   * Implements \Countable::count().
+   *
+   * Returns the size of this conditional. The size of the conditional is the
+   * size of its conditional array minus one, because one element is the the
+   * conjunction.
+   */
+  public function count();

Cant the interface just be interface ConditionAggregateInterface extends \Countable?. Then we don't need this method here really.

+++ b/core/lib/Drupal/Core/Entity/Query/ConditionBase.phpundefined
@@ -7,41 +7,8 @@
+abstract class ConditionBase extends ConditionFundamentals implements ConditionInterface {

docblock

+++ b/core/lib/Drupal/Core/Entity/Query/ConditionInterface.phpundefined
@@ -32,7 +32,7 @@ public function count();
+   * @param string|ConditionInterface $field

Does that need the full namespace? Not sure.

+++ b/core/lib/Drupal/Core/Entity/Query/QueryAggregateInterface.phpundefined
@@ -0,0 +1,160 @@
+   * SUM,AVG,MIN,MAX.COUNT.

Spaces after functions and there is currently a '.' between the last two.

+++ b/core/lib/Drupal/Core/Entity/Query/QueryAggregateInterface.phpundefined
@@ -0,0 +1,160 @@
+   * @todo What about GROUP_CONCAT support?

Do we know what we are doing with this?

+++ b/core/lib/Drupal/Core/Entity/Query/QueryBase.phpundefined
@@ -283,4 +329,77 @@ public function addMetaData($key, $object) {
+   * Generates an alias for a field and it's aggregated function.

This sounds weird.

+++ b/core/lib/Drupal/Core/Entity/Query/QueryFactory.phpundefined
@@ -49,4 +49,21 @@ public function get($entity_type, $conjunction = 'AND') {
+   * Returns a query object for a given entity type.

Returns an aggregate query object for a ...

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+class ConditionAggregate extends ConditionAggregateBase {

docblock.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+    $sqlQuery = $conditionContainer instanceof SelectInterface ? $conditionContainer : $conditionContainer->sqlQuery;

I always like to use parenthesis around stuff like this (.. instanceof ..) ? .. But that is just presonal preference. I just find it easier to read.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+        $type = strtoupper($this->conjunction) == 'OR' || $condition['operator'] == 'IS NULL' ? 'LEFT' : 'INNER';

Same with this - ()?

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+   * Implements Drupal\Core\Entity\Query\ConditionInterface::exists().
...
+   * Implements Drupal\Core\Entity\Query\ConditionInterface::notExists().

Needs prefix '\'

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+   * @param array $condition

Does this need a bit about the param too?

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/ConditionAggregate.phpundefined
@@ -0,0 +1,85 @@
+        break;
...
+        break;

I don't think we usually have spaces here in switches?

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/Query.phpundefined
@@ -7,13 +7,50 @@
+   * An array of fields keyed by the field alias. Each entry correlates to the
+   * arguments of \Drupal\Core\Database\Query\SelectInterface::addField(), so
+   * the first one is the table alias, the second one the field and the last
+   * one optional the field alias.

Does this have to be a 1 line summary, then a separate para?

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/Query.phpundefined
@@ -140,21 +230,80 @@ public function execute() {
+   * Actually create the result.

Does this need a bit of tlc?

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/Query.phpundefined
@@ -140,21 +230,80 @@ public function execute() {
+  protected function getSqlField($field, $langcode) {

No docblock.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/Query.phpundefined
@@ -140,21 +230,80 @@ public function execute() {
+   * Does the query requires GROUP BY and ORDER BY MIN/MAX.

Probably needs 'returns' or 'determines' in this line.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/QueryAggregate.phpundefined
@@ -0,0 +1,183 @@
+ * Definition of Drupal\field_sql_storage\Entity\QueryAggregate.

prefix '\' and Contains.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/QueryAggregate.phpundefined
@@ -0,0 +1,183 @@
+    $alias = str_replace('.', '_', $sql_field);
+    if (substr($alias, 0, 6) === 'field_' && substr($field, -6) !== '_value' && substr($alias, -6) === '_value') {
+      $alias = substr($alias, 0, -6);
+    }

This looks pretty confusing, is there a better way to do this? Not sure.

dawehner’s picture

Status: Needs work » Needs review
FileSize
12.4 KB
69.89 KB

Thanks for the review! It's a bit hard to review part of it, as a lot of the initial not perfectly documented entity query is part of it. Hopefully we can improve it more later.

Cant the interface just be interface ConditionAggregateInterface extends \Countable?. Then we don't need this method here really.

Well, so we should at least me able to document what count() should do, what about the change in this version of the patch?

Does that need the full namespace? Not sure

Noone knows what the current standard is, I adapt it, so there is less problem to get that one in.

+ * SUM,AVG,MIN,MAX.COUNT.

I still want std_dev :)

Do we know what we are doing with this?

See #1903386: Adding groupbByConcat support to dbtng

This looks pretty confusing, is there a better way to do this? Not sure.

Maybe it already helps to document exactly what this is doing.

Status: Needs review » Needs work

The last submitted patch, drupal-1854708-29.patch, failed testing.

chx’s picture

FileSize
735 bytes
69.24 KB

> Cant the interface just be interface ConditionAggregateInterface extends \Countable?.

Yeah. PHP <5.3.9: Remove count() if you extend \Countable.

chx’s picture

Status: Needs work » Needs review
damiankloip’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
11.34 KB
70.56 KB

Ok, I have just been testing this out quite alot, and chx has answered any remaining question on this.

I consider this rtbc. I just fixed a couple of docs, as I thought that was easier than another round of typo fixes. No code changed.

EDIT: This functionality is great!

amateescu’s picture

I haven't read the whole patch, but the doc fixes from the interdiff in #33 look good so I can at least rtbc that :)

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 1854708-33.patch, failed testing.

damiankloip’s picture

I've had it with these CKEditor fails already!

damiankloip’s picture

Status: Needs work » Needs review

#33: 1854708-33.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, 1854708-33.patch, failed testing.

damiankloip’s picture

Status: Needs work » Reviewed & tested by the community

And back again now that we have done the random test failure dance.

webchick’s picture

Assigned: Unassigned » catch

This looks like a catch-patch. :)

dawehner’s picture

@danz

After some talking with chx we came to the conclusion that this is too much of an edge case to be supported by core.
One possible better solution would be to pre-calc some of the information and then run a normal EQ or do your needed stuff in contrib. We shouldn't rebuild everything of sql in EQ.

DanZ’s picture

After some talking with chx we came to the conclusion that this is too much of an edge case to be supported by core.

In that case, make sure that properties/fields that can't be aggregated don't look like they can be aggregated.

In 7.x, this can be done in the Views controller by overriding the use_group_by() function. Have it return FALSE if aggregation doesn't work on the field.

Yes, the documentation is terrible, but that's what it does.

damiankloip’s picture

@DanZ I don't think that is related to this issue. If we need to review this in views, we can, but in a different issue (feel free to open one). This is just about adding this aggregation support for entity query.

catch’s picture

Title: EntityQuery aggregation support » Change notice: EntityQuery aggregation support
Priority: Normal » Critical
Status: Reviewed & tested by the community » Active

I could only find 2 nitpicks, can be fixed in a follow-up since this will need a change notice anyway. This looks great and I wanted this in Drupal 7 the other week then realised I'd not reviewed this patch yet :(

+++ b/core/lib/Drupal/Core/Config/Entity/Query/QueryFactory.phpundefined
@@ -50,4 +52,24 @@ public function get($entity_type, $conjunction, EntityManager $entity_manager) {
+   public function getAggregate($entity_type, $conjunction, EntityManager $entity_manager) {

Arghh dreditor didn't copy the line but typo for 'enitties' I think it was.

+++ b/core/modules/field_sql_storage/lib/Drupal/field_sql_storage/Entity/Query.phpundefined
@@ -140,21 +232,80 @@ public function execute() {
+  protected function getSqlField($field, $langcode) {
+    if (!isset($this->tables)) {
+      $this->tables = new Tables($this->sqlQuery);
+    }
+    $base_property = "base_table.$field";
+    if (isset($this->sqlFields[$base_property])) {
+      return $base_property;
+    }
+    else {
+      return $this->tables->addField($field, 'LEFT', $langcode);
+    }

No docs at all?

Also before we close this it'd be great if we could use this somewhere - comment count?

dawehner’s picture

FileSize
1.04 KB

Opened a follow up for that one: http://drupal.org/node/1918254#comment-7070182

Comment count is probably a bad one, because comment as field is currently in work, I tried some queries in the storage controller, though this didn't worked.

dawehner’s picture

Started a small change notice: http://drupal.org/node/1918702

chx’s picture

Title: Change notice: EntityQuery aggregation support » EntityQuery aggregation support
Priority: Critical » Normal
Status: Active » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Updated issue summary.