Hi,

I have been struggling for several weeks now to build a View showing calculated data for stock items. Starting with a list price, discount, quantity and profit margin, I want to add columns showing cost price, cost price total, sales price total. I also want to show column aggregation, i.e. the total value of the transaction.

There are a plethora of different ways of trying to do this and not succeed. But first, to set the playing field: a node View will be comprised of fields that can be a combination of both native and CCK types. Then there are many ways to manipulate and aggregate the values of the fields. These, I believe, can be classified into three categories:

  1. Calculated fields: Views Math expression fields, Computed Fields (stored)
  2. Column aggregration: Views Calc, Views Group By
  3. Row aggregation: Views "Use grouping", Views Table Style "Group by", and configuring Attachment displays

Of course, Views provides a whole library of APIs for getting Drupal to do exactly what you want, but it makes sense to try and solve the problem with the tools at hand first.

In the use case described above, I need to do 1) and 2). In fact, I need to aggregate a column of calculate fields! So, the first step is to look at field calculation.

Field Calculation

When to do field calculations? Either it can be done and stored with the node as a Computed Field or, it can be done on the fly using Math expressions. Math expressions are not supported by Views Calc, so that is no good. Furthermore, the CTools Math function is fed rendered fields which can cause errors (#1218866: Math expressions choke on formatted fields).

The result of a Computed Field can be stored in the database when the node is saved, or generated on the fly when the node is viewed. Only stored fields are available to Views since Views does not use node_load for performance reasons. However, if you have many Computed Fields then there will be JOINs made across several tables; one for each field and the node plus whatever relationships the node is sourcing other values from. Too many of these will kill site performance.

Computed Fields can result in code duplication; each Computed Field could sum over a specific field from a list of nodes for instance. Then each Computed Field can contain code to loop and load the input nodes for processing. Fortunately, there is an API that gives custom modules an opportunity to refactor the code.

What are the options for the DIY enthuasist? This would involve looping through each row to perform calculations and adding custom columns to the View with the results. The Views API provides plenty of access points to the data and the displays, but my impression is that adding columns is not a trivial task and I have not found any good examples to guide me. One workaround could be to duplicate one of the input columns in the View (say 'List price') with the right formatting and with the desired title (say 'Cost price total') and then to overwrite the data in the hook_views_pre_render for example. Ugly. Is there a better way? Fortunately, this workaround does not generate any more SQL. See this interesting thread: http://stackoverflow.com/questions/1684971/too-many-tables-mysql-can-onl...

Column aggregation

Since Views Calc does not have support for summing Math expressions, mere mortals will despair at writing a hook_views_pre_render script that can do it instead. (Math expressions do not live with the rest of the View 'results'.) So that pretty much rules them out.

Views Calc seemed like it was fit for the bill for handling column aggregation, but it is buggy and in need of better integration with Views 3 as discussed in #1118604: D7 version broken - WSOD and other errors. For instance I have reported two bugs that are complete showstoppers for me.

For the DIY enthuasist, it is back to the Views APIs. Interestingly, I figured out how to override a Views Calc table from mymodule and with mymodule/views-calc-table--myview.tpl.php

/**
 * Implements hook_theme
 * See http://views-help.doc.logrus.com/help/views/api-default-views
 * See http://drupal.org/node/352970 (Views 2 theming)
 * Module weight has been set to 11 in the database.
 */
function mymodule_theme($existing) {
  return array(
      'views_calc_table__myview' => array (
      'path' => drupal_get_path('module', 'mymodule'),
      'arguments' => array('view' => NULL, 'options' => NULL, 'rows' => NULL),
      'template' => 'views-calc-table--myview',
      'original hook' => 'views-calc-table',
    ),
  );
}
/*
 * Implements moduleName_preprocess_hook
 * http://drupal.org/node/223430
 */
function mymodule_preprocess_views_calc_table__myview(&$vars) {
  template_preprocess_views_calc_table(&$vars);
}

However the bugs still remain, so calculating the totals will have to be done using the Views APIs in any case.

Row aggregation

This seems to work as expected. Views Calc supports Views Table "Group by" option.

In the use case described in the beginning, row aggregation would mean sorting the articles by type and providing inline sub-total rows with a grand total row at the bottom of the table.

article X1
article X2
---------
Subtotal
---------
article Y1
article Y2
---------
Subtotal
-----------
Grand total
===========

Alternatively, the "Use grouping" option could be used in a summary table to show subtotal prices for article Xs and article Ys, with an Attachment display using "Group by" showing a breakdown of each group:

SUMMARY PAGE
article Xs
article Ys
-----
Grand total
=====

ATTACHMENT
Group: article Xs
article X1
article X2
------
Total
======
Group: article Ys
article Y1
article Y2
------
Total
======

Or a combination of both. Of course, this is all in the future. A custom solution using Views APIs is the only way at the moment.

Epilogue

Am I completely off base?? My own conclusions are that one can get very lost and confused and waste a lot of time trying things that don't work. I actually toyed with the idea of not using Views at all for this use case.

I suspect I cannot tell the wood from the trees anymore and would appreciate some insight into how this problem should be approached. Is there an oracle that I have not yet visited?

In the meantime, what I will try now is to

  1. Create extra columns and overwrite them using hook_views_pre_render
  2. Override the Views template or replace it altogether and do the more advanced row aggregation as described above.

Some issues that are a priority for me:

  1. Views Calc really needs to be brought up to date and to work with Math expressions
  2. Make Math expressions look more like numeric fields so that style plugins like Views Calc do not need to implement special handling
  3. Or, document how to work with Math expressions in the APIs. Where do they live in the View object?
  4. The features described here are very Excel-like. It would be nice to have documented patterns for manipulating the data.
  5. In fact, could data manipulation in the View deserve its own section in the Views roadmap (#646284: Status of Views 3 Roadmap)??
  6. Views could warn about JOIN bloat even without running a preview.

Yes, I could make issues out of all these but my impression is that there is a lack of direction in this area, hence this long rant.

Well, that's it. This summary of my experiences with Views will surely contains some misconceptions and mistakes for which I would be grateful to be corrected on.

Good night.

Comments

merlinofchaos’s picture

I believe the best solution for you is to create simple versions of the fields you want to perform calculations on that do not contain commas, and set those fields to 'exclude'. Then use those fields to create your calculation.

kevin.mcnamee@mailbox.org’s picture

Yes, that makes more sense. But how does one aggregate math expression columns?

merlinofchaos’s picture

There's a little known, accidental, hidden feature that actually allows you to create a running total out of a math expression. In fact, I need to write something up about this, because I didn't even know it worked this way until I did it by accident.

If you use the token for the current math expression field, you will actually get the result from the last row.

If you then add another column to it, you can keep a running total.

The thing you can't currently do, that I know of, is provide a summary column at the bottom containing the running total (at this time) but it's possible that a smartly created area plugin could read the tokens and get the data you want. It would be a pretty small piece of code to do that.

merlinofchaos’s picture

Oh and since the math expression is evaluated during the render process, there's no way to aggregate it any other way -- obviously you can't aggregate a PHP expression in the database. :/

kevin.mcnamee@mailbox.org’s picture

Ok! And doing the calculations on the fly is faster than retrieving them from the database.

merlinofchaos’s picture

Not necessarily, but the calculations are pretty fast. If you're doing a LOT of records the database probably can do it faster, but at that point you'd need some pretty customized fields to do the calculation, and Views doesn't support that kind of SQL. :/

kevin.mcnamee@mailbox.org’s picture

Could you provide an
example of how to manipulate maths data via the APIs?

merlinofchaos’s picture

You would probably need to create a field handler. I won't get into how to create a field handler here, there are several examples you can basically copy. The short version is you need the hook_views_api(), hook_views_data(), hook_views_handlers() and the handler class.

In the query() method of the handler class, you can do something like:

  $this->ensure_my_table();
  // Simple formula to add two fields on the same table together.
  $formula = $this->table_alias . '.field1 + ' . $this->table_alias . '.field2';
  $this->field_alias = $this->query->add_field(NULL, $formula, $this->table . '_' . $this->field);

That's the basic way to add a formula field directly in SQL. There's a lot of things you can do there. Important note:

1) never reference a table directly. Always either use the table that's added with the field, or use ensure_table() and use the alias it returns. It is very important to do this because relationships can cause the same table to be added multiple times in different ways, and the base name of the table could be ambiguous.

2) You do have to provide some kind of default alias (that's the 3rd argument to add_field) if you don't provide a table (which is the 1st argument) when adding a formula.

kevin.mcnamee@mailbox.org’s picture

I'm not sure I follow. Let me clarify my question. If I use the built-in Views Math expressions to create aggregated column data, how do I access the result from say hook_views_pre_render or mymodule_preprocess_views_view_table__myview? As I mentioned this data is not present in view->result (hence the reason Views Calc does not support it I presume).

In general, I would also like to know why Math expressions are not stored and handled like other fields, such as CCK number data. For instance, there is no "Format" options à la number_field_formatter_info available for how controlling how the values are presented. It would also allow style plugins to provide generic handling for all table data (such as Views Calc). What is the thinking behind this?

dawehner’s picture

The result should be in $view->style_plugin->rendered_fields but pre_render is might to early for it.

merlinofchaos’s picture

dereine is correct. Since the math expression calculates during render, it happens during the render phase, and pre_render is too early.

In any phase where it would be available, $view->style_plugin->rendered_fields would contain the data.

Is there any more we can answer you with, or should this issue be marked fixed now?

kevin.mcnamee@mailbox.org’s picture

Status: Active » Fixed

This should be enough to on. Thanks.

kevin.mcnamee@mailbox.org’s picture

Status: Fixed » Active

The following code will augment the Views Calc table with the SUM of a Math expression column.

function mymodule_preprocess_views_calc_table__myview(&$variables) {
  template_preprocess_views_calc_table(&$variables);

  $expression_total = 0;
  $rendered_fields = $variables['view']->style_plugin->rendered_fields;
  foreach ($rendered_fields as $field) {
    $expression_total += $field['expression'];
  }
  $variables['totals']['SUM']['expression_1'] = (string)$expression_total;
}

However, the function is forced to work with rendered data which means that the Math expression cannot cannot use any formatting, e.g. a "Thousands separator". Otherwise, a value such as "35,123" will be treated as "35" when summing.

This leads to the very cumbersome solution of having to define the following columns:

  1. List price (formatted, displayed)
  2. List price (unformatted, hidden)
  3. Quantity
  4. Math expression [expression] (= List price unformatted x Quantity) (unformatted, hidden)
  5. Math expression [expression_1] (= [expression]) (formatted, displayed)

The same applies to the suggestion in #3, which is just as bad:

  1. List price (formatted, displayed)
  2. List price (unformatted, hidden)
  3. Quantity
  4. Math expression [expression] (= List price unformatted x Quantity) (unformatted, hidden)
  5. Math expression [expression_1] (= [expression]) (formatted, displayed)
  6. Math expression [expression_2] (= [expression_2] + [expression]) (formatted, hidden)

And that is just for aggregating List prices. Is this really how this is supposed to be done?? Is the alternative to start coding the suggestion in #8? Will that play nice with Views Calc?

dawehner’s picture

That's not a support issue queue for the views calc module. Please assume that we don't know every module on drupal.org

kevin.mcnamee@mailbox.org’s picture

I am not trying to get help with Views Calc. It is a plugin just like any other, but it happens to be the one I am using and serves to demonstrate the problem of performing column aggregations on Math expressions.

merlinofchaos’s picture

Status: Active » Fixed

I'm sorry but the system uses tokens in order to get other columns. I realize it's a nuisance that it's using rendered data, but that part of the system cannot be changed. There is now say "Oh and for this token we need a raw value". The tokens are what they are.

The fact is, raw values are not necessarily available and not necessarily in a form you want.

This isn't going to change. I realize it is a nuisance.

kevin.mcnamee@mailbox.org’s picture

OK, that's fine. Now I know that I have reached the technical limit of what I can do with Math expressions and that this behaviour is fixed for the forseeable future.

I assume then that the alternative to using Maths expressions is to engage with the Views APIs as per #8. Bring it on ...

Ashford’s picture

kmcnamee, please consider adding the main ideas here under Views_Calc in the Documentation with a comparison of each method -perhaps with a scenario of when you would use each method.

kevin.mcnamee@mailbox.org’s picture

I am getting to grips with the handler APIs now. If I can use the query() method to populate view->result with the product of two other fields (e.g. list price x quantity) then I expect the rest of the pieces will fall into place (i.e. Views Calc will display the SUM of the products).

Having worked with different development frameworks over the years (Zope, Rails), one is used to retreiving values from the database and post-processing them before presenting the result (a typical MVC architecture). With Views there is only the SQL query and the rendered data to work with (I stand happily corrected if this is not the case) which caused a lot of confusion until now.

merlinofchaos’s picture

That is correct. There isn't a phase between the data retrieved from the query and the actual rendering of the data. The ability to manipulate the raw data is definitely something that has been desired, and we've slowly been trying to make it more possible but it was never part of the initial designs.

kevin.mcnamee@mailbox.org’s picture

Using the suggestion in #8, I created a crude field handler with

  function query() {
    $this->ensure_my_table();

    $formula = 'node_data_field_sai_list_price.field_sai_list_price_value * node_data_field_sai_quantity.field_sai_quantity_value';
    $this->field_alias = $this->query->add_field(NULL, $formula, 'list_price_total');
  }

The resultant SQL query looks like

SELECT node.title AS node_title,
node.nid AS nid,
node_data_field_sai_list_price.field_sai_list_price_value AS node_data_field_sai_list_price_field_sai_list_price_value,
node.type AS node_type,
node.vid AS node_vid,
node_data_field_sai_quantity.field_sai_quantity_value AS node_data_field_sai_quantity_field_sai_quantity_value,
node_data_field_sai_list_price.field_sai_list_price_value * node_data_field_sai_quantity.field_sai_quantity_value AS list_price_total
 FROM node node 
 LEFT JOIN content_field_common_assembly_node node_data_field_common_assembly_node ON node.vid = node_data_field_common_assembly_node.vid
 LEFT JOIN content_field_sai_list_price node_data_field_sai_list_price ON node.vid = node_data_field_sai_list_price.vid
 LEFT JOIN content_field_sai_quantity node_data_field_sai_quantity ON node.vid = node_data_field_sai_quantity.vid
 WHERE (node.status = 1) AND (node.type in ('assembly_component')) AND (node_data_field_common_assembly_node.field_common_assembly_node_nid = 2249 )

This correctly populates the the view->result array with a 'list_price_total' element containing the product of the list price and quantity. However, when I try to use Views Calc to perform a SUM of the new field, I get the following error:

user warning: Unknown column 'views.list_price_total' in 'field list' query: SELECT DISTINCT NULL AS node_title, NULL AS node_data_field_sai_list_price_field_sai_list_price_value, SUM(node_data_field_sai_list_price.field_sai_list_price_value) AS SUM__node_data_field_sai_list_price_field_sai_list_price_value, NULL AS node_data_field_sai_quantity_field_sai_quantity_value, NULL AS list_price_total, SUM(views.list_price_total) AS SUM__list_price_total FROM node node LEFT JOIN content_field_common_assembly_node node_data_field_common_assembly_node ON node.vid = node_data_field_common_assembly_node.vid LEFT JOIN content_field_sai_list_price node_data_field_sai_list_price ON node.vid = node_data_field_sai_list_price.vid LEFT JOIN content_field_sai_quantity node_data_field_sai_quantity ON node.vid = node_data_field_sai_quantity.vid WHERE (node.status = 1) AND (node.type in ('assembly_component')) AND (node_data_field_common_assembly_node.field_common_assembly_node_nid = 2249 ) AND (node.status = 1) AND (node.type in ('assembly_component')) AND (node_data_field_common_assembly_node.field_common_assembly_node_nid = 2249 ) in /usr/local/apache2/htdocs/drupal6/sites/all/modules/views/plugins/views_plugin_query_default.inc on line 1150.

The reason that Views Calc is trying to sum over 'views.list_price_total' is because the mymodule_views_data places the field in the "Global" group:

  $data['views']['list_price_total'] = array(
    'title' => t('List price total'),
    'help' => t('Some help.'),
    'field' => array(
      'handler' => 'mymodule_handler_field_numeric',
    ),
  );

The reason for this is that Views Calc is performing a separate database query to SUM over all of the desired fields. To do this, the fields must actually exist as columns in the database. Calculated values are not available to Views Calc. So where oh where can summing over calculated fields be performed?

Well, on the Views Calc project page the answer is there for all to see. Views Calc has a feature called "Custom Dynamic Fields" which actually does exactly what I want. Wow. This seems obvious now.

In summary, field calculations can be performed (and viewed) using

  1. Computed fields (stored)
  2. Math expressions
  3. views_handler_field::query() override
  4. Views Calc "Custom Dynamic Fields" (CDF)

Only 1) and 4) can be used to aggregate column data (e.g. SUM) using Views Calc. Both methods can be used to perform calculations on calculated fields; however Views Calc CDF does not support relationships. In contrast, Computed Fields has an API that allows refactoring for complex use cases (e.g. instead of having to do A+B=D and A+B+C=E; one can do A+B=D and D+C=E). In fact, Computed fields are the most flexible solution for performing calculations. However, if there are many Computed fields to be displayed, then there can be many JOINs in the query which could affect site performance.
Math expressions can only manipulate rendered data; switch to Views Calc CDF if access to the raw data is needed.
Option 3) should not be used.

kevin.mcnamee@mailbox.org’s picture

Here is a distillation of the information in this thread.

Comparison of methods for performing field calculations à la Excel
Method Advantages Disadvantages Executed Relationship support Column aggregation (using Views Calc)
Views Math expression Simple. Quick. Only has access to rendered data which can cause it to fail. Real-time Yes (Transparent) No
Computed Fields (stored) Complete flexibility. Faster(?) as results are precalculated. Each field requires a JOIN which can lead to performance problems if there are many such fields. Pre-calculated Yes (DIY) Yes
Views Calc "Custom Dynamic Fields" Relatively simple and quick. No API => no code reuse (i.e. refactoring). Real-time No Yes
views_handler_field::query() override Complete flexibility. Complex. Calculations must be done using SQL via Views wrapper functions. Real-time Yes (DIY) No

The trade-offs between the different methods will require developers to give careful consideration to how they construct their Content Types, what fields to define, and what node (revision) references to create. For instance, the more atomic the field data is and the more node relationships that are used to store the data, then the harder Views will have to work to generate the calculated data on the fly. Computed fields are a way to precalculate some values, but too many computed fields will be counter-productive.

Instead, use Computed fields to reduce the number of fields to be retrieved from the database by Views, and use Views Calc CDF to generate as many calculated fields as desired.

If node relationships are involved and column aggregation via Views Calc is required, then Computed fields are guaranteed to work. An equivalent on-the-fly method would use one of the real-time methods above, plus a View override similar to that discussed in #13.

merlinofchaos’s picture

Component: Views Data » Documentation
Assigned: Unassigned » esmerel
Category: support » task
Status: Fixed » Active

I'm marking this active again.

You've done such an outstanding job compiling this data that I think we need to do something with it. Either documentation or a Drupal planet article. It definitely needs to be read by more people than just issue queue searchers.

Assigning to esmerel as a documentation task.

kevin.mcnamee@mailbox.org’s picture

I am glad that it will be of use to others.

Since Views Calc can only act on database columns, there is no advantage to using the query() override. Instead, the render() override will still allow users to perform calculations on the raw data of the fields retreived from the database. While users must add each field of interest to the View (via the UI), the calculations can be done using PHP rather than SQL which allows for more complex operations and better refactoring.

Method Advantages Disadvantages Executed Relationship support Column aggregation (using Views Calc)
views_handler_field::render() override Complete flexibility. Calculations can be performed on existing raw field data using PHP. Easier to refactor. Moderately complex. Real-time Yes (Transparent) No
kevin.mcnamee@mailbox.org’s picture

I have proposed two feature requests that I think would aid users in this area:

mroscar’s picture

hidden feature that actually allows you to create a running total out of a math expression... can you explain more?

dawehner’s picture

About the hidden feature: I created another issue to document that: #1547924: Document field_math feature previous value

DevElCuy’s picture

Views Raw SQL is another option: http://drupal.org/sandbox/sreynen/1542466

It is developed for D7 but small enough to be ported easily.

DanZ’s picture

Speaking as someone fairly new to this problem, let me emphasize some points:

#1: Standard Views aggregation does not work on non-DB fields! E.g., computed entity properties can't be aggregated. This is because that form of aggregation is handled entirely in the database via SQL. I wish I had known this before I spent several days trying to get it to work.

#2: Fields derived from SQL formulas do work with standard Views aggregation. These can come from Views Raw SQL or a custom handler. Here's one that works.

/**
 * Field handler: displays a price multiplied by the quantity.
 */
class uc_order_handler_field_money_total extends uc_order_handler_field_money_amount {

  /**
   * Overrides views_handler_field::query().
   */
  function query() {
    $this->ensure_my_table();

    $table = $this->table_alias;
    $field = $this->real_field;
    $params = $this->options['group_type'] != 'group' ? array('function' => $this->options['group_type']) : array();
    $this->field_alias = $this->query->add_field(NULL, "$table.$field * $table.qty", $this->table . '_' . $this->field, $params);
  }

}

The key is $this->query->add_field(). The second argument defines the formula. The third argument defines the alias name.

Note that the field definition in hook_views_data() must define 'real_field' for this to work.

#3 If you define a custom field handler, you'll need a matching custom sort handler if you want sorting to work. The key point is picking up the alias name to sort on.

class uc_order_handler_sort_total extends views_handler_sort {
  /**
   * Called to add the sort to a query.
   */
  function query() {
    $this->ensure_my_table();
    // Add the field.
    $this->query->add_orderby(NULL, NULL, $this->options['order'], $this->table . '_' . $this->field);
  }
}

The fourth argument of $this->query->add_orderby() defines the alias name to sort on.

#4 If you want to filter on a formula, add a filter handler that extends something like views_handler_filter_group_by_numeric. Because you will filter on a formula, the SQL can't use WHERE. It has to use HAVING. This handler does that.

class uc_order_handler_filter_total extends views_handler_filter_group_by_numeric {
  /**
   * Called to add the filter to a query.
   */
  function query() {
    $this->ensure_my_table();
    $field = $this->table . '_' . $this->field;

    $info = $this->operators();
    if (!empty($info[$this->operator]['method'])) {
      $this->{$info[$this->operator]['method']}($field);
    }
  }
}

#5 The query is built by calls to views_plugin_query methods. See those docs to learn how the add_field(), add_orderby(), and so forth all work.

RdeBoer’s picture

I thought this might be useful for those who read and contributed to this thread: http://drupal.org/project/views_aggregator

drupalchu’s picture

ohh man, as non sql i just got squirmy!

geek-merlin’s picture

Title: Best practice? Math expression fields, Computed Fields, Views Calc, hook_views_pre_render and more » Best ways to compute and aggregate views fields
Paul Lomax’s picture

This is pretty old guide, Views has a baked in 'Maths expression' field these days which can perform a lot of these tasks. Be aware that when working with currency fields you will likely have to create duplicated 'hidden' fields that output the raw currency values you can use to calculate with.

Just in case this helps anyone else looking for some up to date info.

Chris Matthews’s picture

Status: Active » Closed (outdated)

The Drupal 6 branch is no longer supported, please check with the D6LTS project if you need further support. For more information as to why this issue was closed, please see issue #3030347: Plan to clean process issue queue