Posted by hunziker on April 13, 2011 at 7:28am
14 followers
Jump to:
| Project: | Drupal Commerce |
| Version: | 7.x-1.x-dev |
| Component: | Developer experience |
| Category: | task |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | postponed |
Issue Summary
As discussed in other issues, there are some leaks in the current price implementation. Therefore I like to discussed possible solutions for the Drupal Commerce Version 2.x.
Concerns to discuss:
- Price storage (int vs. float/decimal)
- Component concept
- Multiple prices for the same products in different currencies
- Storage of the currency (Should we store for each price the related currency? / Performance vs. Redundancy)
- Database model normalization (Serialized fields are not normalized)
- Integration with other modules such as Views / Entity API / Rules.
Comments
#1
I actually don't see anything bad in your list. Those are design decisions that make a lot of sense to me.
#2
I try to explain why I have an uneasy feel with the price handling:
In Drupal the database is used to exchange date between different modules. So you do not need to use the API. In some cases (for example the Views module) it is much faster to use directly the database as the provided API. So if we use integers instead of a decimal, we provide a default price formatting, that no one expects and there for we are always depending on the API. We get a higher coupling of data with code. Additionally I don't see any advantage of using integers instead of decimals. (PHP can handle floats properly if you add the bcmath extension.)
The component concept is pretty nice. But as I explain to the previous point, we get a hight dependency between data and the code, when we use serialized data in the database. A really hard problem with the current concept of serialization is the computation of all taxes over a given time range. If you want to know how many tax of type X you have to pay for the month Y, you need to select the orders over the given month via SQL. Then you need to unserialize the data field with PHP and select the price component X. If the order has some price in the component X, you need to add them up. If you have 10'000 order in the database, this takes a lot of time. If we had a normalized database schema, we can easily implement such a use case by using the database joining mechanism. We can join the appropriate tables together and sum them up with a aggregation function. There are other use cases where we have problems to implement it with the serialized approach. By side the aggregation of different currencies is not that easy independent of the storage type (integer/decimal).
This is a new feature, but with the current component approach really hard to implement. You can have only one price per item. It would be nice if we can define multiple prices per item. A normalized database schema allows the implementation by minor changes.
At the moment for each price the currency is stored. When you have an order with different currency the first line's currency is taken. The other cases is not handled in the code yet, because there is no logic to resolve this case. Depend on this, we can't use this additional information. Why store it then? We get only maintaining problems with it, especially because it is redundant information. So I propose to remove it or then move it into a separate table and then you can have really two currencies per item. This allows then the above mentioned use case. As mentioned above summing up of amounts in different currency is near impossible. Hence we can go for a single currency store, but then we can remove the currency code. The conversion between the currency can then be done on the user interface layer.
I mentioned this point many times in this post.
As mentioned in the first point. With the group by feature of Views we get some problems of the price formatting. We could avoid this by choosing the right storage format. The rules modules allows the handling of entities and there attributes. When we choose a better format, we do not need to handle all the interaction manually. This can be done by a generic approach.
I hope my explanations are clear. My intention is not to corrupt the Drupal Commerce project. I like to point out what can be improved for a better solution. I have develop for many open source commerce solutions (Magento, osCommerce, xt:Commerce, Zen Cart, OXID eSales, Xonic, VirtueMart, Magento...) written in PHP. I see many things done well and many things done poorly in the mentioned solutions. Most of the Drupal Commerce project is really amazing compared to the existing solutions, but if there are some central things, like the price handling, which are not fully sophisticated designed, the solution has difficulties to get an acceptable market share. I like to help improving the solution with my know how gain over the past.
#3
+1 for database normalization - still having nightmares from UberCart's attributes
#4
Your premise is wrong. There is no such thing as
"In Drupal the database is used to exchange date between different modules."Querying the database directly is discouraged, and we are going toward less normalization (in the sense "data split in more tables"), not more.#5
Yep, we'll need to address some of these items going forward. Moving price components into a separate table has been on my hit list for some time for 2.x, primarily for the purpose of generating easier tax reports. Responding to points in particular,
To summarize, I do think there are some things to address in the component API / data storage, but I don't think there are things to address with the very basic price structure. I'd like to work out a solution for improved price component storage in 2.x, but I'm a little worried about the way components don't currently have forced rounding.
I'm putting this postponed b/c it's not a 1.x issue, but feel free to respond to my points above. I just might not get back to you immediately as I'm really trying to push out a beta3. : )
#6
To add my thoughts on price storage there are problems with storing net values with only two decimal places when working with VAT.
Although no longer a correct example there will be similar ones. The UK rate of VAT was 17.5% and £2.99 might be a common price of a product including VAT. This cannot be calculated if the net price only has two decimal places because the rounding should be applied to the inclusive of VAT price.
£2.54 x 1.175 = £2.98
£2.55 x 1.175 = £3.00
but
£2.545 x 1.175 = £2.99
#7
We are developing a project management system for real estate development. Evaluating the commerce price module we stumbled upon the maximum possible value 21,474,836.47 USD/EUR/...(2147483647 cents) using a 32bit system. We need more money... ;-) ...and exact VAT calculations as aforementioned by dwkitchen #6 are important for us too.
To get rid of this limitations concerning the price storage/implementation we vote for something like:
+1 BigInteger/bcmath/gmp and DECIMAL(19,4) on the database-side
(perhabs as an option if bcmath/gmp is not available and with fallback to an integer implementation)
Having 64bit php integer with DECIMAL(15,4) might be an alternative without bcmath/gmp. That's what we are trying in our project now.
#8
#7 has a good point: we should move from int to bigint on the database side (int is generally 32bit, bigint is generally 64bit; at least that's true on both MySQL and PostgreSQL).
Most of the world is 64bit today, so I don't think we should care that much about 32bit platforms. If you want bigger number handling, deploy on 64bit.
The type of scenario mentioned in #6 should not happen. The components that make up the price are always stored in the maximum floating-point precision (... that PHP is able to serialize).
#11
Here's a thought: GnuCash, the GPL accounting system, follows the same approach as Commerce in storing price values in integer form (rather than floats) to ensure proper rounding and arithmetic. But GnuCash takes it one step further by providing two fields in the database for each price: a value, and a denominator. The purpose of the denominator is to indicate what the decimal precision of the value field should be.
So for example, to represent $99.99 in GnuCash, the following is stored in the GnuCash data file:
value = 9999
denominator = 100
Then, it's just a simple matter of loading both values and dividing the value by the denominator.
Perhaps Commerce can implement a similar storage schema, and automatically perform the conversion for other modules when data is being inserted.
This would solve the potential issues involved with having a store that requires some products to have 2 points after the decimal, and others to have 3. This was raised by hunziker in issue: #1124416: Revise the way we handle price amount values (specifically this comment)
@Damien Tournoud: Also worth noting that GnuCash uses 64-bit integers for both the value and the denominator columns.
#12
Isn't that basically what we achieve by storing the currency code with the price amount? We know how the data is stored based on that, and because currency formatting is pluggable, the way the decimal points are expressed for products using the same currency can be modified there at the display layer. I'm not sure I agree with having different storage for the same currency, though I wouldn't rule it out.
#13
Oh true, that makes sense. Although the use case that I have in mind still might not be covered:
I'm working with a site built in Ubercart, and recently the client said they needed some products to have 3 decimal places, rather than 2 (but most just use 2). They sell products in very large bulk quantities, so the 3rd decimal place is necessary in some cases. It works in Ubercart with just a bit of custom formatting on the display side, because Ubercart uses a decimal storage type with 5 decimal points, so the database can handle it. But it wouldn't work in Commerce because both products would still be using the same currency, so that restricts the decimal points for all the products in a currency.
So I guess the method I'm suggesting would untether the "number of decimal points" from the currency, and instead make it unique to each product. This may create a bit more overhead, but it would provide a lot more flexibility in terms of product-by-product prices.
#14
I'm very interested in determining the best way to do this, as I'm currently in the process of building a bookkeeping module (Ledger) that has similar requirements for exact precisions. I started it with floating values, but quickly learned the pitfalls of doing so (see #1477534: Store values in integer form to avoid issues with float rounding). Luckily the module is still in DEV so I can switch things up easily. ;-)
I've been following Commerce's lead in a lot of the data architecture decisions I've been making in Ledger, but I think this one needs more consideration. It seems that looking into the way projects like GnuCash deal with decimals (and other things like multiple currencies and commodities) makes sense.
#15
Yeah, I see your point re: the ease of solving it at the data storage level. I'm not a huge fan of storing the data for the same currency in two different ways, but I wouldn't rule it out out of hand. For example, it can be a problem is you suddenly decide to change the precision of your storage as it is right now. It's almost like we'd need to recommend just using a new currency for that if it came to it; USD, US1, US2, etc. for each different precision under the current architecture. So the question is, would that really be any better? : P
#16
a big +10 for "storing decimal denominator" like said in #11
and confirming: not handling decimal prices out of the box is a knockout for commerce for quite some customers
i think i understand your point ryan, that this might fit in the existing model as another currency.
but in fact it is not: switching denominator is a trivial migration, while switching currency is not.
my gut feeling is denominator should be handles separately from currency, but only be configurable globally per currency, to keep things manageable.
#17
Since MySQL version 5.0.3 (see http://dev.mysql.com/doc/refman/5.0/en/problems-with-float.html) the floating point problem is solved in MySQL (when you are using "decimal" as field type). So we talk here about a PHP issue. Drupal 7 requires a relative new database version. We introduce here no new requirement. If someone uses other databases (such as PostgreSQL), she or he must check if the database systems support decimals correctly.
So why we do not solve the issue at the point, when PHP gets the data out of the database? As mentioned by #11, we can store a decimal denominator. But in contrast to #11 I propose to do the conversion between decimal an integer on the fly, when the price is selected.
The conversion can look like:
SELECT price * decimal_dominator AS price FROM ....So if you need to do some operation on the data, you have not to care about conversion things. The integration with views works perfectly, because only in case of operation like addition, subtraction, multiplication or division we get imprecise results.
If we need to do some operations directly on the data, we can use the database facilities to do that.
#18
I think I like where you're heading with that, hunziker. We'd just need to ensure that would work fine throughout Views / Rules / the Entity system. I fear we'd run into the same silly issues we have with price fields and the serialization / unserialization of data arrays. : (
#19
@rszrama: You may have right.
I checked how magento does solve the issue. As I can see they use a decimal (12,4). It seems they have no problem with that. And for regular calculation they use normal PHP operations. (+, *, / and -).
#20
@hunziker: the issue with using a decimal type field in the database, though, is that you are then restricted to that many decimal points for ALL of your currencies. So if you wanted to add a currency (or just a product) that required 5 decimal points, you're out of luck.
It seems to me that adding a denominator column is the best solution, because it allows all prices to define how many decimal points they need themselves, without relying on a constant value anywhere (whether it be in the database structure, or in Commerce configuration).
I like your idea about calculating the final price within the query also, although I think you meant to divide, not multiply. So it would look something like this:
SELECT price / denominator AS price ....#21
@m.stenta: Is it really a requirement that we can build applications with such small amounts?
The calculation depends on how you define the denominator.
#22
@hunziker: Why build an application with any restrictions if you can make them unrestricted without too much more overhead? I don't know how much overhead the extra denominator column would add, but it would certainly remove all restriction on decimal points across the board.
And of course I agree, in most cases no one will ever need to work with 0.00000001 units, but one "currency" that does require that much precision is Bitcoin. And who knows, maybe there will be a country that experiences hyper-deflation one day, and requires such small divisions. ;-)
I'm not sure I understand what you mean about defining the denominator, but I can't see where multiplication would work. Unless you defined the denominator as a decimal itself, but then you've got the same problem all over again: how many decimal points should it be? As far as I can tell, the denominator would have to be an integer value like 100, 1000, 10000, etc.
#23
For instance, bitcoins (yeah yeah) have 8 significant decimal places. A bitcoin might be worth 3$ US today, but it could (ha!) be 3000$ or 30,000$. Since there's a limited amount of bitcoins, those decimal places matter a lot in the long run.
#24
> Is it really a requirement that we can build applications with such small amounts?
prices with fractions of cents are not uncommon to me.
and further, configurable denominator would happily :-)) solve a VAT tax problem, stated short "trivial calculus hows that there are target total prices that can't be reached with any integer net price"
see #1362002: vat tax rounding problem
#25
Cross posting this #1350528: Add support for Commerce module price fields.
It seems there currently is no simple way to create a price based index for search and filtering by price is a requirement on many projects.
#26
I just finished converting the Ledger module to a 2-column integer storage scheme. The two columns are called value_num (numerator) and value_denom (denominator). Previously it was a single 'value' field of type 'float' (I quickly learned the pitfalls of float arithmetic). Here is the related issue: #1477534: Store values in integer form to avoid issues with float rounding
In Ledger's case, it mainly involved adding custom load/save logic to the entity controller class's load() and save() methods. load() simply loads both values, and divides value_num by value_denom, and sticks the result into $entity->value, which is what the original field was called. save() does some tricky string manipulation to convert it back to 2 numbers before sticking them back in the database. Not sure if it's the "best" way to do it, but it works. I'd love to hear opinions:
<?php
// Count the number of decimal places.
$precision = strlen(substr(strrchr($entity->value, '.'), 1));
// Calculate the denominator based on the precision.
$entity->value_denom = pow(10, $precision);
// Calculate the numerator based on the denominator.
$entity->value_num = $entity->value * $entity->value_denom;
?>
Then, I just had to use a custom Views field handler to perform the same basic calculation whenever the field was used. This is what it looks like:
ledger_handler_field_value.inc
<?php
/**
* @file
* Contains the value field handler.
* This handler can be used by modules that store values in value_num and value_denom fields.
*/
class ledger_handler_field_value extends views_handler_field_numeric {
// Add fields to the query.
function query() {
// Ensure the main table for this field is included.
$this->ensure_my_table();
// Formula for calculating the final value, by dividing value by denom.
$formula = $this->table_alias . '.value_num / ' . $this->table_alias . '.value_denom';
$this->field_alias = $this->query->add_field(NULL, $formula, $this->table . '_' . $this->field);
}
}
?>
Hopefully it wouldn't take much more than that to convert Commerce to the same model. As long as the load() function provides you with the same 'value' field ('price', 'amount', whatever it is) on the entity, all the rest of the code should work like before. Obviously you'll also have to get rid of the original code that was performing the conversion from minor units. I'm sure there's other stuff as well.
Let me know if I can help at all with this... if you do decide to take a 2-column approach.