Top Visitor Page problem with PostgreSQL 8.3
jaydub - March 2, 2008 - 14:20
| Project: | Drupal |
| Version: | 7.x-dev |
| Component: | statistics.module |
| Category: | bug report |
| Priority: | critical |
| Assigned: | Unassigned |
| Status: | active |
| Issue tags: | PostgreSQL Code Sprint, PostgreSQL Surge |
Description
As reported in http://drupal.org/node/220064 and in the PostgreSQL 8.3 release notes http://www.postgresql.org/docs/8.3/static/release-8-3.html there is a problem with certain queries in PostgreSQL 8.3 due to the fact that PostgreSQL 8.3 no longer automatically casts columns to match the data type in JOIN columns or in this case the column in a CONCAT operation.
The CONCAT function created for PostgreSQL in the Drupal install
if (!db_result(db_query("SELECT COUNT(*) FROM pg_proc WHERE proname = 'concat'"))) {
db_query('CREATE OR REPLACE FUNCTION "concat"(text, text) RETURNS text AS
\'SELECT $1 || $2;\'
LANGUAGE \'sql\''
);
}expects TEXT data types. In previous PostgreSQL versions a non-text column would have been automatically cast to text. This will no longer work in 8.3.
The attached patches fix one query from the statistics module where CONCAT is used. I have included a patch for 6.1 and for 5.7.
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| statistics-5.7.module.patch | 756 bytes | Idle | Failed: Failed to apply patch. | View details | Re-test |
| statistics-6.1.admin_.inc_.patch | 731 bytes | Idle | Failed: Failed to apply patch. | View details | Re-test |

#1
This issue still exists in HEAD so I rerolled the patch.
Works fine on MySQL 5.0 but this needs to be tested on PostgreSQL 8.3 and I don't have that setup at the moment.
So: Calling all Postgres fans...
- Arie
#2
Promoted to the PostgreSQL surge.
#3
the stats module tests are currently broken in HEAD, applying this patch didn't fix it :(
#4
ok so tests are failing because PDO doesn't like floats on int_unsigned columns. this maybe linked with this http://drupal.org/node/339588
#5
this patch fixes the what I was commenting about above.
Statistics Module: 36 passes, 0 fails, and 0 exceptions
#6
The last submitted patch failed testing.
#7
I find it hard to believe that this patch failed as all it does is type cast a float to a integer. by the looks of the testbed result, it looks rather like something went wrong
#8
Marked the following issues as a duplicate of this:
#312475: SQL error in postgres on /admin/logs/visitors
#297358: Statistics Module Not Postgres Ready
#150015: Error on page "Top visitors in the past 3 days" with PostgreSQL
#9
re-submitting previous patch that failed because there is no reason that it should have failed
#10
edit: posted on wrong node
updating status
#11
Expecting the DB layer to handle float to int type coercion is not reasonable. We can expect module authors to only pass in sane data. Trivial fix, the bot likes, it, yay.
#12
I've committed this to CVS HEAD but it probably needs work for Drupal 6. Updating version and status. Thanks!
#13
Apparently, the patch in #1 was supposed to also go in but wasn't merged with the patch in #9 (sorry Dries). Right now this is an incomplete commit to HEAD, so I'm bumping to critical. Reposting patch from #1 to bring it down to the bottom for review and retesting.
#14
I have tested this and it works as stated in comment #3-#5
#15
Committed #13 to HEAD. Marking back to needs porting for 6.x.
#16
#17
#18
I think this patch does not fix the problem correctly, because in PostgreSQL 'char' is equivalent to 'char(1)', e.g., compare
mysql> SELECT CAST(57 AS CHAR);
+------------------+
| CAST(57 AS CHAR) |
+------------------+
| 57 |
+------------------+
with
psql> SELECT CAST(57 AS CHAR);
bpchar
--------
5
(1 row)
I have posted a patch that works for me here: http://drupal.org/node/412402. For convenience, I attach it to this node, too. Tested with Drupal 6.x.
#19
doing a sub select will slow the query down, can we cast as text instead?
josh=# SELECT CAST(57 AS TEXT);text
------
57
(1 row)
#20
TEXT is not supported by MySQL. In fact, I have found no way to cast an integer to a string that is compatible with both MySQL and PostgreSQL.
#21
we could write a special query function for the search module in database.pgsql.inc and database.mysql-common.inc to cast specifically for each DB??
#22
I did a quick test by running the two queries of the last patch above against a table with ~10^6 rows. Curiously enough, the query with the subselect in the from clause is more or less twice as fast as the other (with uid cast to TEXT) in PostgreSQL 8.3. In MySQL 5.0, instead, the query using concat() is around 10 times faster than the other.
My test is certainly very limited in scope but it suggests that, if performance is an important factor, the way to go is to optimize the query according to the DBMS, i.e., something like:
if ($db_type == 'mysql' or $db_type == 'mysqli') {$sql_cnt = "SELECT COUNT(DISTINCT(CONCAT(uid, hostname))) FROM {accesslog}";
elseif ($db_type == 'pgsql') {
$sql_cnt = "SELECT COUNT(*) FROM (SELECT DISTINCT UID, HOSTNAME FROM {accesslog}) AS FOO";
}
else {
# default query
}
#23
OK, well regardless, it seems D7 has a bug in it in regards to PostgreSQL not type-casting like MySQL does so we'll need to go back there and fix it. I've noticed there are a few Drupal casting issues about so I've opened #419858: Database Type Casting Per Database Connection to maybe remedy this issue for D7. Lets get HEAD right then port back to D6.
Postponed till #419858: Database Type Casting Per Database Connection is resolved.
#24
Had this issue as well. Hoping it will get fixed with the next release..
#25
trim(cast(10 AS char(10)))
works for postgresql and mysql
Since nothing seems changed across version it could be ported.
see #449736
#26
I suggest changing the signature of the CONCAT() function in PostgreSQL. It is currently:
CREATE OR REPLACE FUNCTION "concat"(text, text) RETURNS text AS\'SELECT $1 || $2;\'
LANGUAGE \'sql\''
Let's change that to:
CREATE OR REPLACE FUNCTION "concat"(anyelement, anyelement) RETURNS text AS\'SELECT CAST($1 AS TEXT) || CAST($2 AS TEXT);\'
LANGUAGE \'sql\''
I have not tested this, but I believe it can solve this issue quite elegantly: no developer working on MySQL will figure out that we need to explicitely TRIM(CAST()) every whenever calling CONCAT() on non character columns.
#27
As discussed in IRC, @DamZ's function alteration won't work as anyelement parameters must be of the same datatype, however we can define a family of functions with the same name 'concat' that handle various datatypes as arguments:
CREATE OR REPLACE FUNCTION "concat"(text, int) RETURNS text AS 'SELECT $1 || $2::text;' LANGUAGE 'sql';CREATE FUNCTION "concat"(int, text) RETURNS text AS 'SELECT $1::text || $2;' LANGUAGE 'sql';
We'd need to add a function definition for every datatype possibility - well, the datatypes we use in the schema api.
#28
Solution start to look more complicated than the problem and pretty hakish.
After all... CONCATenating int and text (and CASTing) shouldn't be something doesn't look something that shine for good design.
I see adding tons of CONCAT just to overcome this as polluting the code.
This is not just a performance problem. This is going to break a postgresql feature: stricter data type checking.
Better getting rid of the casts. Or when there isn't any viable solution use a "local" trick as
trim(cast(10 AS char(10)))
Do we really need to write that query with that CAST?
I know it is an historical query... but well that stat doesn't shine. It is mixing and splitting anonymous users and IP.
If I want to know what are the top visitors I can't. The top visitor may have several entries depending on the IP. If I want to know what is the top IP... I can't. If I want to know how many "distinct users I can't.
BTW this query looks the only one using CAST in core.
#29
Time to get this in (if not already). Marking for PostgreSQL Code Sprint
#30
subscribing
#31
I think you can define the implicit cast in 8.3, although I haven't tried it yet (rebuilding my test system):
http://www.postgresql.org/docs/8.3/static/sql-createcast.html
#32
Yes, you can redefine the casts, but to do so you also need to redefine a bunch of old functions and you don't get a lot of benefit. If SQL rewriting was still a concern for D7, we could rewrite all concat() functions to use the || operator, which works as long as one input is text. That said, it's probably best to either abstract that in the SelectQuery interface somehow, or just nix any old queries using the concat() function.
#33
Marked #544848: admin/reports/visitors as a duplicate of this issue.
#34
I think it make sense to add implict numeric-to-string casts to PostgreSQL.