I discovered this issue while trying to figure out why #330633: Temporary file cleanup needs some love (UnitTest included!) was breaking the pgsql intall. I traced the error down to the fact that the files timestamp column is of type int_unsigned, and as such, this query:
SELECT fid FROM {files} WHERE status & :permanent != :permanent AND timestamp < :timestamp
causes this error. While, after changing the column type to integer (similar to the created column in the nodes table), the problem goes away.
There's a few issues here. Does the timestamp column in the files table needs to be specified as unsigned? If not, changing that will fix the immediate problem in #330633: Temporary file cleanup needs some love (UnitTest included!). The larger issue though, is that setting an unsigned flag for a column is causing PDOExceptions...
Comment | File | Size | Author |
---|---|---|---|
#46 | remove-pgsql-unsigned_int_1_5.patch | 5.68 KB | mikl |
#43 | remove-pgsql-unsigned_int_1_4.patch | 5.67 KB | mikl |
#35 | remove-pgsql-unsigned_int_1_3.patch | 5.67 KB | mikl |
#33 | remove-pgsql-unsigned_int_1_2.patch | 5.16 KB | mikl |
#32 | remove-pgsql-unsigned_int_1.patch | 5.03 KB | Josh Waihi |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedInteresting. Promoted to the PostgreSQL surge.
Comment #2
Josh Waihi CreditAttribution: Josh Waihi commentedas talked about in IRC with Crell, webchick and Damz, since it is a PostgreSQL API issue related with PDO, the best solution is to not use unsigned. this means there will be a loss of 1bit used to determine a positive or negative value - its kind of a bug anyway to use an unsigned column as a timestamp.
to make up for this bit, we'll make int -> bigint instead.
patch on its way, will put it through simpletests first
Comment #3
Josh Waihi CreditAttribution: Josh Waihi commentedholding off making a patch until unit tests pass first.
Comment #4
linuxpoet CreditAttribution: linuxpoet commentedGuys make this easy on yourself. Change the unsigned to a bigint and use a check constraint. Alternatively use a domain but check constraint should be faster.
Comment #5
holgerjakobs CreditAttribution: holgerjakobs commenteda timestamp should not be any integer type, but type timestamp, which is available in all standard conforming databases and allows calculating, comparison and validity checking.
Comment #6
Crell CreditAttribution: Crell commented@#5, sadly it does not exist in a standard way across all databases we wish to support. The date calculation functions are inconsistent, and timezone handling is horrid and not commonly supported.
We'd love to use real timestamps, if they didn't suck so much. :-)
@linuxpoet: Do you mean across all of Drupal, or just convert in the postgres driver (the current plan)? A patch for the latter would be awesome. Thanks. :-)
Comment #7
jhedstromAttached is a patch that converts 'unsigned' types to a simple check constraint. The logic was already there, but only being applied for columns of type serial.
Comment #8
Damien Tournoud CreditAttribution: Damien Tournoud commented@jhedstrom: what does that means in term of data size limit?
Comment #9
jhedstromGood call Damien...I forgot about that extra bit of storage saved on unsigned types. We can stick with the suggestion in #4, and change them all to bigint. If that makes sense, I can try and roll a patch for that.
Comment #10
Damien Tournoud CreditAttribution: Damien Tournoud commented@jhedstrom: I'm not even sure about that. In SQL, int(10) means "a ten-figure-long integer". So maybe it doesn't even matter. I would just like to have a confirmation.
Comment #11
jhedstromWhat if we kept the type the same, and simply bumped the storage limit up by one in the case of unsigned? For example, unsigned int(10) would become int(11) with the check constraint.
Comment #12
jhedstromWait, nvm, I think that missed the point...essentially I don't know enough about the internals to give the confirmation you're looking for.
Comment #13
Josh Waihi CreditAttribution: Josh Waihi commentedthere is another small issue here, as a signed int, the max number that can be stored is 2147483647, see @Shiny's post http://coffee.geek.nz/node/19437 and PHP will round down any number over that value! so even if we made the column a big int, which maybe we should, PHP would still round down unless its a string.
Comment #14
Josh Waihi CreditAttribution: Josh Waihi commentedphp will *silent* change the int to 2147483647 if it's greater than that, and used in sprintf('%d')
Comment #15
Josh Waihi CreditAttribution: Josh Waihi commentedok did anyone see this?: http://drupal.pastebin.com/f7e89895
that comes from system.install function system_install. no wonder PDO throughs the exception, it doesn't know about the datatype because natively it doesn't exist. and the term int_unsigned is false anyway because the column still is signed. postgres shouldn't be pushed around like that just to fit in with mysql.
since postgres doesn't natively support unsigned integers, I propose that we remove the 'CREATE DOMAIN ....' parts at least and let postgres just use int.
int means the highest number you can have is 2147483647 which is half of mysql's unsigned int so to combat that we can default unsigned integers to bigint which uses 8 bytes instead of 4 which means the highest number becomes 17179869176 ( I think) which is better.
Comment #16
Damien Tournoud CreditAttribution: Damien Tournoud commentedOk, agreed. So let's default:
int, CHECK > 0
bigint, CHECK > 0
bigint, CHECK > 0
About the latter: MySQL does all bigint arithmetics in a signed manner. And adds: "you should not use unsigned big integers larger than 9223372036854775807 (63 bits) except with bit functions!". So nobody will notice.
About the patch:
I think you want to remove the if statement.
Good... but we will also need an upgrade path. We will need to migrate those *_unsigned columns to their proper native types. And drop those domains.
Comment #17
jhedstromBelow is a query that will find all the fields using the domains to be dropped and altered to standard postgres types.
This query works on PG 8.2 and up, but it's been a long time since I've used any version less than that, so further testing/verification should be done on supported PG versions before it makes it into an update script.
I won't have time to add this to the patch until sometime next week, but figured I would post it here if anybody else wanted to jump on this.
Comment #18
catchWe don't support postgres below 8.3, so no need to test below that.
Comment #19
Josh Waihi CreditAttribution: Josh Waihi commented@jhedstrom: done.
Comment #20
Josh Waihi CreditAttribution: Josh Waihi commentedComment #21
jhedstromIt occurred to me after I posted that query above, that the database schema will need to be updated for the changed column types.
http://api.drupal.org/api/group/schemaapi/7
Also, the schema could be used to find the same results as the query posted above, but, in the event of a corrupted schema (if such a thing is possible), the above query would be more inclusive.
Comment #22
Damien Tournoud CreditAttribution: Damien Tournoud commented@jhedstrom: I don't see any need to change the schema.
On the patch in #19:
- "remove" should start with an uppercase, and the whole comment must end with a dot (please see http://drupal.org/node/1354 for more details).
- We need to wrap that function code in something like
if (db_engine() == 'pgsql')
.- Shouldn't we remove the old datatypes we don't use anymore?
Comment #23
jhedstrom@Damien
You're right. I forgot that, internal to the schema, this won't represent a datatype change.
Comment #24
Josh Waihi CreditAttribution: Josh Waihi commented@Damien
Done. :)
Comment #25
Josh Waihi CreditAttribution: Josh Waihi commented@Damien
Done. :)
Comment #26
CalebD CreditAttribution: CalebD commentedIt looks like there is some whitespace problems in system_update_7016. T
The two $ret[]s are one space too short of being align in the while loop.
Comment #27
Josh Waihi CreditAttribution: Josh Waihi commentedbetter formatting per CalebD's request
Comment #28
Dries CreditAttribution: Dries commentedElsewhere in Drupal, we simply define timestamp as:
so can't we simply get rid of the 'unsigned'?
Of course, this patch would still be useful otherwise.
Also, there are some code style issues here. The following line contains various issues, for example:
//only run querys if the driver used is postgres
Comment #29
Damien Tournoud CreditAttribution: Damien Tournoud commentedPlease properly reformat this text block.
Two small issues in the above lines: (1) comments are in the form "// My sentence is properly capitalized and ends with a dot.", and (2) it would be cleaner as
if (db_driver() == 'pgsql') {
.I think the '{}' should be removed.
Comment #30
Josh Waihi CreditAttribution: Josh Waihi commentedComment #31
Dries CreditAttribution: Dries commentedThere are still some code comments that are not conform. Search for "bump the".
Otherwise this looks good but I can't really test it -- no PostgreSQL installed on this machine and I'm _still_ at the airport.
Comment #32
Josh Waihi CreditAttribution: Josh Waihi commentedfair call, I've added a bit of a comment block to explain why we do what we're doing here
Comment #33
mikl CreditAttribution: mikl commentedHmm, I've looked over the patch, and I have made a few modifications to the phrasing of the comments and formatting of the SQL code, but otherwise, this looks fine. I'm currently waiting on the unit tests to run…
Comment #34
keith.smith CreditAttribution: keith.smith commentedPlease use proper case for things like "postgres" and "drupal" in code comments.
Comment #35
mikl CreditAttribution: mikl commentedI'm not entirely sure that "postgres" should be uppercase, since it's not the proper name, but I've changed it to PostgreSQL where applicable and corrected capitalisation of Drupal.
That aside, the tests still don't pass, but I don't think that it is a function of this patch, but rather that #337796: Make all tests pass on PostgreSQL hasn't been committed yet.
Comment #36
Josh Waihi CreditAttribution: Josh Waihi commentedI have run postgres unit tests with 100% pass rate - as @mikl said, other patches are required to do this. this is one of
Comment #37
CalebD CreditAttribution: CalebD commentedI do not see any other problems with this patch via visual inspection, but I'm not at home right now so I can't test on an actual PostgreSQL install.
Setting to code needs review. Will try to test it later tonight.
Comment #38
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #39
drewish CreditAttribution: drewish commentedI was going to try testing the upgrade path with this patch but got hamstrung by #351002: Cannot update D6 to D7 on pgsql
Comment #40
drewish CreditAttribution: drewish commentedwhoops, restoring the title.
Comment #41
drewish CreditAttribution: drewish commentedI think this is fundamentally sound but there's a few nitpicky things. The indenting of the
break;
lines in system_update_7016() is incorrect. They should be indented beyond thecase
statements.
Did some further testing:
* Upgraded a D6 database after using #349671: Make the postgresql driver independant of the schema and it seemed to work fine.
* Upgraded an existing D7 database.
* Tested a clean installation and it worked fine -- It also allowed me to apply the #330633: Temporary file cleanup needs some love (UnitTest included!) and get the cron tests to pass.
Comment #42
drewish CreditAttribution: drewish commentedBoth #330633: Temporary file cleanup needs some love (UnitTest included!) and #341910: file_space_used() not checking properly checking bitmapped status value (adds unit tests) are marked as postponed waiting on this.
Comment #43
mikl CreditAttribution: mikl commentedFixed the indendation as per #41.
Comment #44
drewish CreditAttribution: drewish commentedsorry, i just noticed that the breaks in includes/database/pgsql/schema.inc have the same indenting problem.
Comment #45
Crell CreditAttribution: Crell commentedI won't claim to be a Postgres expert, but on visual inspection #43 seems sane to me (modulo code formatting issues that drewish already seems to have under control).
Comment #46
mikl CreditAttribution: mikl commented#44: Oh, yeah. Missed that too. Re-rolling.
Comment #47
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #48
Josh Waihi CreditAttribution: Josh Waihi commentedthanks to mikl, drewish and Crell for keeping up the good work over chrsitmas!
Comment #49
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #50
Josh Waihi CreditAttribution: Josh Waihi commentedclosing
Comment #51
Scott_Marlowe CreditAttribution: Scott_Marlowe commentedI just had to point out that int(10) does NOT mean a 10 wide integer. It has no meaning outside of MySQL, such as on postgresql where int(10) generates an error, as it should. If you want a 10 wide int, use numeric(10). In MySQL, int(10), which certainly looks to mean a 10 wide int, actually means an int that's padded on retrieval with spaces to take up 10 characters width. I.e. it's a output formatting option, not an input / storage option.
I see this mistake all the time. And it's quite understandable. It's one of the most counterintuitive type definitions in existence.