I discovered this issue while trying to figure out why #330633: Temporary file cleanup needs some love (UnitTest included!) was breaking the pgsql intall. I traced the error down to the fact that the files timestamp column is of type int_unsigned, and as such, this query:

SELECT fid FROM {files} WHERE status & :permanent != :permanent AND timestamp < :timestamp

causes this error. While, after changing the column type to integer (similar to the created column in the nodes table), the problem goes away.

There's a few issues here. Does the timestamp column in the files table needs to be specified as unsigned? If not, changing that will fix the immediate problem in #330633: Temporary file cleanup needs some love (UnitTest included!). The larger issue though, is that setting an unsigned flag for a column is causing PDOExceptions...

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Title: pgsql: column type of int_unsigned causing pdoexception » PostgreSQL surge #5: column type of int_unsigned causing pdoexception

Interesting. Promoted to the PostgreSQL surge.

Josh Waihi’s picture

Assigned: Unassigned » Josh Waihi

as talked about in IRC with Crell, webchick and Damz, since it is a PostgreSQL API issue related with PDO, the best solution is to not use unsigned. this means there will be a loss of 1bit used to determine a positive or negative value - its kind of a bug anyway to use an unsigned column as a timestamp.
to make up for this bit, we'll make int -> bigint instead.

patch on its way, will put it through simpletests first

Josh Waihi’s picture

holding off making a patch until unit tests pass first.

linuxpoet’s picture

Guys make this easy on yourself. Change the unsigned to a bigint and use a check constraint. Alternatively use a domain but check constraint should be faster.

holgerjakobs’s picture

a timestamp should not be any integer type, but type timestamp, which is available in all standard conforming databases and allows calculating, comparison and validity checking.

Crell’s picture

@#5, sadly it does not exist in a standard way across all databases we wish to support. The date calculation functions are inconsistent, and timezone handling is horrid and not commonly supported.

We'd love to use real timestamps, if they didn't suck so much. :-)

@linuxpoet: Do you mean across all of Drupal, or just convert in the postgres driver (the current plan)? A patch for the latter would be awesome. Thanks. :-)

jhedstrom’s picture

Status: Active » Needs review
FileSize
727 bytes

Attached is a patch that converts 'unsigned' types to a simple check constraint. The logic was already there, but only being applied for columns of type serial.

Damien Tournoud’s picture

@jhedstrom: what does that means in term of data size limit?

jhedstrom’s picture

Status: Needs review » Needs work

Good call Damien...I forgot about that extra bit of storage saved on unsigned types. We can stick with the suggestion in #4, and change them all to bigint. If that makes sense, I can try and roll a patch for that.

Damien Tournoud’s picture

@jhedstrom: I'm not even sure about that. In SQL, int(10) means "a ten-figure-long integer". So maybe it doesn't even matter. I would just like to have a confirmation.

jhedstrom’s picture

What if we kept the type the same, and simply bumped the storage limit up by one in the case of unsigned? For example, unsigned int(10) would become int(11) with the check constraint.

jhedstrom’s picture

Wait, nvm, I think that missed the point...essentially I don't know enough about the internals to give the confirmation you're looking for.

Josh Waihi’s picture

there is another small issue here, as a signed int, the max number that can be stored is 2147483647, see @Shiny's post http://coffee.geek.nz/node/19437 and PHP will round down any number over that value! so even if we made the column a big int, which maybe we should, PHP would still round down unless its a string.

Josh Waihi’s picture

php will *silent* change the int to 2147483647 if it's greater than that, and used in sprintf('%d')

Josh Waihi’s picture

ok did anyone see this?: http://drupal.pastebin.com/f7e89895
that comes from system.install function system_install. no wonder PDO throughs the exception, it doesn't know about the datatype because natively it doesn't exist. and the term int_unsigned is false anyway because the column still is signed. postgres shouldn't be pushed around like that just to fit in with mysql.

since postgres doesn't natively support unsigned integers, I propose that we remove the 'CREATE DOMAIN ....' parts at least and let postgres just use int.
int means the highest number you can have is 2147483647 which is half of mysql's unsigned int so to combat that we can default unsigned integers to bigint which uses 8 bytes instead of 4 which means the highest number becomes 17179869176 ( I think) which is better.

Damien Tournoud’s picture

Ok, agreed. So let's default:

  • unsigned smallint to int, CHECK > 0
  • unsigned int to bigint, CHECK > 0
  • and unsigned bigint to bigint, CHECK > 0

About the latter: MySQL does all bigint arithmetics in a signed manner. And adds: "you should not use unsigned big integers larger than 9223372036854775807 (63 bits) except with bit functions!". So nobody will notice.

About the patch:

       if ($spec['type'] == 'serial') {
         $sql .= " CHECK ($name >= 0)";
       }
-      else {
-        $sql .= '_unsigned';
-      }

I think you want to remove the if statement.

-    // Create unsigned types.
-    if (!db_result(db_query("SELECT COUNT(*) FROM pg_constraint WHERE conname = 'int_unsigned_check'"))) {
-      db_query("CREATE DOMAIN int_unsigned integer CHECK (VALUE >= 0)");
-    }
-    if (!db_result(db_query("SELECT COUNT(*) FROM pg_constraint WHERE conname = 'smallint_unsigned_check'"))) {
-      db_query("CREATE DOMAIN smallint_unsigned smallint CHECK (VALUE >= 0)");
-    }
-    if (!db_result(db_query("SELECT COUNT(*) FROM pg_constraint WHERE conname = 'bigint_unsigned_check'"))) {
-      db_query("CREATE DOMAIN bigint_unsigned bigint CHECK (VALUE >= 0)");
-    }
-

Good... but we will also need an upgrade path. We will need to migrate those *_unsigned columns to their proper native types. And drop those domains.

jhedstrom’s picture

Below is a query that will find all the fields using the domains to be dropped and altered to standard postgres types.

This query works on PG 8.2 and up, but it's been a long time since I've used any version less than that, so further testing/verification should be done on supported PG versions before it makes it into an update script.

SELECT c.relname AS table, a.attname AS field, pg_catalog.format_type(a.atttypid, a.atttypmod) AS type FROM pg_catalog.pg_attribute a LEFT JOIN pg_class c ON (c.oid =  a.attrelid) WHERE pg_catalog.pg_table_is_visible(c.oid) AND c.relkind = 'r' AND pg_catalog.format_type(a.atttypid, a.atttypmod) LIKE '%unsigned%';

I won't have time to add this to the patch until sometime next week, but figured I would post it here if anybody else wanted to jump on this.

catch’s picture

We don't support postgres below 8.3, so no need to test below that.

Josh Waihi’s picture

FileSize
4.14 KB

@jhedstrom: done.

Josh Waihi’s picture

Status: Needs work » Needs review
jhedstrom’s picture

Status: Needs review » Needs work

It occurred to me after I posted that query above, that the database schema will need to be updated for the changed column types.

http://api.drupal.org/api/group/schemaapi/7

Also, the schema could be used to find the same results as the query posted above, but, in the event of a corrupted schema (if such a thing is possible), the above query would be more inclusive.

Damien Tournoud’s picture

@jhedstrom: I don't see any need to change the schema.

On the patch in #19:

 /**
+ * remove custom datatype *_unsigned in postgres
+ */

- "remove" should start with an uppercase, and the whole comment must end with a dot (please see http://drupal.org/node/1354 for more details).

- We need to wrap that function code in something like if (db_engine() == 'pgsql').

- Shouldn't we remove the old datatypes we don't use anymore?

jhedstrom’s picture

@Damien

You're right. I forgot that, internal to the schema, this won't represent a datatype change.

Josh Waihi’s picture

@Damien

Done. :)

Josh Waihi’s picture

Status: Needs work » Needs review

@Damien

Done. :)

CalebD’s picture

It looks like there is some whitespace problems in system_update_7016. T

+    }
+   $ret[] = update_sql('ALTER TABLE {' . $row->table . '} ALTER COLUMN ' . $row->field . ' TYPE ' . $datatype); 
+   $ret[] = update_sql('ALTER TABLE {' . $row->table . '} ADD CHECK (' . $row->field . ' >= 0)');
+  }

The two $ret[]s are one space too short of being align in the while loop.

Josh Waihi’s picture

better formatting per CalebD's request

Dries’s picture

Elsewhere in Drupal, we simply define timestamp as:

      'timestamp' => array(
        'description' => 'A Unix timestamp indicating when this batch was submitted for processing. Stale batches are purged at cron time.',
        'type' => 'int',
        'not null' => TRUE,
      ),

so can't we simply get rid of the 'unsigned'?

Of course, this patch would still be useful otherwise.

Also, there are some code style issues here. The following line contains various issues, for example:
//only run querys if the driver used is postgres

Damien Tournoud’s picture

Status: Needs review » Needs work
-    // We create some custom types and functions using global names instead of
+    // We create some functions using global names instead of
     // prefixing them like we do with table names. If this function is ever
     // called again (for example, by the test framework when creating prefixed
     // test databases), the global names will already exist. We therefore avoid
     // trying to create them again in that case.

Please properly reformat this text block.

+  //only run querys if the driver used is postgres
+  if (db_driver() != 'pgsql') return array();

Two small issues in the above lines: (1) comments are in the form "// My sentence is properly capitalized and ends with a dot.", and (2) it would be cleaner as if (db_driver() == 'pgsql') {.

+    $ret[] = update_sql('ALTER TABLE {' . $row->table . '} ALTER COLUMN ' . $row->field . ' TYPE ' . $datatype); 
+    $ret[] = update_sql('ALTER TABLE {' . $row->table . '} ADD CHECK (' . $row->field . ' >= 0)');

I think the '{}' should be removed.

Josh Waihi’s picture

Status: Needs work » Needs review
FileSize
4.69 KB
Dries’s picture

Status: Needs review » Needs work

There are still some code comments that are not conform. Search for "bump the".

Otherwise this looks good but I can't really test it -- no PostgreSQL installed on this machine and I'm _still_ at the airport.

Josh Waihi’s picture

Status: Needs work » Needs review
FileSize
5.03 KB

fair call, I've added a bit of a comment block to explain why we do what we're doing here

mikl’s picture

Hmm, I've looked over the patch, and I have made a few modifications to the phrasing of the comments and formatting of the SQL code, but otherwise, this looks fine. I'm currently waiting on the unit tests to run…

keith.smith’s picture

Status: Needs review » Needs work

Please use proper case for things like "postgres" and "drupal" in code comments.

mikl’s picture

Please use proper case for things like "postgres" and "drupal" in code comments.

I'm not entirely sure that "postgres" should be uppercase, since it's not the proper name, but I've changed it to PostgreSQL where applicable and corrected capitalisation of Drupal.

That aside, the tests still don't pass, but I don't think that it is a function of this patch, but rather that #337796: Make all tests pass on PostgreSQL hasn't been committed yet.

Josh Waihi’s picture

I have run postgres unit tests with 100% pass rate - as @mikl said, other patches are required to do this. this is one of

CalebD’s picture

Status: Needs work » Needs review

I do not see any other problems with this patch via visual inspection, but I'm not at home right now so I can't test on an actual PostgreSQL install.

Setting to code needs review. Will try to test it later tonight.

Damien Tournoud’s picture

Title: PostgreSQL surge #5: column type of int_unsigned causing pdoexception » PostgreSQL surge #5: remove *_unsigned types and rework their sizes
drewish’s picture

Title: PostgreSQL surge #5: remove *_unsigned types and rework their sizes » PostgreSQL surge #5: column type of int_unsigned causing pdoexception

I was going to try testing the upgrade path with this patch but got hamstrung by #351002: Cannot update D6 to D7 on pgsql

drewish’s picture

whoops, restoring the title.

drewish’s picture

Status: Needs review » Needs work

I think this is fundamentally sound but there's a few nitpicky things. The indenting of the break; lines in system_update_7016() is incorrect. They should be indented beyond the case statements.

Did some further testing:
* Upgraded a D6 database after using #349671: Make the postgresql driver independant of the schema and it seemed to work fine.
* Upgraded an existing D7 database.
* Tested a clean installation and it worked fine -- It also allowed me to apply the #330633: Temporary file cleanup needs some love (UnitTest included!) and get the cron tests to pass.

drewish’s picture

mikl’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.67 KB

Fixed the indendation as per #41.

drewish’s picture

Status: Reviewed & tested by the community » Needs work

sorry, i just noticed that the breaks in includes/database/pgsql/schema.inc have the same indenting problem.

Crell’s picture

I won't claim to be a Postgres expert, but on visual inspection #43 seems sane to me (modulo code formatting issues that drewish already seems to have under control).

mikl’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.68 KB

#44: Oh, yeah. Missed that too. Re-rolling.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

Josh Waihi’s picture

thanks to mikl, drewish and Crell for keeping up the good work over chrsitmas!

Damien Tournoud’s picture

Title: PostgreSQL surge #5: column type of int_unsigned causing pdoexception » Column type of int_unsigned causing pdoexception
Issue tags: +PostgreSQL Surge
Josh Waihi’s picture

Status: Fixed » Closed (fixed)

closing

Scott_Marlowe’s picture

I just had to point out that int(10) does NOT mean a 10 wide integer. It has no meaning outside of MySQL, such as on postgresql where int(10) generates an error, as it should. If you want a 10 wide int, use numeric(10). In MySQL, int(10), which certainly looks to mean a 10 wide int, actually means an int that's padded on retrieval with spaces to take up 10 characters width. I.e. it's a output formatting option, not an input / storage option.

I see this mistake all the time. And it's quite understandable. It's one of the most counterintuitive type definitions in existence.