Hello, I'm Piotr Krukowiecki, aka Cvbge, your new (for almost three months) PostgreSQL maintainer. As some people have noticed, there was a poll running on drupal.org last week. This text summarises feedback gathered in the poll's comments, lists problems with PostgreSQL support in Drupal, and shows my plans on how to improve it. It also includes a help request (last item, but probably most important!)

The results of the poll are: 88% of voters use MySQL only, and 11% of votes use PostgreSQL or both MySQL and PostgreSQL databases. Although some people find these PostgreSQL results better than they expected, I feel that the number of PostgreSQL users is lower than it could be. I'm going to undertake some steps to change this -- details later in this post.

Only 11%?

The comments posted for the poll were even more interesting than the poll results itself. The general message was that people would like to use PostgreSQL with Drupal, but were unable to. Reasons given were:

  • Bugs in core Drupal: It can't be denied that some parts of Drupal 4.6 do not work well with PostgreSQL databases. The bugs are being fixed, but it's still far from perfect. The next Drupal release series, 4.7, is going to be much better in this respect.
  • Bugs in contributed modules: As the poll shows, most users of Drupal (and thus modules authors) use MySQL to develop and test their code. They usually do not provide database schemas for PostgreSQL, which is very easy to fix, or use non-standard (MySQL specific) SQL queries. It'd be very hard for Drupal developers to check/fix all contributed modules, considering there are about 220 contributed modules for the entire 4.6 branch.
  • Not enough support for PostgreSQL: Lack of PostgreSQL install instructions (a patch for this is pending), no central place for PostgreSQL (wannabe) users, nearly no documentation, etc..
  • Last but not least: fewer hosts offering PostgreSQL databases. Not much we can do about this.

So what am I planning to do?

  • Fix more bugs in Drupal core! I don't see a good way to directly improve contributed module quality. This will depend mainly on the module authors who proactively address PostgreSQL installations.
  • Improve PostgreSQL support. This would include handbook pages, forums, and maybe a mailing list for the PostgreSQL database only.
  • Educate MySQL users on how to write 'standard' SQL code - a list of things to avoid when coding, 'strict mode' in MySQL 5, SQL validators, etc.

Help wanted!

  • You can help by testing patches. Even if you don't use PostgreSQL, you can still test them. Often, patches for PostgreSQL issues change code that also affects MySQL users.
  • I'm looking for people who have an existing Drupal 4.6 with PostgreSQL set up to test the update path from 4.6 to 4.7 works. Please send me messages using my Drupal.org contact form or reach me on IRC (nick Cvbge).

Comments

pz’s picture

Best way to improve support for contrib is probably to have clearly documented, somewhere module developers are sure to look, what needs to be done to ensure database independence. Now we have something about sql naming conventions but not much else that I am aware of.

Samat Jain’s picture

SQLite.

For those occasions where you'll have a site that is not edited often, and there is no need for the concurrency provided by a full RDBMS.

__
My Drupal websites: Personal home page | Rhombic Networks: specialized Drupal application hosting

chx’s picture

http://mbostrom.us/cowiki/31.html
--
Read my developer blog on Drupal4hu.

--
Drupal development: making the world better, one patch at a time. | A bedroom without a teddy is like a face without a smile.

dries’s picture

Feel free to extend the code-checker script to detect queries that are not ANSI SQL or otherwise not compatible with PostgreSQL. I wrote the code checker script about one year ago, and it scans all the contributed modules on a daily basis. The results are available on each project's page by clicking the "View code report"-link. Three randomly chosen examples:

  1. image project: http://drupal.org/files/projects/image.status
  2. e-commerce project: http://drupal.org/files/projects/ecommerce.status
  3. acidfree project: http://drupal.org/files/projects/acidfree.status

Thanks for being the PostreSQL maintainer and for feeding me with a steady stream of PostreSQL patches. Rock on.

NaX’s picture

I don't consider my self all that great when it comes to RDBMS but I have a few questions.

By conforming to the sql standard are we sometimes maybe loosing out on features that are common to both mysql and PostgreSQL just have a different syntax.

I don't know if this is possible, or how much work it would be, or if it is necessary, but what about a sql query translator.

This way a person would create their module using mysql and when some one uses that module on a PostgreSQL install of drupal it converts the mysql query into the relevant PostgreSQL. Their would have to be some sort of checking system were the translator checks each query. Or maybe a developers mode where it displays relevant messages when in development mode.

robb’s picture

Having written such translators(Sybase/PostgreSQL) in Perl for migration projects I would NOT recommend it. The translator needs to know subtle SQL quirks in every system it is working with and that is a maintenance nightmare.

It is usually better and easier to stick with standard SQL where possible and factor out the server specific SQL to a separate function that decides which algorithm to use based on supported databases. The programmer can then make the best possible use of the databases for those rare cases where performance or feature set require it.

stefano@civicspacelabs.org_home’s picture

Piotr, of course nothing personal with your task :) !!
Honestly asking, why should we use postgreSQL over mySQL?
With the latest version 5, mySql has catched back on missing features as triggers, views, inheritance, sequences, stored procedures, cursors, and user-defined data types [well actually some are coming with 5.1]. the latter also seems to be faster. Are there still reasons to go with postgreSQL? thanks!

Cvbge’s picture

I don't feel competent enough to make good comparison... Just google for 'why postgresql is better than "mysql 5"' ;) or '"postgresql 8.1" "mysql 5" comparison' or similar...

MySQL still lacks some of PostgreSQL's features (http://www.devx.com/dbzone/Article/29480/0/page/1 names few). New features implemented in MySQL 5.0 existed in Postgres for years, so there was more time to optimize and stabilize them, to find and fix bugs...

I think whether you're better using PostgreSQL or MySQL depends on your needs and other factors. I'd prefer not to start YAHW (yet another holy war) here ;)

robb’s picture

Depends on your needs.

One example is PostgreSQL's powerful view, triggers and rules system. This provides a level of control at the server side allowing many client side coding limitations to be overcome.

MySQL has come a long way with version 5.0 but the incompatibilities with SQL 92 in prior versions is still a royal pain to deal with. Not everyone has or can upgrade to MySQL 5. PostgreSQL is much more stable in its support of SQL standards.

MySQL is supposed to be faster, and its use as a caching front-end database is superb. But as applications and data structures become increasing complex PostgreSQLs richer SQL syntax and rich server side language (Perl, Python, PostgreSQL, etc.) can make for some blazing fast systems. Both will get the job done but by the end of the day I find my applications are cleaning, faster and more flexible when coded against PostgreSQL.

I tend to use PostgreSQL for all major applications, SqLite for utilities that I may distribute, and MySQL for applications that are not PostgreSQL ready.

They all have a place and I would hate to see either one fail. A comparison of the databases is almost meaningless without a need driving that comparison. Of course let's not forget SQLite which for embedded applications and stand alone utilities is a must have.

Remember PostgreSQL has a much more liberal license than MySQL. Commerical applications can be built around and into PostgreSQL much easier. That is important and needed for some people.

MVRider’s picture

I agree with you fully!

I found that PostgreSQL really steps ahead under serious load (i.e. reporting, computational processing, etc.) things that really tax a DB. And when it comes to stability and data integrity...I NEVER had a corrupt file in PostgreSQL and can't say the same for MySQL.

I do in fact use both depending on need.

Just my two cents not looking to debate either side.

genwolf’s picture

The missing features from Mysql 4x that exist in MySql 5 are a welcome development. But I had a look around at a few webhosting services last week and there has so far not been a real rush to offer the newest version.

And not meaning to be rude, but what does "seems to be faster" mean and what is it based on? There have been any number of tests that have shown MySql's vaunted speed is illusory when realworld loads and even marginally complex SQL is factored in.

If you are asking Why Postgresql? in relation to MySql you don't know what a Relational SQL Database is or what it can do, and if your needs are met by what is bassically a SQL accessible Data Store you should probably be asking why not SQLite? It has a license even more liberal than Postgres, it is now included with PHP, so you can find it wherever you can find PHP 5, and it offers a portable single file. It is far more SQL compliant and is blazingly fast.

democraticvictory’s picture

Please note that I am not taking sides here. I'd simply like to raise a few questions for consideration:

I have worked with SQL since C. J. Date published his work on relational database theory and IBM released the first relational database management system and taught this stuff at the university level for years.

I am out of date on some of this stuff now. But, I do know the questions to ask.

Here are my questions:

1. It used to be the case that supporting multiple SQL databases required an additional layer that allowed for connecting the various databases. This additional layer (or any generic connectivity to multiple databases) caused a significant performance hit (both RAM and CPU cycles).

Do we incur a performance hit by doing what is required to allow interchangeable DBMS under the covers?

2. It has been my experience that the additional coding and testing to reliably support multiple databases is significant and time is money. So, what development and testing overhead do we incur by doing this?

3. What advantages (if any) do we get by supporting Postgre SQL in particular as opposed to other databases? (It's pretty clear that mySQL dominates the marketplace, so why Postgre SQL?)

4. What about scaleability? If we are going to support multiple databases, then one obvious advantage would be scaleability to higher traffic sites. How well does mySQL scale up? Does Postgre SQL scale better than mySQL? If not, should we be looking at compatibility with a DBMS that does scale better? To get scaleability, do we have to support Oracle or some other proprietary DBMS.(I know, don't jump on me for asking the question.)

5. A typical reason for supporting multiple DBMS's is to increase marketshare for a pricey, proprietary software system where the additional sales will pay for the development and maintenance required. As far as I can tell, web hosts that offer Postgre also offer mySQL, although the reverse does not pertain. Do we significantly increase the marketshare of Drupal by offering it on additional DBMS platforms?

6. If we were to forget multiple DBMS support and focus on mySQL alone, what can we do to Drupal that we can't do if we support multiple DBMS's?

Specifically: Can the resources be better used to develop additional features instead?

7. Do we lose any important mySQL features by making DRUPAL comptible with multiple DBMS's?

If Drupal is coded to support multiple DBMS's such as Postgre SQL, do we eliminate the advantages Postgre might have to achieve compatibility?

8. Apparently, mySQL is just now getting into stored procedures, better transaction support, etc. Do we need these things now or can we wait for them to mature on mySQL?

9. What kind of automated support is available to verify that the coding done is compatible with multiple DBMS's?(You know, like a tool that reads the code and flags incompatibilities.)

10. Any other benefits / disadvantages for supporting multiple databases not covered by the above?

MVRider’s picture

I think we should really think about Drupal as a CMS system and not both a CMS and DB. By righting modules that are SQL compliant and not specific to MySQL or PostgreSQL or any other for that matter, will ensure a longer life for Drupal. Let's say for instance if MySQL somehow stops moving forward or something new comes out that is by far better then both options we have now.

Let's just write a very flexible and powerful CMS...not try to marry Drupal and MySQL.

Cheers!

oicu’s picture

Your point 1 is dead wrong in my experience - in fact, even if you only support one DBMS you should isolate the DB interaction code - apps that have SQL littered all through the code are horrible to maintain. I typically write a nice simple procedural or OO veneer over the DB and hide the SQL away inside just a few files, sometimes only one. That makes plugging in another DBMS almost trivial, and it makes maintaining the code lots easier too.

Cvbge’s picture

This is my private answer, as a person, not as a Drupal PostgreSQL maintainer. Please treat it with a pinch of salt. I said I didn't want to have a flame, but...

1. It used to be the case that supporting multiple SQL databases required an additional layer that allowed for connecting the various databases. This additional layer (or any generic connectivity to multiple databases) caused a significant performance hit (both RAM and CPU cycles).

Do we incur a performance hit by doing what is required to allow interchangeable DBMS under the covers?

Not really. But because mysql does not support full ANSI SQL we have to write more complicated queries sometimes. Sometimes we can't use some SQL features. This means it may be slower then it could. If we supported only PostgreSQL this would not happen.

2. It has been my experience that the additional coding and testing to reliably support multiple databases is significant and time is money. So, what development and testing overhead do we incur by doing this?

Nobody paid me a cent for doing my work on Drupal, ever. Considering that multi-db interface is already implemented, I'd say that overhead is not big. But don't worry, I'll manage.

3. What advantages (if any) do we get by supporting Postgre SQL in particular as opposed to other databases? (It's pretty clear that mySQL dominates the marketplace, so why Postgre SQL?)

PostgreSQL is the best, that's it. We're supporting the best database, so we're the best.

4. What about scaleability? If we are going to support multiple databases, then one obvious advantage would be scaleability to higher traffic sites. How well does mySQL scale up? Does Postgre SQL scale better than mySQL?

OF COURSE PostgreSQL scales better then MySQL. Everybody knows that.

If not, should we be looking at compatibility with a DBMS that does scale better? To get scaleability, do we have to support Oracle or some other proprietary DBMS.(I know, don't jump on me for asking the question.)

Oh please, not that horrible Oracle. Did you know how it treats epty strings? Or it's awfull security support? Better if you don't know.

Do we significantly increase the marketshare of Drupal by offering it on additional DBMS platforms?

I don't know, I don't care. Would we significantly increase the marketshare of Drupal by NOT offering it on additional DBMS platforms?

Ok. As the flame manual says, stop answering in the middle.

oicu’s picture

I am glad to see you working on this - I worked on a similar task for Bugzilla, which now supports PostgreSQL.

I have just tried to get the ecommerce modulw working with PostgreSQL, but gave up in disgust. After I fixed the schema I started getting weird constraint errors, and I had to switch to using MySQL. This experience has soured me somewhat, and I will be far less likely to recommend Drupal as a CMS to other people as a result, unless this gets fixed fairly quickly.

What is really annoying is the Drupal advertises that it supports PostgreSQL - that's what made me look at it in the first place.

zoo33’s picture

Ecommerce is a contributed module, which means that while the Drupal core developers take responsibility for making sure that Drupal core is PostgreSQL compatible, ecommerce depends on it's contributors (which is, of course, anyone who want to contribute) to handle that.

It can be a problem sometimes that the distinction between core and contributed modules is not all clear. In people's minds (mine too I guess), ecommerce.module, image.module etc are parts of Drupal, regardless if they are part of core or not.

There has been suggestions about two categories for contributed modules: official/supported and unofficial. One requirement for official modules could be that they are tested and functional with both MySQL and PostgreSQL.

About ecommerce, I'm looking at the PostgreSQL compatibility too. I'ts a slow progress though.

oicu’s picture

There has been suggestions about two categories for contributed modules: official/supported and unofficial. One requirement for official modules could be that they are tested and functional with both MySQL and PostgreSQL.

Highly desirable. In fact, this would affect whether or not I would use Drupal. I abandoned phpbb because many addons didn't support PostgreSQL, and would be sad if I had to do the same with Drupal.

sepeck’s picture

Feel free to contact the contributor and maintainer of modules that you feel would benefit from your attention and knowledge and supply patches to improve them. Without participation and actual work, it won't happen.

-sp
---------
Test site, always start with a test site.
Drupal Best Practices Guide -|- Black Mountain

-Steven Peck
---------
Test site, always start with a test site.
Drupal Best Practices Guide

markus_petrux’s picture

The subject says it all. Please, check it out. The backend is still in development though. Still, I believe most developers could use it to generate their SQL schemas for PostgreSQL (or even MSSQL).

Doubt is the beginning, not the end of wisdom.

Cvbge’s picture

Indeed, very nice.

Although has problems with some Drupal CREATE TABLE statesments ;)

--
Polska strona o Drupalu: http://drupal.cvbge.org

afaksmith’s picture

I found an interessting article at the MySQL Performance, pointing me to these benchmarks:MySQL vs. PostgreSQLRDBMS on Solaris vs. LinuxIt’s pretty interessting to see, that PostgreSQL is performing very well, where MySQL is suffering very quickl…Thanks..
=======================================================
Adam Smith

http://www.SelectWealthSystem.com/?t=wc