Download & Extend

Add transactions to key _save routines

Project:Drupal core
Version:7.x-dev
Component:database system
Category:task
Priority:critical
Assigned:Unassigned
Status:closed (fixed)
Issue tags:chx-and-david-performance-sprint, Security Advisory follow-up

Issue Summary

First of all: I don't know if I put correctly this bug/feature. It is intended to the developers of drupal, for the newest version, but at this time, I'm testing on 4.7.5 version. I've no time to test this on v5-rc2 (sorry).

I ve seen, that, in some cases (in my case in category nodes) , saving is a bit slower. This is OK, but I've seen a big problem when Drupal is exceeding the PHP max_execution_time, or the Apache max_time (for connection).

Think about this scenario:

- PHP "max_execution_time" set to 2 seconds.
- Try to save new nodes.

In some cases, you can save data, in others not. But, there are some cases that the data will be not 100% saved.

And here's the problem: In MySQL, the "node" table will get the new row about your new node, but the table "node_revisions" don't.

The "save" transaction gets broken, and the data corrupted. In this case, you will get an error when you try to view the node: "Access denied". And the same error raises if you try to see it with de superuser (or god user, user 0).

And, if you install some modules, you can obtain errors derived form this, because some tables got filled, but others not.

When I was developing programs in VB & MS-SQL Server, I had a pair of SQL instructions: BEGIN TRANS and COMMIT TRANS.
I don't know if there's something similar both in MySQL and PostgreSQL.

But if you can, I think you should put one "BEGIN TRANSACTION" at start of node saving. And one "COMMIT TRANSACTION" at its end.

This will ensure the data integrity of our databases. And, when Drupal exceeds the max execution time, MySQL will never save a transaction that it's not completed.

Comments

#1

Status:active» closed (works as designed)

drupal tries to remain slim and not handle everything that can possibly go wrong. your example of 2 sec max_execution_time is not realistic. we need real world benefits to justify added complexity. also remember database independance requirement.

#2

No, It is not about strange cases.

I've find this bug in a real (and more complex) drupal installation.
I have a max_execution_time of 30 seconds, but, some modules are slower when you have a big site.
Then, when I try to save, it expends from 20 to 40 seconds to complete.

In some cases, Drupal gets slower, and it eats a lot of memory (because the admins has installed some third-party modules, of course). Then, some times PHP cannot end the execution.

Finally, think in other drupal-admins, that aren't programmer, and they doesn't understand how work the databases. One day, he discovered something strange on Drupal: one node that he can't access, and he can't delete. This is a bit frustrating.
I've discovered how to repair them, because I searched the Drupal tables. But others can have the same problem and they can't do anything to solve it.

#3

Version:4.7.5» 5.6

Yeah, I don't understand why there is no transaction function built into Drupal. It seems they have included it for version 6 if you are using psgsql? I need a solution for MySQL and D5.6.

My issue is about importing data from another database. I could just write the SQL queries by hand, however, I would prefer to use Drupal's node save function. The problem is that I haven't found a way to wrap this function inside a transaction. I have node relationships that need to be maintained while I import the data. If one node can't get created for some reason, Drupal will still insert that node's "relationship" node and the database will have corrupt data. Not good...

#4

Version:5.6» 7.x-dev
Component:node system» database system
Category:bug report» feature request
Priority:minor» normal
Status:closed (works as designed)» active

Lets see what DB team thinks of this. We actually have transaction support now in the DB layer.

#5

Transaction support in version 6 & 7 right? Not D5?

From what I could tell it was only for D6+ with Postgresql databases. Shouldn't I also be able to convert my MySQL tables to innoDB and use some kind of built in Drupal transaction functions?

I guess this isn't a high priority for D5, but it is limitation of the technology stack that I am using.

#6

This is blocked on #301049: Transaction nesting not tracked by connection. Once that's in and the transaction API is exposed, however, node_save() is a perfect candidate for where we should add a transaction. (This is D7 only. D6 does not support transactions.)

#7

Title:node_save system needs a transaction» Add transactions to key _save routines
Category:feature request» task
Status:active» needs review

OK, here we go with transactions for node_save(), user_save(), and comment_save(). There are likely other places, but we'll get to those in later patches. :-)

AttachmentSizeStatusTest resultOperations
yay_transactions.patch36.94 KBIdleFailed: Failed to apply patch.View details

#8

Status:needs review» needs work

I'm all for actually using transactions. The only reason I haven't put forward such a patch is because our testing system simply won't test it. The testing system will, however, happily give the green light. Even if the patch is perfect, it's not a good idea to commit things that (1) aren't tested and (2) could easily be broken by later changes. That said, the value of this patch may exceed the risk of leaving the codepath untested on our main testing system.

Objections to the current patch itself:
* Doesn't ->rollback() throw an exception if you're not actually in a transaction? If not, that needs to change, and this needs to be ready to handle an exception on unsuccessful rollback. We also don't accomplish anything by handling a generic exception, rolling back, and then throwing an exception that won't be handled.
* Generically catching all exceptions is bad. Either this patch should catch specific exceptions or let all exceptions bubble up. If there's an unhandled exception, the transaction will roll back, anyway, by virtue of not being committed.

#9

My concern is that most Drupal devs don't know how to deal with exceptions, and until we have a good "exception culture" I'm wary about allowing arbitrary exceptions to just keep on throwing up god-knows how far.

I also, thinking about it, realized another issue. The watchdog() call would also get rolled back, as the transaction doesn't actually close until $transaction goes out of scope. When that happens, the watchdog() record will get rolled back, too. Oops.

David, I'll defer to you on the better way to do this since you've worked with transactions more than I have.

#10

My concern is that most Drupal devs don't know how to deal with exceptions, and until we have a good "exception culture" I'm wary about allowing arbitrary exceptions to just keep on throwing up god-knows how far.

Having a completely unhandled exception is at least as useful as what we get now with a fatal error. I also don't think we'll foster a good "exception culture" unless we provide incentive to properly handle exceptions.

I also, thinking about it, realized another issue. The watchdog() call would also get rolled back, as the transaction doesn't actually close until $transaction goes out of scope. When that happens, the watchdog() record will get rolled back, too. Oops.

David, I'll defer to you on the better way to do this since you've worked with transactions more than I have.

Well, we have a few options:

* Use a different database connection for logging. We could connect it on the first write to watchdog. Users of syslog wouldn't have this problem. Since watchdog isn't designed to be a very high-volume log, this wouldn't be a big problem.
* Move the guts of *_save to private functions, re-throw any exceptions from the catch block (where we schedule the roll back), and catch+log the exception in the new, outer function.
* Have watchdog() defer its database writes while in a transaction. There are many ways to do this; none are particularly clean.
* Have ->rollback() take watchdog-esque arguments that the transaction API then records until the transaction is over and then sends to watchdog. This is reasonably elegant because developers generally want to log on rollback.

I generally prefer solutions that keep watchdog from being touched in any way by transactional behavior.

#11

I should add that the "move the guts" option only works if another function hasn't also decided to call *_save while already in a transaction.

#12

Tagging for work this week.

#13

Status:needs work» needs review
AttachmentSizeStatusTest resultOperations
use-transactions.patch42.58 KBIdleFailed: Failed to apply patch.View details

#14

Conceptually I like #13. It provides a clean way to say "hey, transaction specific error stuff here" while still blowing away any other watchdog actions that get rolled back. My one problem is that it introduces a dependency in DBTNG to the watchdog function. Right now DBTNG is extremely self-contained, and touches higher level utilities at only 2 levels (one check_plain() call and one drupal_alter() call). Is there no way we can turn that around and make it a listener, like the query logger?

Hm. You know... I wonder if this is somewhere that #601020: Exception Handler Logging and Improvements would come in useful?

#15

I like the listener pattern, but how could Watchdog register itself as a DB-TNG listener? Every pattern I can think of would either offer a subset of watchdog functionality or make the decoupling from Watchdog superficial (where DB-TNG depends on a service exactly like Watchdog, even if it isn't watchdog).

As for "Watchdog Exception," it appears to merely streamline the process of logging exceptions to Watchdog. Part of the problem here is there isn't a good place for the transaction layer to throw an exception. If it throws an exception during the destructor, PHP breaks. It could throw the exception during the rollback request, but it's unclear how we could efficiently collect and pipe the data to Watchdog post-transaction when dealing with a set of nested transactions.

#16

Well, DBTNG could depend on a system similar to watchdog *IF* you want to get transaction logs. If you don't want transaction logs, it just drops the rollback exception information on the floor.

My thinking of watchdog_exception() is to make it do the special "record to memory and save after the transaction is over" stuff. Not sure if that's a good idea, but it came to mind as a maybe.

#17

AttachmentSizeStatusTest resultOperations
use_transactions.patch46.02 KBIdleUnable to apply patch use_transactions.patchView details

#18

Status:needs review» needs work

We're always retrieving the logging callback properties as a matched set, so let's just make them a single array property of Database rather than 3 separate properties.

The user_save hunk includes a crapload of tabs. :-(

Otherwise I like this approach.

#19

Status:needs work» needs review

Now using arrays for callbacks.

AttachmentSizeStatusTest resultOperations
use_transactions.patch45.84 KBIdleFailed: Failed to apply patch.View details

#20

Minus tabs.

AttachmentSizeStatusTest resultOperations
use_transactions.patch46.17 KBIdleFailed: Failed to apply patch.View details

#21

Here's a patch without whitespace changes. This is only for human review.

AttachmentSizeStatusTest resultOperations
use_transactions_no_whitespace.diff14.9 KBIdleFailed: Failed to apply patch.View details

#22

Status:needs review» reviewed & tested by the community

As soon as the bot approves, #2 has my blessing. (Ignore #21, it's a review-only patch.)

#23

Erk. #20 has my blessing, I mean. Not #2. :-)

#24

yay!

#25

Because of the indentation changes, this is a huge pain to re-roll. Can we get this committed, please?

#26

The security team received a report which would be fixed by this issue. Here is an excerpt from node_save().

<?php
drupal_write_record
('node', $node);
_node_save_revision($node, $user->uid);

module_invoke_all('node_' . $op, $node);

// Update the node access table for this node.
node_access_acquire_grants($node);
?>

If php fails between during hook_node(), we have created a node without its access control. Considering that we now explicitly support external storage for fields, this is perhaps more likely.

I'll add that economist.com is using this patch successfully on D6.

#27

Priority:normal» critical

#28

Status:reviewed & tested by the community» needs work

Somewhere along the line this broke and the testbot didn't notice due to the patch in #21. David, can you reroll?

#29

This needs a re-roll. Masked by #21.

#30

Status:needs work» reviewed & tested by the community

This is important so I rerolled.

AttachmentSizeStatusTest resultOperations
use_transactions.patch45.62 KBIdleFailed: 14510 passes, 2 fails, 6 exceptionsView details

#31

Status:reviewed & tested by the community» needs work

The last submitted patch failed testing.

#32

Status:needs work» needs review

Can't reproduce. Tried with MyISAM tables too.

#33

Status:needs review» needs work

The last submitted patch failed testing.

#34

Status:needs work» needs review

Crell points out i left out the test module. That would do it.

AttachmentSizeStatusTest resultOperations
use_transactions.patch46.8 KBIdlePassed: 14726 passes, 0 fails, 0 exceptionsView details

#35

Status:needs review» reviewed & tested by the community

chxsnack!

Note that most of this patch is just playing games with indentation as we add more try-catch blocks, so it's not as big as it seems.

#36

All also add that I don't believe a single substantive (non-whitespace and non-fuzz-related) line of this patch has changed since before code freeze.

#37

Given that this patch was ready to go by 10/15 and didn't make it into UNSTABLE-10 due to a test bot freak out, and given also that the consensus appears to be from two of the database system maintainers that this patch is basically required to make our new shiny transaction support actually do something, and also has thumbs-up from two folks on the security team, I'm inclined to commit this.

Leaving as RTBC for 48 hours in case Dries wants to chime in. #21 shows the changes without the whitespace differences for easier reviewing.

#38

This looks good to me. Cody-style is not 100% (documentation wraps too early), but that shouldn't be a showstopper IMO.

#39

Status:reviewed & tested by the community» fixed

Ok, committed to HEAD! Thanks!

Maybe we can file a novice patch for fixing all of our documentation to consistently wrap at 80 chars. Hm...

#40

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

nobody click here