In the current version of _block_rehash() all the blocks are deleted and then re-inserted into the database.

With all the new regions the use of blocks has increased, and lossing all the information could be very bad for a site.

Here is a patch which removes the delete, and locks, and also reduces the number of db queries during the rehash. If there are no new modules/blocks or removed modules/blocks then with this there will be zero inserts or deleted on the blocks table.

I found this problem, due to a hosting company killing my php process at the wrong time because of the overhead, and lost nearly 100 blocks. It was not bad enough that I lost all the blocks, but I also lost a lot of PHP which did some special methods for when to and not to display blocks.

I will implent this into HEAD in the next couple of days.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

killes@www.drop.org’s picture

I think "in_database" shoudl be renamed, it is pretty confusing.

gordon’s picture

FileSize
3.86 KB

I have changed in_database to db_action and set text flags to make it a little more verbose about what it is doing.

Dries’s picture

Status: Needs review » Needs work

Personally, I like the old approach better. I'm not a fan of the extra 'commands' that you tag onto the object. The proper solution would be to use transactions.

killes@www.drop.org’s picture

Status: Needs work » Needs review
FileSize
2.13 KB

We cant use transactions since it is only supported by InnoDB if you use mysql.

The current scheme "delete, then re-insert" is just not fit for usage in a php script which needs to be expected to die in all kinds of wierd places.

Attached is an alternative patch which does have the same amount of queries but not use either DELETE or LOCK.

killes@www.drop.org’s picture

Status: Needs review » Needs work

Ok, this won't work, it won't delete blocks from modules that have been deleted in the meantime.

gordon’s picture

FileSize
4.61 KB

Here is a modified version which uses an external index to track what needs to be deleted.

gordon’s picture

Status: Needs work » Needs review

opps

drumm’s picture

Version: 4.7.3 » x.y.z
Status: Needs review » Needs work

It looks like an unrelated patch slipped in here.

gordon’s picture

FileSize
3.86 KB

Opps,

Here it is again.

gordon’s picture

Status: Needs work » Needs review
gordon’s picture

Status: Needs review » Needs work

I found a bug, will upload a new patch

gordon’s picture

Status: Needs work » Needs review
FileSize
3.87 KB

Here is a fixed version.

drumm’s picture

Status: Needs review » Needs work

Patch doesn't apply to HEAD.

gordon’s picture

Status: Needs work » Needs review
FileSize
3.98 KB

I have ported this to head.

Dries’s picture

Mmm, I don't see how this patch fixes the problem. There would still be a race conditation, just a smaller window (less risky to loose data)?

killes@www.drop.org’s picture

I've tested the patch and it works as advertised.

The problem is that you lose all your data if Drupal dies after "DELETE FROM blocks". This is fixed by not executing that query and only deleting specific deletes for modules that have been uninstalled.

The earlier race-condition is back in some form. If you are really unlucky, you would get a double-insert for a block in a newly enabled module. We probably should have a unique key on (module, theme, delta) to disallow this.

killes@www.drop.org’s picture

Status: Needs review » Reviewed & tested by the community

I've reviewed the block module code and didn't find a problem that would occur if you had two rows for the same block. All the SELECT queries that matter have a DISTINCT and the Updates will update both copies, should they exist. Hell, i even duplicated all my entries and don't get double blocks.

Adding a unique key on (theme, module, delta) fails since keys are restricted to 1000 bytes.

gordon’s picture

I have gone back over this and the race condition is microscopic.

This patch only does inserts and deletes on block removal or addition. If there are no changes then no queries are executed.

This patch is focused on data presevation and with the the increased use of regions/blocks the potentical for loss of sensitive data is much higher.

drumm’s picture

This is a definately a good direction to be moving in. The old system isn't good. What can be done to move this along? Put more locks in?

gordon’s picture

I have been looking that this more, and because of the fact that there are no indexes on the blocks table, there is actually no race condition.

Worst case is that you will get the same block duplicated on the table. But because the in memory array in indexed in both the module and delta there will only ever be 1 occurance of the block returned.

Neither user will crash, and return any erronous data.

gordon’s picture

bump

chx’s picture

Drumm, what are your problems with the patch?

drumm’s picture

I was leaving this one for Dries to do final review.

Dries’s picture

I have been looking that this more, and because of the fact that there are no indexes on the blocks table, there is actually no race condition.

I don't understand this ... it looks like a race condition to me.

killes@www.drop.org’s picture

Ok, let's rephrase this: There is a race condition (as on nearly every Drupal page, if you save it) but it doesn't matter. :P

Dries’s picture

No, it should be: there is a race condition and we can fix it 100% with a table lock. :)

killes@www.drop.org’s picture

No, we don't _fix_ it with a table lock, we only make it inconsequential, nothing bad will happen if there is actually a race. The same is true for Gordon's code.

Dries’s picture

Gerhard: why does a table lock doesn't fix the problem?

killes@www.drop.org’s picture

First of all: the real problem is that the current code deletes user data if they are unlucky.

Second, the table lock doesn't really fix the problem of two people editing the form at the same time. It just makes sure there is not db error by making sure the two POST events are put into the database in sequence. For the first user this still overwrites his data. Of course, this is true for all admin forms.

Kjartan’s picture

Seems like the race condition, even though it would be very rare, could easilly be solved using some well placed locks. Seems the main issue of potentially loosing all block data has been solved, and its definately a step in the right direction to preserve data.

killes@www.drop.org’s picture

Status: Reviewed & tested by the community » Needs work

21:04 <Natrak> killes: if you can use REPLACE for blocks you can avoid the locks i guess.. but imo it makes more sense to deal with them individually. Make the blocks patch use locks commit it, then if the no locks patch goes in re-work it to use db_query_replace (which seems a bit weird to me, but we will see)

hunmonk’s picture

Priority: Critical » Normal

i don't think this qualifies as critical--while the results can be bad, it's an exceptional event. downgrading.

RobRoy’s picture

I like the approach of this and is another improvement for _block_rehash(). I didn't like the massive DELETE. Haven't tested the patch, but the approach is a +1. So is the next step to throw locks around a portion of the code or should this be set back to review?

killes@www.drop.org’s picture

RobRoy: It seems as Dries insists on locks so it would be great of you could update the patch as indicated by Kjartan.

RobRoy’s picture

Status: Needs work » Needs review
FileSize
3.96 KB

Not sure if I understand the race condition completely, but here's a patch. This what you want?

RobRoy’s picture

FileSize
3.96 KB

Oops. Hadn't updated from CVS first.

drumm’s picture

Version: x.y.z » 6.x-dev

This is a relatively large refactoring. I'm going to say 6.0, but feel free to debate if yo have good arguments.

killes@www.drop.org’s picture

Priority: Normal » Critical

I just want to relate that a client lost a few blocks when he and his minions added blocks fro several machines. This needs fixing.

BioALIEN’s picture

Subscribing, and I think it it should be tagged as critical.

Dries’s picture

Assigned: gordon » Dries
FileSize
779 bytes

I spent some time looking at the code, and figured out the source of the race condition. The proposed patches are bogus; a working patch is attached.

I've only tested it a little bit -- it could do with more extensive testing.

RobRoy’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
777 bytes

I noticed this condition a little while back, but somehow it got lost in my brain to upload a patch. This is a small change to fix the race condition. I re-rolled removing two trailing whitespace chars. +1

Dries’s picture

Version: 6.x-dev » 5.x-dev

Committed to CVS HEAD. We'll want to backport this to Drupal 5, but I'll leave that up to Gerhard to decide. Updating the version.

killes@www.drop.org’s picture

Umm, you probably meant "Neil". But yes, I think this should be in D4.7 too.

Oh, and while I agree that this fixes most of the problem, I still think that the "delete and then re-insert" part is crappy code. :p

killes@www.drop.org’s picture

Status: Reviewed & tested by the community » Fixed

Neil committed to D5, myself to 4.7.

drumm’s picture

Version: 5.x-dev » 6.x-dev
Status: Fixed » Needs work

Rolling back 5.

"When you use LOCK TABLES, you must lock all tables that you are going to use in your statements."

killes@www.drop.org’s picture

4.7 is also reverted.

drumm’s picture

So if we go the lock-moving-route, we want to:
1. Get the big list of blocks from module_invoke
2. Lock it
3. Do that first SELECT on blocks
4. Iterate through it and do whatever that big loop does.
5. That delete and insert thing.
6. Unlock.

chx’s picture

Assigned: Dries » chx
Status: Needs work » Needs review
FileSize
4.62 KB

We can do better with schema API. MySQL does not support transaction but RENAME TABLE is atomic so we use that. PostgreSQL has transactional DDL.

chx’s picture

FileSize
4.68 KB

Added an initial db_error check to the rename table functions for additional safety.

chx’s picture

FileSize
4.68 KB

$temporary_table_name needs no prefix.

gordon’s picture

This looks really good. This will remove an race conditions.

chx’s picture

FileSize
4.61 KB

uniqid is used for temp table names, simplifying the code.

chx’s picture

FileSize
4.44 KB

Added prefixing to src.

hunmonk’s picture

i've tested the postgres specific commands in psql, and they work just fine.

webchick’s picture

Status: Needs review » Needs work

Ok. Major bug with the latest patch, which otherwise looks like an elegant solution to this and probably other race conditions in core.

@ admin/build/block, click on one of the other themes that's not the current one, click back to the current theme.

All your blocks are gone. ;)

chx’s picture

Status: Needs work » Needs review
FileSize
22.49 KB

This is trivial to fix -- one INSERT SELECT (yes, postgresql is fine) and we are there.

chx’s picture

FileSize
4.75 KB

Would help if I would not every single issue I work on, here.

webchick’s picture

Status: Needs review » Reviewed & tested by the community

Ok, gave this another run-through; set/removed blocks in several themes, looks like it works. Marking RTBC;

David Strauss’s picture

I don't think messing with the DDL to update block data is a good idea. This patch may be better than what we have, but neither option is "safe."

Dries’s picture

I'll want to think about this some more ...

Dries’s picture

Status: Reviewed & tested by the community » Needs work

I agree that this patch doesn't solve the underlying problem. This patch only reduces the window for problems, it doesn't fix it.

chx’s picture

Status: Needs work » Reviewed & tested by the community

The original code leaves you without a single block if Drupal dies. This one does not leave without a block if Drupal dies but if the database bugs then you might end at a scary place where you need to drop the block table and recreate it -- and you again without a single block. However, the latter needs much much much more things to go wrong (very very wrong -- it needs a db bug) while a PHP request can die for a hell lot of reasons.

Dries’s picture

Status: Reviewed & tested by the community » Needs work

This patch doesn't solve the underlying problem. This patch only reduces the window for problems, it doesn't fix it.

chx’s picture

Status: Needs work » Reviewed & tested by the community

Dries, there is no window, I do not know what are you saying. The blocks are saved to a temporary table which is then renamed to the blocks table and the blocks table is moved aside/deleted in an atomic fashion.

chx’s picture

Priority: Critical » Normal
Status: Reviewed & tested by the community » Needs work

So what Dries is saying, as David Strauss explained to me is that the old code deleted only blocks for one theme, this one moves every record so there might be a very small window indeed (much smaller than the old). Given that ... and what dangers this carries as nnewton explained... I am killing this solution.

And I do not think this is critical.

chx’s picture

Priority: Normal » Critical
Status: Needs work » Needs review
FileSize
3.26 KB

I was convinced this is critical and here is something killes suggested.

chx’s picture

typofixed.

moshe weitzman’s picture

looks like that insert and last_id() call can be replaced by drupal_write_record(). anyway, the code looks good to me. i will try to test it.

joshk’s picture

Status: Needs review » Needs work

Applied patch, enabled statistics module, got this result:

* notice: Undefined variable: blocks in /Users/joshk/Sites/drupal-HEAD/modules/block/block.module on line 258.
* warning: usort() [function.usort]: The argument should be an array in /Users/joshk/Sites/drupal-HEAD/modules/block/block.admin.inc on line 20.
* warning: Invalid argument supplied for foreach() in /Users/joshk/Sites/drupal-HEAD/modules/block/block.admin.inc on line 49.
* warning: Invalid argument supplied for foreach() in /Users/joshk/Sites/drupal-HEAD/modules/block/block-admin-display-form.tpl.php on line 45.

Marking as needs work. :(

chx’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
5.46 KB

I know I should not RTBC my own patches and yet. I tested and commented this to death. (The name of the patch marks the place where I was writing it: on the stairs looking down the Western Wall in Jerusalem)

chx’s picture

Oh, one nice addition: I now pass around the theme_key in hook_block so theoretically you can have different blocks defined per theme. Between module preprocess and this I am now quite content of how modules can intervene to the theme.

Gábor Hojtsy’s picture

It would be good to find out, why did we have 0 as default for block caching while the block code used per role caching as default. I guess we need to modify the default in the schema to be able to use drupal_write_record() filling up the defaults properly. Otherwise this looks like a nice approach.

chx’s picture

Noone cares about schema defaults, they are usually whatever 'empty' is in the context, in this case, that's a zero. There is no intention behind it. It's a whole another issue whether this needs to be fixed. Schema changes are in the patch for install and for update as well.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Fixed

OK, looked through where block rehash was used, tried how the new code works, and all seem to be fine as far as I see. We don't touch what is there, we add new blocks from code and only remove those which were removed. This is also better then the code before as it keeps the blocks if PHP dies in the middle of the request for some reason.

BTW I removed the API modification on hook_block() as I don't think there is any point in doing it now.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.