Download & Extend

_block_rehash() deletes all blocks

Project:Drupal core
Version:6.x-dev
Component:block.module
Category:bug report
Priority:critical
Assigned:chx
Status:closed (fixed)

Issue Summary

In the current version of _block_rehash() all the blocks are deleted and then re-inserted into the database.

With all the new regions the use of blocks has increased, and lossing all the information could be very bad for a site.

Here is a patch which removes the delete, and locks, and also reduces the number of db queries during the rehash. If there are no new modules/blocks or removed modules/blocks then with this there will be zero inserts or deleted on the blocks table.

I found this problem, due to a hosting company killing my php process at the wrong time because of the overhead, and lost nearly 100 blocks. It was not bad enough that I lost all the blocks, but I also lost a lot of PHP which did some special methods for when to and not to display blocks.

I will implent this into HEAD in the next couple of days.

AttachmentSizeStatusTest resultOperations
block_rehash.patch.txt3.86 KBIgnored: Check issue status.NoneNone

Comments

#1

I think "in_database" shoudl be renamed, it is pretty confusing.

#2

I have changed in_database to db_action and set text flags to make it a little more verbose about what it is doing.

AttachmentSizeStatusTest resultOperations
block_rehash.patch_0.txt3.86 KBIgnored: Check issue status.NoneNone

#3

Status:needs review» needs work

Personally, I like the old approach better. I'm not a fan of the extra 'commands' that you tag onto the object. The proper solution would be to use transactions.

#4

Status:needs work» needs review

We cant use transactions since it is only supported by InnoDB if you use mysql.

The current scheme "delete, then re-insert" is just not fit for usage in a php script which needs to be expected to die in all kinds of wierd places.

Attached is an alternative patch which does have the same amount of queries but not use either DELETE or LOCK.

AttachmentSizeStatusTest resultOperations
no_delete_block.patch.txt2.13 KBIgnored: Check issue status.NoneNone

#5

Status:needs review» needs work

Ok, this won't work, it won't delete blocks from modules that have been deleted in the meantime.

#6

Here is a modified version which uses an external index to track what needs to be deleted.

AttachmentSizeStatusTest resultOperations
block_rehash.patch_1.txt4.61 KBIgnored: Check issue status.NoneNone

#7

Status:needs work» needs review

opps

#8

Version:4.7.3» x.y.z
Status:needs review» needs work

It looks like an unrelated patch slipped in here.

#9

Opps,

Here it is again.

AttachmentSizeStatusTest resultOperations
block_rehash.patch_2.txt3.86 KBIgnored: Check issue status.NoneNone

#10

Status:needs work» needs review

#11

Status:needs review» needs work

I found a bug, will upload a new patch

#12

Status:needs work» needs review

Here is a fixed version.

AttachmentSizeStatusTest resultOperations
block_rehash.patch_3.txt3.87 KBIgnored: Check issue status.NoneNone

#13

Status:needs review» needs work

Patch doesn't apply to HEAD.

#14

Status:needs work» needs review

I have ported this to head.

AttachmentSizeStatusTest resultOperations
block_rehash_head.patch.txt3.98 KBIgnored: Check issue status.NoneNone

#15

Mmm, I don't see how this patch fixes the problem. There would still be a race conditation, just a smaller window (less risky to loose data)?

#16

I've tested the patch and it works as advertised.

The problem is that you lose all your data if Drupal dies after "DELETE FROM blocks". This is fixed by not executing that query and only deleting specific deletes for modules that have been uninstalled.

The earlier race-condition is back in some form. If you are really unlucky, you would get a double-insert for a block in a newly enabled module. We probably should have a unique key on (module, theme, delta) to disallow this.

#17

Status:needs review» reviewed & tested by the community

I've reviewed the block module code and didn't find a problem that would occur if you had two rows for the same block. All the SELECT queries that matter have a DISTINCT and the Updates will update both copies, should they exist. Hell, i even duplicated all my entries and don't get double blocks.

Adding a unique key on (theme, module, delta) fails since keys are restricted to 1000 bytes.

#18

I have gone back over this and the race condition is microscopic.

This patch only does inserts and deletes on block removal or addition. If there are no changes then no queries are executed.

This patch is focused on data presevation and with the the increased use of regions/blocks the potentical for loss of sensitive data is much higher.

#19

This is a definately a good direction to be moving in. The old system isn't good. What can be done to move this along? Put more locks in?

#20

I have been looking that this more, and because of the fact that there are no indexes on the blocks table, there is actually no race condition.

Worst case is that you will get the same block duplicated on the table. But because the in memory array in indexed in both the module and delta there will only ever be 1 occurance of the block returned.

Neither user will crash, and return any erronous data.

#21

bump

#22

Drumm, what are your problems with the patch?

#23

I was leaving this one for Dries to do final review.

#24

I have been looking that this more, and because of the fact that there are no indexes on the blocks table, there is actually no race condition.

I don't understand this ... it looks like a race condition to me.

#25

Ok, let's rephrase this: There is a race condition (as on nearly every Drupal page, if you save it) but it doesn't matter. :P

#26

No, it should be: there is a race condition and we can fix it 100% with a table lock. :)

#27

No, we don't _fix_ it with a table lock, we only make it inconsequential, nothing bad will happen if there is actually a race. The same is true for Gordon's code.

#28

Gerhard: why does a table lock doesn't fix the problem?

#29

First of all: the real problem is that the current code deletes user data if they are unlucky.

Second, the table lock doesn't really fix the problem of two people editing the form at the same time. It just makes sure there is not db error by making sure the two POST events are put into the database in sequence. For the first user this still overwrites his data. Of course, this is true for all admin forms.

#30

Seems like the race condition, even though it would be very rare, could easilly be solved using some well placed locks. Seems the main issue of potentially loosing all block data has been solved, and its definately a step in the right direction to preserve data.

#31

Status:reviewed & tested by the community» needs work

21:04 <Natrak> killes: if you can use REPLACE for blocks you can avoid the locks i guess.. but imo it makes more sense to deal with them individually. Make the blocks patch use locks commit it, then if the no locks patch goes in re-work it to use db_query_replace (which seems a bit weird to me, but we will see)

#32

Priority:critical» normal

i don't think this qualifies as critical--while the results can be bad, it's an exceptional event. downgrading.

#33

I like the approach of this and is another improvement for _block_rehash(). I didn't like the massive DELETE. Haven't tested the patch, but the approach is a +1. So is the next step to throw locks around a portion of the code or should this be set back to review?

#34

RobRoy: It seems as Dries insists on locks so it would be great of you could update the patch as indicated by Kjartan.

#35

Status:needs work» needs review

Not sure if I understand the race condition completely, but here's a patch. This what you want?

AttachmentSizeStatusTest resultOperations
block_rehash_head.patch3.96 KBIgnored: Check issue status.NoneNone

#36

Oops. Hadn't updated from CVS first.

AttachmentSizeStatusTest resultOperations
block_rehash_head2.patch3.96 KBIgnored: Check issue status.NoneNone

#37

Version:x.y.z» 6.x-dev

This is a relatively large refactoring. I'm going to say 6.0, but feel free to debate if yo have good arguments.

#38

Priority:normal» critical

I just want to relate that a client lost a few blocks when he and his minions added blocks fro several machines. This needs fixing.

#39

Subscribing, and I think it it should be tagged as critical.

#40

Assigned to:gordon» Dries

I spent some time looking at the code, and figured out the source of the race condition. The proposed patches are bogus; a working patch is attached.

I've only tested it a little bit -- it could do with more extensive testing.

AttachmentSizeStatusTest resultOperations
block-lock.patch779 bytesIgnored: Check issue status.NoneNone

#41

Status:needs review» reviewed & tested by the community

I noticed this condition a little while back, but somehow it got lost in my brain to upload a patch. This is a small change to fix the race condition. I re-rolled removing two trailing whitespace chars. +1

AttachmentSizeStatusTest resultOperations
rehash.patch777 bytesIgnored: Check issue status.NoneNone

#42

Version:6.x-dev» 5.x-dev

Committed to CVS HEAD. We'll want to backport this to Drupal 5, but I'll leave that up to Gerhard to decide. Updating the version.

#43

Umm, you probably meant "Neil". But yes, I think this should be in D4.7 too.

Oh, and while I agree that this fixes most of the problem, I still think that the "delete and then re-insert" part is crappy code. :p

#44

Status:reviewed & tested by the community» fixed

Neil committed to D5, myself to 4.7.

#45

Version:5.x-dev» 6.x-dev
Status:fixed» needs work

Rolling back 5.

"When you use LOCK TABLES, you must lock all tables that you are going to use in your statements."

#46

4.7 is also reverted.

#47

So if we go the lock-moving-route, we want to:
1. Get the big list of blocks from module_invoke
2. Lock it
3. Do that first SELECT on blocks
4. Iterate through it and do whatever that big loop does.
5. That delete and insert thing.
6. Unlock.

#48

Assigned to:Dries» chx
Status:needs work» needs review

We can do better with schema API. MySQL does not support transaction but RENAME TABLE is atomic so we use that. PostgreSQL has transactional DDL.

AttachmentSizeStatusTest resultOperations
block_rehash.patch4.62 KBIgnored: Check issue status.NoneNone

#49

Added an initial db_error check to the rename table functions for additional safety.

AttachmentSizeStatusTest resultOperations
block_rehash_0.patch4.68 KBIgnored: Check issue status.NoneNone

#50

$temporary_table_name needs no prefix.

AttachmentSizeStatusTest resultOperations
block_rehash_1.patch4.68 KBIgnored: Check issue status.NoneNone

#51

This looks really good. This will remove an race conditions.

#52

uniqid is used for temp table names, simplifying the code.

AttachmentSizeStatusTest resultOperations
block_rehash_2.patch4.61 KBIgnored: Check issue status.NoneNone

#53

Added prefixing to src.

AttachmentSizeStatusTest resultOperations
block_rehash_3.patch4.44 KBIgnored: Check issue status.NoneNone

#54

i've tested the postgres specific commands in psql, and they work just fine.

#55

Status:needs review» needs work

Ok. Major bug with the latest patch, which otherwise looks like an elegant solution to this and probably other race conditions in core.

@ admin/build/block, click on one of the other themes that's not the current one, click back to the current theme.

All your blocks are gone. ;)

#56

Status:needs work» needs review

This is trivial to fix -- one INSERT SELECT (yes, postgresql is fine) and we are there.

AttachmentSizeStatusTest resultOperations
block_rehash_4.patch22.49 KBIgnored: Check issue status.NoneNone

#57

Would help if I would not every single issue I work on, here.

AttachmentSizeStatusTest resultOperations
block_rehash_5.patch4.75 KBIgnored: Check issue status.NoneNone

#58

Status:needs review» reviewed & tested by the community

Ok, gave this another run-through; set/removed blocks in several themes, looks like it works. Marking RTBC;

#59

I don't think messing with the DDL to update block data is a good idea. This patch may be better than what we have, but neither option is "safe."

#60

I'll want to think about this some more ...

#61

Status:reviewed & tested by the community» needs work

I agree that this patch doesn't solve the underlying problem. This patch only reduces the window for problems, it doesn't fix it.

#62

Status:needs work» reviewed & tested by the community

The original code leaves you without a single block if Drupal dies. This one does not leave without a block if Drupal dies but if the database bugs then you might end at a scary place where you need to drop the block table and recreate it -- and you again without a single block. However, the latter needs much much much more things to go wrong (very very wrong -- it needs a db bug) while a PHP request can die for a hell lot of reasons.

#63

Status:reviewed & tested by the community» needs work

This patch doesn't solve the underlying problem. This patch only reduces the window for problems, it doesn't fix it.

#64

Status:needs work» reviewed & tested by the community

Dries, there is no window, I do not know what are you saying. The blocks are saved to a temporary table which is then renamed to the blocks table and the blocks table is moved aside/deleted in an atomic fashion.

#65

Priority:critical» normal
Status:reviewed & tested by the community» needs work

So what Dries is saying, as David Strauss explained to me is that the old code deleted only blocks for one theme, this one moves every record so there might be a very small window indeed (much smaller than the old). Given that ... and what dangers this carries as nnewton explained... I am killing this solution.

And I do not think this is critical.

#66

Priority:normal» critical
Status:needs work» needs review

I was convinced this is critical and here is something killes suggested.

AttachmentSizeStatusTest resultOperations
block_rehash_diff-80963-66.patch3.26 KBIgnored: Check issue status.NoneNone

#67

typofixed.

AttachmentSizeStatusTest resultOperations
block_rehash_diff-80963-67.patch3.26 KBIgnored: Check issue status.NoneNone

#68

looks like that insert and last_id() call can be replaced by drupal_write_record(). anyway, the code looks good to me. i will try to test it.

#69

Status:needs review» needs work

Applied patch, enabled statistics module, got this result:

* notice: Undefined variable: blocks in /Users/joshk/Sites/drupal-HEAD/modules/block/block.module on line 258.
* warning: usort() [function.usort]: The argument should be an array in /Users/joshk/Sites/drupal-HEAD/modules/block/block.admin.inc on line 20.
* warning: Invalid argument supplied for foreach() in /Users/joshk/Sites/drupal-HEAD/modules/block/block.admin.inc on line 49.
* warning: Invalid argument supplied for foreach() in /Users/joshk/Sites/drupal-HEAD/modules/block/block-admin-display-form.tpl.php on line 45.

Marking as needs work. :(

#71

Status:needs work» reviewed & tested by the community

I know I should not RTBC my own patches and yet. I tested and commented this to death. (The name of the patch marks the place where I was writing it: on the stairs looking down the Western Wall in Jerusalem)

AttachmentSizeStatusTest resultOperations
block_diff_western_wall.patch5.46 KBIgnored: Check issue status.NoneNone

#72

Oh, one nice addition: I now pass around the theme_key in hook_block so theoretically you can have different blocks defined per theme. Between module preprocess and this I am now quite content of how modules can intervene to the theme.

AttachmentSizeStatusTest resultOperations
block_diff_western_wall.patch5.49 KBIgnored: Check issue status.NoneNone

#73

It would be good to find out, why did we have 0 as default for block caching while the block code used per role caching as default. I guess we need to modify the default in the schema to be able to use drupal_write_record() filling up the defaults properly. Otherwise this looks like a nice approach.

#74

Noone cares about schema defaults, they are usually whatever 'empty' is in the context, in this case, that's a zero. There is no intention behind it. It's a whole another issue whether this needs to be fixed. Schema changes are in the patch for install and for update as well.

#75

Status:reviewed & tested by the community» fixed

OK, looked through where block rehash was used, tried how the new code works, and all seem to be fine as far as I see. We don't touch what is there, we add new blocks from code and only remove those which were removed. This is also better then the code before as it keeps the blocks if PHP dies in the middle of the request for some reason.

BTW I removed the API modification on hook_block() as I don't think there is any point in doing it now.

#76

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.

nobody click here