While migrating lots of users (i.e. calling user_save() over and over), we noticed that performance and memory utilization were both poor. We've traced that a some seemingly useless call to field_attach_form() in user_save(). This patch removes it. That code says that it exists to help keep unwanted fields out of user.data but I fail to see how adding more stuff to $edit could achieve that. The specified goal of field_attach_form is "Add form elements for all fields for an entity to a form structure.". Thats not cleaning anything.
This patch also removes a useless user_load later on since we explicitly have a new user who's got nothing data to load anyway. Whats needed is a stub user object instead.
Comment | File | Size | Author |
---|---|---|---|
#61 | user_data.patch | 3.8 KB | catch |
#57 | user-data-721436-57.patch | 3.18 KB | David_Rothstein |
#51 | user_data.patch | 2.66 KB | catch |
#49 | user_data.patch | 2.65 KB | catch |
#43 | data_unpack.patch | 3.03 KB | catch |
Comments
Comment #1
chx CreditAttribution: chx commentedThis patch is two issues in one. Let's split them. I think this is the proper solution to the first problem.
Comment #2
moshe weitzman CreditAttribution: moshe weitzman commentedActually, that field_attach_form() did eventually do something useful. chx reworked it so it is fast. RTBC. thanks.
Comment #3
sunI quickly scanned user.test, and it seems like we don't have any tests for {users}.data handling... :-/
Comment #4
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks. We should have tests so marking this 'needs work'.
Comment #5
moshe weitzman CreditAttribution: moshe weitzman commentedComment #6
yched CreditAttribution: yched commentedShould be doable with a simpler
?
Powered by Dreditor.
Comment #7
catchI'm borderline spamming, but please add to the hate for user->data on #347988: Move $user->data into own table and not least #86299: Add "current password" field to "change password form", which also show the lack of tests.
Comment #8
catchGiven we were saving passwords in plain text for a couple of weeks there, I think it's time to remove the magic $data handling in user_save().
If you want to save stuff in there, you can add to $account->data in hook_user_presave(). Modules shouldn't have to remove their own form values just to avoid security holes or bloating of the serialized array. Patch does not remove the column, or the automatic loading into the user object, just the magical saving of cruft.
Comment #9
catchFound more cruft. This is like a cancer.
catch@catch-laptop:~/www/7$ diffstat user_data.patch
user.module | 44 --------------------------------------------
user.pages.inc | 2 --
2 files changed, 46 deletions(-)
Comment #10
catchFixed api docs. Moshe reminded me that user_cancel and contact module make use of the user->data 'feature', however I'm pretty sure user_cancel doesn't use the magic form handling, contact module might though. Let's see what breaks.
catch@catch-laptop:~/www/7$ diffstat user_data.patch
user.api.php | 14 ++++----------
user.module | 44 --------------------------------------------
user.pages.inc | 2 --
3 files changed, 4 insertions(+), 56 deletions(-)
Comment #11
sunHow much would break if we didn't remove it, but _move_ it into $user->data for real? Thus, only what's in $user->data gets that handling, nothing else.
Comment #12
moshe weitzman CreditAttribution: moshe weitzman commented@sun - we already have a patch that does more than that. Its a good patch for all the reasons in #8. Also see its suggestion of a presave for modules that really want it. I don't see much sense in putting in more effort to handle POST of edit['data'] field
why is bot not testing latest patch?
Comment #13
catchYeah I stick by #8, except you need to add to $edit['data'] not $account->data I think. I don't think we need to offer any more than that, and it's way better than requiring everything else to manually remove their stuff.
Comment #15
catchFixed the tests.
What this patch does:
1. Removes the behaviour of saving whatever happens to be in user form submissions into $user->data unless you explicitly take it out yourself. http://drupal.org/node/86299#comment-2669278 would have been one of the very worst core security vulnerabilities ever, had it not been caught during alpha.
2. Does not remove the $user->data column from {users} or the automatic loading of it into user objects. The only API change is that you now have to explicitly set $edit['data']['mymodulekey'] in hook_user_presave() instead of removing stuff there, if you want it to be saved.
3. Removes all the associated cruft we had accumulated in user_save() and elsewhere to avoid saving crap into this serialized field.
4. Is a performance improvement, in addition to security hardening, because it saves a drupal_write_record() on every user save - this existed only due to having to account for the wonky behaviour . We have shockingly bad insert performance in core which I'm trying to fix at the moment - it currently takes around 13 queries to do one user_save() - and that's with no contrib modules inserting their data.
Comment #17
sun#15: diediedie.patch queued for re-testing.
Comment #18
sunHm. Thinking out loud...
Doesn't this mean that $edit -- which may be $form_state['values'] when coming from a form submission -- must always have all properties assigned? I'm not 100% sure, but I was under the assumption that 'data' always contained the existing values already before...? This code looks like the 'contact' property's value is lost when $edit doesn't contain it?
Do we need to also account for $account->contact for programmatic updates?
Further, is it a good idea to store default configuration values when no user configuration data is given? I'd say that global defaults should be accounted for at run-time, no?
Lastly, very minor: duplicate semicolon here.
We can still keep this though - just removes all button elements from the form values.
Powered by Dreditor.
Comment #20
catchIn drupal_write_record(), if $object->field isn't set, then it won't be updated. However I guess it's possible that custom user forms might set partial $edit['data'], so we now pre-build the array from any existing information in user_user_presave().
Programmatic updates should also be covered due to the same change.
"Further, is it a good idea to store default configuration values when no user configuration data is given? I'd say that global defaults should be accounted for at run-time, no?"
I don't think it's a good idea, but that's what contact module does currently. I am only interested here in not breaking contact module, not in fixing its obvious deficiencies. Please open an issue for that though, would keep the data array cleaner.
Added back value cleaning, also removed double semicolon.
Comment #21
sunThanks!
This should happen in user_save(), I think (due to module weights & hook invocation order)
Just keep it where it was. It's irrelevant for this issue.
Powered by Dreditor.
Comment #22
catchWorks for me.
Comment #23
sunThank you!
Comment #24
BerdirThis is tricky, because it would change the current behavior when the global default changes. Then all users that had the same value as the global one would change to the new default.
Comment #25
chx CreditAttribution: chx commentedeither just
$edit['data'] = $data
or go withif (!isset($edit['data'])) $edit['data'] = array(); $edit['data'] += $data
(and actually the data variable is not even necessary just move the unserialize in).The former destroys the data passed in via edit the latter keeps it. Oh and if you want to keep your behaviour then array_merge.Comment #26
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #27
sund'oh! :)
Comment #28
sunSo how about this?
Comment #30
aspilicious CreditAttribution: aspilicious commented#28: drupal.user-data.28.patch queued for re-testing.
Comment #31
catchHmm, I think I'd prefer chx's latter approach except with the array_merge() - there's no reason to destroy $edit['data'] here.
Comment #32
David_Rothstein CreditAttribution: David_Rothstein commentedThis appears to have caused #748976: "Show/hide descriptions" links no longer work -- and in order to get that working again, we'll actually need to explicitly not destroy $edit['data'] here (at least as far as I can tell).
Also, we'll want to update the API docs for user_save(), since those currently say the following which is out of date:
Also, are there still situations where a not-fully-loaded user object is ever passed in to user_save()? - i.e., what happens with the following code if $account->data hasn't been loaded yet?
Doesn't that mean we'll lose some of its saved values?
Finally, a quick grep suggests there is more stuff left over that could be cleaned up here:
In profile_save_profile():
Probably isn't needed anymore?
Also in user_register_submit():
The same, I assume.
Comment #33
sunI assume this also caused some weird double-serialization - after a fresh install, I get the following in {users}.data for uid 1:
Of course, along with the error message:
Comment #34
jbrown CreditAttribution: jbrown commented#33 I am getting this also. It triggers #752366: Fatal error: Call to undefined function decode_entities() in devel.module
Comment #35
jbrown CreditAttribution: jbrown commentedRipping off http://drupal.org/node/745668#comment-2747534 then http://drupal.org/node/721436#comment-2740728 fixes it, but you have to reinstall, otherwise saving uid 1 results in:
Fatal error: Cannot unset string offsets in /home/jonny/sites/drupal-7.x-dev/modules/user/user.module on line 400
Comment #36
eojthebraveAlso getting the errors reported in #33. Attached patch fixes the double serialization issue.
Comment #37
jbrown CreditAttribution: jbrown commentedWorks for me.
Nice work eojthebrave!
Comment #38
catchLet's get that in to fix the critical bug, then we have extra cleanup to do once that's in, but I think 99% of that cleanup is cruft removal now so that shouldn't hold up the quick fix.
Comment #39
Dries CreditAttribution: Dries commentedCommitted the fix for the double initialization. Moving to 'needs work' for follow-up work.
Comment #40
catch#754446: $user->data serialization unpack problem was duplicate - it looks like at the moment $account->data can be either an array or serialized when passed into user->save which is just lovely....
chx didn't like marcingy's patch on that issue, but didn't say why, and just switching it to always unserialize breaks tests.
So we have achoice of either checking for both a serialized or unserialized ->data in user_save(), trying to track down every possible place it could be one or the other, or the change to drupal_unpack(). The former is pretty bad (although a small price to pay for removing all the other dozens of lines we had working around it), if we do the middle then we're probably only delaying contrib modules running into it, the latter I'm not sure why it was so unpopular with chx but let's at least see what the test bot says. It's possible user_load() could also do it just for itself if we don't put that in drupal_unpatck(). So re-uploading that patch alongside some of the fixes from David's #32. If this doesn't work then I'll work on one of the alternative options.
Comment #41
catchLet's see what the bot says - this isn't RTBC so please no-one mark as such.
Comment #42
Dries CreditAttribution: Dries commentedGood lord, I'm already loving this patch. :)
Comment #43
catchAdded @return documentation to drupal_unpack() which it had previously lacked. Also confirmed that marcingy's steps to reproduce in #754446: $user->data serialization unpack problem no longer trigger the error. Will try to grab chx so he has a chance to complain (again) about this approach but seems cleanest to me (apart from ripping out $user->data altogether which it's a bit late for now.
Comment #44
rfayI still get
on every page even with #43 applied. I think it's non-negotiable to solve the unpack warning. I don't have time to debug it right now, but thought it was important to apply the patch and check in.
Comment #45
catch@rfay - did you reinstall? I didn't get any such error..
Comment #46
rfay@catch: I did not reinstall, but did re-save the current user. I thought that would do it. I hate to reinstall and lose the state that is causing the problem.
Is it essential to reinstall? I'll be happy to if that's fundamental to the fix.
Comment #47
marcingy CreditAttribution: marcingy commented@rfay yes a reinstall is essential as once the data in $user->data is corrupted the unpack will always fail. #754446: $user->data serialization unpack problem provides information on how to recreate the state with great ease on a code base without this patch applied.
Comment #48
rfayOK (#44-47): I reinstalled and no longer have the error. I have a similar setup and am doing similar things that should result in similar results, but since I didn't look deeply into this, I just have to trust you (with good reason, of course) that this is a real fix.
Please disregard my interruption :-)
Comment #49
catchSame patch except this time no change to drupal_unpack(), we can just do this in the user entity controller and keep it self contained there.
Comment #50
eojthebraveresialize should be reserialize
Otherwise this looks good to me. Comment cleanup and removal of more unnecessary code.
132 critical left. Go review some!
Comment #51
catchFixed the typo.
Comment #52
andypostStill not fixed #748976: "Show/hide descriptions" links no longer work
Comment #53
mfbIt's just not ideal, performance-wise, to unserialize $user->data twice in a row, once by drupal_unpack() and once by unserialize(). On the other hand this is a critical bug fix so maybe it should be applied and cleaned up later?
Comment #54
catchWe run unserialize dozens of times each request for menu links etc, an extra one on users is fairly minor IMO, often there's only one user, rarely more than a few loaded per page. The security hardening and user_save() performance outweigh that IMO.
If we were to commit this as is, I think there's still a remaining issue with #748976: "Show/hide descriptions" links no longer work, haven't had time to look into that one yet, although I'm frankly a bit surprised that's being stored in user->data in the first place. If we were somehow able to remove the extra unserialize() it'd be nice though. One option would be to just keep all $user->data in $user->data and lose the drupal_unpack() - since we've change the API slightly already I'm not sure if that'd be acceptable, but it'd solve some of these remaining issues.
Comment #55
mfbThe solution for #748976: "Show/hide descriptions" links no longer work is that you also need to unserialize $user->data in session.inc _drupal_session_read(), i.e.
More double unserialization which again isn't ideal :)
Comment #56
catchThat convinces me we should cull the drupal_unpack() and have $user->data be the canonical place to find this stuff. That synchronises _save() and _load(), means we don't do double work etc. Given we made modules using this do a bit more work on _save() already, slightly changing the object structure doesn't seem too bad to fix the bugs. If someone wants to beat me to a patch that'd be great, otherwise I'll try to get to it.
Comment #57
David_Rothstein CreditAttribution: David_Rothstein commentedHm, what that really makes me wonder is why two totally different mechanisms are used to populate the global $user object on login vs. actually loading a user account.... however, I guess that's another can of worms.
We do need to fix the serialization, but as per #32, the main thing we need here to get #748976: "Show/hide descriptions" links no longer work working is to respect the $edit['data'] that is passed in to user_save(), rather than blowing it away. I've rerolled the patch and added that one-line fix. No other changes for now, although I definitely agree with the direction proposed in #56. The magical drupal_unpack() thing is really weird and messes up the namespace pretty badly. (I can only imagine what happens at the moment if someone tries to put something in
$account->data['roles']
, for example.)Comment #58
mfbLooks like drupal_unpack() is also used a couple times by comment module to "unpack" $user->data into the $comment object...
Comment #59
moshe weitzman CreditAttribution: moshe weitzman commentedre: #57, i tried to standardize the building of global $user at #361471: Global $user object should be a complete entity. Read it and weep :)
Latest patch looks rtbc to me.
Comment #60
mfb$user->data still needs to be unserialized in session.inc, otherwise you are array merging a string and array, when $user object was built by _drupal_session_read() as opposed to user_load()
Comment #61
catchHere's the same patch with the addition of an unserialize() in session.inc
I've opened #763048: Remove drupal_unpack($user) due to namespacing collisions to tackile the namespacing, left it as critical, although it's more like 'major' if we get that priority eventually. We should get this in first though.
Comment #62
Dries CreditAttribution: Dries commentedThis looks good now. Committed to CVS HEAD. Thanks.
Comment #64
rfayIt looks to me like this important change didn't get documented in the module upgrade guide. Where did it get documented? With multiple commits and 63 comments it's hard to figure out what the net result was.
Comment #65
moshe weitzman CreditAttribution: moshe weitzman commentedI think the upgrade should state: The user edit form no longer automatically saves unrecognized properties to the users.data column. Modules relying on this behavior are recommended to create own table for storage of user properties. The users.data column is deprecated and will be removed in Drupal 8.
Comment #66
Jackinloadup CreditAttribution: Jackinloadup commentedIs there a task issue already in place to make sure that the users.data column is removed in Drupal 8?
Comment #67
rfayAdded to the 6/7 module upgrade guide: http://drupal.org/update/modules/6/7#user_data_change
Blog post: http://randyfay.com/node/85