This is part of #1837054: [META] Refactor account workflow (and config)
Problem/Motivation
The registry patch prevents the use of alternate password hashing schemes, since password.inc is always parsed, you cannot include a file that defines functions with the same names.
Proposed resolution
A couple possible ways around this exist. One is to use singleton objects (per Crell). A more direct alternative is to require that any alternative framework have a way to define alternative, uniquely named, functions that are called.
Remaining tasks
Issue summary needs updating
remaining tasks needs to be itemized
patch in comment #75 need to be re-rolled for D8
(reviews needed, tests to be written or run, documentation to be written, etc.)
User interface changes
(new or changed features/functionality in the user interface, modules added or removed, changes to URL paths, changes to user interface text)
API changes
(API changes/additions that would affect module, install profile, and theme developers, including examples of before/after code if appropriate)
Original report by [username]
The registry patch prevents the use of alternate password hashing schemes, since password.inc is always parsed, you cannot include a file that defines functions with the same names.
A couple possible ways around this exist. One is to use singleton objects (per Crell). A more direct alternative is to require that any alternative framework have a way to define alternative, uniquely named, functions that are called.
Here's a patch that tries to fix this - feedback on the variable name and default base name?
Obviously, you'll have to make sure your function registry is rebuilt...
Also, I had trouble logging in from the /node page - put not from /user
Comment | File | Size | Author |
---|---|---|---|
#75 | pluggable-pass-259103-73.patch | 36.85 KB | pwolanin |
#73 | pluggable-pass-259103-71.patch | 36.84 KB | pwolanin |
#71 | pluggable-pass-259103-71.patch | 35.51 KB | pwolanin |
#69 | pluggable-pass-259103-69.patch | 36.9 KB | pwolanin |
#67 | pluggable-pass-259103-67.patch | 35.57 KB | pwolanin |
Comments
Comment #1
pwolanin CreditAttribution: pwolanin commentedComment #2
Dries CreditAttribution: Dries commentedFor the image handling, we also have the ability to pick different image libraries. I think we should use the same mechanism, or at least come up with a consistent pattern. Anything else will result in confusing user experience. Maybe have a look at it, and see if we can clean those things up a bit? That is not to say your current approach is bad. It might be the case that everything is nice and dandy.
Comment #3
pwolanin CreditAttribution: pwolanin commented@Dries - [edit]
I don't think you meant that post for here...right, good pointComment #4
pwolanin CreditAttribution: pwolanin commentedOk, previous patch was more complicated than needed (and didn't work). Attached much simpler patch should work.
Comment #5
Owen Barton CreditAttribution: Owen Barton commented@pwolanin: Actually, I think Dries did intend to post this here - the method used to switch password hashing is used for several other things in Drupal - I haven't checked the HEAD code recently, but image frameworks and the caching and session subsystems use(d?) similar (or identical) systems.
In other words, chances are some other things may be broken here too (and no-one noticed), or if not perhaps someone has fixed them in a different way. Either way, the point is that we should use be consistent across these if possible.
Comment #6
Owen Barton CreditAttribution: Owen Barton commentedSo it looks like cache and session both use the same style as password - from bootstrap.inc
Image uses:
most of which is 9+ months old. This looks more similar to what your patch is doing, so (assuming image stuff is working with the registry) it seems reasonable to sync these up. I think it may be worth looking at cache and session at the same time though, since there may be a helper function to help manage the optional incs and their selection that could be shared by these 3 or 4 instances.
Comment #7
Crell CreditAttribution: Crell commentedIf we're revisiting how we handle pluggable subystems, may I suggest a class-based approach as discussed here: http://www.garfieldtech.com/blog/pluggable-systems-howto . That article was written specifically for the password hashing system. :-)
Comment #8
pwolanin CreditAttribution: pwolanin commented@Crell - compared to the procedural-style system you suggest, I think my solution is cleaner. It has the caveat that the replacement code has to be part of a module so that the relevant file will be declared in a .info file to the registry processing mechanism. On the other hand, this means the file can be located anywhere, as opposed to having to be added to the /includes as in your example.
Note that we thus avoid having to rewrite all the password code as object methods, when there is no real object-ness to what we are doing. Obviously the case is quote different with the database layer when we may be holding onto connection information, prepared statements, etc.
@Grugnog - if I understand it right, the cache and session includes happen before the registry ever gets involved and since we are always using that code, it's fine to always include those files or their replacements. Password, image, etc, are different since it is likely that on many page loads we won't need that code at all.
Comment #9
pwolanin CreditAttribution: pwolanin commentedActually- I now realized my method is even better - for this reason: the registry code automatically scans /includes. Hence we can drop an alternate framework into there as a .inc file and it should just workTM.
Comment #10
pwolanin CreditAttribution: pwolanin commentedfrom moshe:
So, it seems like we really ought to standardize this as best as possible. What I proposed here would not be as nice for cache, etc, since we those functions are called from all over. How do we feel about the performance of 1 level of indirection - essentially the 1st method suggested here: http://www.garfieldtech.com/blog/pluggable-systems-howto but using module_invoke() and/or drupal_function_exists() to take advantage of the registry's ability to include files as needed?
Comment #11
Crell CreditAttribution: Crell commentedOf the three mechanisms I describe in that article, the first one is the slowest and least feature-ful. The only "advantage" is offers is that it doesn't have any classes in it, which is not an advantage because there's nothing wrong with classes. If we're going to standardize on something, let's standardize on one that is faster and offers more robust features.
Comment #12
pwolanin CreditAttribution: pwolanin commented@Crell - ok- I'll play around with the factory object or function technique. I'm not sure why it would be faster to include a file with a class than a file with functions?
Comment #13
Crell CreditAttribution: Crell commentedThe purely-procedural method has to do a check or an include_once() on each function call. The class-based mechanism can use just an isset() to see if it needs to be initialized.
Off the cuff, I would recommend using a factory function, as those are easier to read and more consistent with what the database layer is doing. Something like:
For lack of a better term, I'm using $segment to refer to the cache table. Referring to a table name directly is rather inflexible. Instead, a given cache implementation should be able to decide that pages should be cached to a dedicated SQLite database, nodes should be cached to a memcache server, and variables should be cached to the standard database. The "default" implementation just puts everything into the local database.
The password or session handling would have a different signature for the factory function, but would still operate with a factory function. So for instance:
Not sure if that makes any sense; I'm posting this while trying to resuscitate a database server that just died on me so I'm a bit distracted. :-)
Comment #14
chx CreditAttribution: chx commentedI would much prefer fixing the registry to work with
variable_get('session_inc', './includes/session.inc');
and similar pluggable systems. It scans includes/ currently which is sorta broken even with database drivers check your registry table, it only contains database.pgsql.inc not .mysql.inc.Larry, "there's nothing wrong with classes" -- I thought you were standing about a foot from me in Barcelona where I was ranting high and low for what, ten or twenty minutes about how OOP is hard and unnatural for anyone but CS majors? The only reason I let it happen and even helped with PDO is that PDO itself is OOP but that does not mean that every pluggable subsystem should use this now. Quite the opposite: OOP is the last measure we should consider.
Comment #15
pwolanin CreditAttribution: pwolanin commented@Crell - I'm a little unclear as to why (in terms PHP inner workings) require_once would be really slower - doesn't PHP have a quick internal lookup as to which files are already included? One can get the list from get_included_files(). In a procedural-style construction of this, one could also have a single, central, function to decide based on isset() whether require_once() is needed and/or return the function name.
Comment #16
Crell CreditAttribution: Crell commented@chx: I was about 3 feet from you, actually, and I still disagree with you. "Aw, OOP is so haaaaard" is a strawman argument, and one that is also untrue. You can write totally incomprehensible OOP and totally incomprehensible procedural code, or clear and obvious code in either one. A fair portion of Drupal's procedural code right now is incomprehensible, too, to someone who hasn't had their head buried in it for years. Knee-jerk avoidance of classes and objects is just as damaging to the code base as knee-jerk usage of them is. Right tool for the job.
Besides, what's wrong with CS majors? There's a lot of them out there, and PHP in general could use them. :-) Besides, using standard, industry-accepted patterns and practices lowers the bar to entry for Drupal development because people can recognize how things work more easily. "Oh, they're doing X, I get it now."
@pwolanin: require_once() actually does have a performance hit, especially on opcode caches. The internal lookup is not "quick", for internal engine reasons that I don't claim to fully understand but have seen Rasums et al repeat often. Yes, one could write a user-space wrapper around it. However, you're then still doing an indirect function call rather than a literal method, which is a very slight performance difference.
Much of it does boil down to a stylistic question, I grant. I find an architecture that wraps shared, swappable functionality into tidy objects that can be mixed and matched to be much easier to understand and wrap my head around than 500 verbosely named functions. It is more self-documenting, and more approachable for people who aren't spoon-coders. It also makes it easier to do things like specifying an engine when needed, and is easy to wrap factory functions around to keep the syntax simple. Avoiding it just because "OMG it has classes" is naive and short-sighted.
Comment #17
pwolanin CreditAttribution: pwolanin commented@Crell - after having a long discussion about this, I think my short term answer is that we should make as many as possible look basically like the image toolkit handing - where we have a dispatch function that is the central point where we decide which actual funciton to invoke. While the OOP version may have some advantages, it also means re-writing lots of code. So, my intention at this point is to get this working with the minimal changes to code. Obviously for caching, this may not be the final solution for D7 (especially if we need to have multiple caching handlers per page load).
Comment #18
Crell CreditAttribution: Crell commentedThe long promised RFC, for those who are interested: http://www.garfieldtech.com/blog/drupal-handler-rfc
Comment #19
pwolanin CreditAttribution: pwolanin commentedHere is in initial patch for passwords that basically follows the outline at http://www.garfieldtech.com/blog/pluggable-systems-howto with the addition of a declared interface to make it a well-defined API.
I put the interface and factory function in module.inc since they need to be always available, and are essentially declaring interfaces that modules may implement.
If this is a reasonable approach (at least to make these frameworks work with the registry) I can continue with others.
Comment #20
pwolanin CreditAttribution: pwolanin commentedlooking at the other things (image, sessions, smtp) maybe putting the password interface just in user module makes more sensefor password?
Comment #21
pwolanin CreditAttribution: pwolanin commentedthis covers both passwords and the smtp library.
All tests pass (PHP 5.2.5, MySQL 5.0) with this patch (7109 passes, 0 fails, 0 exceptions)
Comment #22
moshe weitzman CreditAttribution: moshe weitzman commentedThis is looking very nice. Subscribe.
Comment #23
Crell CreditAttribution: Crell commentedMethods should use camelCase(), not lower_case(). The factory functions should also be more compactly named. "drupal_mail_wrapper()" is needlessly verbose. drupal_mail() is sufficient.
This is a step in the right direction, but relying on the variable system and autoloader means we can't implement it early in the page request process. That means it won't extend to cache (the system that breaks the registry now) and session unless we require the use of the database, even for aggressive caching. (eaton's been haranguing me about that problem. :-) ) It also doesn't support multiple routing, so we still can't, for instance, route page caching and "other" caching to different caching mechanisms.
After a long nighttime session this past weekend with eaton and CitizenKane, I'm working on a 2.x branch of handler.module that should address both of those issues. It's all happening in CVS if you want to follow along. I could definitely use some feedback.
Comment #24
pwolanin CreditAttribution: pwolanin commented@Crell - drupal_mail_wrapper is the existing name of the D5/D6 function. user_mail() is already taken... Maybe drupal_smtp()?
For SMTP this supports multiple routing. For passwords, I don't know why multiple routing would make sense.
Comment #25
moshe weitzman CreditAttribution: moshe weitzman commentedIf I read the patch correctly, the multiple routing for SMTP assumes that the code sending the email is the decider of which driver to use. But thats not what we want in SMTP, nor in cache nor db layers. All those systems have a 'segment' which should route the call to the right driver. for SMPT, the segment is the $mail_key. for cache, it is the 'table'. for db layer, it is the $databases array.
All thats needed I think is to change the smtp_system variable into an array that is keyed by segment. Then the code to lookup the right driver becomes:
Comment #26
pwolanin CreditAttribution: pwolanin commentedHere's is a revised patch which fixes the method and variable names. One thing I noticed is that we should do a linear scale to log scale conversion for hash strength to make this more user-friendly.
Comment #27
pwolanin CreditAttribution: pwolanin commentedtalking to Moshe, here's a more flexible smtp lookup code.
Comment #28
Crell CreditAttribution: Crell commented@Moshe: The routing segment for the database would be the target, not the table. For cache, it would be "page", "menu", "form", etc.
I don't actually comprehend what pwloanin says in #26... :-)
Comment #29
pwolanin CreditAttribution: pwolanin commented@Crell - the phpass code stores the # of iterations in log2 form - i.e. if the value is 10, you perform 2^10 = 1024 iterations. Since alternate hashing schemes might have some arbitrarily different internal representation of strength, it makes more sense to represent to the developer a linear scale (which I think we all understand) which can be internally converted as needed. In this case converted by taking
log($strength, 2)
. So setting strength to 0.1 will make the iterative part of the code about 10x faster, while setting it to 10 will make it about 10x slower.The earlier version of the patch, setting strength to 0.5 would change the # of iterations from 2^14 to 2^7 which is a 128-fold difference in speed for the iterative part.
Comment #30
pwolanin CreditAttribution: pwolanin commentedall tests pass with this patch: 7109 passes, 0 fails, 0 exceptions
Comment #31
Anonymous (not verified) CreditAttribution: Anonymous commentedsubscribe.
Comment #32
robertDouglass CreditAttribution: robertDouglass commentedsub.
Comment #33
pwolanin CreditAttribution: pwolanin commentedre-ran all tests: 7184 passes, 0 fails, and 0 exceptions
Note sure why the testing server sees fails.
Comment #34
mfer CreditAttribution: mfer commentedsubscribing... I hope to review it in the next week.
Comment #35
catchI took a quick look through this and the approach looks sane to me, although I'm not the best person to review anything OOP by any means. Obviously I'm in favour of any patch with the word 'standardize' in the title. Will try to take a more in depth look at this soon.
Comment #36
RobLoachI love the approach taken here, but it seems like something that should be done in a number of different patches instead of all pluggable systems at once. I particularly like the replacement of the drupal_mail system. I absolutely love the push for a standard way of implementing these pluggable systems, and the approach that's being taken here. I'll take a closer look at this in the near future.
Comment #37
pwolanin CreditAttribution: pwolanin commented@Rob Loach - well this patch only covers 2 - so we could split the mail one off and do one per issue if you think that's really the right way to go.
Crell is interested in getting a unifying API in, so whatever we do here needs to be enough to get these working correctly, but may be intermediate on the way to another solution in the long term.
Comment #38
RobLoachHow would a unifying API for all external APIs work? Sounds like it would just be an empty interface, or something.
Comment #39
mfer CreditAttribution: mfer commented@Rob Loach see http://www.garfieldtech.com/blog/drupal-handler-rfc and http://drupal.org/project/handler
Comment #40
pwolanin CreditAttribution: pwolanin commentedok, so per comments above and conversation w/ catch, I'll split this into 2 separate issues and add another issue for any other framework so as to have minimal and independent patches. Attached is a patch for just the password part.
Note that while this looks like a somewhat big patch - it is almost entirely just a refactoring of code from a procedural to a class/method form. No new functionality is added.
The smtp library part is continued at: http://drupal.org/node/331180
Comment #41
catchCreated a new account and logged in. Ran user and system tests. No glaring errors in the patch, and it's 95% moving code around, so RTBC.
Comment #42
webchickThis belongs in the other patch. :)
This is kind of floating in outer space, explaining what is below and not actually documenting any particular variable/function/etc. So let's just make it regular comments and not PHPDoc (remove the second * in /**)
Could we add some PHPDoc above this that basically summarizs what it's for, and why one might implement it in their class? It doesn't have to be overly detailed, but just remember that most Drupal people do not understand this stuff so throwing them a bone is nice. :)
This is a little tweaky. Perhaps it's made worse by the fact that this function is used to generate passwords in D6 and below (I really like the move of that functionality to a function called user_*generate*_password() btw). So could we maybe add a verb to this function too to make it more clear what it's doing? (user_get_password()?)
Going to play around and see if I can break it with openID, etc. ;)
Comment #43
pwolanin CreditAttribution: pwolanin commentedtrying to think of some other factory function name - I think this is about the most concise while still being meaningful, so I'm not really convinced it need to change. Why is this worse than having
db_select()
?a possible alternative could be like:
Comment #44
pwolanin CreditAttribution: pwolanin commentedre-rolled using this naming scheme.
Comment #45
pwolanin CreditAttribution: pwolanin commentedComment #46
Anonymous (not verified) CreditAttribution: Anonymous commentedthis looks really nice.
all tests pass.
one tiny niggle is "$P$".
forgive my ignorance about salting best practice, but is "$P$" significant in any way, or is just a nice recognisable way to start the string? could we use a class constant with a descriptive name or a code comment for this?
Comment #47
pwolanin CreditAttribution: pwolanin commentedThe $P$ is necessary to be compatible with the framework - please read the entire prior ~300 comment thread before suggesting any changes to the underlying algorithms...
Comment #48
Anonymous (not verified) CreditAttribution: Anonymous commentedsorry if i wasn't clear, but i'm not at all suggesting any changes to underlying algorithms.
i was simply suggesting that '$P$' looks a little magic, so a comment or the use of a descriptively-named class constant might be useful.
Comment #49
pwolanin CreditAttribution: pwolanin commentedthe '$P$' is the hash identifier. Sure, we could add a comment to explain why that's needed.
Here's the original issue: http://drupal.org/node/29706
Comment #51
pwolanin CreditAttribution: pwolanin commentedhmm, apparently the testbed had a problem. Anyhow, re-rolled w/ and extra code comment and re-tested.
7299 passes, 0 fails, and 0 exceptions
Comment #52
catchThis all looks great, and it's been reviewed by people who'd spot any OO issues (which I wouldn't). Ran user tests and got 100% pass, plus it's primarily moving code around, so RTBC.
Comment #53
Damien Tournoud CreditAttribution: Damien Tournoud commentedWhy is AuthenticatorInterface in user.module, while its implementation, strangely named UserPassword is in password.inc? We need to think more about this. I recommend:
- to have password.inc containing a PasswordHasherInterface (the current AuthenticatorInterface) and the password_hasher() factory function (the current user_authenticator())
- to have a password.default.inc containing the current UserPassword, renamed DrupalPasswordHasher
Comment #54
pwolanin CreditAttribution: pwolanin commented@Damien - I didn't really think about re-architecting this files, but something like what you suggest might make sense. Also, I had at some point assumed that user module might define several interfaces. Do we really need another file for this? At the least the factory function shoul dlive in use module so we don't have to do a drupal_function_exists().
Comment #55
pwolanin CreditAttribution: pwolanin commentedafter discussion w/ Damien in IRc, we concluded that the factory funciton should stay in user.module fo rthe reason I suggest.
Comment #56
Crell CreditAttribution: Crell commentedRemember that functions don't magically lazy-load without the drupal_function_exists(). Classes and interfaces can live pretty much anywhere and will just magically work.
Just random reminder. :-)
Comment #57
pwolanin CreditAttribution: pwolanin commented@Crell - yes, exactly why I thought to keep the factory function in the .module file.
Comment #58
pwolanin CreditAttribution: pwolanin commented@Damien - I'm not sure I agree with your suggested naming changes for the interface, etc. With a few more minor code tweaks, we could use this same interface to handle local/remote/or other arbitrary authentication schemes.
I have these tweaks in mind for a follow-up issue - I have been trying to keep this issue as narrow as possible.
Comment #59
pwolanin CreditAttribution: pwolanin commentedThis atch moves code around as suggested so password.inc has the interface, and user.password.inc has the default implementation, but does not alter anything else wrt the prior RTBC patch.
Comment #60
catchPatch still looks fine and all the tests pass. I think we can leave further tweaking to another issue - since this is really just about stopping things getting broken by the registry. Marking back to RTBC.
Comment #62
chx CreditAttribution: chx commentedNote that I used
module_invoke(variable_get('field_storage_module', 'field_sql_storage'), $op, $arg1, $arg2);
in field module. That's like a Drupal-ish solution...Comment #63
pwolanin CreditAttribution: pwolanin commented@chx - that would be ok for mail, it would be much more of a pain here where we have several methods in the interface. And I think we should try to tackle these using a consistent pattern.
Here's a simple re-roll, but I'm going to make a few more changes beyond this simple change.
Comment #65
pwolanin CreditAttribution: pwolanin commentedreroll just for patch hunks that didn't apply
Comment #67
pwolanin CreditAttribution: pwolanin commentedHere's a re-worked patch that makes the class more self contained and makes it conform to an interface. The important change here is that we make no assumption that the "real" user password is stored in the local database - so for example an implementing class could actually talk to a remote (e.g. LDAP or other) database to authenticate the user.
note most of the changes to user.test are renaming
$user
(don't do it!) to$account
Comment #69
pwolanin CreditAttribution: pwolanin commentedre-roll from CVS (somehow patch doesn't like the removed file notation int he git diff). Also removed interface checking per chx.
Comment #71
pwolanin CreditAttribution: pwolanin commentedapparently patch thinks 2 additions at the end of a file conflict.
Comment #73
pwolanin CreditAttribution: pwolanin commentedargh - ok, rolled against a real CVS checkout again.
Comment #75
pwolanin CreditAttribution: pwolanin commentedoops
Comment #77
pwolanin CreditAttribution: pwolanin commentedComment #78
Dave ReidAdding tag.
Comment #79
YesCT CreditAttribution: YesCT commentedThis is part of a meta issue to rework account workflow #1837054: [META] Refactor account workflow (and config)
I started to updated the issue summary with the new issue summary template but the summary still needs work to reflect the current state of this issue, especially detailing the remaining tasks.
I added tags.
Comment #79.0
YesCT CreditAttribution: YesCT commentedUpdated issue summary with issue summary template, but all the comments need to be read and more info added still.
Comment #80
Crell CreditAttribution: Crell commentedIsn't this deprecated in favor of #1463624: Move password.inc into DIC
Comment #80.0
Crell CreditAttribution: Crell commentedUpdated issue summary added link to meta issue.
Comment #81
Crell CreditAttribution: Crell commentedThis was done as part of the linked issue.