Download & Extend

make pluggable password hashing framework more generic and use class auto-loading.

Project:Drupal core
Version:8.x-dev
Component:user.module
Category:feature request
Priority:normal
Assigned:pwolanin
Status:needs work
Issue tags:password.inc

Issue Summary

The registry patch prevents the use of alternate password hashing schemes, since password.inc is always parsed, you cannot include a file that defines functions with the same names.

A couple possible ways around this exist. One is to use singleton objects (per Crell). A more direct alternative is to require that any alternative framework have a way to define alternative, uniquely named, functions that are called.

Here's a patch that tries to fix this - feedback on the variable name and default base name?

Obviously, you'll have to make sure your function registry is rebuilt...

Also, I had trouble logging in from the /node page - put not from /user

AttachmentSizeStatusTest resultOperations
pluggable-passwords.patch6.21 KBIdleFAILED: [[SimpleTest]]: [MySQL] Unable to apply patch pluggable-passwords.patch.View details | Re-test

Comments

#1

Title:allow alternate ashing frameworks» fix alternate password hashing frameworks

#2

For the image handling, we also have the ability to pick different image libraries. I think we should use the same mechanism, or at least come up with a consistent pattern. Anything else will result in confusing user experience. Maybe have a look at it, and see if we can clean those things up a bit? That is not to say your current approach is bad. It might be the case that everything is nice and dandy.

#3

@Dries - [edit] I don't think you meant that post for here... right, good point

#4

Ok, previous patch was more complicated than needed (and didn't work). Attached much simpler patch should work.

AttachmentSizeStatusTest resultOperations
pluggable-passwords-259103-4.patch1.72 KBIdleFailed: Failed to apply patch.View details | Re-test

#5

@pwolanin: Actually, I think Dries did intend to post this here - the method used to switch password hashing is used for several other things in Drupal - I haven't checked the HEAD code recently, but image frameworks and the caching and session subsystems use(d?) similar (or identical) systems.

In other words, chances are some other things may be broken here too (and no-one noticed), or if not perhaps someone has fixed them in a different way. Either way, the point is that we should use be consistent across these if possible.

#6

So it looks like cache and session both use the same style as password - from bootstrap.inc

<?php
     
require_once variable_get('cache_inc', './includes/cache.inc');
...
      require_once
variable_get('session_inc', './includes/session.inc');
?>

Image uses:

<?php
function image_toolkit_invoke($method, $params = array()) {
  if (
$toolkit = image_get_toolkit()) {
   
$function = 'image_' . $toolkit . '_' . $method;
    if (
function_exists($function)) {
      return
call_user_func_array($function, $params);
    }
    else {
     
watchdog('php', 'The selected image handling toolkit %toolkit can not correctly process %function.', array('%toolkit' => $toolkit, '%function' => $function), WATCHDOG_ERROR);
      return
FALSE;
    }
  }
}
?>

most of which is 9+ months old. This looks more similar to what your patch is doing, so (assuming image stuff is working with the registry) it seems reasonable to sync these up. I think it may be worth looking at cache and session at the same time though, since there may be a helper function to help manage the optional incs and their selection that could be shared by these 3 or 4 instances.

#7

If we're revisiting how we handle pluggable subystems, may I suggest a class-based approach as discussed here: http://www.garfieldtech.com/blog/pluggable-systems-howto . That article was written specifically for the password hashing system. :-)

#8

@Crell - compared to the procedural-style system you suggest, I think my solution is cleaner. It has the caveat that the replacement code has to be part of a module so that the relevant file will be declared in a .info file to the registry processing mechanism. On the other hand, this means the file can be located anywhere, as opposed to having to be added to the /includes as in your example.

Note that we thus avoid having to rewrite all the password code as object methods, when there is no real object-ness to what we are doing. Obviously the case is quote different with the database layer when we may be holding onto connection information, prepared statements, etc.

@Grugnog - if I understand it right, the cache and session includes happen before the registry ever gets involved and since we are always using that code, it's fine to always include those files or their replacements. Password, image, etc, are different since it is likely that on many page loads we won't need that code at all.

#9

Actually- I now realized my method is even better - for this reason: the registry code automatically scans /includes. Hence we can drop an alternate framework into there as a .inc file and it should just workTM.

#10

Title:fix alternate password hashing frameworks» fix and standardize pluggable frameworks
Status:needs review» needs work

from moshe:

fyi there are 2 more pluggable systems in core: smtp and sessions. search for

variable_get('smtp_library', '')
variable_get('session_inc', './includes/session.inc');

So, it seems like we really ought to standardize this as best as possible. What I proposed here would not be as nice for cache, etc, since we those functions are called from all over. How do we feel about the performance of 1 level of indirection - essentially the 1st method suggested here: http://www.garfieldtech.com/blog/pluggable-systems-howto but using module_invoke() and/or drupal_function_exists() to take advantage of the registry's ability to include files as needed?

#11

Of the three mechanisms I describe in that article, the first one is the slowest and least feature-ful. The only "advantage" is offers is that it doesn't have any classes in it, which is not an advantage because there's nothing wrong with classes. If we're going to standardize on something, let's standardize on one that is faster and offers more robust features.

#12

@Crell - ok- I'll play around with the factory object or function technique. I'm not sure why it would be faster to include a file with a class than a file with functions?

#13

The purely-procedural method has to do a check or an include_once() on each function call. The class-based mechanism can use just an isset() to see if it needs to be initialized.

Off the cuff, I would recommend using a factory function, as those are easier to read and more consistent with what the database layer is doing. Something like:

<?php
function cache($segment, $driver = 'default') {
  static
$caches;

  if (empty(
$caches[$driver][$segment])) {
    if (
$driver == $default) {
     
$driver = variable_get(...);
    }
   
// Whatever initialization is needed.
 
}

  return
$caches[$driver][$segment];
}

cache('page')->set($path, $page);
cache('page')->empty();
cache('filter')->get(...);
?>

For lack of a better term, I'm using $segment to refer to the cache table. Referring to a table name directly is rather inflexible. Instead, a given cache implementation should be able to decide that pages should be cached to a dedicated SQLite database, nodes should be cached to a memcache server, and variables should be cached to the standard database. The "default" implementation just puts everything into the local database.

The password or session handling would have a different signature for the factory function, but would still operate with a factory function. So for instance:

<?php
function password($driver = 'default') {
  static
$drivers;

  if (empty(
$drivers[$driver])) {
   
// Initialization.
 
}

  return
$drivers[$driver];
}

password()->userCheckPassword($password, $account);
password()->userNeedsNewHash($account);
// etc.
?>

Not sure if that makes any sense; I'm posting this while trying to resuscitate a database server that just died on me so I'm a bit distracted. :-)

#14

I would much prefer fixing the registry to work with variable_get('session_inc', './includes/session.inc'); and similar pluggable systems. It scans includes/ currently which is sorta broken even with database drivers check your registry table, it only contains database.pgsql.inc not .mysql.inc.

Larry, "there's nothing wrong with classes" -- I thought you were standing about a foot from me in Barcelona where I was ranting high and low for what, ten or twenty minutes about how OOP is hard and unnatural for anyone but CS majors? The only reason I let it happen and even helped with PDO is that PDO itself is OOP but that does not mean that every pluggable subsystem should use this now. Quite the opposite: OOP is the last measure we should consider.

#15

@Crell - I'm a little unclear as to why (in terms PHP inner workings) require_once would be really slower - doesn't PHP have a quick internal lookup as to which files are already included? One can get the list from get_included_files(). In a procedural-style construction of this, one could also have a single, central, function to decide based on isset() whether require_once() is needed and/or return the function name.

#16

@chx: I was about 3 feet from you, actually, and I still disagree with you. "Aw, OOP is so haaaaard" is a strawman argument, and one that is also untrue. You can write totally incomprehensible OOP and totally incomprehensible procedural code, or clear and obvious code in either one. A fair portion of Drupal's procedural code right now is incomprehensible, too, to someone who hasn't had their head buried in it for years. Knee-jerk avoidance of classes and objects is just as damaging to the code base as knee-jerk usage of them is. Right tool for the job.

Besides, what's wrong with CS majors? There's a lot of them out there, and PHP in general could use them. :-) Besides, using standard, industry-accepted patterns and practices lowers the bar to entry for Drupal development because people can recognize how things work more easily. "Oh, they're doing X, I get it now."

@pwolanin: require_once() actually does have a performance hit, especially on opcode caches. The internal lookup is not "quick", for internal engine reasons that I don't claim to fully understand but have seen Rasums et al repeat often. Yes, one could write a user-space wrapper around it. However, you're then still doing an indirect function call rather than a literal method, which is a very slight performance difference.

Much of it does boil down to a stylistic question, I grant. I find an architecture that wraps shared, swappable functionality into tidy objects that can be mixed and matched to be much easier to understand and wrap my head around than 500 verbosely named functions. It is more self-documenting, and more approachable for people who aren't spoon-coders. It also makes it easier to do things like specifying an engine when needed, and is easy to wrap factory functions around to keep the syntax simple. Avoiding it just because "OMG it has classes" is naive and short-sighted.

#17

@Crell - after having a long discussion about this, I think my short term answer is that we should make as many as possible look basically like the image toolkit handing - where we have a dispatch function that is the central point where we decide which actual funciton to invoke. While the OOP version may have some advantages, it also means re-writing lots of code. So, my intention at this point is to get this working with the minimal changes to code. Obviously for caching, this may not be the final solution for D7 (especially if we need to have multiple caching handlers per page load).

#18

The long promised RFC, for those who are interested: http://www.garfieldtech.com/blog/drupal-handler-rfc

#19

Status:needs work» needs review

Here is in initial patch for passwords that basically follows the outline at http://www.garfieldtech.com/blog/pluggable-systems-howto with the addition of a declared interface to make it a well-defined API.

I put the interface and factory function in module.inc since they need to be always available, and are essentially declaring interfaces that modules may implement.

If this is a reasonable approach (at least to make these frameworks work with the registry) I can continue with others.

AttachmentSizeStatusTest resultOperations
pluggable-password-259103-19.patch28.5 KBIdleFailed: Failed to install HEAD.View details | Re-test

#20

looking at the other things (image, sessions, smtp) maybe putting the password interface just in user module makes more sensefor password?

AttachmentSizeStatusTest resultOperations
pluggable-password-259103-20.patch28.29 KBIdleFailed: Failed to install HEAD.View details | Re-test

#21

this covers both passwords and the smtp library.

All tests pass (PHP 5.2.5, MySQL 5.0) with this patch (7109 passes, 0 fails, 0 exceptions)

AttachmentSizeStatusTest resultOperations
pluggable-pass-smtp-259103-21a.patch35 KBIdleFailed: Failed to install HEAD.View details | Re-test

#22

This is looking very nice. Subscribe.

#23

Status:needs review» needs work

Methods should use camelCase(), not lower_case(). The factory functions should also be more compactly named. "drupal_mail_wrapper()" is needlessly verbose. drupal_mail() is sufficient.

This is a step in the right direction, but relying on the variable system and autoloader means we can't implement it early in the page request process. That means it won't extend to cache (the system that breaks the registry now) and session unless we require the use of the database, even for aggressive caching. (eaton's been haranguing me about that problem. :-) ) It also doesn't support multiple routing, so we still can't, for instance, route page caching and "other" caching to different caching mechanisms.

After a long nighttime session this past weekend with eaton and CitizenKane, I'm working on a 2.x branch of handler.module that should address both of those issues. It's all happening in CVS if you want to follow along. I could definitely use some feedback.

#24

@Crell - drupal_mail_wrapper is the existing name of the D5/D6 function. user_mail() is already taken... Maybe drupal_smtp()?

For SMTP this supports multiple routing. For passwords, I don't know why multiple routing would make sense.

#25

If I read the patch correctly, the multiple routing for SMTP assumes that the code sending the email is the decider of which driver to use. But thats not what we want in SMTP, nor in cache nor db layers. All those systems have a 'segment' which should route the call to the right driver. for SMPT, the segment is the $mail_key. for cache, it is the 'table'. for db layer, it is the $databases array.
All thats needed I think is to change the smtp_system variable into an array that is keyed by segment. Then the code to lookup the right driver becomes:

<?php
$config
= variable_get('smtp_system', array('default' => 'DrupalMailSend'));
$driver = isset($config[$segment]) ? $config[$segment] : $config['default'];
?>

#26

Status:needs work» needs review

Here's is a revised patch which fixes the method and variable names. One thing I noticed is that we should do a linear scale to log scale conversion for hash strength to make this more user-friendly.

AttachmentSizeStatusTest resultOperations
pluggable-pass-smtp-259103-25.patch34.86 KBIdleFailed: Failed to install HEAD.View details | Re-test

#27

talking to Moshe, here's a more flexible smtp lookup code.

AttachmentSizeStatusTest resultOperations
pluggable-pass-smtp-259103-27a.patch35.61 KBIdleFailed: Failed to install HEAD.View details | Re-test

#28

@Moshe: The routing segment for the database would be the target, not the table. For cache, it would be "page", "menu", "form", etc.

I don't actually comprehend what pwloanin says in #26... :-)

#29

@Crell - the phpass code stores the # of iterations in log2 form - i.e. if the value is 10, you perform 2^10 = 1024 iterations. Since alternate hashing schemes might have some arbitrarily different internal representation of strength, it makes more sense to represent to the developer a linear scale (which I think we all understand) which can be internally converted as needed. In this case converted by taking log($strength, 2). So setting strength to 0.1 will make the iterative part of the code about 10x faster, while setting it to 10 will make it about 10x slower.

The earlier version of the patch, setting strength to 0.5 would change the # of iterations from 2^14 to 2^7 which is a 128-fold difference in speed for the iterative part.

#30

all tests pass with this patch: 7109 passes, 0 fails, 0 exceptions

#31

subscribe.

#32

sub.

#33

re-ran all tests: 7184 passes, 0 fails, and 0 exceptions

Note sure why the testing server sees fails.

#34

subscribing... I hope to review it in the next week.

#35

I took a quick look through this and the approach looks sane to me, although I'm not the best person to review anything OOP by any means. Obviously I'm in favour of any patch with the word 'standardize' in the title. Will try to take a more in depth look at this soon.

#36

Status:needs review» needs work

I love the approach taken here, but it seems like something that should be done in a number of different patches instead of all pluggable systems at once. I particularly like the replacement of the drupal_mail system. I absolutely love the push for a standard way of implementing these pluggable systems, and the approach that's being taken here. I'll take a closer look at this in the near future.

#37

@Rob Loach - well this patch only covers 2 - so we could split the mail one off and do one per issue if you think that's really the right way to go.

Crell is interested in getting a unifying API in, so whatever we do here needs to be enough to get these working correctly, but may be intermediate on the way to another solution in the long term.

#38

How would a unifying API for all external APIs work? Sounds like it would just be an empty interface, or something.

#39

#40

Title:fix and standardize pluggable frameworks» fix pluggable password hashing framework
Status:needs work» needs review

ok, so per comments above and conversation w/ catch, I'll split this into 2 separate issues and add another issue for any other framework so as to have minimal and independent patches. Attached is a patch for just the password part.

Note that while this looks like a somewhat big patch - it is almost entirely just a refactoring of code from a procedural to a class/method form. No new functionality is added.

The smtp library part is continued at: http://drupal.org/node/331180

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-40.patch28.73 KBIdleFailed: Failed to install HEAD.View details | Re-test

#41

Status:needs review» reviewed & tested by the community

Created a new account and logged in. Ran user and system tests. No glaring errors in the patch, and it's 95% moving code around, so RTBC.

#42

Status:reviewed & tested by the community» needs work

+ *  The return value from drupal_smtp()->send(), if it is called.

This belongs in the other patch. :)

+/**
+ * User module interfaces and their corresponding factory functions.
+ */

This is kind of floating in outer space, explaining what is below and not actually documenting any particular variable/function/etc. So let's just make it regular comments and not PHPDoc (remove the second * in /**)

+interface PasswordInterface {

Could we add some PHPDoc above this that basically summarizs what it's for, and why one might implement it in their class? It doesn't have to be overly detailed, but just remember that most Drupal people do not understand this stuff so throwing them a bone is nice. :)

user_password()->hash($password)

This is a little tweaky. Perhaps it's made worse by the fact that this function is used to generate passwords in D6 and below (I really like the move of that functionality to a function called user_*generate*_password() btw). So could we maybe add a verb to this function too to make it more clear what it's doing? (user_get_password()?)

Going to play around and see if I can break it with openID, etc. ;)

#43

trying to think of some other factory function name - I think this is about the most concise while still being meaningful, so I'm not really convinced it need to change. Why is this worse than having db_select()?

a possible alternative could be like:

user_authenicator()->hashPassword($pass);
user_authenicator()->checkPassword($account, $pass);

#44

re-rolled using this naming scheme.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-44.patch27.57 KBIdleFailed: Failed to install HEAD.View details | Re-test

#45

Status:needs work» needs review

#46

this looks really nice.

all tests pass.

one tiny niggle is "$P$".

+  private function generateSalt() {
+    $output = '$P$';

forgive my ignorance about salting best practice, but is "$P$" significant in any way, or is just a nice recognisable way to start the string? could we use a class constant with a descriptive name or a code comment for this?

#47

The $P$ is necessary to be compatible with the framework - please read the entire prior ~300 comment thread before suggesting any changes to the underlying algorithms...

#48

sorry if i wasn't clear, but i'm not at all suggesting any changes to underlying algorithms.

i was simply suggesting that '$P$' looks a little magic, so a comment or the use of a descriptively-named class constant might be useful.

#49

the '$P$' is the hash identifier. Sure, we could add a comment to explain why that's needed.

Here's the original issue: http://drupal.org/node/29706

#50

Status:needs review» needs work

The last submitted patch failed testing.

#51

Status:needs work» needs review

hmm, apparently the testbed had a problem. Anyhow, re-rolled w/ and extra code comment and re-tested.
7299 passes, 0 fails, and 0 exceptions

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-51.patch27.62 KBIdleFailed: Failed to apply patch.View details | Re-test

#52

Status:needs review» reviewed & tested by the community

This all looks great, and it's been reviewed by people who'd spot any OO issues (which I wouldn't). Ran user tests and got 100% pass, plus it's primarily moving code around, so RTBC.

#53

Status:reviewed & tested by the community» needs work

Why is AuthenticatorInterface in user.module, while its implementation, strangely named UserPassword is in password.inc? We need to think more about this. I recommend:

- to have password.inc containing a PasswordHasherInterface (the current AuthenticatorInterface) and the password_hasher() factory function (the current user_authenticator())
- to have a password.default.inc containing the current UserPassword, renamed DrupalPasswordHasher

#54

@Damien - I didn't really think about re-architecting this files, but something like what you suggest might make sense. Also, I had at some point assumed that user module might define several interfaces. Do we really need another file for this? At the least the factory function shoul dlive in use module so we don't have to do a drupal_function_exists().

#55

after discussion w/ Damien in IRc, we concluded that the factory funciton should stay in user.module fo rthe reason I suggest.

#56

Remember that functions don't magically lazy-load without the drupal_function_exists(). Classes and interfaces can live pretty much anywhere and will just magically work.

Just random reminder. :-)

#57

@Crell - yes, exactly why I thought to keep the factory function in the .module file.

#58

@Damien - I'm not sure I agree with your suggested naming changes for the interface, etc. With a few more minor code tweaks, we could use this same interface to handle local/remote/or other arbitrary authentication schemes.

I have these tweaks in mind for a follow-up issue - I have been trying to keep this issue as narrow as possible.

#59

Status:needs work» needs review

This atch moves code around as suggested so password.inc has the interface, and user.password.inc has the default implementation, but does not alter anything else wrt the prior RTBC patch.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-59.patch30.54 KBIdleFailed: Failed to apply patch.View details | Re-test

#60

Status:needs review» reviewed & tested by the community

Patch still looks fine and all the tests pass. I think we can leave further tweaking to another issue - since this is really just about stopping things getting broken by the registry. Marking back to RTBC.

#61

Status:reviewed & tested by the community» needs work

The last submitted patch failed testing.

#62

Note that I used module_invoke(variable_get('field_storage_module', 'field_sql_storage'), $op, $arg1, $arg2); in field module. That's like a Drupal-ish solution...

#63

Status:needs work» needs review

@chx - that would be ok for mail, it would be much more of a pain here where we have several methods in the interface. And I think we should try to tackle these using a consistent pattern.

Here's a simple re-roll, but I'm going to make a few more changes beyond this simple change.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-63.patch28.98 KBIdleFailed: Failed to apply patch.View details | Re-test

#64

Status:needs review» needs work

The last submitted patch failed testing.

#65

Status:needs work» needs review

reroll just for patch hunks that didn't apply

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-65.patch30.15 KBIdleFailed: Failed to apply patch.View details | Re-test

#66

Status:needs review» needs work

The last submitted patch failed testing.

#67

Title:fix pluggable password hashing framework» make pluggable password hashing framework more generic and use class auto-loading.
Status:needs work» needs review

Here's a re-worked patch that makes the class more self contained and makes it conform to an interface. The important change here is that we make no assumption that the "real" user password is stored in the local database - so for example an implementing class could actually talk to a remote (e.g. LDAP or other) database to authenticate the user.

note most of the changes to user.test are renaming $user (don't do it!) to $account

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-67.patch35.57 KBIdleFailed: Failed to apply patch.View details | Re-test

#68

Status:needs review» needs work

The last submitted patch failed testing.

#69

Status:needs work» needs review

re-roll from CVS (somehow patch doesn't like the removed file notation int he git diff). Also removed interface checking per chx.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-69.patch36.9 KBIdleFailed: Failed to apply patch.View details | Re-test

#70

Status:needs review» needs work

The last submitted patch failed testing.

#71

Status:needs work» needs review

apparently patch thinks 2 additions at the end of a file conflict.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-71.patch35.51 KBIdleFailed: Failed to install HEAD.View details | Re-test

#72

Status:needs review» needs work

The last submitted patch failed testing.

#73

Status:needs work» needs review

argh - ok, rolled against a real CVS checkout again.

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-71.patch36.84 KBIdleFailed: Invalid PHP syntax in modules/user/user.module.View details | Re-test

#74

Status:needs review» needs work

The last submitted patch failed testing.

#75

Status:needs work» needs review

oops

AttachmentSizeStatusTest resultOperations
pluggable-pass-259103-73.patch36.85 KBIdleFailed: Failed to apply patch.View details | Re-test

#76

Status:needs review» needs work

The last submitted patch failed testing.

#77

Version:7.x-dev» 8.x-dev
Category:bug report» feature request

#78

Adding tag.

nobody click here