This is part of #1837054: [META] Refactor account workflow (and config)

Problem/Motivation

The registry patch prevents the use of alternate password hashing schemes, since password.inc is always parsed, you cannot include a file that defines functions with the same names.

Proposed resolution

A couple possible ways around this exist. One is to use singleton objects (per Crell). A more direct alternative is to require that any alternative framework have a way to define alternative, uniquely named, functions that are called.

Remaining tasks

Issue summary needs updating
remaining tasks needs to be itemized
patch in comment #75 need to be re-rolled for D8
(reviews needed, tests to be written or run, documentation to be written, etc.)

User interface changes

(new or changed features/functionality in the user interface, modules added or removed, changes to URL paths, changes to user interface text)

API changes

(API changes/additions that would affect module, install profile, and theme developers, including examples of before/after code if appropriate)

Original report by [username]

The registry patch prevents the use of alternate password hashing schemes, since password.inc is always parsed, you cannot include a file that defines functions with the same names.

A couple possible ways around this exist. One is to use singleton objects (per Crell). A more direct alternative is to require that any alternative framework have a way to define alternative, uniquely named, functions that are called.

Here's a patch that tries to fix this - feedback on the variable name and default base name?

Obviously, you'll have to make sure your function registry is rebuilt...

Also, I had trouble logging in from the /node page - put not from /user

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

pwolanin’s picture

Title: allow alternate ashing frameworks » fix alternate password hashing frameworks
Dries’s picture

For the image handling, we also have the ability to pick different image libraries. I think we should use the same mechanism, or at least come up with a consistent pattern. Anything else will result in confusing user experience. Maybe have a look at it, and see if we can clean those things up a bit? That is not to say your current approach is bad. It might be the case that everything is nice and dandy.

pwolanin’s picture

@Dries - [edit] I don't think you meant that post for here... right, good point

pwolanin’s picture

Ok, previous patch was more complicated than needed (and didn't work). Attached much simpler patch should work.

Owen Barton’s picture

@pwolanin: Actually, I think Dries did intend to post this here - the method used to switch password hashing is used for several other things in Drupal - I haven't checked the HEAD code recently, but image frameworks and the caching and session subsystems use(d?) similar (or identical) systems.

In other words, chances are some other things may be broken here too (and no-one noticed), or if not perhaps someone has fixed them in a different way. Either way, the point is that we should use be consistent across these if possible.

Owen Barton’s picture

So it looks like cache and session both use the same style as password - from bootstrap.inc

      require_once variable_get('cache_inc', './includes/cache.inc');
...
      require_once variable_get('session_inc', './includes/session.inc');

Image uses:

function image_toolkit_invoke($method, $params = array()) {
  if ($toolkit = image_get_toolkit()) {
    $function = 'image_' . $toolkit . '_' . $method;
    if (function_exists($function)) {
      return call_user_func_array($function, $params);
    }
    else {
      watchdog('php', 'The selected image handling toolkit %toolkit can not correctly process %function.', array('%toolkit' => $toolkit, '%function' => $function), WATCHDOG_ERROR);
      return FALSE;
    }
  }
}

most of which is 9+ months old. This looks more similar to what your patch is doing, so (assuming image stuff is working with the registry) it seems reasonable to sync these up. I think it may be worth looking at cache and session at the same time though, since there may be a helper function to help manage the optional incs and their selection that could be shared by these 3 or 4 instances.

Crell’s picture

If we're revisiting how we handle pluggable subystems, may I suggest a class-based approach as discussed here: http://www.garfieldtech.com/blog/pluggable-systems-howto . That article was written specifically for the password hashing system. :-)

pwolanin’s picture

@Crell - compared to the procedural-style system you suggest, I think my solution is cleaner. It has the caveat that the replacement code has to be part of a module so that the relevant file will be declared in a .info file to the registry processing mechanism. On the other hand, this means the file can be located anywhere, as opposed to having to be added to the /includes as in your example.

Note that we thus avoid having to rewrite all the password code as object methods, when there is no real object-ness to what we are doing. Obviously the case is quote different with the database layer when we may be holding onto connection information, prepared statements, etc.

@Grugnog - if I understand it right, the cache and session includes happen before the registry ever gets involved and since we are always using that code, it's fine to always include those files or their replacements. Password, image, etc, are different since it is likely that on many page loads we won't need that code at all.

pwolanin’s picture

Actually- I now realized my method is even better - for this reason: the registry code automatically scans /includes. Hence we can drop an alternate framework into there as a .inc file and it should just workTM.

pwolanin’s picture

Title: fix alternate password hashing frameworks » fix and standardize pluggable frameworks
Status: Needs review » Needs work

from moshe:

fyi there are 2 more pluggable systems in core: smtp and sessions. search for

variable_get('smtp_library', '')
variable_get('session_inc', './includes/session.inc');

So, it seems like we really ought to standardize this as best as possible. What I proposed here would not be as nice for cache, etc, since we those functions are called from all over. How do we feel about the performance of 1 level of indirection - essentially the 1st method suggested here: http://www.garfieldtech.com/blog/pluggable-systems-howto but using module_invoke() and/or drupal_function_exists() to take advantage of the registry's ability to include files as needed?

Crell’s picture

Of the three mechanisms I describe in that article, the first one is the slowest and least feature-ful. The only "advantage" is offers is that it doesn't have any classes in it, which is not an advantage because there's nothing wrong with classes. If we're going to standardize on something, let's standardize on one that is faster and offers more robust features.

pwolanin’s picture

@Crell - ok- I'll play around with the factory object or function technique. I'm not sure why it would be faster to include a file with a class than a file with functions?

Crell’s picture

The purely-procedural method has to do a check or an include_once() on each function call. The class-based mechanism can use just an isset() to see if it needs to be initialized.

Off the cuff, I would recommend using a factory function, as those are easier to read and more consistent with what the database layer is doing. Something like:

function cache($segment, $driver = 'default') {
  static $caches;

  if (empty($caches[$driver][$segment])) {
    if ($driver == $default) {
      $driver = variable_get(...);
    }
    // Whatever initialization is needed.
  }

  return $caches[$driver][$segment];
}

cache('page')->set($path, $page);
cache('page')->empty();
cache('filter')->get(...);


For lack of a better term, I'm using $segment to refer to the cache table. Referring to a table name directly is rather inflexible. Instead, a given cache implementation should be able to decide that pages should be cached to a dedicated SQLite database, nodes should be cached to a memcache server, and variables should be cached to the standard database. The "default" implementation just puts everything into the local database.

The password or session handling would have a different signature for the factory function, but would still operate with a factory function. So for instance:


function password($driver = 'default') {
  static $drivers;

  if (empty($drivers[$driver])) {
    // Initialization.
  }

  return $drivers[$driver];
}

password()->userCheckPassword($password, $account);
password()->userNeedsNewHash($account);
// etc.

Not sure if that makes any sense; I'm posting this while trying to resuscitate a database server that just died on me so I'm a bit distracted. :-)

chx’s picture

I would much prefer fixing the registry to work with variable_get('session_inc', './includes/session.inc'); and similar pluggable systems. It scans includes/ currently which is sorta broken even with database drivers check your registry table, it only contains database.pgsql.inc not .mysql.inc.

Larry, "there's nothing wrong with classes" -- I thought you were standing about a foot from me in Barcelona where I was ranting high and low for what, ten or twenty minutes about how OOP is hard and unnatural for anyone but CS majors? The only reason I let it happen and even helped with PDO is that PDO itself is OOP but that does not mean that every pluggable subsystem should use this now. Quite the opposite: OOP is the last measure we should consider.

pwolanin’s picture

@Crell - I'm a little unclear as to why (in terms PHP inner workings) require_once would be really slower - doesn't PHP have a quick internal lookup as to which files are already included? One can get the list from get_included_files(). In a procedural-style construction of this, one could also have a single, central, function to decide based on isset() whether require_once() is needed and/or return the function name.

Crell’s picture

@chx: I was about 3 feet from you, actually, and I still disagree with you. "Aw, OOP is so haaaaard" is a strawman argument, and one that is also untrue. You can write totally incomprehensible OOP and totally incomprehensible procedural code, or clear and obvious code in either one. A fair portion of Drupal's procedural code right now is incomprehensible, too, to someone who hasn't had their head buried in it for years. Knee-jerk avoidance of classes and objects is just as damaging to the code base as knee-jerk usage of them is. Right tool for the job.

Besides, what's wrong with CS majors? There's a lot of them out there, and PHP in general could use them. :-) Besides, using standard, industry-accepted patterns and practices lowers the bar to entry for Drupal development because people can recognize how things work more easily. "Oh, they're doing X, I get it now."

@pwolanin: require_once() actually does have a performance hit, especially on opcode caches. The internal lookup is not "quick", for internal engine reasons that I don't claim to fully understand but have seen Rasums et al repeat often. Yes, one could write a user-space wrapper around it. However, you're then still doing an indirect function call rather than a literal method, which is a very slight performance difference.

Much of it does boil down to a stylistic question, I grant. I find an architecture that wraps shared, swappable functionality into tidy objects that can be mixed and matched to be much easier to understand and wrap my head around than 500 verbosely named functions. It is more self-documenting, and more approachable for people who aren't spoon-coders. It also makes it easier to do things like specifying an engine when needed, and is easy to wrap factory functions around to keep the syntax simple. Avoiding it just because "OMG it has classes" is naive and short-sighted.

pwolanin’s picture

@Crell - after having a long discussion about this, I think my short term answer is that we should make as many as possible look basically like the image toolkit handing - where we have a dispatch function that is the central point where we decide which actual funciton to invoke. While the OOP version may have some advantages, it also means re-writing lots of code. So, my intention at this point is to get this working with the minimal changes to code. Obviously for caching, this may not be the final solution for D7 (especially if we need to have multiple caching handlers per page load).

Crell’s picture

The long promised RFC, for those who are interested: http://www.garfieldtech.com/blog/drupal-handler-rfc

pwolanin’s picture

Status: Needs work » Needs review
FileSize
28.5 KB

Here is in initial patch for passwords that basically follows the outline at http://www.garfieldtech.com/blog/pluggable-systems-howto with the addition of a declared interface to make it a well-defined API.

I put the interface and factory function in module.inc since they need to be always available, and are essentially declaring interfaces that modules may implement.

If this is a reasonable approach (at least to make these frameworks work with the registry) I can continue with others.

pwolanin’s picture

looking at the other things (image, sessions, smtp) maybe putting the password interface just in user module makes more sensefor password?

pwolanin’s picture

this covers both passwords and the smtp library.

All tests pass (PHP 5.2.5, MySQL 5.0) with this patch (7109 passes, 0 fails, 0 exceptions)

moshe weitzman’s picture

This is looking very nice. Subscribe.

Crell’s picture

Status: Needs review » Needs work

Methods should use camelCase(), not lower_case(). The factory functions should also be more compactly named. "drupal_mail_wrapper()" is needlessly verbose. drupal_mail() is sufficient.

This is a step in the right direction, but relying on the variable system and autoloader means we can't implement it early in the page request process. That means it won't extend to cache (the system that breaks the registry now) and session unless we require the use of the database, even for aggressive caching. (eaton's been haranguing me about that problem. :-) ) It also doesn't support multiple routing, so we still can't, for instance, route page caching and "other" caching to different caching mechanisms.

After a long nighttime session this past weekend with eaton and CitizenKane, I'm working on a 2.x branch of handler.module that should address both of those issues. It's all happening in CVS if you want to follow along. I could definitely use some feedback.

pwolanin’s picture

@Crell - drupal_mail_wrapper is the existing name of the D5/D6 function. user_mail() is already taken... Maybe drupal_smtp()?

For SMTP this supports multiple routing. For passwords, I don't know why multiple routing would make sense.

moshe weitzman’s picture

If I read the patch correctly, the multiple routing for SMTP assumes that the code sending the email is the decider of which driver to use. But thats not what we want in SMTP, nor in cache nor db layers. All those systems have a 'segment' which should route the call to the right driver. for SMPT, the segment is the $mail_key. for cache, it is the 'table'. for db layer, it is the $databases array.
All thats needed I think is to change the smtp_system variable into an array that is keyed by segment. Then the code to lookup the right driver becomes:


$config = variable_get('smtp_system', array('default' => 'DrupalMailSend'));
$driver = isset($config[$segment]) ? $config[$segment] : $config['default'];

pwolanin’s picture

Status: Needs work » Needs review
FileSize
34.86 KB

Here's is a revised patch which fixes the method and variable names. One thing I noticed is that we should do a linear scale to log scale conversion for hash strength to make this more user-friendly.

pwolanin’s picture

talking to Moshe, here's a more flexible smtp lookup code.

Crell’s picture

@Moshe: The routing segment for the database would be the target, not the table. For cache, it would be "page", "menu", "form", etc.

I don't actually comprehend what pwloanin says in #26... :-)

pwolanin’s picture

@Crell - the phpass code stores the # of iterations in log2 form - i.e. if the value is 10, you perform 2^10 = 1024 iterations. Since alternate hashing schemes might have some arbitrarily different internal representation of strength, it makes more sense to represent to the developer a linear scale (which I think we all understand) which can be internally converted as needed. In this case converted by taking log($strength, 2). So setting strength to 0.1 will make the iterative part of the code about 10x faster, while setting it to 10 will make it about 10x slower.

The earlier version of the patch, setting strength to 0.5 would change the # of iterations from 2^14 to 2^7 which is a 128-fold difference in speed for the iterative part.

pwolanin’s picture

all tests pass with this patch: 7109 passes, 0 fails, 0 exceptions

Anonymous’s picture

subscribe.

robertDouglass’s picture

sub.

pwolanin’s picture

re-ran all tests: 7184 passes, 0 fails, and 0 exceptions

Note sure why the testing server sees fails.

mfer’s picture

subscribing... I hope to review it in the next week.

catch’s picture

I took a quick look through this and the approach looks sane to me, although I'm not the best person to review anything OOP by any means. Obviously I'm in favour of any patch with the word 'standardize' in the title. Will try to take a more in depth look at this soon.

RobLoach’s picture

Status: Needs review » Needs work

I love the approach taken here, but it seems like something that should be done in a number of different patches instead of all pluggable systems at once. I particularly like the replacement of the drupal_mail system. I absolutely love the push for a standard way of implementing these pluggable systems, and the approach that's being taken here. I'll take a closer look at this in the near future.

pwolanin’s picture

@Rob Loach - well this patch only covers 2 - so we could split the mail one off and do one per issue if you think that's really the right way to go.

Crell is interested in getting a unifying API in, so whatever we do here needs to be enough to get these working correctly, but may be intermediate on the way to another solution in the long term.

RobLoach’s picture

How would a unifying API for all external APIs work? Sounds like it would just be an empty interface, or something.

mfer’s picture

pwolanin’s picture

Title: fix and standardize pluggable frameworks » fix pluggable password hashing framework
Status: Needs work » Needs review
FileSize
28.73 KB

ok, so per comments above and conversation w/ catch, I'll split this into 2 separate issues and add another issue for any other framework so as to have minimal and independent patches. Attached is a patch for just the password part.

Note that while this looks like a somewhat big patch - it is almost entirely just a refactoring of code from a procedural to a class/method form. No new functionality is added.

The smtp library part is continued at: http://drupal.org/node/331180

catch’s picture

Status: Needs review » Reviewed & tested by the community

Created a new account and logged in. Ran user and system tests. No glaring errors in the patch, and it's 95% moving code around, so RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Needs work
+ *  The return value from drupal_smtp()->send(), if it is called.

This belongs in the other patch. :)

+/**
+ * User module interfaces and their corresponding factory functions.
+ */

This is kind of floating in outer space, explaining what is below and not actually documenting any particular variable/function/etc. So let's just make it regular comments and not PHPDoc (remove the second * in /**)

+interface PasswordInterface {

Could we add some PHPDoc above this that basically summarizs what it's for, and why one might implement it in their class? It doesn't have to be overly detailed, but just remember that most Drupal people do not understand this stuff so throwing them a bone is nice. :)

user_password()->hash($password)

This is a little tweaky. Perhaps it's made worse by the fact that this function is used to generate passwords in D6 and below (I really like the move of that functionality to a function called user_*generate*_password() btw). So could we maybe add a verb to this function too to make it more clear what it's doing? (user_get_password()?)

Going to play around and see if I can break it with openID, etc. ;)

pwolanin’s picture

trying to think of some other factory function name - I think this is about the most concise while still being meaningful, so I'm not really convinced it need to change. Why is this worse than having db_select()?

a possible alternative could be like:

user_authenicator()->hashPassword($pass);
user_authenicator()->checkPassword($account, $pass);
pwolanin’s picture

re-rolled using this naming scheme.

pwolanin’s picture

Status: Needs work » Needs review
Anonymous’s picture

this looks really nice.

all tests pass.

one tiny niggle is "$P$".

+  private function generateSalt() {
+    $output = '$P$';

forgive my ignorance about salting best practice, but is "$P$" significant in any way, or is just a nice recognisable way to start the string? could we use a class constant with a descriptive name or a code comment for this?

pwolanin’s picture

The $P$ is necessary to be compatible with the framework - please read the entire prior ~300 comment thread before suggesting any changes to the underlying algorithms...

Anonymous’s picture

sorry if i wasn't clear, but i'm not at all suggesting any changes to underlying algorithms.

i was simply suggesting that '$P$' looks a little magic, so a comment or the use of a descriptively-named class constant might be useful.

pwolanin’s picture

the '$P$' is the hash identifier. Sure, we could add a comment to explain why that's needed.

Here's the original issue: http://drupal.org/node/29706

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
27.62 KB

hmm, apparently the testbed had a problem. Anyhow, re-rolled w/ and extra code comment and re-tested.
7299 passes, 0 fails, and 0 exceptions

catch’s picture

Status: Needs review » Reviewed & tested by the community

This all looks great, and it's been reviewed by people who'd spot any OO issues (which I wouldn't). Ran user tests and got 100% pass, plus it's primarily moving code around, so RTBC.

Damien Tournoud’s picture

Status: Reviewed & tested by the community » Needs work

Why is AuthenticatorInterface in user.module, while its implementation, strangely named UserPassword is in password.inc? We need to think more about this. I recommend:

- to have password.inc containing a PasswordHasherInterface (the current AuthenticatorInterface) and the password_hasher() factory function (the current user_authenticator())
- to have a password.default.inc containing the current UserPassword, renamed DrupalPasswordHasher

pwolanin’s picture

@Damien - I didn't really think about re-architecting this files, but something like what you suggest might make sense. Also, I had at some point assumed that user module might define several interfaces. Do we really need another file for this? At the least the factory function shoul dlive in use module so we don't have to do a drupal_function_exists().

pwolanin’s picture

after discussion w/ Damien in IRc, we concluded that the factory funciton should stay in user.module fo rthe reason I suggest.

Crell’s picture

Remember that functions don't magically lazy-load without the drupal_function_exists(). Classes and interfaces can live pretty much anywhere and will just magically work.

Just random reminder. :-)

pwolanin’s picture

@Crell - yes, exactly why I thought to keep the factory function in the .module file.

pwolanin’s picture

@Damien - I'm not sure I agree with your suggested naming changes for the interface, etc. With a few more minor code tweaks, we could use this same interface to handle local/remote/or other arbitrary authentication schemes.

I have these tweaks in mind for a follow-up issue - I have been trying to keep this issue as narrow as possible.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
30.54 KB

This atch moves code around as suggested so password.inc has the interface, and user.password.inc has the default implementation, but does not alter anything else wrt the prior RTBC patch.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Patch still looks fine and all the tests pass. I think we can leave further tweaking to another issue - since this is really just about stopping things getting broken by the registry. Marking back to RTBC.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

chx’s picture

Note that I used module_invoke(variable_get('field_storage_module', 'field_sql_storage'), $op, $arg1, $arg2); in field module. That's like a Drupal-ish solution...

pwolanin’s picture

Status: Needs work » Needs review
FileSize
28.98 KB

@chx - that would be ok for mail, it would be much more of a pain here where we have several methods in the interface. And I think we should try to tackle these using a consistent pattern.

Here's a simple re-roll, but I'm going to make a few more changes beyond this simple change.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
30.15 KB

reroll just for patch hunks that didn't apply

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Title: fix pluggable password hashing framework » make pluggable password hashing framework more generic and use class auto-loading.
Status: Needs work » Needs review
FileSize
35.57 KB

Here's a re-worked patch that makes the class more self contained and makes it conform to an interface. The important change here is that we make no assumption that the "real" user password is stored in the local database - so for example an implementing class could actually talk to a remote (e.g. LDAP or other) database to authenticate the user.

note most of the changes to user.test are renaming $user (don't do it!) to $account

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
36.9 KB

re-roll from CVS (somehow patch doesn't like the removed file notation int he git diff). Also removed interface checking per chx.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
35.51 KB

apparently patch thinks 2 additions at the end of a file conflict.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
36.84 KB

argh - ok, rolled against a real CVS checkout again.

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
36.85 KB

oops

Status: Needs review » Needs work

The last submitted patch failed testing.

pwolanin’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » feature
Dave Reid’s picture

Issue tags: +password.inc

Adding tag.

YesCT’s picture

This is part of a meta issue to rework account workflow #1837054: [META] Refactor account workflow (and config)

I started to updated the issue summary with the new issue summary template but the summary still needs work to reflect the current state of this issue, especially detailing the remaining tasks.

I added tags.

YesCT’s picture

Issue summary: View changes

Updated issue summary with issue summary template, but all the comments need to be read and more info added still.

Crell’s picture

Isn't this deprecated in favor of #1463624: Move password.inc into DIC

Crell’s picture

Issue summary: View changes

Updated issue summary added link to meta issue.

Crell’s picture

Issue summary: View changes
Status: Needs work » Fixed

This was done as part of the linked issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.