| Project: | Drupal core |
| Version: | 7.x-dev |
| Component: | openid.module |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | needs review |
| Issue tags: | needs backport to D7 |
Issue Summary
In the core openid Drupal module (modules/openid/openid.inc), the functions _openid_link_href and _openid_meta_httpequiv perform simple parsing of HTML in order to locate openid provider information; these calls contain regular expressions which depend on the order of the attributes for link and meta tags, respectively.
This breaks, for one, compatibility with Community-ID (http://source.keyboard-monkeys.org/projects/show/communityid), which outputs it's link tags with the "rel" attribute _following_ the "href" attribute - which is valid HTML, but gets missed by the existing parser code.
Example:
Works (Drupal finds the provider information):
<link rel="openid2.provider" href="https://open.login.yahooapis.com/openid/op/auth" />Doesn't work (Drupal does not find the provider information and returns an error):
<link href="https://open.login.yahooapis.com/openid/op/auth" rel="openid2.provider" />To reproduce, either install Community-ID and try to use an OpenID account with it to sign into Drupal; Or copy the contents of any OpenID URL page; find the link tag which has rel="openid2.provider" and reverse the order of the "rel" and "href" attributes as above. Doing this produces a "Sorry, that is not a valid OpenID..." message in response to the login attempt.
The fix is fairly simple: I rewrote these two functions to be more robust by not depending on the sequence of the attributes and being more lenient about white space in a couple places.
These patches take into account this issue - http://drupal.org/node/206476 - and afaict will not regress that. I also tested these changes against my own Community-ID RC3 install, Flickr.com and myid.net which all worked without issue (previously only Community-ID was broken - just showing that I did do some checking to try to ensure I didn't break anything else).
Applies to Drupal 6.12 and Drupal 7.x-dev as of 2009-05-21 (applicable section of code is the same, except for minor source formatting differences - which is why two different patches).
Please include in main drupal/openid 7.x tree - or let me know if anything about the above is not correct or confusing, etc.
| Attachment | Size | Status | Test result | Operations |
|---|---|---|---|---|
| openid-link-meta-parsing-6.x.patch | 2.28 KB | Idle | FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch openid-link-meta-parsing-6.x.patch. | View details | Re-test |
| openid-link-meta-parsing-7.x.patch | 2.29 KB | Idle | Failed: 11853 passes, 0 fails, 53 exceptions | View details | Re-test |
Comments
#1
#2
The last submitted patch failed testing.
#3
Would it be better to parse the page using DOMDocument::loadHTML() and then find the elements by traversing the DOM tree? Parsing HTML using regular expressions is generally not as easy as it may seem.
#4
is the 6.x version failing because the patch is marked as applying to 7? Just tried it on the current DRUPAL-6 & it applies properly.
#5
Just tried applying vs. HEAD, and it goes smoothly. All tests are passing for me, with 0 exceptions. Perhaps something elsewhere in the codebase was causing the exceptions when this was tested in June?
#6
The last submitted patch failed testing.
#7
openid-link-meta-parsing-6.x.patch queued for re-testing.
#8
This patch uses DOMDocument::loadHTML() for parsing the HTML. Parsing HTML using regular expressions is generally a bad idea.
#9
Reroll.
#10
The problem with the first drupal 6.x patch is that the source file name is incorrect. Applying the hereby attached patch should work correctly and fix problems with the community id openid link provided.
#11
The last submitted patch, openid-link-meta-parsing-6.x.patch, failed testing.
#12
The patch I just uploaded shouldn't be tested against the latest revision from CVS. So I'm not surprised it doesn't pass. It's a drupal 6 patch, HEAD is drupal 7. I created a new issue for the 6.x branch: http://drupal.org/node/865098
#13
For now the D7 patch in #9 is the one up for review.
#14
#9: openid-parse-html-2.patch queued for re-testing.
#15
No progress after 2010? D7 and D6 without this patch still fail on my CommunityID provider
#16
Rolled new patch with git. Applies to D7 and D8.
#17
#18
+++ b/core/modules/openid/openid.inc@@ -371,24 +371,32 @@ function _openid_nonce() {
+ foreach ($html_element->head->link as $link) {
...
+ foreach ($html_element->head->meta as $meta) {
If the DOM unexpectedly does not contain a HEAD or LINK tags, this will trigger notices. Let's add a !empty() condition here before attempting to iterate.
+++ b/core/modules/openid/openid.inc@@ -371,24 +371,32 @@ function _openid_nonce() {
+ if (preg_match('/(\s|^)' . $rel . '(\s|$)/i', $link['rel'])) {
+ return trim($link['href']);
Missing preg_quote() to $rel. Note that the second argument to it should always be specified ('/' in this case).
For the actual regex, I believe you rather want:
@^(?:\s*)?([rel])(?:\s*)?$@/i
Lastly, I'd rather return the more explicit $matches[1].
9 days to next Drupal core point release.
#19
Good point. I added an isset() (it seems that empty() checks for text content inside the element).
No (if I understand you correctly). The rel attribute may contain a space-separated list of link types (see the HTML5 spec), so if rel is e.g. "foo", it should match <link rel="foo bar">. That is not supported by your pattern, right? But I did add the
?:.I'm not sure what you mean? The regexp operates on the rel attribute, but we return the href attribute. The only $matches[1] that appear in the patch is the old code that is removed.
#20
Thanks for clarifying and adjusting!
Looks ready to fly for me.
#21
Committed/pushed to 8.x.
Moving to CNR against Drupal 7 since there's already a patch. Reports above suggest this is a bug rather than a task.