In section 4.2 of RFC 3986 scheme-less URLs are described as being relative to the URL of the current document. All major browsers support this URL format.

Pathologic currently interprets scheme-less URLs as local paths since the php parse_url() function is unable to parse scheme-less URLs.

I think that scheme-less URLs could be supported in Pathologic by prefixing the scheme on to URLs which start with '//' before passing it to parse_url().

CommentFileSizeAuthor
#1 schemeless-1617944-1.patch640 bytesJamesK
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

JamesK’s picture

Status: Active » Needs review
FileSize
640 bytes
Garrett Albright’s picture

Huh, just had another new issue about the same sort of thing. Was this discussed in an article on a popular site recently or something?

At any rate, I agree Pathologic needs to be adapted to support these sorts of URLs, but I'm wondering about your rationale for prefixing http:// or https:// instead of just stopping processing and passing back through the path with // prefix, especially since the output of Pathologic will be cached - so if the node is first viewed over an HTTPS connection, all the paths will be prefixed with https:// even if following visitors view it via HTTP, and vice versa, which seems to defeat the purpose of these sorts of URLs in the first place. Am I missing something?

gmclelland’s picture

I may not understand your question correctly, but it is my understanding that if we used Protocol-relative URLs it wouldn't matter if it was cached as long as it was cached as //domain.com/something then the browser would determine whether or not to use http or https.

JamesK’s picture

Was this discussed in an article on a popular site recently or something?

I'm not sure, I just started using it Pathologic and realized it was breaking my URLs, so I fixed it.

I would like to process the scheme-less URLs through pathologic so that they will be checked against the list of base paths to be recognized as local.

Caching shouldn't be a problem because, as far as I can tell, Drupal maintains a separate cache entry for content requested over HTTPS. In my usage, I haven't come across any "document contains insecure content" errors so far.

JamesK’s picture

Re #3,

if we used Protocol-relative URLs it wouldn't matter if it was cached as long as it was cached as //domain.com/something

Pathologic outputs the full URL including scheme, but as I mention in #4, it shouldn't matter.

Garrett Albright’s picture

All right, added some code along with a test to support this. But I am doing this by just passing through the entire URL without adding a scheme to it (or changing it in any other way).

Garrett Albright’s picture

Status: Needs review » Fixed

Derp, forgot to change status.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.