Problem/Motivation
Drupal provides a way to convert a relative url to an absolute url, but no way to convert an absolute url to a relative url. Having the (limited) ability to convert an absolute url to a path on the current site is necessary in order to parse HTML messages for inline image attachments. (See #46).
bfroehle said in #15, .a function like
url_to_path() would be useful to the Central Authentication Service module.
It may also be helpful in a future 7.x/8.x port of filter_html_secure_image_process(), which duplicates much of this functionality.
Proposed resolution
it turns out that most of the necessary logic is contained within the conf_path() function, but that generally-useful part needs to be segregated from the rest of the function in order to be re-usable.
The patch under review modifies includes/bootstrap.inc by moving the re-usable code from conf_path() to a new find_conf_path() function.
The remainder of the needed functionality may be provided by a contrib module.
Remaining tasks
The API change node needs to be reviewed.
User interface changes
None.
API changes
No changes to existing code is needed. A new function is made available for developers who need it.
Original report by pillarsdotnet:
Drupal provides a way to convert a relative url to an absolute url, but no way to convert an absolute url to a relative url. This would be helpful, for example, when parsing an HTML mail body for image urls that could be converted to inline attachments.
| Comment | File | Size | Author |
|---|---|---|---|
| #62 | fix-conf_path-1113588-followup-62.patch | 910 bytes | pillarsdotnet |
| #53 | find_conf_path-1113588-53.patch | 2.17 KB | pillarsdotnet |
| #47 | url_to_path-1113588-47.patch | 5.49 KB | pillarsdotnet |
| #38 | url_to_path-1113588-38.patch | 5.2 KB | pillarsdotnet |
| #33 | url_to_path-1113588-33-d8.patch | 5.05 KB | pillarsdotnet |
Comments
Comment #1
pillarsdotnet commented(EDITED to reflect the status of the latest patch submission)
Re-factor the
conf_path()function to createfind_conf_path(),url_to_path(), andurl_to_realpath()functions.Comment #2
pillarsdotnet commentedComment #3
damien tournoud commentedTriaging.
Comment #4
pillarsdotnet commented@Damien -- Of course I know this is 8.x but the patch applies cleanly to both and I wanted a testbot check.
Revised patch fixes a couple of whitespace errors.
Comment #5
pillarsdotnet commentedCan we please leave at 7.x until the testbot returns something?
Comment #7
pillarsdotnet commentedAnother typo, sigh.
Comment #8
pillarsdotnet commentedCool. Testbot passed. Once we have a working 8.x testbot, I'll re-roll and re-submit, but for now the code is identical.
Comment #11
joachim commentedSeems a good idea in principle.
Just a few niggles...
Ew.
There's now conf_path and _conf_path? What's the difference?
Only one space after a full stop.
Powered by Dreditor.
Comment #12
pillarsdotnet commentedSuggestions taken.
P.S. -- Joachim: Please help me understand your "ew" response. Standards? Coding style? Documentation? -- Thanks.
P.P.S. --The function should probably be named
url_to_path()since it takes a (possibly absolute) URL and returns a local path, or FALSE if no equivalent local path could exist.Comment #13
pillarsdotnet commentedGo, testbot!
Comment #14
joachim commentedew was to the assigning and returning in the single line. It's quite hard to understand what's going on.
I think this is clearer:
$foo = complex($expression);
return $foo;
although unless $foo was a reference, just this will do:
return complex($expression);
Though the first version makes debugging easier! ;)
Comment #15
bfroehle commentedA function likeurl_to_path()would be useful to the Central Authentication Service module.Currently we have acaspath which redirects to the single sign-on server and then back to Drupal. We add a destination parameter to the return address so we can use drupal_goto() to redirect to a useful page after the login. Usingurl_to_path(referer_uri())would be great for getting the destination parameter based upon the HTTP Referrer.We found that refactoring the CAS code and passing a destination=... query string provided much better results in general. I think including this function in core would only encourage developers to use it.
Comment #15.0
pillarsdotnet commentedUse issue summary template.
Comment #16
bfroehle commentedShouldn't the default be array(), given the later use of $local?
drupal_parse_url() does not function the same as parse_url(). In particular, for external URLs it does not return the host, port, scheme, etc separately.
Powered by Dreditor.
Comment #17
bfroehle commentedAlso, what if Drupal is installed in a subdirectory. It's not clear to me that this would return the proper local path.
Comment #18
pillarsdotnet commentedSuggestions implemented and name changed.
Comment #19
pillarsdotnet commentedFound and fixed a bug, and wrapped the return value in drupal_realpath() for good measure.
Comment #20
bfroehle commentedWhat's the logic for this return? In particular, why is there a leading slash?
Based upon the name, I would have assumed that
$path == url_to_path(url($path))for most $paths.Powered by Dreditor.
Comment #21
pillarsdotnet commentedBetter doxygen comments.
The returned value should be directly usable by
fopen(),file_exists(), and the like.Perhaps this should use
realpath()instead ofdrupal_realpath()?Perhaps a better name would be
url_to_local_path()?Comment #22
pillarsdotnet commentedDid some testing and found I was using
parse_url()incorrectly.Replaced
drupal_realpath()withrealpath()because I can't imagine a need for stream wrappers with this function.@#20 -- Almost. Actually, for most paths,
url_to_path(url($path)) == realpath('./' . $path)So maybe this function should be called
url_to_realpath()?Comment #23
pillarsdotnet commentedDo we have a working D8 testbot now?
Comment #24
pillarsdotnet commentedAlternatively, why not supply both url_to_path() and url_to_realpath() functions?
Comment #25
damien tournoud commentedThis needs to take into account the base root directory, Drupal is not always install as the site root.
Also:
This looks like
DRUPAL_ROOT . '/' . $script_nameto me.Comment #26
pillarsdotnet commented@Damien -- I would expect that
realpath($path) == DRUPAL_ROOT . '/' . $pathfor most relative paths, but callingrealpath()has the advantage of resolving symlinks to a canonical path.Attached patch changes
substr($path, 1)toltrim($path, '/')because $path may or may not start with a leading slash.Here's the goal. If Drupal is installed at http://example.com/drupal and
conf_path()is'drupal/sites/example.com'then:url_to_path('http://external.com/drupal/foo')returnsFALSEurl_to_path('http://www.example.com/drupal/foo')returns'drupal/foo'url_to_path('http://example.com/foo')returns'foo'url_to_path('drupal/foo')returns'drupal/foo'url_to_path('/foo')returns'foo'#3 and #5 should perhaps return FALSE, but that would require logic not provided by the existing
conf_path().Comment #27
pillarsdotnet commentedComment #28
pillarsdotnet commented@Damien -- You're right, it only adds one line of code to take
base_path()into account.Revised patch so that if
conf_path()is'drupal/sites/example.com'as before,url_to_path('http://external.com/drupal/foo')returnsFALSEurl_to_path('http://www.example.com/drupal/foo')returns'drupal/foo'url_to_path('http://example.com/foo')returnsFALSEurl_to_path('drupal/foo')returns'drupal/foo'url_to_path('/foo')returnsFALSEurl_to_path('/drupal/foo')returns'drupal/foo'Comment #29
pillarsdotnet commentedI really need to write some tests.
As an aside,
Why
drupal_static()rather than plainstatic? Surely the value can't change within a single page request, can it?Comment #30
pillarsdotnet commentedAccording to rfay it isn't (currently) possible to test this on Drupal testbots, so removing "needs tests" tag.
See #1117308: It should be possible to test things that depend on base_path() and conf_path()
Comment #31
pillarsdotnet commentedFound and fixed another minor logic error.
Comment #32
catchThis needs both profiling of the modified functions, and ideally a comparison of strace output from apache before and after. See #1055862: Require sites.php to opt-in for multi-site support/functionality and #1055856: Optimize settings.php discovery for background.
At a quick look it doesn't look like it would conflict with those patches, but I need to look properly and haven't got time for that this morning.
Comment #33
pillarsdotnet commentedUpdated doxygen to acknowledge that realpath( string $path ) does a stat and returns
FALSEif the$pathdoes not exist. Also moved url_to_realpath() after url_to_path(). No other changes.@catch -- No profiling is necessary. The only core change is that part of conf_path() moves to a new function called find_conf_path(). If it kills performance to add just one function call, then I need to switch programming languages.
Comment #34
pillarsdotnet commentedTagging as per http://drupal.org/node/1050616#comment-4327740
Comment #35
catchWhy is url_to_realpath() in bootstrap.inc compared to common.inc? As far as I can see it's not called anywhere in core.
Comment #36
pillarsdotnet commented@catch -- huh? What are you looking at?
EDIT: -- Okay, I can see where, looking at the issue description and (only) the top of the patch, you'd get confused.
The patch modifies both bootstrap.inc and common.inc.
In bootstrap.inc, most of conf_path() is moved to a new function, find_conf_path(). The patched conf_path() calls find_conf_path().
In common.inc, two new functions are added: url_to_path() and url_to_realpath().
When writing url_to_path(), I needed part of the conf_path() function. I didn't want to copy part of conf_path() into url_to_path(), so I used that part to make a new find_conf_path(), instead.
Better?
Comment #37
pillarsdotnet commentedSee related issue #1160374: emogrifier path issue when base_path() and DRUPAL_ROOT contain same 'sitename' folder
Comment #38
pillarsdotnet commentedIt turns out that I can't use
DRUPAL_ROOTin conjunction withbase_path().Updated patch for 8.x:
Comment #39
pillarsdotnet commentedNote that the code in this issue has been used by Emogrifier since April 5, 2011:
That's 172 users, according to the project usage statistics.
Does that qualify as "Reviewed and Tested by the Community" ?
Comment #40
pillarsdotnet commentedComment #41
pillarsdotnet commentedApparently not, (sigh). Tagging instead.
Comment #42
geerlingguy commentedSubscribe, can be useful. Don't have time to test right now though.
Comment #43
pillarsdotnet commented#38: url_to_path-1113588-38.patch queued for re-testing.
Comment #44
pillarsdotnet commented#38: url_to_path-1113588-38.patch queued for re-testing.
Comment #45
damien tournoud commentedQuick review:
- The code style needs to be fixed everywhere (docblocks are not correctly formatted, ternary operators should not go to the line, etc.)
- url_to_path() is buggy as it uses $base_path which is the base path *of the current site* (ie. it is a part of $base_url). The base path could be different for other sites
- url_to_realpath() has very limited added value and should just go away
Other then that, this looks like a useful refactoring to me.
Comment #46
pillarsdotnet commentedI'll review and try to fix (I've learned some things since writing that code), but if I fail to see the same things you're seeing, I'd appreciate some specific examples.
Well, that's exactly what it's supposed to do. The object is to answer the question, "Could this url resolve to a file on the current site?" I'm not concerned with other sites that coincidentally share the same server.
Actually, that's the only function that is necessary for my use-case.
mailmime.inc:So if:
then the file is attached.
Comment #47
pillarsdotnet commentedOkay, I've tried to address your concerns, but I don't understand what is wrong about the formatting of my doc blocks. Can you point me to a specific example that violates coding standards?
Comment #48
pillarsdotnet commentedComment #49
bfroehle commentedIt is hard for me to imagine other (legitimate) use cases except the one you provide. In other circumstances the developer would often be much better off refactoring their code to avoid this kludge.
Comment #50
pillarsdotnet commentedI'm sorry that you find my solution inelegant. Can you suggest an alternative way for me to safely resolve a URL to a file within the current site? Note that unlike the solution @sun provides in his Local image input filter, I want to include the possibility that several domains point to the same Drupal site.
It is hard for me to imagine a case where copying 90% of a core function into module code just to avoid running the other 10% would be preferable to refactoring that core function into reusable parts.
Nevertheless, I am maintaining two such copies, both in Emogrifier and in Mail MIME.
I suppose that I could write an entire module just to supply the missing function, but that seems ... ugly.
Comment #50.0
pillarsdotnet commentedAcknowledge bfroehle's redaction and add another possible use-case.
Comment #51
lars toomre commented@pillarsdotnet - re: docblock formatting -
My understanding is that is that there should be no blank lines between two or more @param definitions. Also, there always should be a blank line before an @return definition. Finally, the first line of a docblock should begin with an active verb.
Hence, the docblock for find_conf_path() should include 'Finds ' and deletion of the blank line after first @param. The other two docblocks need a blank line before @return.
Comment #52
damien tournoud commentedI see.
Well, given that limited use case and the comments above, I recommend only getting in the
bootstrap.incpart.Comment #52.0
pillarsdotnet commentedCorrected awkward grammar.
Comment #53
pillarsdotnet commentedFair enough. Re-rolled. (Thanks, Lars!)
Note that it would be nice to consider other sites that coincidentally share the same server, but doing so with a reasonable level of confidence would require (at least) a DNS lookup starting from the global root servers, and possibly some assumptions regarding the webserver and filesystem configuration.
Comment #54
pillarsdotnet commentedUpdated title and summary.
Comment #55
damien tournoud commentedOk here.
Comment #55.0
damien tournoud commentedUpdated in response to Damien's suggestion in #52.
Comment #55.1
pillarsdotnet commentedLinked to bfroehle's redacted comment.
Comment #55.2
pillarsdotnet commentedUpdated to reflect RTBC status.
Comment #56
pillarsdotnet commentedThanks. Added an API change node (please review -- I don't know what belongs in most of the fields), updated the summary, and removed the "Needs committer feedback" tag.
Comment #57
bfroehle commentedObviously I was referring to your use in the one "non-other" circumstances. Nonetheless, it is too bad that there isn't another more robust way to accomplish the same thing --- like a custom theme_image which would alter the URL and record that the image should be attached.
Comment #58
pillarsdotnet commentedAre you referrring to this function?
Comment #59
catchCommitted to 8.x, thanks!
Comment #60
jessebeach commentedI noticed some strange behavior with multisite while installing a fresh site off of d8 head (commit 8157c28).
I have two sites in my sites directory:
d8.drupal.dev
default
When I tried to reinstall at http://d8.drupal.dev/install.php, after dropping the tables in my drupal_d8 database, I got a message from the installer that my site was already installed. On a whim, I changed the name of my default directory to default2 and tried installing. This brought me to the installer.
Looking further, it seems that this patch (commit b93f2bc), might be causing this behavior where the default site is being used in lieu of the more specific site I've indicated.
This is what I see in bootstrap.inc in debug
https://skitch.com/jesse.beach/fh6b1/netbeans-ide-7.0
The for loop is never entered and it jumps right to default:
https://skitch.com/jesse.beach/fh6nf/drupal-dev-d8-netbeans-ide-7.0
The more specific site is not used. The default site settings.php file gets loaded.
Comment #61
pillarsdotnet commented@jessebeach:
Please open a new issue and link it here. Also please verify whether reversing the commit patch solves your problem, as I do not see any possible way that your problem could be related to the change made in this issue.Note that none of the code lines referenced by your screenshots have been changed; they were just moved without modification from one place to another.EDIT: Retracted -- see next comment, and thanks for the report!
Comment #62
pillarsdotnet commentedBah! Now I see the problem. The parameters
$script_nameand$http_hostsomehow got reversed. (Scrambling to find/fix collateral breakage...)Comment #63
ksenzeeThat patch fixes the problem for me. I'm wondering how it got past testbot. Do we not have any multisite tests? And if not, how the heck would we write any?
Comment #64
jessebeach commentedRe #62:
That did it! I'm able to install from the specific site settings.php file now. Thanks @pillarsdotnet!
Comment #65
pillarsdotnet commentedNo.
Dunno. I don't see any open issues. Feel free to open one.
EDIT: Nevermind. Opened #1294974: PIFR should provide a way to test multisite installations.
Comment #66
catchI have a feeling we don't and potentially may never have multi-site tests - can someone open a followup issue to confirm that is really true.
I'm heading off so can't commit tonight, but if it's still here in the morning will rollback or commit whatever is here unless webchick beats me too it.
Comment #67
webchickCool, committed and pushed #62 to 8.x. I'll be around for the next 12-14 hours if anything else happens.
Thanks for the quick turnaround on the fix, pillarsdotnet!
Comment #68
webchick#1295046: Determine testing status of multisite installs (and if it is possible) filed for the follow-up. I didn't know if it should be major or not, so I left it at normal.
Comment #69
pillarsdotnet commentedAt least one of the following should be closed as a duplicate:
#1117308: It should be possible to test things that depend on base_path() and conf_path()
#1294974: PIFR should provide a way to test multisite installations.
#1295046: Determine testing status of multisite installs (and if it is possible)
Comment #70.0
(not verified) commentedAdded API change node.