Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Running 2 websites:
A: http://www.mywebsite.com (Drupal)
B: http://www.mywebsite.com/mysubsite (non-Drupal)
The Drupal website is multilanguage. Currently I'm unable to link to pages of the subsite because Pathologic is rewriting those URL's & adding the language-code to it.
For example: a link to http://www.mywebsite.com/mysubsite/my-page get's rewritten to http://www.mywebsite.com/fr/mysubsite/my-page .
What are my options?
Comments
Comment #1
Garrett Albright CreditAttribution: Garrett Albright commentedArg, yeah, that's a tough case. Could you maybe have the non-Drupal site use a separate subdomain instead of just a directory under your main domain name - so something like mysubsite.mywebsite.com?
Comment #2
rp7 CreditAttribution: rp7 commentedTought about that - but unfortunately not possible in my scenario (or atleast for the client's scenario). It's not 1 subsite, it's dozens.
How do you like to idea of implementing a hook, giving the possibility to skip pathologic rewrites (say for example only if the hook implementation returns FALSE)?
I'm willing to write a patch for this.
Comment #3
freblasty CreditAttribution: freblasty commentedThis patch will allow modules to implement their own url generator logic using the
hook_pathologic_url()
. If a module has no interest in a particular url then it canreturn NULL
which will cause pathologic to move on to the next hook.The Pathologic url generator logic is used as fallback if none of the implementing modules generated an url. This will ensure that websites already using the pathologic module won't break.
Comment #4
rp7 CreditAttribution: rp7 commented+1 for commit
Comment #5
Garrett Albright CreditAttribution: Garrett Albright commentedI'm just not sold on the idea for this yet… It seems like an unnecessary complication. Can you change my mind?
Comment #6
Vacilando CreditAttribution: Vacilando commented+1 for @freblasty's patch!
Comment #7
rp7 CreditAttribution: rp7 commentedOne use case is the one I described in my original post. I can imagine there are more websites which are set up like this.
Implementing this hook, I could just return the same URL & stop Pathologic from further processing it, which solves my problem.
Comment #8
rp7 CreditAttribution: rp7 commentedagainst latest dev
Comment #9
freblasty CreditAttribution: freblasty commentedAnother use case is contextual link generators: adding node references to a text using linkit will cause the text to contain anchor tags to node/nid. However when the pathologic filter processes the text it will ignore the language of the node and apply the default display language to the url (depending on your language detection setting).
Using the patch mentioned in #3 en #8 will allow module developers to write contextual link generators. In the above situation a link generator could check the node language and generate the approriate language url.
In short: pathologic can gain a contextual knowledge about the urls it is processing without having the worry about the implementation details.
Please note that the patch currently uses a hook but this could also be implemented as a plugin for pathologic.
Comment #10
Andrey Inkin CreditAttribution: Andrey Inkin commentedThe alternative solution would be to create a setting field 'Exclude URLs' and check for that field before processing a URL.
Here I've done it with regular expressions:
in settings you specify
/\/mysubsite\//
and in _pathologic_replace before it goes into all the processing it runs the URL agains this regex:Comment #11
freblasty CreditAttribution: freblasty commentedI still prefer a hook or plugin over the regex solution because it allows more flexibility from a developer standpoint.
Comment #12
Garrett Albright CreditAttribution: Garrett Albright commentedRaF and freblasty, instead of hooks which returned a completed URL, would it be acceptable for hooks that altered the parameters that Pathologic eventually passes to url() itself? So the code I'd add to Pathologic would look something like…
And your modules would implement something that looked like this:
I think this is a bit more subtle than the approach in the patch, but will likely still allow you to accomplish what you need to. Please let me know.
Comment #13
freblasty CreditAttribution: freblasty commented@Garrett the patch sole purpose was to give a clear view of the functionality we need from pathologic. The code changes mentioned in #12 should provide the same functionality.
Just got one remark/question, what if two or more modules are rewriting the url params? This could result in some weird looking urls. In the patch #3 and #8 only the first hook that returns a value different from null was processed and all the others get ignored.
Comment #14
Andrey Inkin CreditAttribution: Andrey Inkin commentedAccording to the issue description, you need to exclude certain URLs from parsing, and with the #10 approach it's done in a more efficient way - in the beginning of the function rather than deconstructing the URL and then constructing it back with no change.
Besides if the subsite URL has 'q' parameter in it, then the $url_params['path'] variable is going to have wrong value, which is essentially why we want to exclude the URL from processing.
Comment #15
Garrett Albright CreditAttribution: Garrett Albright commentedIt's a valid concern, but not one I expect to happen very often - I'll bet that a year after this code gets in the module, probably less than 5% going to be using any modules which implement this alter hook at all, much less more than one. And in the rare cases where more than one will be necessary, they won't necessarily conflict with each other. Still, the precedent of "let whatever modules want to alter this go ahead and do so, regardless of potential conflicts" is a more Drupally concept than "see if any modules want to alter this and stop after the first one."
Pathologic does account for the fact that the path might be in the 'q' query parameter;
…So if that's the reason that you're wanting to exclude paths that Pathologic processes, then either that's not a real problem, or there's a bug going on - perhaps do further testing and create another issue if things still seem broken.
That being said, perhaps there could be a way that an alter hook implementation says "just return the original path" and url() is bypassed.
Comment #16
Andrey Inkin CreditAttribution: Andrey Inkin commentedWhat it's doing there with the 'q' parameter {$parts['path'] = $parts['qparts']['q'];} is correct, because in Drupal 'q' parameter can only be a path. But in other applications 'q' can be anything, so there is a case where the URL is parsed wrongly:
http://my-drupal-site.com/non-drupal-subsite/?q=some-parameter
it becomes
http://my-drupal-site.com/some-parameter
I still think it's better to not go into URL parsing if you just need the original URL.
Comment #17
freblasty CreditAttribution: freblasty commentedYou're right it's a Drupally concept. When you think of it if a developer implements an alter hook then he/she is implementing some sort of special case and will therefor check for specific conditions which should reduce the chances of a conflict from happening.
Do mean like some sort of "do not process url" functionality?
Comment #18
Garrett Albright CreditAttribution: Garrett Albright commentedThis will be a case where a module that implements the hook I'm proposing could step in and stop Pathologic from inserting an altered URL (see below). No, it won't stop Pathologic from picking apart and trying to process the URL in the first place, but I don't really care - performance has always been somewhat of a secondary concern for Pathologic, since the output of text formats is (almost) always cached and only one node's content is filtered at a time in (almost) all cases anyway.
Something like that. I'm thinking that maybe we create a
$url_params['options']['use_original']
which is FALSE by default, but we'll check it after the alter hooks run, and if it's true, we just return the original URL without running anything throughurl()
. I'll add a$parts['original']
parameter too which will have the original unadulterated URL. So a hook implementation could do something like this:Comment #19
dankobiaka CreditAttribution: dankobiaka commented+1 for Andrey Inkin's solution.
The hook / alter approach is great for more complex problems, but why not use both? Being able to input a simple regex using the administration form is much simpler and adequate for most situations.
The issue Andrey has described with regards to the "q" parameter is big concern for us. It is commonplace for search engines to use "q" for their query parameter, but Pathologic strips it out.
Can we get a solution put together ASAP?
Comment #20
Garrett Albright CreditAttribution: Garrett Albright commentedBecause implementing both means implementing both and manually testing both and writing automated tests for both and supporting both. And seeing as how the hook approach encompasses the regex text box approach, I'd rather just implement/test/etc that. If you know how to write regular expressions, creating a Drupal module with a hook implementation should be well within your abilities as well, at any rate. So that is what I shall do, probably over the weekend.
"We?" Oh, sure! Let me just email you my bank transfer details, and I'll start coding as soon as I see the deposit come in. :P
Comment #21
dankobiaka CreditAttribution: dankobiaka commentedSeems like there is several people in this thread alone, myself included, that'd be more than happy to help you get this solution developed. My team have created a workaround this issue in our local version, I'd just like to have the module properly updated so we can continue to get updates for this useful module.
Comment #22
freblasty CreditAttribution: freblasty commented+1 for Garrett Albright's alter hook implementation.
Like the idea and should definitely be added to the hook alter functionality.
Comment #23
Garrett Albright CreditAttribution: Garrett Albright commentedBagged and tagged! I'll make a new release soon.
I've added a pathologic.api.php file which documents the new hook pretty well and has a couple examples of what you should be able to do with it. Everyone, please give it a try and report back here if you spot anything not working as expected. Also added a couple of tests to the .test file, if anyone cares.
Thanks to all.