Closed (won't fix)
Project:
Drupal core
Version:
6.x-dev
Component:
base system
Priority:
Minor
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
6 Dec 2006 at 03:02 UTC
Updated:
12 Apr 2009 at 18:05 UTC
When using the following URL, Drupal renders the front page instead of giving a 404 error.
index.php/content/article/9/page3:1/
I disabled path and pathauto with no effect.
Comments
Comment #1
jose reyero commentedWhere did you get that url from? it doesn't seem like the ones produced by drupal core alone.
Also, I understand that you have some other contributed modules enabled. Thus, these shouldn't be filled as Drual core bugs unless they have nothing to do with other modules -which may not be updated yet, btw.
Comment #2
ByteEnable commentedIt doesn't really matter where it came from, its invalid and should invoke 404, not render the front page. Only core modules were turned on.
Byte
Comment #3
dman commentedIt certainly does matter where you got the URL from.
Due to the URL-resolution used in Drupal that allows arbitrary unspecified arguments to be passed to page generation items, any URL that extends underneath an existing valid URL will cascade back to find the best match.
That is, if /node/77 resolves OK, so does /node/77/edit (which is a supported special case) but so does /node/77/anything/you/like go looking for what you want - a module may have been implimented that catches that - and will eventually decide to give you the /node/77 page anyway.
This 'problem' is often brought up in amature SEO speculation and misses the point that the nearest recursive match is one of the most robust fallbacks possible in content management.
If you invent a bad URL like that example, the server figures that you were asking for :
index.php/content/article/9/page3:1
.. no luck, or
index.php/content/article/9/ with parameters (page3:1)
.. no luck, or
index.php/content/article/ with parameters (9,page3:1)
.. no luck, or
index.php/content/ with parameters (article,9,page3:1)
or
index.php with parameters (content,article,9,page3:1)
SCORE!
So index.php gets served, and happens to ignore the parameters you chose to give it.
This is how every single page request happens throughout Drupal.
Without it, for example, this URL would give a 404:
http://drupal.org/search/node/SEO
because it's not a real page, it's a result of calling the 'search' routine with the parameters 'node' and 'SEO'
All URLs are converted to query strings before Drupal resolution. If you really want to disable this behaviour, you can do so by removing the rewrite rule in .htaccess
Comment #4
ByteEnable commentedNah, its not the server, its Drupal. Have you tested the URL? Which rewrite in .htaccess is responsible for this behaviour?
Byte
Comment #5
jose reyero commentedAgree with dman,
You really can have all kinds of fabricated urls that don't make full sense but still take you to a valid page.
So I wouldn't call this a bug. For the moment, I change the priority to 'minor'
Comment #6
ByteEnable commentedHave any of you *actually* tried to render the URL on Drupal 5.0 Beta 2? The page does not render well :). All these comments you people are making are just assumptions and suppositions, not hard facts about Drupal 5.0 Beta 2.
Byte
Comment #7
RobRoy commentedThis is minor since it's pretty uncommon, but still worthwhile checking out. I went to index.php/content/article/9/page3:1/ on my site and the theme does not get invoked so not only is it unthemed content, but it is the front page instead of a 404. Why does this matter? Let's say you were moving over from an old CMS that generated URLs like this, any old Google references will generate tons of duplicated content for your site.
So this isn't super likely, but it is a bug and the SEO implications are pretty gnarly if you were moving from some crap CMS to Drupal.
Comment #8
RobRoy commentedStill get this on a fresh copy of HEAD.
Comment #9
dman commentedI see that with some server settings, (but not all
/index.php/bad/path/
and
/index/bad/path/
does indeed behave differently (and render wrongly) than the usual catch-all
/anything-else/bad/path/
However, seeing as my deep explanation into the practical mechanics of Drupal URL-resolution was just 'supposition' whilst ByteEnable claims to have the 'hard facts', I'll leave it to someone else to discover the easy fix for this for themselves. Meh.
Comment #10
brian_may commentedThis bug means that broken links to other pages go undetected.
For example, our website had a link to node/47 instead of node/74.
If Drupal would report 404 instead of defaulting to the displaying the default front page, then tools such as webcheck can pick up on the broken link. Unfortunately, it didn't so a number of broken links went undetected.
Also on our website we have redefined the front page to point elsewhere, the default page isn't the front page.
My solution for now has been to change the node_page_default() function in node.module to
Presumably you could also change it so it only displays node_page_default() if args(1)=="", so it doesn't mess up if your front page requires this functionality (not tested).
Comment #11
xqus commentedFirst off all index.php/content/article/9/page3:1/ is not a valid Drupal URL.
It should be index.php?q=content/article/9/page3:1/ or just domain.org/content/article/9/page3:1/
The reason this URL returns the front page is probably because the web server has a look back feature (Apache certainly has) so that if it can't fin a file, is goes up one level and tries that one before returning a 404. So in you're example if will end up just serving index.php, which is indeed the front page.
If you are using Apache, you can try to set
AcceptPathInfo Offin you're .htaccess file.
Comment #12
RobRoy commentedLooks like #11 will do the trick so marking this won't fix as it's actually a web server issue, not Drupal.
Comment #13
brian_may commented#11 does nothing to answer my concern in #10. If a user trys to reference a node that doesn't exist, then Drupal should produce a 404 error. Instead it jumps to the front page and pretends the node exists. This is not an Apache issue, Apache doesn't know what nodes numbers exist. Only Drupal does.
However I see that (a) several new releases of Drupal have been released since I last looked, and (b) maybe #10 is a different bug to the rest of the bug report. If after testing the latest version I find the problem discussed in #10 still exists I will open a new bug report.