When using the following URL, Drupal renders the front page instead of giving a 404 error.

index.php/content/article/9/page3:1/

I disabled path and pathauto with no effect.

Comments

jose reyero’s picture

Where did you get that url from? it doesn't seem like the ones produced by drupal core alone.

Also, I understand that you have some other contributed modules enabled. Thus, these shouldn't be filled as Drual core bugs unless they have nothing to do with other modules -which may not be updated yet, btw.

ByteEnable’s picture

It doesn't really matter where it came from, its invalid and should invoke 404, not render the front page. Only core modules were turned on.

Byte

dman’s picture

It certainly does matter where you got the URL from.

Due to the URL-resolution used in Drupal that allows arbitrary unspecified arguments to be passed to page generation items, any URL that extends underneath an existing valid URL will cascade back to find the best match.

That is, if /node/77 resolves OK, so does /node/77/edit (which is a supported special case) but so does /node/77/anything/you/like go looking for what you want - a module may have been implimented that catches that - and will eventually decide to give you the /node/77 page anyway.

This 'problem' is often brought up in amature SEO speculation and misses the point that the nearest recursive match is one of the most robust fallbacks possible in content management.

If you invent a bad URL like that example, the server figures that you were asking for :

index.php/content/article/9/page3:1
.. no luck, or
index.php/content/article/9/ with parameters (page3:1)
.. no luck, or
index.php/content/article/ with parameters (9,page3:1)
.. no luck, or
index.php/content/ with parameters (article,9,page3:1)
or
index.php with parameters (content,article,9,page3:1)
SCORE!
So index.php gets served, and happens to ignore the parameters you chose to give it.

This is how every single page request happens throughout Drupal.

Without it, for example, this URL would give a 404:
http://drupal.org/search/node/SEO
because it's not a real page, it's a result of calling the 'search' routine with the parameters 'node' and 'SEO'

All URLs are converted to query strings before Drupal resolution. If you really want to disable this behaviour, you can do so by removing the rewrite rule in .htaccess

ByteEnable’s picture

Nah, its not the server, its Drupal. Have you tested the URL? Which rewrite in .htaccess is responsible for this behaviour?

Byte

jose reyero’s picture

Priority: Normal » Minor

Agree with dman,

You really can have all kinds of fabricated urls that don't make full sense but still take you to a valid page.

So I wouldn't call this a bug. For the moment, I change the priority to 'minor'

ByteEnable’s picture

Have any of you *actually* tried to render the URL on Drupal 5.0 Beta 2? The page does not render well :). All these comments you people are making are just assumptions and suppositions, not hard facts about Drupal 5.0 Beta 2.

Byte

RobRoy’s picture

Version: 5.0-beta2 » 5.x-dev

This is minor since it's pretty uncommon, but still worthwhile checking out. I went to index.php/content/article/9/page3:1/ on my site and the theme does not get invoked so not only is it unthemed content, but it is the front page instead of a 404. Why does this matter? Let's say you were moving over from an old CMS that generated URLs like this, any old Google references will generate tons of duplicated content for your site.

So this isn't super likely, but it is a bug and the SEO implications are pretty gnarly if you were moving from some crap CMS to Drupal.

RobRoy’s picture

Version: 5.x-dev » 6.x-dev

Still get this on a fresh copy of HEAD.

dman’s picture

I see that with some server settings, (but not all
/index.php/bad/path/
and
/index/bad/path/

does indeed behave differently (and render wrongly) than the usual catch-all
/anything-else/bad/path/

However, seeing as my deep explanation into the practical mechanics of Drupal URL-resolution was just 'supposition' whilst ByteEnable claims to have the 'hard facts', I'll leave it to someone else to discover the easy fix for this for themselves. Meh.

brian_may’s picture

This bug means that broken links to other pages go undetected.

For example, our website had a link to node/47 instead of node/74.

If Drupal would report 404 instead of defaulting to the displaying the default front page, then tools such as webcheck can pick up on the broken link. Unfortunately, it didn't so a number of broken links went undetected.

Also on our website we have redefined the front page to point elsewhere, the default page isn't the front page.

My solution for now has been to change the node_page_default() function in node.module to

function node_page_default() {

  return drupal_not_found();

   ...
}

Presumably you could also change it so it only displays node_page_default() if args(1)=="", so it doesn't mess up if your front page requires this functionality (not tested).

xqus’s picture

First off all index.php/content/article/9/page3:1/ is not a valid Drupal URL.
It should be index.php?q=content/article/9/page3:1/ or just domain.org/content/article/9/page3:1/

The reason this URL returns the front page is probably because the web server has a look back feature (Apache certainly has) so that if it can't fin a file, is goes up one level and tries that one before returning a 404. So in you're example if will end up just serving index.php, which is indeed the front page.

If you are using Apache, you can try to set

AcceptPathInfo Off

in you're .htaccess file.

RobRoy’s picture

Status: Active » Closed (won't fix)

Looks like #11 will do the trick so marking this won't fix as it's actually a web server issue, not Drupal.

brian_may’s picture

#11 does nothing to answer my concern in #10. If a user trys to reference a node that doesn't exist, then Drupal should produce a 404 error. Instead it jumps to the front page and pretends the node exists. This is not an Apache issue, Apache doesn't know what nodes numbers exist. Only Drupal does.

However I see that (a) several new releases of Drupal have been released since I last looked, and (b) maybe #10 is a different bug to the rest of the bug report. If after testing the latest version I find the problem discussed in #10 still exists I will open a new bug report.