Hello,

Perhaps someone can help me out. I found this post that summarizes my issue, but it doesn't really explain the way to resolve it:
http://www.ubercart.org/forum/support/19084/stop_catalog_category_redirect

I have a client using Drupal 6 + Ubercart + the Catalog submodule of Ubercart to display categories of products. If someone goes to a good URL there are no problems, so for example if they go to http://www.mystore.com/catalog/t-shirts, and that exists, then no issues.

However, if someone goes to a URL that does NOT exist, let's say http://www.mystore.com/catalog/something-that-does-not-exist, then it takes the user to http://www.mystore.com/catalog rather than a 404 not found page. The problem I am having with this, is that somehow Google has indexed a large handful of bad URLs, possibly from a previous version of the site, and when we go to those pages, they all redirect to the same place: mystore.com/catalog.

Google doesn't like this because they think we have a rather large number of pages with the exact same page content (in fact, the number of pages is infinite: mystore.com/catalog/mickey-mouse, mystore.com/catalog/m-i-s-s-i-s-s-i-p-p-i, and so on). We have used the redirect module to handle a handful of them, but it's not a total fix as new ones are found, and it doesn't alter the behavior.

Furthermore, I don't like this behavior either. I am using a very nifty module called search404, that intelligently suggests what actual page a user might be looking for based on keywords it finds in the URL a user tried. So if there is a page at mystore.com/t-shirts, and the user tried mystore.com/shirts, then search404 would suggest the t-shirts page. But when it just redirects to /catalog instead of serving up a 404 page, I lose that cool feature.

So, very long explanation. Sorry about that. But the question I have is, how can I alter the Catalog submodule of Ubercart in Drupal 6 to NOT behave this way? I want it to serve up a 404 instead of going to /catalog, and I'm to the point where I am willing to hack the uc_catalog module to do it (gasp).

My apologies if this is a duplicate post.

Thanks!

Comments

longwave’s picture

Status: Active » Fixed

This is a feature of Drupal itself, rather than a bug in Ubercart. Paths without aliases resolve to their parent path if the child does not exist - for example /catalog and /node/% are unaliased paths, so if you visit /catalog/something-that-does-not-exist and /node/1/something-that-does-not-exist you will see the parent pages in both cases.

Instead of fixing this by returning 404, you might want to try installing a module that provides a "canonical URL" such as canonical_url or nodewords. This should tell Google and other search engines that /catalog and /catalog/something-that-does-not-exist are the same page and the correct URL is just /catalog, and eventually they should forget about the unwanted pages. You can also use xmlsitemap module to give further hints to search engines about the URLs they should index.

If you really want to return 404s instead for these URLs, there is a sandbox module which claims to be able to do this at https://drupal.org/sandbox/pwolanin/1197046 but I have not tested this.

longwave’s picture

Alternatively a quick hack to fix this in 6.x might be to override theme_uc_catalog_browse() in your template.php, and call drupal_not_found() or redirect to a 404 page if arg(1) is not a number, I think this should also work.

TR’s picture

Status: Fixed » Active

Well, that's a Drupal thing that you don't like, not an Ubercart thing. Try it yourself; this issue has a URL of https://drupal.org/node/2083667 - go ahead and add on some bogus path like you did with the catalog, for instance https://drupal.org/node/2083667/m-i-s-s-i-s-s-i-p-p-i. What happens?

In fact, more generally, that's the way HTTP URLs work. To do something otherwise is best handled within Drupal core or by a contributed module.

404s are not great for your page ranking either. The best thing to do in your situation IMO is to add redirect entries in your .htaccess for those improperly indexed pages to force Google and other bots to remove the bad links from their index. As an example:

RewriteRule    ^catalog/bad-link$    http://example.com/catalog/good-link    [R=301,L]

or

RewriteRule    ^catalog/bad-link$    http://example.com/obsolete-product    [R=301,L]

(where /obsolete-product is a page you create that explains this is no longer a valid product and offers links to the catalog etc.)

And yeah, there's probably also a way to hack Ubercart to reject bad terms, but that's not something we're going to put into Ubercart for Drupal 6 because that version is almost end-of-life. And the catalog has been re-written in Views for Drupal 7, so changing this behavior falls to Views, not Ubercart, in D7.

TR’s picture

Status: Active » Fixed

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.