Hi,

I have a live site in which I encountered a bug every now and then but since I could never replicate or find stuff in the logs I kinda gave up. Now that I have a similar fresh install with the same search module set up in which I encountered the exact same issues I started digging.

Like I said, both site have the same modules and setup concerning search;
- i18n
- search (core)
- search config
- search 404
- global redirect (not 100% if this is part of the issue)
- search lucene api
- search lucene api node filter
- search lucene content
- search lucene did you mean

This issue is as followed; when entering an url that is invalid I get 3 warnings starting with 'The page you requested does not exist. For your convenience, a search was performed using the query' Yet the query differs in each error;

  • qqqqqqqqqqqqwwq1q1aaaa1a123
  • OR a OR href OR nl OR search OR luceneapi OR node OR s1123 OR class OR luceneapi OR dym OR suggestion OR s1123 OR a OR
  • a OR href OR nl OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR sealer OR luceneapi OR node OR s1123 OR class OR luceneapi OR dym OR suggestion OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR sealer OR a OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR search OR lucentezza OR node OR s1123 OR class OR luceneapi OR dym OR suggestion OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR lucentezza OR a OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR search OR luceneapi OR nodig OR s1123 OR class OR luceneapi OR dym OR suggestion OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR nodig OR a OR s1123 OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR search OR luceneapi OR node OR s1123 OR classy OR luceneapi OR dym OR suggestion OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR classy OR a OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR search OR luceneapi OR node OR s1123 OR class OR lucentezza OR dym OR suggestion OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR lucentezza OR a OR dym OR a OR href OR nl OR search OR luceneapi OR node OR a OR href OR nl OR search OR luceneapi OR node OR s1123 OR class OR luceneapi OR dym OR question OR s1123 OR a OR class OR luceneapi OR dym OR suggestion OR question OR a OR s1123 OR a.

Also the url I get redirect to is as followed;

http://dev.website.com/nl/OR%20a%20OR%20href%20OR%20nl%20OR%20%3Ca%20hre...

Now, this is important!, to replicate you can't try out the same invalid url all the time, you have to test with a 'fresh' invalid url.

As I started digging in search404.module I found on line #195 the code search404_goto($lucene_dym_path); If I read the preg_replace function above correctly the $lucene_dym_path variable doesn't take the possibility of i18n into account.

I SUCK at regular expressions, but I've changed the preg_replace on line #194 to;

$lucene_dym_path = preg_replace('/^.*href="\/?[a-z]{0,64}\/search\/luceneapi_node\/([^"]*)".*$/i', '$1', $suggestions);

This takes a possible language path prefix into account of max 64 characters as can be set on /en/admin/settings/language/edit/en

Now this prevents this issue of an url like <a href="/nl/search/luceneapi_node/OR a OR href OR nl OR sealer OR luceneapi OR node OR s1123 OR class OR luceneapi OR dym OR suggestion OR s1123 OR a OR" class="luceneapi-dym-suggestion">sealer</a> being passed to the search404_goto() function!

But there is still an issue I can't put my finger on. When using a totally invalid url like example.com/en/qwqwqwqwq to which Lucene DYM knows no answer things are fine. I'm getting redirected to a 404 page with a single error message 'The page you requested does not exist. For your convenience, a search was performed using the query xxxppp.'

Now if I use a invalid url that is similar to an existing one like example.com/en/producst I get redirected to example.com/en/products with an error message 'The page you requested does not exist. For your convenience, a search was performed using the query productx.' But like I said I am redirected to example.com/en/products and not a search resul;ts page. This eventhough neither 'Jump directly to the search result when there is only one result' or 'Jump directly to the first search result even when there are multiple results' is checked?

Hope I've explained well. I know Lucene has reached EOL but I love this module in combination with Search404 and have it implemented on multiple sites so I hope to fix this!

Cheers

Comments

zyxware’s picture

Thanks Bartezz for the detailed bug report. I will need more help from you to get a working solution to this problem.

The regex you wrote assumes that there is going to be i18n set up. There would be other sites that do not have i18n set up. So that has to be accounted for as well.

Regarding the second problem can you try a debug_backtrace in drupal_goto and see who does this redirection you are seeing?

Bartezz’s picture

Hi,

Thanx for getting back to me! Can you tell me exactly where and how you want me to place the debug_backtrace()?

Cheers

zyxware’s picture

@Bartezz - If you can print debug_backtrace inside drupal_goto function in common.inc you can see where you are being redirected from.

zyxware’s picture

Priority: Major » Normal
Status: Active » Postponed (maintainer needs more info)

Any more information on this? Please also check it with the latest version of the module.

Bartezz’s picture

Sorry, have been real busy, will comment back on this asap!

zyxware’s picture

Category: bug » support
Status: Postponed (maintainer needs more info) » Closed (fixed)

Assuming that this problem no longer exists, I am closing this ticket. If this still persists please feel free to re-open this issue.