Closed (fixed)
Project:
Search Lucene API
Version:
5.x-1.2
Component:
Code
Priority:
Critical
Category:
Bug report
Assigned:
Reporter:
Created:
22 Jul 2009 at 23:35 UTC
Updated:
18 Aug 2009 at 16:10 UTC
Some search results include search result items with the title of "Page not found." These titles are linked to search URLs.
For example if my search is for the keyword craven, two items within the search result set are titled Page not found and each is linked to https://biblio.csusm.edu/archives/founding/search/luceneapi_node/craven
You can see this at: http://biblio.csusm.edu/archives/founding/search/luceneapi_node/craven
Thanks for your help....
Comments
Comment #1
cpliakas commentedHi.
This means that there was an error loading the node. The first thing to try is to clear the search results cache because there may be a node ID in the index referencing a node that no longer exists in the system. You can do this via the Search Lucene Content admin page. Please let me know if this doesn't work and we can debug further. If possible, try to get the node IDs for the offending results and verify you can view the node.
Sorry for your troubles,
Chris
Comment #2
cpliakas commentedIronically this happened to me for the first time today. I got the same error as you did on a site that is under development. In my case the offending node id had a corresponding entry in the {node} table, however I got a "Page not found" when I visited the page explicitly through node/`nid`. I looked into the node_load() function, and I found that the SQL query does an INNER JOIN on the {user} table. The problem was that the node had a uid that no longer existed in the {users} table, so the INNER JOIN prevented the node from loading. I would check that on your end as well.
In terms of my code, it does not clean up these entries from the index when situations like this are encountered. Furthermore, the update errors out which is not the correct behavior. The code should remove documents corresponding with the nid in the search index and proceed to the next node. The result of the bug is that the entries are orphaned in the index, but they are still returned in search results. This effects all versions of Search Lucene API, and a fix will be reflected in the next minor release of the module.
Thanks for picking this up,
Chris
Comment #3
cpliakas commentedFix committed to all branches of Search Lucene API: #241974 #241976 #241962. The index will have to be rebuilt to purge the orphaned nodes in the search index. The fix will be reflected in the 1.3 releases.
Comment #4
ianchan commentedHi, sorry for not responding sooner. Thank you for issuing the fix so rapidly! I updated module last week right after the new version came out. However, I haven't been able to figure out how to completely re-build the search cache. I tried the re-index and even emptying the MySQL table. However, I'm still getting the "Page not Found" items in the results.
Thanks for your help!
Comment #5
cpliakas commentedNo problem. The 2.0 API introduces a button to "wipe" the cache, however this is unavailable in 1.0. The easiest way to accomplish a complete index rebuild for your version of Search Lucene API is to manually delete the files/luceneapi_node directory containing the index files and then re-indexing the search index. As cron is run, the files/luceneapi_node directory will automatically be re-created and populated. Please let me know if the problem persists, and thanks for pointing out this issue.