Download & Extend

merging node search (standard search) with directory search

Project:Search Files
Version:6.x-2.0-beta4
Component:User interface
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed (duplicate)

Issue Summary

Hello,

I installed the search files module and currently you can either search nodes with the standard Drupal search, or click on the "search in directories" tab and search for PDFs. I would like to merge the two into a single search and get a query result with both relevant nodes and relevant pdfs, a little bit like what the Google search does on websites. You might get a page for the first result and a PDF for the second element in the result list because of its relevancy.

Is there a way to do this with search files? Or is it something that would require development.

Thanks so much for your help and insights,

Comments

#1

I have the same question, how to merge the file hits with content hits into one list sorted by relevancy.

#2

subscribe

#3

subscribe

#4

Status:active» closed (duplicate)

As pointed out in http://drupal.org/node/368195#comment-2072572 you have to alter core's search module right now:

<?php
function search_data($keys = NULL, $type = 'node') {

  if (isset(
$keys)) {
    if (
module_hook($type, 'search')) {
     
$results = module_invoke($type, 'search', 'search', $keys);

/* ### START MOD */     
      // Include file_attachments results in node search
     
if ($type == 'node') {
       
$file_results = module_invoke('search_files_attachments', 'search', 'search', $keys);
       
$results = array_merge($results, $file_results);
      }

    
// Include file_directories results in node search
     
if ($type == 'node') {
       
$file_results = module_invoke('search_files_directories', 'search', 'search', $keys);
       
$results = array_merge($results, $file_results);
      }
/* ### END MOD */         

     
if (isset($results) && is_array($results) && count($results)) {
?>

Best,
Paul

#5

The code in #4 will not work for any search that returns more than one page of results.

#6

@curtaindog: I'm not really experienced with the drupal search yet, so I'd like to know why this won't work. Or in your opinion shouldn't work as it does at http://www.clearingstelle-eeg.de where I implemented this code change (site is in German, try to search for the term "biomasse" which is one of the top terms. The search result page shows several pages with different content types as well as pdf documents.)

#7

@sin.star - First, I apologise that my comment was not more constructive, I just wanted to warn people that this is not an easy problem. In the thread you refer to, this comment and this comment suggest that there are problems with the result merge approach. Personally, I tried something similar and wound up getting duplicate elements in my search results. That being said I did run your search and it appeared to work (though I don't know any German and even less about biomass so I'm not the most reliable witness :)

From my understanding the results are paginated before they are returned from the module_invoke call, so you get the effect of (page n of set x + page n of set y) rather than (page n of set x + set y). This in itself is probably not a big deal although it would cause some result pages to be longer than others. More importantly though is that Drupal's pager widget pulls its data from the most recent search so if the most recent search had only one match you'd only get a link to one page, regardless of the number of matches for the other search(es).

Note that this is just what I've inferred from comments around the place, I haven't dived too deeply into the code, and I hope to be proven wrong.

CurtainDog

#8

Ok, here is a new hacked up version. Note that as each module is doing its own query we can't rely on the database for paging and thus we have to pull 'all' matches out, merge and repage them in the code. Ack! So to stop your site grinding to a halt I would limit the search to something like 100 matches per module and tell users to enter more search terms if the result they're looking for isn't returned.

So the first thing to do is to go into the do_search function and up the pager query to something more solid like 100 or 1000. We're paging in code so this number represents all the results we're going to get out of a particular module for a particular query, so it's got to be big enough to be useful but not so big that the search eats all your memory.

<?php
function search_data($keys = NULL, $type = 'node') {

  if (isset(
$keys)) {
    if (
module_hook($type, 'search')) {
     
// Massage the query, we want everything returned
     
global $pager_page_array, $pager_total, $pager_total_items;
     
$page_size = 10; // Default pager query returns 10 results per page
     
$get_page = isset($_GET['page']) ? $_GET['page'] : '';
     
$pager_page_array = explode(',', $get_page);
     
$_GET['page'] = 0;

     
// Invoke and merge search results
     
$results = module_invoke($type, 'search', 'search', $keys);
     
// Include file_attachments results in node search
     
if ($type == 'node') {
       
$file_results = module_invoke('search_files_attachments', 'search', 'search', $keys);
       
$results = array_merge($results, $file_results);
      }

     
// Include file_directories results in node search
     
if ($type == 'node') {
       
$file_results = module_invoke('search_files_directories', 'search', 'search', $keys);
       
$results = array_merge($results, $file_results);
      }

     
// Massage the pager variables, they would've been overwritten by the last search
     
$_GET['page'] = $get_page;
     
$pager_total_items[0] = count($results);
     
$pager_total[0] = ceil($pager_total_items[0] / $page_size);
     
$pager_page_array[0] = max(0, min($get_page, ((int)$pager_total[0]) - 1));
     
$paged_results = array();
     
$page_offset = $pager_page_array[0] * $page_size;
      for (
$i = 0; $i < $page_size && $i + $page_offset < $pager_total_items[0]; ++$i ) {
       
$paged_results[] = $results[$i + $page_offset];
      }

     
// Return to normal programming
     
if (isset($paged_results) && is_array($paged_results) && count($paged_results)) {
        if (
module_hook($type, 'search_page')) {
          return
module_invoke($type, 'search_page', $paged_results);
        }
        else {
          return
theme('search_results', $paged_results, $type);
        }
      }
    }
  }
}
?>

Actually, I like the results that come out of this better than the natural search files query as for my particular case there's always a lot of vanished files, which results in pages looking uneven. By paging at the end each page is nice and even.

#9

Is there any plans to fold this sort of thing into D7?

#10

my question is - does it finally work with filefield from cck ?
and also - is it already merged with standard search so i can use it in my views search term exposed filter ?

no one use today standard search.
and no one use upload module to have file attached to node.

there is filefield and views exposed filters now and it could be great to have it working with them.

thanks.