attachments module improvements

miiimooo - July 3, 2009 - 14:39
Project:Search Files
Version:6.x-2.0-beta1
Component:Code
Category:task
Priority:minor
Assigned:thl
Status:postponed
Description

Hi

I have so far used some of the old search_attachments module to enhance the normal search so it also includes attachments. But this is looking very good and I'm thinking of changing over to it. Two things, though, I've found from looking at the code and which might be worth adding to it are:

  1. caching of content:
    <?php
          $cache_text
    = cache_get("file_text:".$fid);
          if (
    $cache_text && is_object($cache_text) && $cache_text->data) {
           
    $file_text .= " " . $cache_text->data;
    ...
             
    cache_set("file_text:".$file->fid, $new_text);
    ?>
  2. the file system path doesn't look correct - shouldn't it be:
    <?php
          $full_path
    = $_SERVER['DOCUMENT_ROOT'] . '/' . file_directory_path() . '/' . $file->filepath;
    ?>

    My apologies if this has already been dealt with.

    All the best

#1

thl - August 22, 2009 - 22:31
Assigned to:Anonymous» thl
Status:active» postponed (maintainer needs more info)

1.) caching of content
What exactly do you believe is worth caching? If the output from all helper applications will be cached in the database, we likely end up doubling the disk space requirements.

2.) file system path
see #556790: Filepath/URI handling in Search Files Attachments
Fixed in CVS past Beta3

#2

miiimooo - August 25, 2009 - 12:18

re 1) The cache only holds the text version. In many cases a PDF document or office document can be several MB and still the text is a couple of KB. I have one here that uses around 300MB of disk space for documents and the cache table is 1.7MB.
The main advantage of caching is that when you display search results you don't have to run the helper for each result to get an excerpt which can put considerable load on the CPU and make this very slow.
As a compromise maybe you could add a configuration switch. Then users could chose depending on their hosting environment.

#3

thl - August 28, 2009 - 18:44
Priority:normal» minor
Status:postponed (maintainer needs more info)» postponed

Defer performance improvements past 6.x-2.0 release.

 
 

Drupal is a registered trademark of Dries Buytaert.