For anyone out there who would like to index Word 2007 files here is how I have done it. It would be great if this could be included in the documentation on the front page as well as in the Autodetect Helpers section.

  1. Install docx2txt Docx2txt SourceForge Project
  2. Head to admin/settings/search_files/helpers/add
  3. Name Word 2007, Extension docx, Helper path perl /usr/local/bin/docx2txt.pl %file% -

/usr/local/bin may not be required if it is installed correctly.

And the code for Autodetect can be added to search_files_install_auto_helper_app_configuration()

    // test for word docx2txt
    $location = trim(shell_exec('which docx2txt.pl'));
    $location = preg_replace("/^no .*$/", "", $location);
    $perl_location = trim(shell_exec('which perl'));
    $perl_location = preg_replace("/^no .*$/", "", $perl_location);
    if ($location && $perl_location) {
      search_files_helper_db_add("Word 2007 files", "docx", $perl_location .' '. $location ." %file%");
      drupal_set_message(t('Helper app docx2txt has been detected and configured'));
    }
CommentFileSizeAuthor
#4 search_files_docx.patch1.04 KBmstrelan

Comments

mstrelan’s picture

Update .. the autodetect code is missing a hypen after %file%. new code below

<?php
    // test for word docx2txt
    $location = trim(shell_exec('which docx2txt.pl'));
    $location = preg_replace("/^no .*$/", "", $location);
    $perl_location = trim(shell_exec('which perl'));
    $perl_location = preg_replace("/^no .*$/", "", $perl_location);
    if ($location && $perl_location) {
      search_files_helper_db_add("Word 2007 files", "docx", $perl_location .' '. $location ." %file% -");
      drupal_set_message(t('Helper app docx2txt has been detected and configured'));
    }
?>
bibo’s picture

Status: Active » Needs review

Thanks for this, I added it to my configuration.

Setting to need review since it.. needs review :).
I don't have time to test it right away though, but this would probably be a good addition to the module.

jrglasgow’s picture

Status: Needs review » Needs work

can you please write a patch for this?

mstrelan’s picture

StatusFileSize
new1.04 KB

Sure. I'm having problems browsing CVS from Eclipse at the moment, it keeps hanging when I expand the modules directory. Anyway here is a patch just on my local filesystem.

mstrelan’s picture

Status: Needs work » Needs review
jrglasgow’s picture

Status: Needs review » Fixed

this has been committed and will appear in the next release of 6.x-2.x and 7.x

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

babruix’s picture

Works great, thanks for this!
By the way, anybody know solution for xlsx2txt and pptx2txt?
i see projects: http://sourceforge.net/projects/xlsx2txt/ and http://sourceforge.net/projects/pptx2txt/ have anybody tried them?
Probably i should test them myself..

ken hawkins’s picture

Thought others might appreciate more explicit steps to get docx2txt working, perhaps this belongs in the handbook?

I'm doing this for 7.x on Ubuntu 12.04.

In terminal:

cd /usr/local/bin/
wget http://docx2txt.cvs.sourceforge.net/viewvc/docx2txt/docx2txt/docx2txt.pl

Then in Drupal:

/admin/config/search/search_files/helpers/add
Helper name: Word 2007
Extension: docx
Helper path: perl /usr/local/bin/docx2txt.pl %file% -