For anyone out there who would like to index Word 2007 files here is how I have done it. It would be great if this could be included in the documentation on the front page as well as in the Autodetect Helpers section.
- Install docx2txt Docx2txt SourceForge Project
- Head to admin/settings/search_files/helpers/add
- Name Word 2007, Extension docx, Helper path
perl /usr/local/bin/docx2txt.pl %file% -
/usr/local/bin may not be required if it is installed correctly.
And the code for Autodetect can be added to search_files_install_auto_helper_app_configuration()
// test for word docx2txt
$location = trim(shell_exec('which docx2txt.pl'));
$location = preg_replace("/^no .*$/", "", $location);
$perl_location = trim(shell_exec('which perl'));
$perl_location = preg_replace("/^no .*$/", "", $perl_location);
if ($location && $perl_location) {
search_files_helper_db_add("Word 2007 files", "docx", $perl_location .' '. $location ." %file%");
drupal_set_message(t('Helper app docx2txt has been detected and configured'));
}
| Comment | File | Size | Author |
|---|---|---|---|
| #4 | search_files_docx.patch | 1.04 KB | mstrelan |
Comments
Comment #1
mstrelan commentedUpdate .. the autodetect code is missing a hypen after %file%. new code below
Comment #2
bibo commentedThanks for this, I added it to my configuration.
Setting to need review since it.. needs review :).
I don't have time to test it right away though, but this would probably be a good addition to the module.
Comment #3
jrglasgow commentedcan you please write a patch for this?
Comment #4
mstrelan commentedSure. I'm having problems browsing CVS from Eclipse at the moment, it keeps hanging when I expand the modules directory. Anyway here is a patch just on my local filesystem.
Comment #5
mstrelan commentedComment #6
jrglasgow commentedthis has been committed and will appear in the next release of 6.x-2.x and 7.x
Comment #8
babruix commentedWorks great, thanks for this!
By the way, anybody know solution for xlsx2txt and pptx2txt?
i see projects: http://sourceforge.net/projects/xlsx2txt/ and http://sourceforge.net/projects/pptx2txt/ have anybody tried them?
Probably i should test them myself..
Comment #9
ken hawkins commentedThought others might appreciate more explicit steps to get docx2txt working, perhaps this belongs in the handbook?
I'm doing this for 7.x on Ubuntu 12.04.
In terminal:
Then in Drupal: