So tried to debug a bit myself by doing a print_r in a couple of spots. Anyway, it "looks" like this is the command line it is running...

/home1/stamfor3/bin/antiword -m MacRoman -a letter 'files/testdoc.doc' > 'files/media_mover/mm_antiword/1/testdoc.doc.pdf'

The location of antiword is correct.
The location it is putting the file seems fine (and it does create something), but the file is 0 length.
If I run the above command line from a shell from the drupal base directory it creates a file in the same place, but not 0 length (normal size)

I tried playing with the permissions of the antiword binary itself setting it to 777 temporarily in case the apache user had some permission issue, but same result.

So I'm not sure what to debug next. media mover consistent processes the word doc and attaches it to the node, just always 0 length.

When the problem is resolved I'll be happy to document it for you (readme, drupal docs, etc.)!

- Peter

Comments

arthurf’s picture

Status: Active » Postponed (maintainer needs more info)

I wonder if the MacRoman encoding is the issue. Can you run antiword from the command line and see if altering the encoding works?

winston’s picture

Hey arthur,

Figured out the problem after a bit more googling. MacRoman wasn't the problem. And yes, command line worked for me all along.

Anyway here was my clue...

Antiword looks for its mapping files in three directories, in the order given:
(1) The directory specified by $ANTIWORDHOME
(2) The directory specified by $HOME/.antiword
(3) Directory /usr/share/antiword

So I think what was happening is that when I ran from command line $HOME resolved correctly, but not when run through apache user (drupal). There is more than one way to fix this. One is to put the mapping and font files into /usr/share/antiword, but you may not have access to that on shared hosting. Two is to find in the antiword code where /usr/share/antiword is and hard code the correct path for your server, but I'm guessing you don't want to put that in a readme!! Third is to add this line somewhere to the php code in mm_antiword.module putenv("ANTIWORDHOME=/path/to/.antiword");, but of course that will be different for every user.

If you give me till the end of the weekend I can probably come up with a patch. The patch will need to provide the user of the module with a way to hard code their .antiword path (wherever they have the mapping and font files) and then do a putenv call in the appropriate spot.

Thanks,

Peter

arthurf’s picture

Status: Postponed (maintainer needs more info) » Active

Awesome- I'm also glad to help, but any time somebody offers a patch :)

winston’s picture

Version: 6.x-1.0-beta2 » 6.x-1.x-dev
Status: Active » Needs review
StatusFileSize
new6.54 KB

Turned into a long weekend I guess :)

Anyway here is the patch. It fixes the above mentioned item. Also stumbled across two other trivial changes as I was doing it as follows...

There was a slight error in an existing formapi element (#descript instead of #description) so the description wasn't showing up.

Also, I added "-i 0" to the antiword command line as without it the second graphic in the test.doc provided with antiword (in the docs folder) wasn't showing up.

Went ahead and did it against dev - assuming that is best practice?

Thanks for an excellent module - I was looking for just this feature. It will be even more exciting if we can get a really lightweight Open Office implementation!

Peter

winston’s picture

Oh, almost forgot - updated the readme with more detail and instructions that may help less experienced users and those on shared hosting (obvious from the patch I presume!)

arthurf’s picture

Thanks for the very detailed patch! I'm wondering if one way to circumvent some of the issues around the path to antiword would be to optionally have the user download it to the module directory itself- since the path to it should be relative to the root of drupal, just doing drupal_get_path('module', 'mm_antiword') would be sufficient- or at least, we could check there as a first line of defense. That way we avoid the apache user $home vs. the user $home. Also the admin/settings page should probably do better checking to see that the path is valid- that might help clear up any issues.

winston’s picture

StatusFileSize
new6.68 KB

Hey Arthur,

Two things.

First, I had a slight bug in my patch. I was testing for the antiword mapping path dir even if the admin didn't set it. Technically I suppose it would work, but hardly the best way. Here is a new patch.

Second, I'm not fond of the idea of modules that want me to put other programs in the modules folder. Makes upgrading the module, especially for those who don't use CVS, more difficult. They can't simply delete the module folder and put a new one in. If we can come up with a standard for where addons go (fckeditor, flowplayer, antiword, etc.) so there is a common place to configure that then maybe (but that wouldn't be an issue for media mover). Also, I think the instructions to put antiword in the module path are not really an improvement for a less experienced user. You'd be telling them to install antiword, then move a folder on their server somewhere. If they can find the folder to move in the first place isn't it simpler and less error prone to just give them an admin setting to point to it?

Now, what we COULD do is add a bit of code to more intelligently look for the mapping files first. But even there I have concerns. For example I suppose we could add code that goes up a few directories iteratively looking for a .antiword folder. And if one is found assume to use that. However, I feel it is just cleaner to put that in the site admin's control and point it out in the readme. I know if I had seen antiword mapping files path as a configuration option I would have figured it out fairly quickly, and if I was confused I would first look for a readme.txt or install.txt. Either way, it never would have become a bug report/feature request.

Thanks,

Peter

arthurf’s picture

I get what you're saying about the upgrade path with binaries/files in the modules.

WYSIWYG is now using sites/all/libraries, sites/default/libraries to store JS editors... it might make sense to follow that pattern using sites/all/bin, sites/default/bin and then hunt on the system level.

That sound reasonable?

winston’s picture

Status: Needs review » Needs work

Hey Arthur,

Didn't forget about this. I have some qualms about this idea, but I do understand that perhaps you don't want to put something else in the admin ui to maintain.

I can live with it as it solves the problem and I would know what to do. However from the user perspective (assuming a user who doesn't have admin access to the machine such as shared hosting - other users probably wouldn't have this problem in the first place) you would end up with a procedure like this...

1. Find antiword binaries
2. Follow instructions to install
3. Figure out where the mapping files ended up (hopefully always $HOME/.antiword, perhaps using the env command to find ANTIWORDHOME)
4. Move that folder to a specific folder under the sites folder (possibly having to create it if it doesn't exist).

On the other hand with an admin setting to just point to the mapping files the workflow looks like this...

1. Find antiword binaries
2. Follow instructions to install
3. Figure out where the mapping files ended up...
4. Put a setting in the antiword admin screen

The only difference is #4, but I still feel that for someone who maybe was just barely able to execute steps 1 to 3 at a command line we're putting extra burden on them (albeit not a major one).

Also are you comfortable enough that the sites/all/libraries or sites/default/libraries is going to end up the standard? If it doesn't it might be a tedious change later on.

Anyway that said, let me know how you want to go. I'd still like to create the patch so if you guide me on your decision I'll execute it in the code and readme file.

winston’s picture

OK, so here is an "evil" thought...

We first look in the same folders where we know antiword will look ($ANTIWORDHOME, $HOME/.antiword, /usr/share/antiword).

If not in there (it won't be on a shared host install because $ANTIWORDHOME will be the apache account home folder) then...

We capture the directory where the module is installed in a variable.

Start from that directory and keep going up one directory level at a time. At each level up we look for a .antiword folder. If we get all the way up to the root directory we stop and raise an error/fail. However on a shared host we should get up to the users home folder and succeed! Once found we set our variable and proceed with the code I had.

No admin screen, no readme instructions (other than follow the antiword install). A few more function calls, but only the first time.

Does that sound like it would work?