Hi, I'm using Drupal 7.14 on a Linux server.
I was able to set up Apache Solr and the Drupal module for it with not path problems. However Tika has not been so simple. I have tried installing the tika-app-1.2.jar outside my web directory, inside the sites/all/library and in the apachesolr_attachments/tika direcoty - all to no avail. When I test the settings I keep getting the error message:
Text can not be succesfully extracted. Please check your settings
and in /admin/reports/status:
Error
Apache Solr Attachments Java executable not found
Could not execute a java command. You may need to set the path of the correct java executable as the variable 'apachesolr_attachments_java' in settings.php.
I've tried giving the absolute path as:
/var/home/username/domain.com/www/sites/all/modules/apachesolr_attachments/tika
and tika jar file as tika-app-1.2.jar and tika-app-1.1.jar (I downloaded them both and installed them both in the same /tika directory).
I've chosen:
Extract using
Tika (local java application)
I AM able to make Tika run if I simply access the library via the command line. So:
$ java -jar tika-app-1.2.jar -t ../tests/test-tika.pdf
correctly returnes:
Testing Apache Solr Attachments text extraction
I'm running out of options to test. I read an older issue about making sure settings.php knows where the java executable is, but typing 'java' on my server brings up the service.
$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)Please let me know if there is something else I could try.
Thank you.
Comments
Comment #1
dawnbuie commentedAlso note I've read the the readme.txt and have seen this last line
I did apply the solrconfig.tika.patch to my working the solrconfig.xml file in my working solr directory. The patch contents are:
I wonder if I'm missing any other files where I'm supposed to add the following path to my tika app?
/var/home/username/domain.com/www/sites/all/modules/apachesolr_attachments/tika/tika-app-1.2.jarthanks again. I'd love to be able to try this excellent module out.
Comment #2
scott.whittaker commentedI have the exact same issue as dawnbuie attempting to get Drupal to execute tika on OS X. Apachesolr works and is indexing content. Tika works on the command line. Keep getting the above message when hitting the "Test your tika extraction" button. Can't get rid of the "java executable not found" message message in Status report. Have tried setting a variable $apachesolr_attachments_java in settings.php but nothing I've tried removes that message. Have tried setting it to /usr/bin/java, the location of JAVA_HOME, and the location of tika.jar.
I'm out of ideas.
Comment #3
moehac commentedI'm having the same issue. Thanks for your help and consideration.
Comment #4
nick_vhCan you please try with tika 1.0? Im not sure if tika 1.2 made some major changes?
Comment #5
nick_vhComment #6
jfhovinne commentedFor my part, text extraction works. Here is my configuration:
apachesolr_attachments settings (nothing in settings.php):
I've manually patched solrconfig.xml and restarted Tomcat.
HTH
Jean-François
Comment #7
debzani commentedFor my case, I have to put tika jar in some subfolder of apache solr server installation and then it works. The full path is as follows
~\apache-solr-3.6.0\contrib\extraction\lib\
Comment #8
eminencehealthcare commentedI am experiencing this exact same issue.
Comment #9
_randy commentedI had this issue as well. I used the tika patch for solrconfig.xml and found myself exactly in the same situation.
I removed the tika handler from the solrconfig.xml file and manually added the configuration to the xml file with the rest of the Request Handlers. It seems that the tika Request Handler setup injected itself in the middle of the
<query>xml tags.Comment #10
paultrotter50 commentedI am also getting the error message "Text can not be succesfully extracted. Please check your settings" when i press "test your tika extraction". I have tried with tika-app-1.0.jar and tika-app-1.2.jar. I am using tomcat 5.5.36, with solr 3.6.2
admin/config/search/apachesolr/settings shows my localhost server in green, so that appears to be working.
On /admin/config/search/apachesolr I have noticed that the 'value' of 'Schema' is 'drupal-4.1-solr-3.x' which seems strange as I'm using Drupal 7.
I have used the tika patch for solrconfig.xml.
I would really appreciate and suggestions as to what I might be doing wrong, or what I should try next.
Comment #11
Panther256 commentedI had to add the following line to my settings.php (at the bottom) to get Tika to work:
***** Just the line, I didn't need the PHP tags as shown above
Of course you will need to modify the path to your java.exe for your configuration.
-- Gene
Comment #12
jdu commentedI was having this same problem. Turns out, in OS X, I needed to redirect the output of the shell_exec()
In apachesolr_attachments.index.inc, somewhere around line 135, do this: return shell_exec($cmd.' 2>&1');
This made it work with my MAMP setup, and I have since moved the same code to a LAMP environment with no issues.
Comment #13
unqunqEDIT: I got it to work on the AWS server by placing the app file just outside the site root folder. I thought I had it configured the same on my local machine but it looks like something was wrong locally. Now it passes the test and indexes all attachments.
I could not get it to work either.
I run a local Drupal7 instance on my Mac OS and I configured it to use local tika. The path is correct and if running
java -jar tika-app-1.4.jarin terminal I get the Tika CLI:Drupal still complains that it cannot extract:
My java version:
Comment #14
bart atlas commentedPanther256's fix in comment #11 worked for me on WAMP. Much obliged!
Comment #15
nibo commentedI had the same problem on my system (CentOS 6.3).
To find what the problem was, I made a
dpm()of whatshell_exec()in theapachesolr_attachments_extract_using_tika-function was returning. The result wasBut the problem was not the reserving of the memory space, but the SELinux module. Disabling it on my dev environment did the trick.
Comment #16
revathi.b commentedI have to give permission to the jar file.
sudo chmod -R 775 tika-app-1.12.jar , this command resolve my problem