Hello
I have made a content type. This content type has a file field through CCK. I made many nodes and uploaded pdf files in the nodes. I have installed Apache Solr Search Integration module. I have downloaded Solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/ and unzipped this on c:/solr i placed schema.xml from module file to c:/solr/example/conf/ directory. Then i run Command Prompt . I go to the c:/solr/example directory and run the command java -jar start.jar . Then i opened the URL http://localhost:8983/solr/admin/ . This URL is also working . Then i run the cron. Then i try to search from the pdf files , but it is not searching. Please guide me. Will i have to do something in http://localhost/drupal/?q=admin/settings/apachesolr page or http://localhost/drupal/?q=admin/settings/apachesolr/attachments ?

Currently
http://localhost/drupal/?q=admin/settings/apachesolr page has
Solr host name:localhost
Solr port: 8983
Solr path: /solr

and
http://localhost/drupal/?q=admin/settings/apachesolr/attachments page has
PDF Helper: /usr/local/bin/pdftotext "%file%" -
Text Helper: /bin/cat "%file%"

Comments

peterzoe’s picture

could you test the same with a .pdf attachment using the upload module? I have a feeling that cck filefield files are not being picked up. however, classic attachments are being picked up, indexed and searched (on my implementation at least)....
cheers, Peter.

febbraro’s picture

Title: PDF Serach is not working » PDF Search is not working

The problem is that you are specifying a Linux/Unix path for a PDF to Text converter in a Windows environment. You need to specify a windows command line executable that can extract the text from a PDF document to stdout.

dipen chaudhary’s picture

#2 nice catch :)

Should this be closed then?

JacobSingh’s picture

Status: Active » Closed (fixed)