Hello
I have made a content type. This content type has a file field through CCK. I made many nodes and uploaded pdf files in the nodes. I have installed Apache Solr Search Integration module. I have downloaded Solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/ and unzipped this on c:/solr i placed schema.xml from module file to c:/solr/example/conf/ directory. Then i run Command Prompt . I go to the c:/solr/example directory and run the command java -jar start.jar . Then i opened the URL http://localhost:8983/solr/admin/ . This URL is also working . Then i run the cron. Then i try to search from the pdf files , but it is not searching. Please guide me. Will i have to do something in http://localhost/drupal/?q=admin/settings/apachesolr page or http://localhost/drupal/?q=admin/settings/apachesolr/attachments ?
Currently
http://localhost/drupal/?q=admin/settings/apachesolr page has
Solr host name:localhost
Solr port: 8983
Solr path: /solr
and
http://localhost/drupal/?q=admin/settings/apachesolr/attachments page has
PDF Helper: /usr/local/bin/pdftotext "%file%" -
Text Helper: /bin/cat "%file%"
Comments
Comment #1
peterzoe commentedcould you test the same with a .pdf attachment using the upload module? I have a feeling that cck filefield files are not being picked up. however, classic attachments are being picked up, indexed and searched (on my implementation at least)....
cheers, Peter.
Comment #2
febbraro commentedThe problem is that you are specifying a Linux/Unix path for a PDF to Text converter in a Windows environment. You need to specify a windows command line executable that can extract the text from a PDF document to stdout.
Comment #3
dipen chaudhary commented#2 nice catch :)
Should this be closed then?
Comment #4
JacobSingh commented