This project is not covered by Drupal’s security advisory policy.

The Apache Solr file module provides a bridge between the File entity and Apache Solr modules allowing you to index and search for files. This module is the successor of apachesolr_media module.

This module allows website administrators to index files of most document types so they can be included in site-wide search results. This is very useful for enterprise websites that need to manage a large number of files, such as videos, PDFs, documents in Excel, Word, and PowerPoint, as well as images. ApacheSolr File module also can (not yet) index fields within the file entities, including title, description, and taxonomy field.

The difference between this module and Apache Solr attachments:

  • apachesolt_file indexes file entities and the search result links to file entity page - DRUPAL_HOME/file/fid.
  • apachesolr_attachments index file field content inside nodes and the search result links to file path or DRUPAL_HOME/node/nid.
  • apachesolr_file always uses the latest Solr and Apachesolr module, so new features will be added soon.

How to enable Solr to index file content

  1. The latest Solr is always recommended.
  2. Apache Solr 7.x-1.x-dev is recommended, because it has the latest solrconfig.xml.
  3. Copy "contrib" and "dist" directory from Solr package to your SOLR_HOME on your server.
  4. Find and edit these lines in solrconfig.xml.
      <lib dir="../../contrib/extraction/lib" /> 
      <lib dir="../../contrib/clustering/lib/" />
      <lib dir="../../dist/" /> //You may need to add this line by yourself. 
    

    Make sure all the paths are all pointed to the right directories you set in the previous step. If everything are correct, go to Solr_SERVER_URL/update/extract you will see something like

    <response>
    <lst name="responseHeader">
    <int name="status">400</int>
    <int name="QTime">0</int>
    </lst>
    <lst name="error">
    <str name="msg">missing content stream</str>
    <int name="code">400</int>
    </lst>
    </response>
    

    (Read more about http://wiki.apache.org/solr/ExtractingRequestHandler)

  5. Install and enable File entity 7.x-2.x-dev and apachesolr_file.

Here is an issue may answered some common questions about deploy.
http://drupal.org/node/1555126

Project information

Releases