I have a situation where my CCK types have file attachments and my users need to be able to search through those attachments. I understand that this is not currently supported, but is anything in the works? I know there is a SoC project that aims to do this, but my time line is quite a bit faster than that. So, that said, if there have been no other attempts, can you provide any pointers/ideas you've had about it. I'd be more than happy to develop and contribute that code.
Comments
Comment #1
janusman commentedI think there is a module that adds attachment information to the index. See http://interoperating.info/mark/search_attachments
If that module adds information to the normal drupal search index during cron runs, then it will likely (?) make that information available to the Solr indexing process...
Comment #2
febbraro commentedI have used that search_attachments module before and there are some very serious problems (which is why I had to write this), specifically if you try to index a 500 page PDF and insert it into the drupal search index, it basically vomits all over the place :-) Not pretty and it ensures that you will never get past that document to index the rest of your content.
Even though I had not gotten any response here, I went ahead and wrote the module to index attachments (both in $node->files and in any CCK filefield) and stuff it into Solr. It currently works great but it could REALLY benefit from some refactoring of the apachesolr module and an additional hook or two so I can avoid some particular grossness.
Is anyone here interested? Please PM me or something, while I have momentum on this I would love to get it integrated.
Comment #3
davidseth commentedI'm interested!
Comment #4
robertdouglass commentedfebbraro, this is duplicate now, right?