Hello,
I just installed this module, along with the file entity module. They installed fine, so I went to reindex to see if it grabbed files, and I got this error message:
SQLSTATE[42S02]: Base table or view not found: 1146 Table 'drupal_ripon.apachesolr_index_entities_file' doesn't exist
I didn't notice a readme or install file with the module, so I looked over the module page on drupal.org, and saw this step for the instructions:
Apply the solrconfig-solr3x.xml.patch in apachesolr_file module directory to solrconfig-solr3x.xml and rename it and put it in your SOLR_HOME/conf.
I'm not sure what that means... It sounds like you want me to apply a .patch file to the .xml file, but I'm not seeing any patch file included with this module. Is there just something I'm missing? Is this related to the database table not being created?
Thank you for any help you can provide.
Comment | File | Size | Author |
---|---|---|---|
#7 | solrconfig-solr3x.xml_.patch | 702 bytes | shenzhuxi |
Comments
Comment #1
shenzhuxi CreditAttribution: shenzhuxi commentedSorry, I forget to push to git.
I just fix it.
Comment #2
georgedamonkey CreditAttribution: georgedamonkey commentedExcellent, thank you. Now, that takes a bit to roll over to drupal.org, right?
Comment #3
georgedamonkey CreditAttribution: georgedamonkey commentedI was just looking at this again, and with this error: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'drupal_ripon.apachesolr_index_entities_file' doesn't exist, I decided to look at the website's database. Took me a while to notice it, but the table in the database for this module is titled this: apachesolr_index_enities_file. There's a 't' missing. I renamed the table, and I no longer get that error.
Comment #4
shenzhuxi CreditAttribution: shenzhuxi commentedThere's a typo for the table name in the .install file before, so after you update the module, you need uninstall and re-enable the module. Sorry for that.
Comment #5
georgedamonkey CreditAttribution: georgedamonkey commentedExcellent. I did what you said, and it installed just fine. Thank you!
Comment #6
andreicio CreditAttribution: andreicio commentedAbout the .patch file: I can see commits dated May 4, and the xml patch is still missing. Could you please post the patch here, attached to a comment? Or at least the code that needs to be added? I suppose it declares the /update/extract requestHandler, but I know way too little Solr to try it myself.
Thank you.
Comment #7
shenzhuxi CreditAttribution: shenzhuxi commentedI just add the patch.
Sorry I forgot to commit it.
Comment #8
georgedamonkey CreditAttribution: georgedamonkey commentedThank you.
I just applied the patch. After looking over the content of that patch file, it looks like that makes it so the text within a document is then indexed. Am I correct with this? If that's the case, should I re-index the site for it to take effect?
Comment #9
shenzhuxi CreditAttribution: shenzhuxi commentedYes you need re-index.
Comment #10
georgedamonkey CreditAttribution: georgedamonkey commentedHello,
After applying the patch, I now get this error as files get indexed:
Notice: unserialize(): Error at offset 0 of 7253 bytes in apachesolr_file_extract() (line 88 of /var/www/clients/client2/web2/web/sites/all/modules/apachesolr_file/apachesolr_file.module).
It gets repeated over and over.
Comment #11
shenzhuxi CreditAttribution: shenzhuxi commentedGo to http://YOUR_SOLR_DEPLOY/update/extract/?extractOnly=true&wt=phps&extractFormat=text&stream.file=YOUR_FILE_ON_SOLR_SERVER
Check whether your file is parsed correctly.
http://wiki.apache.org/solr/ExtractingRequestHandler may be useful.
Comment #12
georgedamonkey CreditAttribution: georgedamonkey commentedWell, it turns out that I made a mistake with one of your installation steps. I had the contrib and dist directories in my root apache-solr directory. Once I moved them to apache-solr/example/solr/ and re-indexed everything, now it seems to be working great.
Now that I have that working, it has led to another question. Search results come up perfectly for me when I'm logged in as the administrative user. But, anonymous users get zero results when searching files.
Looking at permissions, I have all users set to view all files under File entity. I also have all users set to use search and advanced search.
Is there another area for apachesolr_file to allow anonymous users to search files?
Thanks again for all your help.
Comment #13
shenzhuxi CreditAttribution: shenzhuxi commentedAfter enabled "Use search" for roles, it works fine. Just tested.
Comment #14
georgedamonkey CreditAttribution: georgedamonkey commentedOdd... I made sure all users have 'use search', tried rebuilding permissions afterwards, and anonymous users still can't search files. Searching the rest of the site works fine, it's just files they can't search.
Comment #15
georgedamonkey CreditAttribution: georgedamonkey commentedSo, I did some further testing. Anonymous users can view files, such as here:
http://beta.riponlibrary.org/file/957
So, it seems to just be that anonymous users cannot search for those files. Any ideas what I may have done wrong? Are there permission settings specific to apachesolr_file that I'm just not seeing?
Comment #16
georgedamonkey CreditAttribution: georgedamonkey commentedWell, I figured out the source of the problem. If I disable and uninstall the Apache Solr Access module, anonymous users can then search the site's files. Not sure what other ramifications I'll run into having that portion of the Solr module disabled, but the issue seems to stem with an incompatibility with that particular module.
Comment #17
shenzhuxi CreditAttribution: shenzhuxi commentedMaybe after you reinstalled the priority of modules were changed.
Comment #18
drzraf CreditAttribution: drzraf commentedthere is another potential error which isn't caught : if the extract handler has not been setup (eg, patch not applied), the 404 return from the "extractOnly" request is not handled so the content is not indexed.
This should be handled by the module.
Comment #19
drzraf CreditAttribution: drzraf commentedapachesolr_file_extract
should returnFALSE
if thefilesize
() is greater than themultipartUploadLimitInKB
otherwise it may exceed the Apache memory limit while there's no chance to get the file extracted.Comment #20
drzraf CreditAttribution: drzraf commentedCould you elaborate on why indexation has to be done in 2 steps, using
extractOnly
first ?Extraction applies to local files (cf
file_get_contents()
) thus relying only on theExtractingRequestHandler
would allow the use ofexec(curl)
what would be far less memory hungry.Comment #21
shenzhuxi CreditAttribution: shenzhuxi commentedhttp://drupal.org/project/apachesolr_media 7.x-2.x apply the 1 step way which required more modification to apachesolr module. It's more difficult for users to deploy.
The first release of apachesolr_file module keep the minimal codes by reusing apachesolr api, so it can't apply ExtractingRequestHandler in one step.
Comment #22
drzraf CreditAttribution: drzraf commentedThat's quite disappointing (even if you did a great work on this):
* apachesolr_media uses (used ?)
apachesolr_get_solr->addFile()
which AFAICT does not exists (anymore ?)* if it's not possible, then the current apachesolr API design is wrong
* the current way to index content is clearly suboptimal for a "successor" module.
Comment #24
cdcooper CreditAttribution: cdcooper commentedI see the .patch file and have run a command to make the changes but the word back is 'no changes'
The command I ran was git apply . I have both the patch and the original file in the same directory.