Hi,
Now that I have figured out how to adjust the amount of data crawled through nutch and stored through Solr and how to adjust the length of the presentation of that data the last piece still remains:
How to adjust the teaser length while using the highlight feature.
Example:
I search "justice" on the site and get these returns:
Cases - All terms
03/22/2010 06/07/2010 Samuel A. Alito, Jr. 8-1 1 2 3 4 5 6 7 8 9 … next › last » Cases Justices Advocates ...
http://www.oyez.org/cases
Supreme Court Tour
Supreme Court Tour | The Oyez Project Skip to Navigation Oyez Site Feedback On The Docket Appellate.net Justia SCOTUSblog Cases Justices Advocates Benefactors About Tour Home › Supreme Court Tour › Supreme Court Tour Printer-friendly version Cases Justices Advocates Benefactors About Tour Footer Links ...
http://www.oyez.org/tourNotice on the first result I only get the 100 characters or so? Notice on the second result I get 200? This is an issue. This happens because the word "justice" appears only once in the first instance and twice in the second instance and is cut off by some login in ApacheSolr.
How can I be sure? Maybe it's just all there is?
Simple enough to find out! Just go to your Solr admin instance on 8983. Type in the exact keyword in the basic query and get the results, Then, view the page source. There you will see exactly what Solr has in it's database for that query. For me, since I modified nutch/Solr to keep the whole page, I have, well, the whole page. So I know ApacheSolr logic is doing this.
What have I done so far?
I have looked all over the code and found a few places of interest.
apachesolr_search.module line 1464-1477
search.module line 1200-1245
What are my results?
Nothing. I can't seem to adjust the length of the highlighted portion of ApacheSolr.
What I want:
I want to make the length of the entire search snippet standard for all results, to lets say 300 characters, AND keep the highlighting, minus the character limit logic it currently uses.
Sound tough? Heck yeah it is. This would complete my tutorial http://drupal.org/node/968308 on adjusting teaser lengths using SolrSearch. Any input would be super duper.
Comments
Comment #1
maxmmize commentedBased on this:
http://wiki.apache.org/solr/HighlightingParameters
I have been adjusting my slop to
I set it high to see results. I cleaned out my nutch and Solr and rebuilt them to vanilla. Still no result change.
Comment #2
maxmmize commentedI turned off highlighting in solrconfig.xml (both in my lib and in my modules folder) and re-ran everything, I still get highlighting, so this proves that Drupal is doing it. Now I just need to find out where and how.
BTW, shouldn't turning this off prevent our mod form running highlighting?
Comment #3
maxmmize commentedFound the issue:
apachesolr_search.module is not being overriden by solrconfig.xml for some reason. (permissions?) I replaced NULL in the params for hl.fragsize to 400 and wham, done. Possible bug?
Comment #4
maxmmize commentedComment #5
jbrauer commentedThe variables seem to be ignored if they are empty. For example in settings.php setting:
'apachesolr_hl_pretag' => NULL,
'apachesolr_hl_posttag' => NULL,
still provides the words highlighted with tags. But putting something like '--' as the value causes it to render -- for the pre or post tag. Any value seems to work as long as it's not NULL or ''.
Comment #6
jpmckinney commentedWe need a patch for an issue to be "Needs review".
You can set these variables in your settings.php (or with strongarm). We should probably expose them in the UI.
Add feature in HEAD first.
Comment #7
pwolanin commentedI don't think we should add a UI for this to the base module.
The OP seems to be discussing some other issue -