Hi,
Is it possible to trigger an immediate reindex of a node without waiting for Cron to run?
I have a site where I'm storing node rating information within Solr as its being used as one of the sorting options.
When a node rating changes I'm marking the node for reindexing via apachesolr_mark_node.
The issue I have is that the node rating information can potentially be out of date on the search results as it has to wait for cron to run (on a 1 hour interval) to update the nodes Solr record.
I don't really want to trigger a full cron run every time a node is rated. Is their something I can call to trigger a Solr to immediately index anything marked for indexing?
Cheers
Comments
Comment #1
pwolanin commentedWhy do you run cron so infrequently? I'd run it e.g. every 5 minutes.
Comment #2
sepph commentedI've always assumed that running cron too frequently would be a bad idea. Though I guess that depends on how much gets done on each cron run.
I'll try setting setting cron to run frequently and possibly look in to using Elysia Cron to control what tasks run on each run.
Comment #3
nick_vhI assume this on is fixed then?
Comment #5
cpliakas commentedThis is a perfect use case for #1586320: Add support for the ExternalFileField field type. It would decouple updating the rating from cron and indexing. Therefore cron could still run every hour, but a routine could be run that updates only the ratings field every N number of minutes without even having to reindex the piece of content being rated.
Comment #6
martijn houtman commentedRunning cron empties the cache, so there will be more cache misses if you run cron every x minutes. I'd suggest re-indexing through drush (drush solr-index), I run mine in the system cron (so not Drupal's cron).
But still, if you run re-index every minute, a visitor might be looking at old an old index for 1 minute. I've tried several things, but it's giving me a lot of trouble (old indexes showing up, although the node is no longer marked for index). What would be the apachesolr API call to manually re-index an entity?
Comment #7
cpliakas commentedHi Martijn Houtman.
Solr itself has a delay, so even if you send the node to Solr immediately it won't be reflected in the results for the amount if time specified in solrconfig.xml. I just want to make the distinction clear between sending the content to Solr immediately for indexing and when it actually becomes available. The former is a Drupal thing, the latter is not.
Thanks
Chris
Comment #8
martijn houtman commentedHi cpliakas,
Thanks for your reply. Yeah, I am aware of that, we also set that timeout to a small value, in order to make the delay as little as possible. Solr is great for offloading our search, but our website's visitors should see their changes as soon as possible. That's why I do not want to wait for the cron to run, but at least queue them to Solr ASAP.
Comment #9
cpliakas commentedUnderstood. The desire for "near real time" searching (NRT) is something that people are expecting more and more. It might be worth following Solr 4.0, experimenting with it's NRT implementation, seeing what the gotchas are out there, and determine what would need to be done in this module or an extension module to support that capability.
Comment #10
j0rd commentedLooking for this as well. I'm currently using Solr 4.x and NRT is enabled by default (i think). At least according to http://lucene.apache.org/solr/solrnews.html
Also from what this thread seems to suggest, setting indexing event to a couple 100 ms allows solr to continue to be performant. No reason it needs to be minutes. This I believe is on 3.x and not 4.x.
http://lucene.472066.n3.nabble.com/When-Index-is-Updated-Frequently-td26...
So if you were going to go this route, you would need updates to get sent when a node is saved. Currently I don't believe this is an option in ApacheSolr module.
SearchAPI provides this, I helped with the patch...but I don't believe the ApacheSolr module has this functionality or options.
Going to change this from a "support request" to "feature request" and change the title.
Comment #11
j0rd commentedThere's a really good issue queue (with code) about how to do this here:
#1816462: Possible to instantly index an entity / node?