Hello there,
I'm writing some customizations for the search engine and would like to take into account the number of times a node had been viewed or read. I already have that part implemented, but I'm running into some trouble with setting up reindexing.
Our cron runs 4 times an hour and our server can safely process 50 nodes with Apache Solr. 50x4 = 200 nodes per hour. The problem is that on average we get about 300 page views an hour and that number is growing. If I submit a node for a reindex every time it's viewed the queue will always be full, which presents all kinds of problems.
So I need to come up with a way to moderate the submission for reindex. A couple of ways to do that are:
* Only allow submission of a node for reindex once a day
* Only submit node for reindex if it had 10+ views/reads since last reindex
I can store the timestamp of when the node was last indexed as one of it's solr fields. Same with number of views at the last time of indexing.
I think it makes sense to do the check and submission for reindex at node load time.
My questions is: How do i retrieve a particular node's values from apache solr index?
Thanks,
Andrey.
P.S. If you have a better idea of how to accomplish what I'm trying to do, please let me know.
| Comment | File | Size | Author |
|---|---|---|---|
| Screen shot 2011-05-06 at 9.46.30 AM.png | 281.96 KB | mr.andrey |
Comments
Comment #1
mr.andrey commentedEnded up just using a custom table with a hook_nodeapi.
Now the node is reindexed if:
* node counter is < 10 and it has advanced 2+ views
* node counter is < 100 and it has advanced 5+ views
* node counter is >= 100 and it has advanced 10+ views
Seems reasonable for now.
Cheers,
Andrey.
Comment #2
pwolanin commentedYou must not be doing any caching?
Anyhow - sounds like a reasonable approach.
Comment #3
mr.andrey commentedWhat do you mean?
I'm not quite clear on the whole caching and read counter thing. I'm using Boost and Statistics Advanced Settings.
I've read that Boost isn't 100% friendly with the read counter, but haven't looked that deeply into it yet.