Add a bias on node type (and node-type exclusion) [#365901]

Comment	File	Size	Author
#18	nohang-exclude-365901-18.patch	2.86 KB	pwolanin
#17	nohang-365901-17.patch	721 bytes	pwolanin
#10	node-type-weights-365901-10.patch	6.93 KB	pwolanin
#4	365901-node-type-weights.patch	5.77 KB	damien tournoud
#3	365901-node-type-weights.patch	5.73 KB	damien tournoud
#1	365901-node-type-weights.patch	5.43 KB	damien tournoud

Comment #1

damien tournoud commented 29 January 2009 at 01:00

Status:

Active

» Needs review

Status	File	Size
new	365901-node-type-weights.patch	5.43 KB

Here is a first patch for this. Do we need a steepness, in addition to the boost?

Log in or register to post comments

Comment #2

pwolanin commented 29 January 2009 at 14:21

not sure why a steepness would be relevant here if you can set the boost per content type.

Why this instead of just saving an array and letting it be serialized/unserialized?

list($type_steepness, $type_boost) = explode(':', $type_boost);

Also, per Jacob we need a setting (potentially) to totally exclude certain node types from indexing.

Log in or register to post comments

Comment #3

damien tournoud commented 29 January 2009 at 15:37

Status	File	Size
new	365901-node-type-weights.patch	5.73 KB

Following a IRC discussion with Peter, here is a new version:

- use bf queries to set the type specific boosts
- allow to completely omit nodes from indexing

Log in or register to post comments

Comment #4

damien tournoud commented 29 January 2009 at 16:50

Status	File	Size
new	365901-node-type-weights.patch	5.77 KB

And a fixed version.

Log in or register to post comments

Comment #5

JacobSingh commented 29 January 2009 at 16:58

nice implementation! This feature will be killer.

I feel there is a usability issue here though. It's certainly good to remove node types which are not going to be queried on, however we need to warn the user that they are being removed when you omit the boost. I imagine many users would assume that "Omit" is going to mean, do not boost it, not "remove it".

Also, when they turn on a node type which they had previously not been using, we should warn them and/or force a re-index so they are not confused.

What do you think? How should we make this clear.

Log in or register to post comments

Comment #6

JacobSingh commented 29 January 2009 at 17:00

Wait, I just reviewed again. I didn't notice you set the previously omitted nodes to be re-indexed.

Sorry :/

I'll go back through and do a proper review with the patched code later on today.

Log in or register to post comments

Comment #7

pwolanin commented 29 January 2009 at 17:32

I think that a setting to omit from the index should really be a separate settings form from the boost. We should, but default, not any bq for content types if we can avoid it.

Log in or register to post comments

Comment #8

damien tournoud commented 29 January 2009 at 18:49

This still needs review, but this is wrong:

+      $solr->deleteByQuery("type:$type");

This should take the site hash into account too.

Log in or register to post comments

Comment #9

pwolanin commented 29 January 2009 at 22:25

Ah, sure at least if you think you might have a multi-site index.

Note, however, that the delete index operation is not limited currently to the current site - so we could go with this for now, but handle it better when multi-site support goes back in.

Log in or register to post comments

Comment #10

pwolanin commented 2 February 2009 at 02:43

Status	File	Size
new	node-type-weights-365901-10.patch	6.93 KB

Here's a better patch that separates boosts from exclusion - also correctly handles the case where we 'Reset to defaults'

Log in or register to post comments

Comment #11

dww

we/he/they

commented 3 February 2009 at 01:18

Title:	Add a biais on node type	» Add a bias on node type
Issue tags:		+drupal.org upgrade

I'll see if I can make time to review/test this, but I can't promise I will with all the other d6 upgrade issues on my plate... ;)

Log in or register to post comments

Comment #12

pwolanin commented 3 February 2009 at 13:28

Title:

Add a bias on node type

» Add a bias on node type (and node-type exclusion)

Log in or register to post comments

Comment #13

pwolanin commented 3 February 2009 at 14:07

Status:

Needs review

» Fixed

committed to 6.x

Log in or register to post comments

Comment #14

dreed47 commented 3 February 2009 at 16:11

I installed this patch and I see two issues with the node exclusion part.

First, the admin page at /admin/settings/apachesolr/index shows a count of all nodes in the system as though they are all to be indexed, even though I've set some node types to be excluded

Second (and much more important) It seems as though the cron job is pulling nodes that should be excluded. For example, I have it set to process 50 nodes per cron run and it pulls the first 50 that it comes to and they are all excluded node types so it indexes nothing and waits until the next cron run. For many people this may not be an issue but for my current use case it is. Say I have 10k nodes of a type that I don't want to index and 1k nodes of type that I do. The cron indexing should not have to loop thru all 11k nodes.

Log in or register to post comments

Comment #15

dww

we/he/they

commented 3 February 2009 at 16:49

Status:

Fixed

» Active

Haven't confirmed myself, but sounds like #14 brings up an important bug in how this works. ;)

Log in or register to post comments

Comment #16

pwolanin commented 3 February 2009 at 16:57

@dww - yes, I'm aware of those issues - already had imagined we might need a follow-on patch. I'm not convinced that the node_load of non-indexed nodes is a problem, but much more serious is that indexing may hang forever if all the nodes selected for indexing on a given cron run are excluded.

Log in or register to post comments

Comment #17

pwolanin commented 3 February 2009 at 17:08

Status:

Active

» Needs review

Status	File	Size
new	nohang-365901-17.patch	721 bytes

This might be a sufficient fix to prevent the really critical bug.

Log in or register to post comments

Comment #18

pwolanin commented 3 February 2009 at 21:51

Status	File	Size
new	nohang-exclude-365901-18.patch	2.86 KB

A little better refactoring - a separate hook for node exclusion.

Log in or register to post comments

Comment #19

damien tournoud commented 4 February 2009 at 09:04

Looks like a good idea at first sight.

I commented (on IRC) on the previous version of the patch that we probably don't want to output:

watchdog('Apache Solr', 'Adding @count documents.', array('@count' => count($documents)));

When count($documents) == 0 ;)

Log in or register to post comments

Comment #20

dww

we/he/they

commented 4 February 2009 at 09:31

Yup, +1 on the concept here. Code appears good on visual inspection though I haven't tested it. I guess I should really setup a solr instance on my laptop to test stuff like this. ;)

Log in or register to post comments

Comment #21

pwolanin commented 4 February 2009 at 13:25

@Damien - is it bad to watchdog that we sent 0? might help with debugging. Committing as is - we can revisit the watchdog call if needed.

@dww - it's really easy to run the example (Jetty) server locally. grab me in IRC if you want assistance.

Log in or register to post comments

Comment #22

pwolanin commented 8 February 2009 at 18:38

Status:

Needs review

» Fixed

see follow-up patch: http://drupal.org/node/370796

Log in or register to post comments

Comment #23

22 February 2009 at 18:40

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Log in or register to post comments

Comment #24

kentr commented 23 October 2010 at 22:08

One thing we need for d.o is a biais on the node type so that project nodes appear first in the search result (have you tried recently to search for "views" or "cck" here?).

This can't really be done at query time (we can't map types to weights using any Solr function)

Just want to confirm: the content type bias is only applied at index time, not at query time (so I must re-index to see the effect)?

Thanks.

[Edit] Found the answer: Content type bias is done at query time.

Log in or register to post comments

Add a bias on node type (and node-type exclusion)

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

News items

Our community

Documentation

Drupal code base

Governance of community