Restrict MLT list to nodes of certain types, or the same type as current

David Stosik - February 11, 2009 - 15:38
Project:Apache Solr Search Integration
Version:6.x-1.x-dev
Component:More Like This
Category:feature request
Priority:normal
Assigned:Unassigned
Status:needs review
Issue tags:drupal.org redesign
Description

Would it be possible, to add, in the MLT blocks configuration, the possibility to restrict to some types of nodes, or to the current node's type?

#1

mikejoconnor - February 12, 2009 - 18:11

As long as your using the more like this handler you should be able to use an fq query to restrict the results to specific nodes. See http://wiki.apache.org/solr/MoreLikeThisHandler and http://wiki.apache.org/solr/MoreLikeThis

#2

mikejoconnor - February 13, 2009 - 09:57

Here is a patch to add a simple sub query field to the recommendations block. The sub query field is added as a fq parameter, to limit the mlt results.

AttachmentSize
mlt_subquery.diff 2.13 KB

#3

mikejoconnor - February 13, 2009 - 10:16
Status:active» needs review

Version 2,

Removed check_plain

Fixed the settings form validation

AttachmentSize
mlt_subquery.diff 2.27 KB

#4

pwolanin - February 15, 2009 - 00:49

I think having a bare text field is pretty user-unfriendly. How about a multi-select or checkboxes for all the available types? also, this should probably take into account types that are already being excluded from the index?

#5

mikejoconnor - February 19, 2009 - 03:55

I considered that, however I thought that someone might want to do more than just filtering on type, such as only show items for a certain category, user, cckfield, date range, etc.

#6

pwolanin - February 19, 2009 - 14:51

@mikjoconner - I think for that sort of thing they need a custom module to modify the query - the UIO should jsut have some basic, user-friendly options I think.

#7

JacobSingh - February 19, 2009 - 15:16

I'm in agreement. Our beta tests suggest that screen needs a redesign as few of our users were able to figure it out. Having a default created helped a lot in this regard.

I think we could:
1). Add a hook as Peter is suggesting (I think)

2). Add a variable which can be set in settings.php if someone so desires and document it.

#8

pwolanin - February 19, 2009 - 17:54

@Jacob - we should already be running the alter hook on the query, right? If not , then I'd make that the way to alter it.

#9

JacobSingh - February 20, 2009 - 05:52

Ah, yeah it does.

Mike, how do you feel about this? Since there is already a provision to do it, perhaps providing some new interface using the facet registry would be a better way to go?

Best,
Jacob

#10

mikejoconnor - February 25, 2009 - 20:20

Overall the current solution isn't a good one, it was more of a proof of concept, and a starting point. Overall I think the MLT ui, needs a lot of love. In my opinion limiting this to a simple list of node types is very short sighted.

I really like the idea of adding items from the facet registry, and combining them with a select list, radio buttons, checkboxes, or an autocomplete text field.

#11

Nick_vh - May 14, 2009 - 15:47

Added a checkbox selection for the apache solar more like this block so it is easier for people to really customize their more like this block.

Please review this patch. Diffed against the latest cvs checkout

AttachmentSize
morelikethis_added_content_type_selection.patch 7.15 KB

#12

pwolanin - May 14, 2009 - 20:14
Status:needs review» needs work

MLT module is gone in the latest CVS (combined with framework module). Are you using the DRUPAL-6--1 branch?

#13

Nick_vh - May 15, 2009 - 08:41

I checked out the latest head and I still see the MLT module? And I'm sure it is the HEAD. Please clarify?

#14

pwolanin - May 15, 2009 - 15:03

Right. Do not use HEAD - the active development branch is DRUPAL-6--1

#15

Nick_vh - May 15, 2009 - 22:25

That explains.. :-)

#16

Scott Reynolds - May 15, 2009 - 22:40

Happy to annouce you can now do this with Apache Solr Views

http://drupal.org/cvs?commit=212234

#17

ceardach - July 4, 2009 - 00:00
Status:needs work» needs review

Now that MLT can be used in a view, are there any other changes that need to be done to close this issue?

#18

janusman - July 10, 2009 - 14:13
Status:needs review» needs work

I'm thinking this "needs work" as opposed to review.

#19

Bèr Kessels - July 14, 2009 - 08:32

@Scott in #16, do you mean that this issue can be closed? That a version of this was committed?

#20

Scott Reynolds - July 14, 2009 - 15:10

Sry I added noise. I shouldn't have. I was just excited about this feature going into Apacher Solr Views project. This has not be committed to Apache Solr Search Integration project.

#21

robertDouglass - July 17, 2009 - 10:11

This is a visual review of #11.

This looks like a bug.

<?php
-    $fields = array('mlt.mintf', 'mlt.mindf', 'mlt.minwl', 'mlt.maxwl', 'mlt.maxqt', 'mlt.boost', 'mlt.qf');
+   
$fields = array('mlt.mintf', 'mlt.mindf', 'mlt.minwl', 'mlt.maxwl', 'mlt.maxqt', 'mlt.boost', 'mlt.fq');
?>

And if it is, it looks like it's still there:

<?php
// current apachesolr.module
 
try {
   
$solr = apachesolr_get_solr();
   
$fields = array(
     
'mlt_mintf' => 'mlt.mintf',
     
'mlt_mindf' => 'mlt.mindf',
     
'mlt_minwl' => 'mlt.minwl',
     
'mlt_maxwl' => 'mlt.maxwl',
     
'mlt_maxqt' => 'mlt.maxqt',
     
'mlt_boost' => 'mlt.boost',
     
'mlt_qf' => 'mlt.qf',
    );
?>

These are extraneous comments, right?

<?php
+    // additional fq terms
+    // fq=+popularity:[10 TO *] +section:0
+    // in our case this will be fq=mlt_fq +type:blog +type:page
?>

Please pay attention to whitespace issues around Drupal coding style:

<?php
+    if(!empty($block['mlt_fqtype'])){
+     
$subfq .= implode(' OR type:',$block['mlt_fqtype']);
+       
'fq' => $block['mlt_fq'].$subfq,
// Should look like this
+    if (!empty($block['mlt_fqtype'])) {
+     
$subfq .= implode(' OR type:', $block['mlt_fqtype']);
+       
'fq' => $block['mlt_fq'] . $subfq,
?>

Isn't there a logic error here? If you implode on ' OR type:', won't the first one in the array not get the proper prefix?

<?php
+      $subfq .= implode(' OR type:', $block['mlt_fqtype']);
?>

Why do we need the subquery text field if we're providing checkboxes?

<?php
$types = node_get_types('names');
$form['advanced']['mlt_fqtype'] = array(
+   
'#type' => 'checkboxes',
+   
'#title' => t('Subquery for selected types'),
+   
'#description' => t('This can be used to filter the result set. Exampe. type:story will limit the suggestions to story nodes. Note: If the list is empty, all content types will be selected'),
+   
'#options' => $types,
+   
'#default_value' => isset($block['mlt_fqtype']) ? $block['mlt_fqtype'] : array(''),
+  );
+
$form['advanced']['mlt_fq'] = array(
+   
'#type' => 'textfield',
+   
'#title' => t('Results subquery'),
+   
'#description' => t('This can be used to filter the result set. Exampe. type:story will limit the suggestions to story nodes'),
+   
'#default_value' => isset($block['mlt_fq']) ? check_plain($block['mlt_fq']) : '',
+  );
?>

#22

robertDouglass - July 17, 2009 - 10:14
Version:6.x-1.x-dev» 6.x-2.x-dev

While I understand mikejoconnor's desire to have a flexible system for these queries, and while jacobsingh points out that variables can be set in settings.php, and pwolanin points out that queries can be modified, I think the feature request for restricting by content type from the admin section is a valid request that fits the 80/20 rule of 80% of use cases with 20% of the work. To get in, the design requirement is that it can still be modified programmatically (via modify_query). I'm also moving this to the 6.2 branch.

#23

socki - September 24, 2009 - 18:44

Here's an initial patch to expose this functionality to each More Like This block.

The patch does the following:

  • Adds a simple criteria text box to the block configuration page where you can enter facet name/value pairs
  • Loops over the facet name/value pairs entered, and adds them as filters to the main suggestions query

This permits the user to enter a criteria which would restrict the results. For example:

type:article

or

type:article title:growth

What I'd still like it to do:

  • Have a nicer interface then simply the textbox
  • Permit a simple keyword in addition to a facet name/value pair

One problem that I'm with allowing a simple keyword is that I've attempted to do this (as seen in the commented out code of the patch), is by adding a subquery. For example:

    $query = apachesolr_drupal_query('id:' . $id);

    $sub_query = apachesolr_drupal_query('-title:strategies');

    //apply any additional block specifc filters to the query
    foreach (explode(' ', $settings['mlt_res_criteria']) as $idx => $criteria) {
      list($field, $value) = explode(':', $criteria);
      if ($field && $value) {
        $sub_query->add_filter($field, $value);
      }
    }
    $query->add_subquery($sub_query);

The issue is that it doesn't seem to matter what I enter into the apachesolr_drupal_query...as long as there is something entered, the query returns nothing. If i instead leave that blank, but keep the subquery doing the filtering, that works fine.

AttachmentSize
apachesolr.diff 2.9 KB

#24

socki - September 24, 2009 - 20:27
Status:needs work» needs review

#25

pwolanin - September 25, 2009 - 01:21

From the comment above - the last patch doesn't work?

At one point there was a more expanded functionality like this in the MLT module when it was separate. I guess I'm not sure whether this is a general site need, or a site-specific need that shoudl be handled by a little but of custom code.

#26

socki - September 28, 2009 - 20:41

Hi, I'm not sure I understand the question. If you are asking me if the patch above works, are you referring to #11? If so, I'm not certain how that could work right now given that there is no separate apachesolr_mlt.module in the current release. The mlt block has been incorporated into the basic apachesolr.module.

The patch that i submitted in #23 basically exposes a text field whereby you can add some additional filtering for the MLT block. The rationale being that you might only want related content of a specific type to show up. I believe the patch with this basic functionality works.

Note: The patch is against the 6.x-1.0-RC2 release and the 2.0-dev code appears similar so the patch might work there as well, though i have not tested it against 2.0.

The additional comments that I added afterwards were more in terms of making the interface a bit nicer to the user, rather then just exposing a textbox. This piece might not be necessary, but be more of a nice to have.

thoughts?

#27

robertDouglass - September 30, 2009 - 10:13

@socki have you tried solving this need using apachesolr_views as per Scott Reynolds? http://drupal.org/project/apachesolr_views

For the apachesolr module I'd like to reiterate my design requirements:
- a per-block variable that can contain a filter string
- a getter and setter function for that variable that takes the block module/delta and knows how to set the variable name
- a way to parse and apply the contents of that variable to the query on any mlt block
- a series of checkboxes for content type on the block configuration form that allow the admin to limit the mlt suggestions to a specific content type. I still feel that content types will cover 80% of people's needs.

To keep the block form from clobbering whatever else comes along, the variable should store an array that has a structure something like this:

<?php
$mlt_query
= array(
 
'form' => array('type:page', 'type:story'),
 
'custom' => array('uid:1', 'uid:3', 'tid:17'),
);
?>

The 'form' part is set by the block admin form and the 'custom' part is set by other modules calling the API (the getter setter functions mentioned above). At query time the whole thing is combined into one query.

#28

socki - November 2, 2009 - 17:31

Here is the patch that I'm working on. I have attached two separate files, though the logic for both is identical. Basically I have two things that I'm trying to get accomplished here. As per the discussion above, I'm hoping that something along these lines can find its way into the module going forward.

1) The handling of the fields is nearly how it was described by @robertDouglass. The variation that has been taken is that rather then have separate _get and _set functions created, the two additional fields were added into the serialized structure that the rest of each blocks data gets stored into. This was done easily by just assigning a default value in the apachesolr_mlt_block_defaults function and adding the corresponding fields to the apachesolr_mlt_block_form function.

This part works in both the 1.x and 2.x patches.

2) The MLT block is then filtered with the addition of some code to the apachesolr_mlt_suggestions function. Basically, the approach currently taken is as such:

    //if types available via array
    if (is_array($settings['mlt_res_types'])) {
      $_apply_sub = FALSE; //by default we will not apply the subquery
      $sub_query = apachesolr_drupal_query();
      //loop over content type restrictions
      foreach ($settings['mlt_res_types'] as $type => $enabled) {

        //if at least one content type was selected, then we'll limit results based on them
        if ($enabled) {
          $_apply_sub = TRUE;
          $sub_query->add_filter('type', $type);
        }

      }//end - loop
        //if we're restricting the results
        if ($_apply_sub) {
          $query->add_subquery($sub_query);
        }
    }//end - is array

The code above loops over the content types enabled for the particular MLT block and adds it as a filter. This should suffice for about 80% of users of the block.

    //if there was an additional criteria specified
    if (trim($settings['mlt_res_criteria']) !== "") {
      $_criterias = explode(' ', trim($settings['mlt_res_criteria']));

      $sub_query = apachesolr_drupal_query();

      foreach ($_criterias as $idx => $_criteria) {
        if ($_str_pos = strpos($_criteria, ':')) {
          list($field, $value) = explode(':', $criteria);
          $sub_query->add_filter($field, $value); // The TRUE makes it a negative filter.
        } elseif ($_str_pos = strpos($_criteria, '^')) {
          $params['bq'][] = $_criteria;
        } else {
          $sub_query->add_filter('title', $_criteria); // The TRUE makes it a negative filter.
          $sub_query->add_filter('body', $_criteria); // The TRUE makes it a negative filter.
        }
      }

      $query->add_subquery($sub_query);
    }//end - if

As an added bonus, user's would have the ability to tweak the results even further by entering in an additional criteria. The way this current functions is that it attempts to allow to write basic solr queries which it then will break apart and parse if necessary in order to allow for boosting of terms, and additional keywords.

This is only functioning in the 1.x branch.

The reason that this appears to not to work in the 2.x branch is that the logic within the apachesolr_modify_query function is different. In the 1.x branch, queries are added to the parameters as such:

  if ($query && ($fq = $query->get_fq())) {
    $params['fq'] = $fq;
  }

In the 2.x branch, the queries are parsed and added to the parameters like this:

  if ($query && ($fq = $query->get_fq())) {
    foreach ($fq as $delta => $values) {
      foreach ($values as $value) {
        $params['fq'][$delta][] = $values;
      }
    }
  }

It seems the issue may be because $values in the 2.x version is expected to be an array, but it is not.

Question is, in the 2.x implementation, how should I be adding these additional filters so that it can be parsed and subsequently filtered correct?

Thanks in advance to your help.

AttachmentSize
1.x rc3 patch (works) 5.01 KB
2.x dev patch (needs work) 7.56 KB

#29

robertDouglass - November 24, 2009 - 15:42

wrt 6.2, the subqueries functionality was simply broken until recently.

#30

robertDouglass - November 24, 2009 - 22:22

I worked on this extensively and, based on the work from @socki, came up with an approach that's simple and effective. It has the list of checkboxes for types. These get OR'd together. Then, a textfield where you can write arbitrary query stuff. This get's AND'd to the previous query (the type filters). Note that you can do your own AND/OR grouping in the textbox, as well as range queries, boosting, negatives, etc.

AttachmentSize
mlt.patch 3.73 KB

#31

robertDouglass - November 24, 2009 - 22:27

#30 is for 6.2, in case it wasn't clear.

#32

Nick_vh - November 24, 2009 - 22:39

great work! This has been a long ride but i'll test it and I expect nothing else then happiness! :)

#33

robertDouglass - November 25, 2009 - 14:34
Version:6.x-2.x-dev» 6.x-1.x-dev

Applied to 6.2. Please review for 6.1.

AttachmentSize
mlt.patch 3.73 KB
 
 

Drupal is a registered trademark of Dries Buytaert.