I want to only return results from a cck type named "jobs" on one search page, and the default to work on another search page.
How do I do this? I also am using the Location module, and would like to join the results with a simple proximity search.

1. How do I index attached files?
2. How do I create my own search form, and call something like do_search(). Do you have a views hooks, so I can build a search query using views?
3. Join the results with the location table?

2 fields:

Keywords
Enter City, State

Thank you. I cannot find a group to ask these questions.

Bill

Comments

cpliakas’s picture

Hi Bill.

Sorry for the lack of documentation. This module was build on my free time outside of work, so priority #1 was getting the code stable. Extensive documentation is in the works, but I need to take a deep breath before I embark on that adventure. I promise that it will be available soon.

Seth Cohn of CommonPlaces e-Solutions, LLC actually did something very similar to what you are trying to accomplish using the module's Facet API. It really showed some flexibility in the API that I never intended, but the result was being able to use facets as "tabs". It is really too much to describe here, but a module based off of his code should actually be available fairly soon. What I would recommend first is to implementing hook_luceneapi_document_alter() to add the fields to the index. For example, after adding the jobs field to the index, you will be able to search for them using the following query: jobs:"Job you are looking for". I would also recommend adding fields for City and State. Take a look at luceneapi_node_luceneapi_facet() as well as README.txt for a brief tutorial on the facet API. Again, more detailed documentation is on the way.

In terms of views hooks, that is also a separate module that is being explored, but Search Lucene API has an abstraction layer to Zend's query API, so you can create and manipulate queries programmatically. This would mesh very well with the various views hooks, although it has not been implemented. Any contributions in this area would be greatly appreciated.

Indexing files requires being able to parse the files into text first. Many times this requires external programs, but if you are trying to parse MS 2007 documents, the ZF components actually have document parsers built in. Once you extract the text, you can it to the index as a field via hook_luceneapi_document_alter().

The next step is to get some solid documentation online, and then we will determine the best way to start dialogue with the Drupal community. I am working with other developers who are currently implementing the API to contribute real world solutions making this module more than just a content indexer. I am sorry that I cannot be of more help right now, but check back in a little bit as the documentation is coming.

billnbell’s picture

I get most of what you are saying. I am better with Zend directly than with the module. I can not find any documentaion on FACET. The readme does not help me much. Need more documentation and examples. Examples will help people to use it more often. Just some sample code would be very helpful. How about a simple Drupal example:

1. How to create a custom form field and pass it to Lucene and limit results to content type "xyz". Somehow there has to be an easy way to get a list of node ids without hacking your module.
2. hook_luceneapi_document_alter() example would be very useful on how to add fields.
3. Facet example with luceneapi_node_luceneapi_facet() - just a simple one (very simple!!)

Any ideas on how to take a list of node.nid and join it with a SQL query - maybe with a temp table would be very good as well.... implode(",", $nodes).

Thanks on the Zend parsers. I didn't see that. I heard there was a way to output PDF, but other than Xpdf I couldn't find anything else that works well.

I am building a large web site for 2 companies and could really use some assistance. Solr is not an API and appears harder to use.

Thanks

cpliakas’s picture

Hi Bill.

Here are a couple of examples to get you going. First, let's start with adding the field to the document. In the example, I am going to assume that the CCK field is stored in $node->field_location[0]['value'] and we want to add it to the Lucene field location. It is fairly simple to add it via hook_luceneapi_document_alter().

<?php
/**
 * Implementation of hook_luceneapi_document_alter().
 */
function mymodule_luceneapi_document_alter($doc, $node, $module, $type) {
  // bail if we are not dealing with nodes
  if ('node' != $type) {
    return;
  }

  // add the field to the document as a keyword.  Feel free to use another field type.
  luceneapi_field_add($doc, 'keyword', 'location', $node->field_location[0]['value']);
}

The following example creates a facet to be able to filter by location using wildcards.

<?php

/**
 * Implementation of hook_luceneapi_facet().
 */
function mymodule_luceneapi_facet($op, $module, $type) {
  // bail if we are not dealing with nodes
  if ('node' != $type) {
    return;
  }

  switch ($op) {
    // name of the facet, will show up in /admin/settings/luceneapi_node
    case 'name':
      return 'Location';

    // define the facet's form element, just a simple FAPI array
    case 'facet':
      $form = array();
      $form['location'] = array(
        '#type' => 'textfield',
        '#title' => t('Location'),
        '#size' => 30,
        '#default_value' => luceneapi_facet_value('location', ''),
        '#description' => t('Enter a location, wildcards allowed.'),
      );
      return $form;

    // callback function, a facet handler to process the value submitted by the user
    case 'callback':
      return array('mymodule_facet_handler' => array());
  }
}

/**
 * Adds query to filter by location.
 */
function mymodule_facet_handler() {
  // gets facet value passed by user
  $location = luceneapi_facet_value('location', '');

  // returns subquery to filter by location, in this case it is a wildcard query
  // NOTE: see luceneapi.query.inc for available query types and parameter definitions
  return luceneapi_query_get('wildcard', $location, 'location');
}

There are a bunch of different ways to do the custom form you are talking about, but you could always pass the search query to search_data($keys, 'luceneapi_node');. It will search the contents field by default, but you can change the default field to location by calling Zend_Search_Lucene::setDefaultSearchField('location'); beforehand. Since you are looking to add extra data via a custom SQL query, hook_luceneapi_result_alter() is what you are looking for. You can get the node ID from $result, execute your query, and then add the data back to $result. Since it is passed by reference, the array will be passed to the theme function and displayed. If you want to limit the results to the content type xyz, you can always append a subquery in hook_luceneapi_query_alter() to do the filtering.

/**
 * Implementation of hook_luceneapi_query_alter().
 */
function mymodule_luceneapi_query_alter($query, $module, $type) {
  if ('node' == $type) {
    $term_subquery = luceneapi_query_get('term', 'xyz', 'type');
    luceneapi_add_subquery($query, $term_subquery, 'required');
  }
}

As you can probably see, the luceneapi_query_get() function is just a factory function for the Zend_Search_Lucene_Search_Query objects. The advantage is that it handles exceptions gracefully and sends errors to watchdog so you don't have to. See the luceneapi_throw_error() docblock for the other benefits. To get the error messages piped to the screen in your development environment, set the error mode to Debug in /admin/settings/luceneapi. If the hooks aren't what you are looking for, feel free to build a custom query and pass it to $index->find() as you are a Zend guy. I came over to Drupal from the Zend Framework, so I understand wanting to program using straight Zend. Although there is an abstraction layer, I really tried to keep it close to Zend since they did an amazing job with their API. The advantages of doing things through the abstraction layer are graceful error handling as mentioned above, and consistent UTF8 handling without you having to think about it.

Hope this helps,
Chris

sethcohn’s picture

Chris and I just reviewed the code required to implement faceted tabs (identical to the way Drupal puts tabs into search now) using LuceneAPI, and it's pretty trivial (once we figured out a method that worked, which took some brainstorming initially), so I'll whip up a sample contrib module for this and post it in a new issue. Lucene Extras, Chris' planned place for this sort of contrib module will get a real release of it sooner or later.

cpliakas’s picture

Status: Active » Closed (fixed)
shunting’s picture

Suppose I want to treat a CCK field as a facet (seems like the simplest use case), what is the hook for that? document_alter is for an external document, right? What about a node? Thanks for a very, very intriguing module....

shunting’s picture

This sample code worked for me, with one exception (below).

When I added 'location' as a CCK field to blog (could be any content) type, the facet search successfully searched on keywords with facet values in the location field only.

(What would be nice is an example with multiterm (OR-ed) values, not just a single string....)

The exception: luceneapi_add_subquery should be luceneapi_subquery_add -- although I left that function commented out, since it looks like it's already handled by the contributed luceneapi_node_module.

shunting’s picture

More confusing terminology. The sentence:

... [Lucene] will search the contents field by default ...

applies to Lucene fields, not to Drupal fields, which is confusing enough. Even more confusingly, luceneapi_node.module indexes Drupal fields, but only those built into nodes, like author and change date, and not CCK fields specific to content type. And even more confusingly, contents:foo will find "foo" anywhere in content, including CCK fields -- but won't display the hit that way.

Not a knock on the software at all or even the documentation! It looks to be like we're dealing with an intrinsically confusing situation here.

* * *

Am I right in thinking that hook_luceneapi_document_alter applies to a "document" from Lucene's perspective, and that therefore a Drupal node is a "document"? Or does the hook only apply to external documents?

alex72rm’s picture

Version: 6.x-1.0-rc7 » 6.x-1.6

@#6:

I don't understand, but as it's stated in #3 doesn't work for me (I'm working with 5.x branch).

Is there some modifications to apply to run it correctly?

Thanks a lot

alex72rm’s picture

Status: Closed (fixed) » Active

The call in #3:

luceneapi_field_add($doc, 'keyword', 'location', $node->field_location[0]['value']);

should induce a re-index?

Maybe this is the reason for no search results when a "location" field is filled with a value.

cpliakas’s picture

Hi Alex.

This hook gets invoked when a document is being added to the index. After you implement this hook, you have to reindex and run cron. Also, I made a mistake in the above post. You should probably use "text" or "unstored" instead of "keyword". "keyword" fields are best used for things like node ID's or exact matches as opposed to textual data.

Hope this helps,
Chris

cpliakas’s picture

Status: Active » Closed (fixed)

This thread only works with the 1.0 API, and is not applicable to 2.0.

BenK’s picture

Status: Closed (fixed) » Active

Hi everyone,

I'm very intrigued by the possibilities of the Search Lucene API module and saw a reference to a contributed "Lucene Extras" module in Comment #4 above. Has the "Lucene Extras" module been released? I'm interested in trying out the faceted tabs feature....

Cheers,
Ben

P.S. I re-opened this issue because of the reference to Comment #4... if you'd rather that I open a new support request, just let me know. :-) Thanks!

cpliakas’s picture

Status: Active » Closed (fixed)

Hi Ben.

Unfortunately the module never got off the ground for a number of reasons. The guy who came up with the cool faceted tabs stuff no longer works with me, but I can try to dig up some old code to point you in the right direction. I am going to close this issue and open a new one at #649458: How do I display facets as tabs?.

Thanks for bringing this up,
Chris

T-MaK’s picture

Hi cpliakas,

How can I display result by content type?

Thanks.