cck.inc file to provide generic CCK support

nedjo - March 29, 2007 - 00:33
Project:Solr
Version:5.x-1.x-dev
Component:Code
Category:feature request
Priority:normal
Assigned:Unassigned
Status:active
Description

Nice module! Cleanly and simply coded. Ideally we'd make Drupal core search flexible enough to handle external indexers. Until then, this is a very promising start.

I'm particularly interested to see the ease with which we can let Solr/Lucene know about fields to be indexed.

Now, to the feature request. In CCK we have node types with well defined fields. It would seem fairly straighforward, then, to implement generic CCK support. This would have two parts:

  1. All cck node types are automatically indexed, passing data on their fields.
    We iterate through the fields in a CCK node type and present the results, something vaguely like:

    <?php
    $fields
    = _content_field_view($node);
    foreach (
    $fields as $name => $field) {
     
    $output .= '<field name="'. $name .'">'. $field['#value'] .'</field>';
    }
    ?>

  2. Users can access an advanced search form, which presents each of the available searchable fields.

    Here possibly we could get away with using an existing cck function to generate the appropriate form, then alter it (unset the form elements we don't need). Something like:

    <?php
    $form
    = _content_widget_invoke('form', $node);
    // Unset stuff...
    ?>

    I suppose we would use the has_title and has_body values of content types to determine if we should offer a 'title' and 'body' fields to search.

Thoughts? Does this sound like a good direction? Pointers or pitfalls?

#1

nedjo - March 29, 2007 - 05:26

Thinking about this a bit more, I'm looking at the $node->content array as a possible source of search indexing, that would include CCK fields as well as others.

Before passing the node to indexing, we would build its content array:

<?php
$node
= node_build_content($node, $teaser, $page);
?>

Then we can iterate through the content array and consider each part a distinct field. Here's a rough idea:

<?php
$fields
= array();
foreach (
element_children($node->content) as $key) {
 
$fields[$key] = $node->content[$key]['#value'];
}
?>

From here we pass the fields to be indexed separately, as I've suggested in this issue: http://drupal.org/node/131999.

The advantage of this is that we don't need any CCK-specific code; we rely on existing implementations of node views, in CCK and elsewhere.

Figuring out how to generate an advanced search form, though, would be a lot trickier. This is going to differ by node type (e.g., each node type has a distinct set of fields). The only method we have for determining what keys are in the content array is through generating it for a given node. That is, we don't have a way of saying "what keys are there in the content array in general for node type A?".

So, using the content array probably warrants more consideration, but maybe my first take, above, is the better way to go for CCK, since, unlike relying on the content array, it takes advantage of our full knowledge of CCK field definitions (and could be extended, e.g., to present a select list of options for a field that is populated through a select).

#2

hickory - April 5, 2007 - 15:58

I haven't used CCK much, so I can't really comment on how this would work, but using node_build_content seems to make sense.

Presumably CCK nodes don't have handler modules that could define the search form, but you could maybe build the search form in the settings page by querying across all the nodes of that type, seeing which fields were available and letting the admin choose which ones should be indexed/searchable.

#3

hickory - July 24, 2007 - 13:57

This is also complicated by needing to define all the possible fields in solrconfig.xml before indexing content.

#4

robertDouglass - October 6, 2007 - 12:39

Two things. Can't we use the wildcard (*_i, *_t, *_d) to dynamically add fields without messing with solr config? Secondly, we could have an admin screen that intelligently prints suggested solr config field definitions for the admin to paste into the file.

#5

hickory - October 9, 2007 - 12:48

Using wildcards sounds like a good idea, and producing settings for the configuration file is something I've considered before as well.

#6

robertDouglass - October 10, 2007 - 09:45

I'm working on this in the context of my new solr module. Will share the goodies when ready =)

 
 

Drupal is a registered trademark of Dries Buytaert.