Comments

jpmckinney’s picture

Title: preliminary Drupal 7 port » PATCH - Very preliminary Drupal 7 port
Category: task » feature

Like tjwallace said, this is not a finished port. Many theme functions and queries still need to be rewritten.

The current patch removes _sendRawGet and _sendRawPost from Drupal_Apache_Solr_Service to get search working. We may want to track down how these functions were causing search to fail.

There are at least four TODO/XXX in the current patch that need to be fixed.

-    $document->moderate = $node->moderate;
+    // TODO: $node->moderate has been removed
+    //$document->moderate = $node->moderate;
+    // TODO: use db_insert()
+  // TODO: update to use new API
+  $result = db_query_range("SELECT asn.nid, asn.changed FROM {apachesolr_search_node} asn ". $join_sql ."WHERE (asn.changed > :last_change OR (asn.changed = :last_change AND asn.nid > :last_nid)) AND asn.status = 1 ". $exclude_sql ."ORDER BY asn.changed ASC, asn.nid ASC", 0, $limit, $args);
+      // TODO: use db_select
       $result = db_query("SELECT i.field_name, f.multiple, f.type AS field_type, i.widget_type, i.label, i.type_name AS content_type FROM {content_node_field_instance} i INNER JOIN {content_node_field} f ON i.field_name = f.field_name;");

Here's my review up to apachesolr_search.admin.inc in the patch:

DONE: The .info part of the patch won't apply as the patch is against a downloaded version of the module, rather than against CVS. Also, need to add:

files[] = apachesolr.module

"Index Actions" was changed from a header to a fieldset, as the new FAPI wasn't accepting the old format. Perhaps someone more familiar with the D7 FAPI can restore the header.

   $form['action'] = array(
-    '#value' => '<h3>'. t('Index Actions') .'</h3>',
+    '#type' => 'fieldset',
+    '#title' => t('Index Actions'),
   );

I'm not sure what the appropriate $build_mode argument to node_build_content should be.

-    $node->build_mode = NODE_BUILD_SEARCH_INDEX;
-    $node = node_build_content($node, FALSE, FALSE);
+    $node->build_mode = 'search_index';
+    node_build_content($node);

DONE: Be careful about codings standards http://drupal.org/coding-standards e.g.:

+        db_update('apachesolr_search_node')->condition('nid', $nids, 'IN')->fields(array('changed'=>time()))->execute();

DONE: Make $nid plural here. Also, there's no need for the quotes around :new and :old, I think (occurs elsewhere, too):

+    $nid = db_select('node')->fields('node', array('nid'))->where("type = ':new' OR type = ':old'", array(':new'=>$info->type,':old'=>$info->old_type));

DONE: What happened to $excluded_types? (see SQL injection comment)

-  $total = db_result(db_query("SELECT COUNT(asn.nid) FROM {apachesolr_search_node} asn ". $join_sql ."WHERE asn.status = 1 " . $exclude_sql, $excluded_types));
+  $total = db_query("SELECT COUNT(asn.nid) FROM {apachesolr_search_node} asn ". $join_sql ."WHERE asn.status = 1 " . $exclude_sql)->fetchField();

RETRACTED: Change :last_change to :last_changed (occurs elsewhere):

+  $remaining = db_query("SELECT COUNT(asn.nid) FROM {apachesolr_search_node} asn ". $join_sql ."WHERE (asn.changed > :last_change OR (asn.changed = :last_change AND asn.nid > :last_nid)) AND asn.status = 1 "  . $exclude_sql, $args)->fetchField();

DONE: Don't call time() twice, as it may not be the same time:

-    $time = time();
-    db_query("UPDATE {apachesolr_search_node} SET changed = %d WHERE changed > %d", $time, $time);
+    db_update('apachesolr_search_node')->condition('changed', time(), '>')->fields(array('changed'=>time()))->execute();

SQL injection possibility?

-    $exclude_sql = "AND n.type NOT IN(". db_placeholders($excluded_types, 'varchar') .") ";
-    $args = array_merge($args, $excluded_types);
+    $exclude_sql = "AND n.type NOT IN('" . implode("','", $excluded_types) . "') ";

DONE: Should be removed. Let Solr autocommit:

+        // TODO: is this neccessary?
+        $solr->commit();

DONE: I think join is synonymous in DBTNG with innerJoin, so I prefer the shorter method name:

+    $na = $query->innerJoin('node', 'n', 'n.nid = asn.nid');

DONE: Expand $na into $n_alias (occurs more than once):

+    $na = $query->innerJoin('node', 'n', 'n.nid = asn.nid');

DONE: Respect 80-column limit for comments as in original code:

+  // Make sure no node ends up with a timestamp that's in the future by using time() rather than the node's changed or created timestamp.

hook_node_delete and hook_node_update don't take a second argument. If $set_message is being used, we may need to do some refactoring to keep that functionality:

+function apachesolr_node_delete($node, $set_message = TRUE) {
tjwallace’s picture

Status: Active » Needs work
StatusFileSize
new77.33 KB

Ok I have applied most of jpmckinney's comments. I also have gotten search results to show up now!

pwolanin’s picture

great - I will try to put a version of this into CVS HEAD tonight.

BenK’s picture

Great work... subscribing...

janusman’s picture

Awesome!

davidseth’s picture

I have been keenly waiting for this! Great Work! I will see if I can help out, can anyone give me some pointers as to where they would like some help?

pwolanin’s picture

page callbacks should return structured data - this looks rather wrong:

+    // XXX: bit of a hack
+    $output .= drupal_render(drupal_get_form('apachesolr_index_action_form'));
pwolanin’s picture

some other parts of the patch don't apply:

apachesolr_search.module.rej

***************
*** 167,189 ****
        }
  
        // Collect the search results. See search_data().
-       $content = apachesolr_search_search('search', $keys);
-       if (isset($content) && is_array($content) && count($content)) {
          if (module_hook($type, 'search_page')) {
-           return module_invoke($type, 'search_page', $content);
          }
          else {
-           return drupal_get_form('search_form', NULL, $keys, $type) . theme('search_results', $content, $type);
          }
        }
- 
-       // See search_view().
-       if ($content) {
-         $content = theme('box', t('Search results'), $content);
-       }
-       else {
-         $content = theme('box', t('Your search yielded no results'), variable_get('apachesolr_search_noresults', apachesolr_search_n
oresults()));
-       }
      }
      elseif ($type != 'node') {
        // Ignore $type == node. Since we override the menu path to search to point
--- 177,196 ----
        }
  
        // Collect the search results. See search_data().
+       $results = apachesolr_search_search('search', $keys);
+       if (isset($results) && is_array($results) && count($results)) {
          if (module_hook($type, 'search_page')) {
+           return module_invoke($type, 'search_page', $results);
          }
          else {
+           $build['search_form'] = drupal_get_form('search_form', NULL, $keys, $type);
+           $build['search_results'] = array(
+             '#theme' => 'search_results_listing',
+             '#content' => theme('search_results', array('results' => $results, 'type' => $type)),
+           );
+           return $build;
          }
        }
      }
      elseif ($type != 'node') {
        // Ignore $type == node. Since we override the menu path to search to point

apachesolr.module.rej

***************
*** 1135,1141 ****
      list($module, $filepath, $class) = variable_get('apachesolr_service_class', array('apachesolr', 'Drupal_Apache_Solr_Service.php', 'Drupal_Apache_Solr_Service'));
      include_once(drupal_get_path('module', $module) .'/'. $filepath);
      try {
-       $solr_cache[$host][$port][$path] = new $class($host, $port, $path);
      }
      catch (Exception $e) {
        watchdog('Apache Solr', nl2br(check_plain($e->getMessage())), NULL, WATCHDOG_ERROR);
--- 1135,1144 ----
      list($module, $filepath, $class) = variable_get('apachesolr_service_class', array('apachesolr', 'Drupal_Apache_Solr_Service.php', 'Drupal_Apache_Solr_Service'));
      include_once(drupal_get_path('module', $module) .'/'. $filepath);
      try {
+       $solr = new $class($host, $port, $path);
+       // Set some non-default behaviors.
+       $solr->setCollapseSingleValueArrays(FALSE);
+       $solr_cache[$host][$port][$path] = $solr;
      }
      catch (Exception $e) {
        watchdog('Apache Solr', nl2br(check_plain($e->getMessage())), NULL, WATCHDOG_ERROR);
pwolanin’s picture

committing parts of the patch that apply (with a few tweaks) to CVS HEAD - it's surely pretty broken, but as least then we can all work from a common checkout.

pwolanin’s picture

Title: PATCH - Very preliminary Drupal 7 port » preliminary Drupal 7 port
jpmckinney’s picture

Category: task » feature

Did you sync HEAD with 6.x-2.x before applying the patch? The patch may not have applied cleanly because HEAD was behind the latest 6.x-2.x. EDIT: Nevermind, having checked the CVS log HEAD was indeed synced with 6.x-2.x.

jpmckinney’s picture

Category: feature » task

Checking the CVS log http://drupal.org/project/cvs/204268, it looks like a different patch was applied than the one in this thread: http://drupal.org/cvs?commit=365998 That's fine with me, I'd rather not commit this patch right away.

pwolanin’s picture

@jpmckinney - I started with the patch above, but made a few manual edits. For example, not removing the methods overrides to use drupal_tttp_request() since we need to make that work. Also made some edits to the .info files.

jpmckinney’s picture

@pwolanin When I look at the commit, I only see changes to files within the contrib/ directory. I just checked out HEAD, and I can't find any trace of the above patch. Are you sure you committed it (or some parts of it)?

pwolanin’s picture

Category: feature » task

@jpmckinney - you are totally right - I ran the commit inside the contrib subdir by mistake.

Just committed the rest.

robeano’s picture

This is great! It looks like pwolanin made the rest of the commits just now. Thanks so much for getting the port to D7 jump started.

jpmckinney’s picture

I'm just confirming that the patch applied at http://drupal.org/cvs?commit=366106 differs with the last attached patch in Drupal_Apache_Solr_Service.php and in the .info files. It also differs in some changes to apachesolr_search_view, as cited in the apachesolr_search.module.rej failure above.

@pwolanin I don't see what part of the attached patch above would cause the patch failure for apachesolr.module.rej that you cite.

dmitry_bezer’s picture

StatusFileSize
new81.29 KB

Next step.
includes updates in theming DB access and taxonomy related code

pwolanin’s picture

@jpmckinney - I'm not sure either about what caused the conflict, but that's easy to fix.

@dmitry_bezer Looks like Drupal core is using node_view() instead of node_build_content() for indexing : http://api.drupal.org/api/function/_node_index_node/7 so we sould revist how we are building content for indexing.

Also it would be preferable for all queries to use named placeholder:

-  db_query("UPDATE {apachesolr_search_node} SET changed = %d WHERE nid IN (SELECT nid FROM {og_ancestry})", REQUEST_TIME);
+  db_query("UPDATE {apachesolr_search_node} SET changed = ? WHERE nid IN (SELECT nid FROM {og_ancestry})", array(REQUEST_TIME));

also, db_query must only be used for SELECT - convert to db_update (or db_merge) for any update queries.

however, this looks like good progress, so committing to HEAD so it's available for further refinement.

dmitry_bezer’s picture

Title: PATCH - Very preliminary Drupal 7 port » preliminary Drupal 7 port
Category: feature » task

@pwolanin - thank you for your comments,
So the next thing i'm going to do is change node_build_content() -> node_view() and convert all the queries into new style DB Api calls.
Hope I will post the patch soon

dmitry_bezer’s picture

StatusFileSize
new20.05 KB

* non-select queries converted to DB API calls
* node_view() is used for indexing instead of node_build_content()
* "Use Apache Solr for taxonomy links" option is partially working now

pwolanin’s picture

there is a minor white space formatting issue with:

$sel_query->addExpression( 'GREATEST(n.created, n.changed, COALESCE(c.last_comment_timestamp, 0))', 'changed' );

COALESCE is not listed on http://drupal.org/node/773090 as a supported function, but I don't see an obvious alternative. A quick search suggests that MySQL, postgres, sqlite, MSSQL, and Oracle all support coalesce.

Also, I think ideally we'd use dynamic queries also for any query using JOINs

pwolanin’s picture

committing to HEAD with some white space cleanup. queries with joins need to be fixed still.

pwolanin’s picture

Yes, I think we need the COALESCE, just looking out for DBTNG changes or weirdness - I'll ping Crell about it

dmitry_bezer’s picture

StatusFileSize
new14.47 KB

* queries using joins converted (except ones in cck-related code)
* more taxonomy related fixes
* empty search page fixed
* minor improvements

pwolanin’s picture

I checked with Crell in IRC, and he says that dynamic queries are not required for simple JOINS in SELECT queries.

Some of these changes look like line endings are changing or trailing space added?

-            $content = theme('apachesolr_browse_blocks', array('blocks' => $blocks));
+            $content = theme('apachesolr_browse_blocks', array('blocks' => $blocks)); 
dmitry_bezer’s picture

StatusFileSize
new13.67 KB

line endings fixed

dmitry_bezer’s picture

fyi: I'm currently rewriting apachesolr CCK code into Field API. The patch is coming soon.

pwolanin’s picture

Thanks. committed after removing a stray tab.

dmitry_bezer’s picture

StatusFileSize
new16.28 KB

Initial Field API support implementation. Actually just a reimplementation of old CCK code. Adds facets for fields of 'list', 'list_text' and 'list_number' types. Old CCK code is not removed yet. Any comments are welcome.

pwolanin’s picture

dmitry, can you explain a little more the new code in the patch?

Also, was this deliberate?

-    $menu['search/apachesolr_search'] = $menu['search'];
+    //$menu['search/apachesolr_search'] = $menu['search'];
dmitry_bezer’s picture

StatusFileSize
new24.82 KB

As i said, basically it simply reimplements the old CCK code. So actually there is no new code =) On indexing it adds to the solr document fields of certain types that can only contain predefined set of values. It also adds corresponding blocks you can use to filter your search results. This fields functionality is very similar to how taxonomy is used in search. Actually this adds the same functionality that you have for taxonomy (in terms of search) for any field with predefined values.

menu code: yes, it was deliberate. That line was added by mistake and caused problem with breadcrumbs

This new patch also fixes a minor problem with menu and removes some tab characters and spaces on the line ends.

pwolanin’s picture

Looks like apachesolr_cck_fields() is still used?

pwolanin’s picture

committing this patch - looks like a bunch of the admin paths need more work though. Let me take a quick pass.

pwolanin’s picture

StatusFileSize
new7.63 KB

committing this path fix-up patch.

pwolanin’s picture

StatusFileSize
new4.6 KB

minor changes to settings page UI + CHANGELOG, committed

dmitry_bezer’s picture

thanks Peter,

I realized that i forgot that submodules in apachesolr/contrib the still require some work. will post the patch soon

dmitry_bezer’s picture

StatusFileSize
new26.25 KB

* lots of small fixes and improvements
* apachesolr_commentsearch and apachesolr_nodeaccess were also ported

dmitry_bezer’s picture

Here is what i think should be done next:
* remove old CCK code, it is still there only for reference.
* port the rest of submodules in contrib/ - I tried to port the apachesolr_date but was unable to install the dev version of Date module
* use it, test it, find bugs, submit patches =)

pwolanin’s picture

StatusFileSize
new24.91 KB

re-roll of the patch above with whitespace, etc, cleanup.

Also, reverted a part that's removing a date fix.

pwolanin’s picture

Wow, working with this code is a real PITA (especially the new field API stuff).

As I dig into it, however, it seems clear to me that we should treat taxonomy terms via the Field API and hence remove a variety of the special casing for them.

pwolanin’s picture

StatusFileSize
new48.61 KB

Going to commit this patch - moving in the right direction, though it's still a mess.

pwolanin’s picture

StatusFileSize
new11.93 KB

Another iteration - committing this (not sure it works yet).

das-peter’s picture

StatusFileSize
new1.26 KB

Another patch to get __makeHttpRequest running again. drupal_http_request has changed its parameter structure.

pwolanin’s picture

Thanks, committed that

pwolanin’s picture

StatusFileSize
new14.42 KB

more updates. committing to CVS

das-peter’s picture

StatusFileSize
new1.43 KB

Two syntax issues fixed - introduced with latest commit ;)

das-peter’s picture

Looks like the commend module should be added as dependency in the apachesolr.info.
The method apachesolr_rebuild_index_table() relies on tables created by the comment module.

Another solution would be to check if the comment module is enabled and change the query if necessary.
But then there should be a way to detect if the comment module is enabled, to recreate the index table.

pwolanin’s picture

ok, thanks - this is obviously a work in progress.

comment module shoudl NOT be required - the query was supposed to have been fixed to work in the absence of comment

pwolanin’s picture

StatusFileSize
new23.79 KB

reworking taxonomy so that it's handled as a normal field, fixed various bugs.

committing this to CVS.

pwolanin’s picture

Status: Needs work » Needs review
StatusFileSize
new17.06 KB

Here's the APi change we've all been waiting for - put the params array inside the query object, and use the query object itself to execute searches.

pwolanin’s picture

committed #52 - but still looking for feedback on current progress.

pwolanin’s picture

StatusFileSize
new2.53 KB

Here's a change we should also make in 6.x-2.x: move the OR facet code to the finalize hook.

pwolanin’s picture

oops - that patch was backward. - use -R

committed it to CVS

pwolanin’s picture

StatusFileSize
new6.27 KB

taxonomy fixes. Not really tested yet.

pwolanin’s picture

StatusFileSize
new1.86 KB

#56 was already committed - here are some minor fixes after review with Robert. Committing to CVS.

pwolanin’s picture

Committing this one line fix also:

Index: apachesolr.module
===================================================================
RCS file: /cvs/drupal-contrib/contributions/modules/apachesolr/apachesolr.module,v
retrieving revision 1.38
diff -u -p -r1.38 apachesolr.module
--- apachesolr.module	25 Aug 2010 07:30:42 -0000	1.38
+++ apachesolr.module	25 Aug 2010 09:14:54 -0000
@@ -1839,6 +1839,7 @@ function apachesolr_node_fields() {
         'indexing_callback' => 'apachesolr_term_reference_indexing_callback',
         'index_type' => 'integer',
         'facet_block_callback' => 'apachesolr_search_taxonomy_facet_block',
+        'name_callback' => 'apacehsolr_term_reference_name',
       ),
       'node_reference' => array(
         'display_callback' => 'apachesolr_cck_nodereference_field_callback',
yngens’s picture

thanks for the great job. subscribing

pwolanin’s picture

StatusFileSize
new74.21 KB

Working with Crell - here's a big patch that makes Field API handling more generic, adds a framework for handling settings for multiple Solr servers, and moves the nodeaccess module out of the contrib dir.

pwolanin’s picture

StatusFileSize
new7.42 KB

Fix up server add/edit/delete. Committed this to CVS.

pwolanin’s picture

StatusFileSize
new4.8 KB

fix up requirements checking, add button to test a set of server settings. committing to CVS.

pwolanin’s picture

StatusFileSize
new1.11 KB

oops - indexing code doesn't account for schema changes.

pwolanin’s picture

Status: Needs review » Fixed
StatusFileSize
new10.5 KB

final patch for this issue.

fixes various settings forms and variables, avoids a notice, removes use of md5() in favor of drupal_hash_base64().

committing to CVS.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.