Download & Extend

function apachesolr_entity_update() and $status_callback($id, $type) broken the node

Project:Apache Solr Search Integration
Version:6.x-3.x-dev
Component:Code
Category:bug report
Priority:critical
Assigned:Unassigned
Status:closed (fixed)

Issue Summary

In the process of insert the hook_nodeapi in apachesolr.mode call this function apachesolr_entity_update().

This function do a process to determine the status of the node ($node->status) for this case the variable $entity is the alias of $node and in this one get the $node->status. So the $status_callback($id, $type) do a reload $node and broken it in some cases.

I test it commenting the code when determine the callback to see the node status. And then use this.

<?php
$status
= TRUE;
$status = $status && $entity->status;
?>

I bealive in the case of Drupal 6 to determine the node status we don't need another process.

In my website the callback function apachesolr_index_node_status_callback($entity_id, $entity_type) destroy my articles.

Please take a look.

AttachmentSizeStatusTest resultOperations
node-status-0.patch1.28 KBIdlePASSED: [[SimpleTest]]: [MySQL] 514 pass(es).View details

Comments

#1

Status:needs review» needs work

Removing so much code? You are removing the status callback all together?
please explain in more detail what is the problem here and how you want to fix it?

#2

As you can see I remove a lot of code, and the status_callback. I don't know yet how Drupal 7 handle the status of an entity (or node) but Drupal 6 is pretty easy, it's in the $node object.

I figure out that drupal 6 always is gonna call this function:

function apachesolr_index_node_status_callback($entity_id, $entity_type)

Because the hook_nodeapi() it's only for nodes.

+++ b/apachesolr.module
@@ -1908,19 +1908,8 @@ function apachesolr_entity_update($entity, $type) {
-    $status_callbacks = apachesolr_entity_get_callback($type, 'status callback', $bundle);

find the function callback apachesolr_index_node_status_callback()

+++ b/apachesolr.module
@@ -1908,19 +1908,8 @@ function apachesolr_entity_update($entity, $type) {
-          $status = $status && $status_callback($id, $type);

callback of the function apachesolr_index_node_status_callback()

So if we're gonna determine if the node status is === 1 or not we don't need this. Actually when I remove this lines my nodes start to save in the right way and more faster.

The function apachesolr_index_node_status() have this.

  $node = node_load($entity_id, NULL, TRUE);
  $status = ($node->status == 1 ? 1 : 0);
  return $status;

In the insert action when the node it's not 100% finish we call it to see the status and validate it. Actually the status is always there it's comming in the $node object and it's only for nodes. I didn't remove for the term or other thing.

For me this change means performance and secure way to save a new node or edit them.

#3

I think I'm having the same problem. When I save a new node of a type that's enabled in the Apache Solr Search Configuration (bottom of /admin/settings/apachesolr), all CCK field values are lost. If the node is then edited and saved, it works correctly.

I haven't been able to completely figure out what's happening but it seems like the call to node_load() inside apachesolr_index_node_status_callback() is causing the trouble.

#4

Okay, I updated the weight of the cck module ('content') to -10 so its hook_nodeapi() definitely runs before apachesolr's and that fixed the problem for me. Something to keep in mind.

#5

<?php
/**
* Status callback for ApacheSolr, for nodes.
*/
function apachesolr_index_node_status_callback($entity_id, $entity_type) {
 
// Make sure we have a boolean value.
  // Anything different from 1 becomes zero
 
$node = node_load($entity_id, NULL, TRUE);
 
$status = ($node->status == 1 ? 1 : 0);
  return
$status;
}
?>

I'm trying to understand it here, but it does not make a lot of sense. Can we see if changing this to a direct sql query resolves the issue?

#6

Drupal.org has this problem too, causing #1825814: Apachesolr 6.x-3.x and CCK cause corrupt field caching.

CCK caches field values on node load, see content_load(). If this triggers a node load before CCK has gotten to saving the values, then CCK will cache empty values.

#7

Priority:normal» critical

Updating to critical since a bad field cache looks like data loss, and becomes a likely-permanent data loss if the node is saved.

Ways to solve this:

  • Don't call node_load() in the middle of node save; you won't get a complete node. Rewrite to run later or get the data another way.
  • or, make sure apachesolr's weight in the system table is higher than content's.
  • or, in CCK, have content_insert() call cache_clear_all(), like content_update() and content_delete() do.

#8

I'm in favor of a direct SQL query. it's pluggable enough so someone can easily modify it to their needs. Let me make a patch

#9

Status:needs work» needs review
AttachmentSizeStatusTest resultOperations
1693092-9.patch632 bytesIdleFAILED: [[SimpleTest]]: [MySQL] 518 pass(es), 2 fail(s), and 0 exception(s).View details

#10

Status:needs review» needs work

The last submitted patch, 1693092-9.patch, failed testing.

#11

Try this.

AttachmentSizeStatusTest resultOperations
1693092-10.patch655 bytesIdlePASSED: [[SimpleTest]]: [MySQL] 520 pass(es).View details

#12

Status:needs work» needs review
AttachmentSizeStatusTest resultOperations
1693092-11.patch651 bytesIdlePASSED: [[SimpleTest]]: [MySQL] 520 pass(es).View details

#13

Yeah, I figured that I forgot the argument..

#14

Killua, you can mark it as RTBC and I'll commit your patch, it's a little cleaner than mine;-) But! Don't forget, confirm first if this actually solves the problem. But I think it does

#15

Yes I saw that hehe ... and just to know I have in my production site this path.

Its the same, only with the $node obj thing to keep it idk more „drupal way“

AttachmentSizeStatusTest resultOperations
1693092-12.patch683 bytesIdleFAILED: [[SimpleTest]]: [MySQL] Invalid patch format in 1693092-12.patch.View details

#16

no that is not necessary. I prefer the patch #11

#17

Status:needs review» needs work

The last submitted patch, 1693092-12.patch, failed testing.

#18

Status:needs work» reviewed & tested by the community

I'm actually using this "kinda" patch in deploy. SO I guess is fine. It solves tons of conflicts with CCK thinks. For me is a patch to be commited A.S.A.P

#19

Status:reviewed & tested by the community» fixed

Committed to 6.x-3.x

#20

Status:fixed» closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

#21

Just so that anyone is trying to find this:

The patch at #11 helped me fix the problem with cck fields not being saved. I did not think of this to be a ApacheSOLR problem at all at first. It worked fine for most content types because filefield_paths was clearing the cache after apachesolr caused it to be set. But for a new profile type (from content_profile module), this caused me a lot of headache and after hours of debugging, I isolated the problem to ApacheSOLR and not content_profile as I originally thought.

Thanks for the patch!

nobody click here