Updated: Comment #31
Problem/Motivation
Two things that we should have done and failed to when we recently updated Search module to use plugins instead of hooks, for the Node module:
a) We missed updating the Node entity class's preDelete() method line where it deleted search index information for the node. It needed the "type" parameter updated from 'node' to 'node_search', which is the new name of the node search plugin, when calling search_update_index(). The current code is not removing node search information, so it would accumulate cruft in the search index as nodes were deleted and their information was not removed.
b) We had the node module implementing its own update hook to update the search index instead of using a method on the Node entity class. This is kind of silly/inefficient.
Proposed resolution
a) Fix the Node::preDelete() method call to search_update_index().
b) Replace the node_node_update() hook implementation with code in the Node::postSave() method.
c) Test both update and delete to verify search index information is updated/deleted.
Remaining tasks
Commit the patch, which accomplishes all the tasks.
User interface changes
None.
API changes
None.
Related Issues
(A list of related issues.)
Original report by jhodgdon
In D7, node_delete_multiple did this for each node as it was deleted:
if (module_exists('search')) {
search_reindex($nid, 'node');
}
In the conversion of Node module to use generic entity stuff in D8, this was apparently inadvertently removed.
I think we need to add a hook_node_delete() that will make a similar call to search_reindex() (which still exists and is still the same as it was in D7).
Actually, this could be a good Novice project?
Comments
Comment #1
BerdirI think this should be a Node::postDelete(), just like node_node_update() should be in postSave(), seems strange that node implements it's own hooks when there's a proper place to deal with it...
Comment #2
jhodgdonYeah, that is a good idea. The node module implementing its own hooks is just due to straight porting from the D7 days.
Wait! preDelete already has this:
So this issue is not valid... oh, except it shouldn't be passing in 'node', but 'node_search' now.
So... This issue needs to do the following:
a) Fix 'node' to 'node_search' in the above code in core/modules/node/lib/Drupal/node/Entity/Node.php::preDelete()
b) Do something similar to move the node_node_update() function currently in node.module into the postSave() method.
Comment #3
slv_ CreditAttribution: slv_ commentedCatching up with core contrib, I'll give this a go!
Comment #4
slv_ CreditAttribution: slv_ commentedPatch attached.
I replaced the call to search_reindex with node_reindex_node_search, so that the class is decoupled from the search module, and the logic goes through the wrapper in node.module instead.
Also, maybe we should move the reindex from preDelete() to postDelete() method? It kind of seems more logical to me, but thought I'd ask first. Happy to re-roll if you think that'd be better.
Comment #6
jhodgdonActually, search_reindex() is not the same as node_reindex_node_search(). The search_reindex() function actually removes information from the index (despite its name), and the node function just marks stuff as "this needs to be reindexed".
So in the delete method, we really just need the 'node' to be changed to 'node_search' in the call to search_reindex().
Thanks!
Comment #7
BerdirYou need to wrap this in a if (!$update) so that the comment is still true.
You can ignore the test errors, those are just a testbot fluke, happens sometimes. Not going to re-test and this needs to be fixed anyway.
Comment #8
slv_ CreditAttribution: slv_ commentedRight, this should do then.
jhodgdon, thanks for that. I just got fooled by something and overlooked what the function was actually doing.
Berdir, sorted that in this patch, but I might be missing something. So, if we're reindexing when the node is updating, shouldn't that be done with an "if ($update)", instead of an "if (!$update)".
Also, I replaced preDelete() with postDelete(), as this behavior makes more sense to happen *after* the nodes has been deleted. Just realized this was mentioned in #1 too.
Thanks!
Comment #9
slv_ CreditAttribution: slv_ commentedComment #10
BerdirHm, I think I was wrong twice :)
Yes, if ($update) makes more sense. No idea what I was thinking. Also wondering why no tests failed. Really seems that we should have some better test coverage there?
And preDelete() makes more sense when you think about it in a data consistency way, we need to remove references to something before we remove it.
Comment #11
jhodgdonUmmm
bool $update: TRUE if the entity has been updated, or FALSE if it has been inserted.
So this seems to be backwards in the postSave() function:
I agree with the other changes here.
Regarding tests... Here's a thought for a combined test for updating and deleting... It would:
a) Create a node.
b) Index it for search.
c) Search and verify it appears in search results.
d) Update the node.
e) Run the indexer again so it gets updated in the index.
f) Search and verify new text appears in search results.
g) Using a raw query, find the information in the search index tables.
h) Delete the node. Do not run any reindexing process.
i) Run the same query as (g) and verify the node is gone. Also try a search and verify it doesn't appear.
For whoever is writing the test: you can look at the existing Search module tests for how to do all of this, and given that all the other node-search-related tests are in the Search module, and that you'll probably want to extend SearchTestBase, it makes sense for this one to go there too.
When making a test like this for a bug, it is very helpful to make a patch containing just the test and upload it so we can verify the bug is properly tested, and then a patch with the test plus the fix so we can verify that the test passes.
slv_: You're doing great on this so far! Want to write some tests too? If not, you can gracefully bow out and someone else can do the tests. Thanks!
Comment #12
slv_ CreditAttribution: slv_ commentedRight, patch attached with if ($update) this time.
I haven't put the postDelete code back into preDelete() though. I was thinking about it in the way that if for some reason the process gets interrupted, the node might not be deleted, but it might have been removed from the search index, if doing it in preDelete().
The other way around, in postDelete, the worst case is that the node still appears on the index, if something fails after the node has been deleted. I assume it would be removed in a general reindex then.
It makes sense in both ways I'd say, so I'm fine with anything really. Happy to re-roll and put it back in preDelete().
Jhodgdon, happy to write tests, too.
Comment #13
slv_ CreditAttribution: slv_ commentedTests attached.
I've added 2 tests instead of one, as it seems a bit cleaner what each of them does, and we're testing different behaviors anyways (updating index info vs removing it on node delete).
It should fail, as we haven't pushed the actual fix yet.
Patch with fix + tests in next post.
Comment #14
slv_ CreditAttribution: slv_ commentedAnd... full patch.
Comment #16
slv_ CreditAttribution: slv_ commentedRight, never committing again a patch made after "git add -N". It doesn't quite work.
Attaching the full patch again (test + fix). Hope that's ok.
Comment #17
Berdirwhat you said about preDelete() isn't a problem, the whole thing is in a db transaction and rolled back in case of an error. preDelete() is correct :)
Also, you should provide a test-only.patch that only contains the test, to make sure that it covers the bug.
Comment #18
slv_ CreditAttribution: slv_ commentedk. Thought the search_index and the actual node removal were in *different* transactions. That works for me then =).
I'll commit the test apart too. I did in #13 in fact, but failed because of what I mention in #16 and then Just committed the whole thing.
Thanks!
Comment #19
slv_ CreditAttribution: slv_ commentedRight. Tests attached, without fix.
Comment #20
slv_ CreditAttribution: slv_ commentedTests patch attached.
Comment #21
slv_ CreditAttribution: slv_ commentedAnd the whole lot, again.
Comment #22
slv_ CreditAttribution: slv_ commentedSomething happened on #19. It didn't come up for me, so I tried again. #19 and #20 are the same therefore.
Comment #23
penyaskitoI'm not sure if it is a good idea to add an if condition in a test. It should be indexed, so let's assert that it is found and then assert that it's gone after delete.
For every patch, you should create an interdiff for making reviews easier. In the case of bugs like this, also uploading a patch only with the test (should be red then) for assuring that the bug is well-tested. This will also help reviewers for getting this in. Thanks for sprinting!
Comment #24
slv_ CreditAttribution: slv_ commentedThanks for the feedback. Will get rid of the if / else, as the same can be done just with asserts anyways, and add an interdiff.
The patch with just the test is already added in #19 and #20, that's why it fails.
Comment #25
slv_ CreditAttribution: slv_ commentedRight. The standalone tests attached. (No interdiff, that'll be added on the full patch).
Comment #26
slv_ CreditAttribution: slv_ commentedFull patch + interdiff.
Comment #27
jhodgdonThanks for the patches! This is looking good, assuming the tests pass/fail as appropriate. Note that we expect the Update parts of the tests to pass without the patch (this patch is moving that from a hook to a method, but it was most likely working before), but the Delete parts of the tests should fail without the patch.
A documentation/naming nitpick:
This is not grammatically right or very accurate. How about:
Tests that on node update/delete, search index information is updated.
The test method names seem a bit overly verbose too:
testSearchInfoIsUpdatedOnNodeChange(
could just be
testSearchIndexUpdateOnNodeChange
etc.
Comment #28
slv_ CreditAttribution: slv_ commentedWill polish that. Thanks Jennifer. And yes, only the delete test failed, the update one was fine as expected.
Comment #29
slv_ CreditAttribution: slv_ commentedRight. So... standalone test again.
Comment #30
slv_ CreditAttribution: slv_ commentedFull patch + interdiff.
Comment #31
BerdirOk, I think this is good to go :)
Comment #32
jhodgdonI just updated the issue summary and am also updating the title. Ready to go!
Comment #33
slv_ CreditAttribution: slv_ commented#30: 2098015-search-index-update-on-node-changes-tests-and-fix-30.patch queued for re-testing.
Still not committed it seems, so just re-queueing to check all is still fine.
Comment #34
Xano#30: 2098015-search-index-update-on-node-changes-tests-and-fix-30.patch queued for re-testing.
Comment #35
alexpottNice work! Committed 00780f2 and pushed to 8.x. Thanks!
Does this need backporting to D7?
Comment #36
slv_ CreditAttribution: slv_ commentedThanks Alex! As stated in the issue summary by jhodgdon, it doesn't need backport, as it's something that was removed inadvertedly just in D8.
Comment #37
jhodgdonAgreed. In node_delete_multiple() in D7, we have:
which takes care of the "remove search info" part, and there are no methods to use instead of hooks in D7.
Comment #38.0
(not verified) CreditAttribution: commentedAdd an issue summary