Problem/Motivation
It's possible for comments to develop bad references: a reference to a parent comment, host entity or author that is not valid (e.g. deleted). This shouldn't ever happen in theory, but experience has shown that in practice due to bugs or data migrations it can (see e.g. #2906470: Orphan comments and entries in comment_entity_statistics after comment field instance has been deleted).
If a reference is invalid, this can lead to fatal errors when rendering comments.
For example, if a parent comment has somehow disappeared then the Drupal standard installation gives a "Fatal error: Call to a member function url() on null in /Volumes/SSD2/www/8/core/modules/rdf/rdf.module on line ..." when that comment gets loaded.
Proposed resolution
1) Create a Kernel test that creates a comment with invalid references and renders it.
2) Fix whatever needs fixing for the rendering to happen without error. These are checking for NULL returns from getParentComment in template_preprocess_comment. Checking for NULL returns from getCommentedEntity in CommentLazyBuilders::renderLinks, CommentViewBuilder::getBuildDefaults, and CommentViewBuilder::buildComponents. And finally checking that the user entity exists in Comment::getAuthorName.
Remaining tasks
* review
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#120 | 2614720-120.patch | 8.63 KB | quietone |
#120 | interdiff-113-120.txt | 594 bytes | quietone |
#113 | 2614720-113-8.9.patch | 8.64 KB | quietone |
#113 | interdiff-103-113.txt | 540 bytes | quietone |
#105 | 2614720-105.patch | 8.64 KB | quietone |
Comments
Comment #2
roderikStatus change in separate comment, to add attribution...
Comment #3
roderikAdding the fix helps, though. (See, now I'm all confused by these new fields...)
Comment #4
roderikPriority major at least.
Comment #5
dawehnerNote: you can save a line here with :
if ($entity = $comment->getCommentedEntity()) {
and
if ($parent = $comment->getParentComment()) {
Comment #8
roderik@dawehner: Oh ok, I thought we were explicitly not doing that. Then I either misremembered or it's changed since.... 2012?....
Also: I think I fixed the test to pass/fail properly. Interdiffs inline, to not confuse them with the double patches:
For #3:
Fixed test:
Comment #9
roderikFail, test, fail!
Comment #12
roderikpass, test, pass!
Comment #14
larowlanThanks, although i don't think this qualifies as major, the test is pretty contrived, you're adding a storage handler method to deliberately corrupt the comment. Is there a way to trigger this from standard install, from the ui?
Comment #15
roderikI don't think there is a way to trigger this from standard install. And if migrate doesn't import orphaned comments, then you're theoretically safe. My reasoning for putting it major is just
Plus that there's a fix which is fairly trivial at least for D8 core. (Admittedly that shouldn't affect severity.)
If I misunderstood, or there's too little time & there's pressure to declassify things, go ahead.
---
(In case you're curious how I got to this: I'm just porting some D7 code which clones site content from live to new-dev site and scrubs that content. That D7 code says
...and I was wondering how to port these safety measures. (Dropping these is not an option, I think.) Then I looked and found out D8 doesn't even load these comments without crashing.)
Comment #16
roderikHave the test clean up after itself; don't leak configuration changes to... whereever.
Comment #18
catchI think an exception is the right thing here, otherwise there'd be no way to know that you have orphaned comments.
If we really don't want to throw the exception though, we should at least trigger_error() with E_USER_WARNING.
Comment #19
roderikThis has grown bigger than I hoped.
1)
In response to #18: I think we only want to throw exceptions in places where they are caught. The basic reason for this ticket is to make Drupal not crash unexpectedly.
And I'm at a loss on how to do this. (Note: beginner-intermediary D8 core person.) I agree with warning or logging, but if I do a trigger_error() with E_USER_WARNING., tests will fail. And the intention is to write a test that ensures Drupal stays working (not crashing) with orphaned comments -- even though a E_USER_WARNING is triggered.
Then I thought "maybe I can find a logical place to just emit a log message with severity 'warning' instead". Like in CommentStorage::load(). But... nothing I look at, seems to have a logger-anything injected (or a reachable container object). So... I must be missing a mechanism here.
2)
I don't know why I didn't catch this before... my test successfully loaded and deleted comments, but rendering a node page with a comment would still result in uncaught exceptions. Not good either.
So I tested that too and fixed corresponding fatal errors.
Then I rendered a single comment where the commented entity does not exist anymore, and things really got nasty. Fixed all the ensuing fatals there too... though I admit that there's no practical application for rendering a single comment in this case. If you want me to back those changes out, I will.
3)
This also contains a fix in WebTestBase::drupalBuildEntityView(); a call signature (apparently someone still had the old signature in their head when contributing #2273277: Figure out a solution for the problematic interaction between the render system and the theme system when using #pre_render and this call was never executed until now). Could have opened a different issue, but meh.
Comment #20
roderikComment #24
alexpottDiscussed with @xjm, @cilefen, @amateescu, @fago, @berdir and @plach. We agreed that this is not major because there's no clear path to orphaned comments. @roderik is this a theoretical issue or did it happen on a real site for you?
Comment #27
KlemenDEV CreditAttribution: KlemenDEV as a volunteer commentedI can confirm this is not a theoretical issue. This is an actual problem. I migrated a website to Drupal 8 from Drupal 7 and there are some orphaned comments without parent comments and there are many pages that fail with a fatal error and 500 response code. There are many cases in which this can happen.
I had to modify core modules in order to make website usable. I think this is a major issue for websites that experience similar issues as we had.
Comment #30
quietone CreditAttribution: quietone at Acro Commerce commentedThis looks like this is test of rolling back a full migration using the UI to fail in #16 at #2687849: Add back rollbacks on migrate_drupal_ui. Since this is blocking a Major, changing to Major and tagging as blocker.
The error is
Comment #31
quietone CreditAttribution: quietone at Acro Commerce commentedComment #32
ofry CreditAttribution: ofry as a volunteer and commentedComment #33
ofry CreditAttribution: ofry as a volunteer and commentedComment #35
Dylan Donkersgoed CreditAttribution: Dylan Donkersgoed at Northern Commerce commentedAnother reroll here. This is actually a reroll of 19 again as I couldn't get 33 to apply to any commit from 8.8.x, even one that was identical (aside from commit hash) to one that worked for 8.6.x. Not sure what I was doing wrong. This reroll required some manual adjustments as well to make it compatible with the current source. I have tested it and it still seems to fix the issue at least.
Comment #36
Dylan Donkersgoed CreditAttribution: Dylan Donkersgoed at Northern Commerce commentedDisregard that last patch - I generated it before committing a fix for a typo which causes a fatal error. Here's a working one.
Comment #37
mohit_aghera CreditAttribution: mohit_aghera as a volunteer and at Axelerant commentedComment #41
quietone CreditAttribution: quietone as a volunteer commentedTriaging this as part of the Bug Smash Initiative.
I followed the steps to reproduce that are provided by maisplia in the IS of the related issue, #2906470: Orphan comments and entries in comment_entity_statistics after comment field instance has been deleted and can confirm that it is possible to create orphaned comments via the UI. I began with a minimal install, 9.1.x, and created three comments on a custom node type, 1 by uid 1 and the other two by an authenticated user. Then deleted the comment field instance. Looking in the db there are 3 entries in comment, comment__comment_body and comment_field-data and 1 row in comment_entity_statistics. After the deletion the page loaded without a visible error for either user. There were no error message at /admin/reports/dblog either.
This issue was to fix fatal errors when loading orphaned comments and my test was not able to generate errors. Therefor, I believe this is fixed in 9.1.x.
Comment #42
quietone CreditAttribution: quietone as a volunteer commentedRepeated the same test as above with 8.8.x and 8.9.x . Again no errors. So, this has been fixed.
In the related issue #2906470: Orphan comments and entries in comment_entity_statistics after comment field instance has been deleted masipila did not mention errors when reloading pages with orphan comments. And I am sure he would have - he is very thorough. That was created in Sep 2017 so I suspect that this has been fixed for a while, by what issue I don't know.
Because it is close to bedtime I am not closing this now. I will come back tomorrow afternoon and review.
Comment #43
quietone CreditAttribution: quietone as a volunteer commentedAh, another day and I realized I had not tested with nested comments, which is what the IS clearly says. I completely missed it! Yes, deleting a parent comment does cause fatal errors. Sorry for the noise everyone.
And while here I decided to reroll.
Comment #45
larowlanthanks for reviving this!
this should use [] syntax, shows how old this patch is 😂
🧐 nit: we could break in the previous if and avoid an else here
similarly here
Should we return access denied if the commented entity is gone?
same here re [] vs array()
won't the call to ->getEntityTypeId here fail if $commented_entity is falsey? ie we need to check $commented_entity before we check the entity type, not after
Comment #46
Deepak Goyal CreditAttribution: Deepak Goyal at Srijan | A Material+ Company for Drupal India Association commentedComment #47
Deepak Goyal CreditAttribution: Deepak Goyal at Srijan | A Material+ Company for Drupal India Association commentedHi @larowlan
Made changes as you suggested please review.
#45.4 is still pending I am no getting this.
Comment #48
quietone CreditAttribution: quietone as a volunteer commentedSetting to NR for the testbot.
Comment #50
quietone CreditAttribution: quietone as a volunteer commentedBeen dabbling on this for the past week and the other day I finally realized why I am not making progress. The short answer is that when I run any Functional Comment test that uses \Drupal\Tests\comment\Functional\CommentTestBase::postComment the test fails on this line
$this->assertFieldByName('op', t('Save'), 'Save button found.');
.I am working on HEAD, on a fresh minimal install.
Comment #51
quietone CreditAttribution: quietone as a volunteer commented45.3 There was another case where this could be applied, the 'parent' case.
45.4. That makes sense here but not yet changed. But is it something for a followup so an admin can delete orphans?
Instead of checking after each return of buildLinks this patch changes buildLinks to return an empty array if the commented_entity does not exist.
Many changes to the new test and there are more changes here converting array() to [] and other coding standards. So sorry for the noise.
Comment #53
quietone CreditAttribution: quietone as a volunteer commentedThere are changes in comment, getAuthorName and getAuthorEmail that seem totally unrelated to orphans. I could be totally wrong but I want to see what test failures this causes. I would do this locally but there is something about this patch that plays havoc with my setup making this practically impossible to work on.
Comment #54
quietone CreditAttribution: quietone as a volunteer commentedComment #55
quietone CreditAttribution: quietone as a volunteer commentedSimilarly with the changes to access logic. The problem is not the access to the comment but that it is an orphan. Let's see what happens if the changes to CommentAccessControlHandler.php are removed.
Comment #56
quietone CreditAttribution: quietone as a volunteer commentedOK, trimmed this back to the bare minimum and with a fail patch. This removes the changes to files, CommentViewBuilder.php, CommentInterface.php, comment.module and comment.tokens.inc.
Comment #70
quietone CreditAttribution: quietone as a volunteer commentedThanks to pameeela another duplicate has been found, #2565247: Fatal error: Call to a member function url() on null. Added as related and adding contributors
Comment #71
quietone CreditAttribution: quietone as a volunteer commentedThe duplicate #2565247: Fatal error: Call to a member function url() on null has a slightly different change to rdf.module and it needs to be checked which change is preferred.
Comment #72
quietone CreditAttribution: quietone as a volunteer commentedFound what looks like another duplicate of this issue #3044158: Orphaned comments Error: Call to a member function getOwner().
That has a one line change to hasParentComment() and no test. Checking in hasParentComment seems like a better place to check for an orphan, so, let's remove the fix here and test with that fix.
Comment #73
quietone CreditAttribution: quietone as a volunteer commentedHaha, There is no need to change Comment at all. The change in rdf.module will allow the test to pass.
Comment #74
benjifisherI am reviewing this today, as part of DrupalCon Global.
Comment #75
benjifisherLooking at the table of patches, I see that the early ones were about 5 KB, then there were a few around 20 KB, and now we are down to 7.64 KB, most of which is tests. Good job!
I have not tested manually, but the fix is simple and looks right. Most or all of my suggestions are related to comments.
The patch to the
rdf
module is small. Here is the whole thing:That comment applies to
toString(TRUE)
both places that it is used. Putting the comment inside theif
block makes it look as though it only applies to the first one.The rest of this point is out of scope for this issue, unless it is worth adding a follow-up issue and an
@see
comment. The combinationtoString(TRUE)->getGeneratedUrl()
asks for the bubbleable metadata and then throws it away. It was added, along with the comment, in #2631774: Impossible to update Comment entity with REST (HTTP PATCH): bundle field not allowed to be updated, but EntityNormalizer::denormalize() requires it. It was discussed in Comment 39 of that issue and following comments, so it did not slip in by accident. It is also mentioned in the Proposed resolution section of the issue summary.I find this very confusing. If we do not ask for the bubbleable metadata and then throw it away, how does it "leak"? Does it somehow get "used up" when we ask for it? Is it really
getCommentedEntity()
andgetParentComment()
that are creating this unwanted cache metadata?According to Comment 40 on that issue, #2626298: REST module must cache only responses to GET requests independently fixes this metadata problem. Maybe we no longer need this ugly work-around (IMO).
This is a really nice explanation. I commend you for not saying "because **** happens". I just want to polish it a bit.
Why CommentOrphanedTest instead of OrphanedCommentTest?
Consider shortening "will officially never" to "should never". Maybe switch the order of the two sentences.
Consider "Experience" instead of "Practice". Start a new sentence instead of using a semicolon. I think it should be "such code" instead of "that code".
Why install
comment_test
? (Answer: because it definesCommentTestSqlEntityStorage
.)The comment makes it look as if all four of those modules implement
hook_comment_storage_load()
. Maybe something like "Install node, comment, comment_test, and all modules that implement hook_comment_storage_load()".My personal preference is to use the imperative (as here, "Test"), but isn’t the Drupal standard to use the present tense ("Tests")?
If you prefer to use
service()
all the time, that is fine, but we could useDrupal::entityTypeManager()
instead:Thanks for this comment!
I would rather exit early if
$storage
does not implement the interface.Is this any different from testing
$storage instanceof Drupal\Core\Entity\Sql\SqlEntityStorageInterface
? If we had ause
statement, then we could simplify this to$storage instanceof SqlEntityStorageInterface
.Do we have to navigate away from
$this->node
before reloading it? Are we sure that$this->node->id()
is not2
?Out of scope: do the Comment tests do this enough that it is worth adding a
drupalGetNode()
helper method? (I count 21 times, including this patch.)This line needs a comment: we are now invalidating the entity field, having previously invalidated the user and parent.
But
orphanComments()
does not return anything, so now$ids
isNULL
.Since
$ids
isNULL
, this loads all comments, right? Is that the intention?Either use something more specific than
$entities
, like$comments
, or else eliminate the intermediate variable.It is not semi random: it invalidates the most recent comment (max
cid
), and we rely on that.The first time we call this, we provide
[]
as the first argument. I would say "If empty" instead of "If not provided".Comment #76
benjifisherP.S.
The class comment in #75.2: "An orphan comment is comment with ...". Add "a" before or after the line break.
Comment #77
quietone CreditAttribution: quietone as a volunteer commented@benjifisher, thanks for the review.
Now that the unnecessary changes have been removed it is easy to see that this patch fixes rdf_comment_storage_load so that is handles NULL returns from getCommentedEntity and getParentComment. The interface documents that those methods can return NULL so rdf should be doing that. Therefor I made a decision to move this patch to #2978320: [backport] rdf comment storage load should not load NULL comments and continue work there. And I started that but didn't get back here to leave a comment until now.
However, there is still a need to make sure that loading an orphaned comment doesn't cause a fatal. Exploring that can continue here. For example, are there other cases where getCommentedEntity and getParentComment do not properly handle NULL?
And now what about the review in #75 and #76? To make reviews easier I made the recommended changes for the points in #75 and #76. If the tests pass this patch needs to move to #2978320: [backport] rdf comment storage load should not load NULL comments. Then, I think this needs to be postponed, until that issue is committed.
Comment #78
samiullah CreditAttribution: samiullah at Salsa Digital commentedAfter removing the comment field and navigating to the comment, we saw website encountered error
PHP error is seen in dblog as well
Comment #79
benjifisher@samiullah:
Can you describe more explicitly what you did and what the results were? I can guess what you mean by "the comment field" that you removed, but I would rather not have to guess. What error messages did you get? (If you paste the error message into a comment, then others who run into the same problem will be able to find this issue with their favorite search engine.)
I think what you are describing is a different bug, but I am not sure. If it is different, then we should have a separate issue for it.
@quietone:
If you want to move the patch and the discussion to the other issue, that is fine with me. I like the way you addressed #75.6. Several other points from that review look as if they have not been addressed (3, 4, 8, 9).
Comment #83
quietone CreditAttribution: quietone as a volunteer commentedClosed #3044158: Orphaned comments Error: Call to a member function getOwner() as a duplicate, adding credit for HiMyNameIsSeb, dionsj, and mgp_novicell. Since this is still a work in progress here is the fix suggested from that issue.
Comment #84
quietone CreditAttribution: quietone as a volunteer commentedI haven't really been happy with this and today and spent time reworking the test. The new and improved version creates two comments and the second one is orphaned. It uses a data provider that changes one of the properties (entity_id, cid, or pid) of comment 2 so that it is orphaned. Then, when running the test caused failures each failure was fixed in turn. So, now I am sure there are no out of scope changes.
Still to do in #79, the final paragraph.
Comment #86
quietone CreditAttribution: quietone as a volunteer commentedFixing tests and fix coding standards.
Comment #87
quietone CreditAttribution: quietone as a volunteer commented#75
3. comment_test does not need to be installed, this just uses the test storage class. So, maybe that file needs to move somewhere else.
4. Comment changes and uses 'Tests'.
8. That bit of code reworked. It now loads the two comments created in the test individually.
9. Forgot to do this one.
Setting NW for 9 and maybe 3.
Comment #88
quietone CreditAttribution: quietone as a volunteer commentedThis address #75.9. The comment storage test class is now in the new test, CommentOrphanedTest, because that is the only place it is used. The comments in that class are updated.
Comment #89
quietone CreditAttribution: quietone as a volunteer commentedUploaded the wrong patch - ignore #88.
This address #75.9. The comment storage test class is now in the new test, CommentOrphanedTest, because that is the only place it is used. The comments in that class are updated.
Note that this patch includes the fix from #3163924: buildEntityView calls resetCache with an array of ids not entities.
Comment #90
quietone CreditAttribution: quietone as a volunteer commentedActually, the comment storage test class is also used in #2978320: [backport] rdf comment storage load should not load NULL comments, so moved it outside the CommentOrphanedTest.php
Comment #91
quietone CreditAttribution: quietone as a volunteer commentedActually, I just rewrote the test in #2978320: [backport] rdf comment storage load should not load NULL comments and it no longer uses comment storage.
Still, this is ready for review.
Comment #92
jonathanshawNit: I would fine is_null($commented_entity) a little easier to follow.
This variable is very confusing. It sounds like it's an entity type id, but actually it's either boolean or NULL. Could it be simple a boolean called $is_a_node?
Presumably this isn't mean to be in the patch.
Comment #93
quietone CreditAttribution: quietone as a volunteer commented@jonathanshaw, thanks.
1. Fixed.
2. Fixed, the variable is now $is_node.
3. Ah, yes that is the fix from #3163924: buildEntityView calls resetCache with an array of ids not entities, let's try without that. That requires changing the call to buildEntityView. Made a fail patch to be sure this change works. The file 'interdiff.txt' is between the fail patch and the patch.
Comment #95
jonathanshawYou've done a fantastic job of sticking with this unglamorous issue for a long time, quietone. Sorry to raise doubt when you're so close to the end ...
I don't see that this needs to be a functional test. The UI seems to be only being used to set up the comments, and that can be better done via the API. The actual assertions are all via the API. Kernel tests are so much faster, and robust to UI changes.
Unimportant nit: Do we need to commit to the ids? Could we just use loadMultiple() to get all comments?
I'm not sure you need this; isolation between scenarios of a data provider is pretty good.
If you make the test a kernel test, you may not even need the complexity of this storage handler. It's usually possible to make invalid data with dysfunctional references via the entity API, validation is opt-in.
I'm not sure we've really dealt with this comment of @catch in #18. Would it make sense to log a warning if a comment has a bad reference?
Comment #96
quietone CreditAttribution: quietone as a volunteer commented@jonathanshaw, thanks again.
95.
1. The reason this is a functional test is because a) I was working from what was here, b) there has been no suggestion in this issue that the test should be a kernel test and c) I don't know how to use the Entity API to make orphan comments. If that is the desired direction here then either a) I need to find support for learning how to do that or b) someone else will have to finish this.
2-5. TODO when a decision is made on the type of test.
6. The IS was updated in #41 on 23 June 2020. Can you be more specific about what needs to be updated?
Edit: improvements
Comment #97
jonathanshawUpdated IS based on my understanding of your work, hope I got it right.
Comment #98
jonathanshawYou're right, it's not that simple. Comment:preSave() wille error if you try to save a comment with a reference to an invalid parent via the entity API. You do need to tickle the database directly like the orphanComments method does.
But I don't see why the orphanComments method needs to be on a storage handler rather than just part of the test case; I can't see that getting committed as is.
Maybe this would this work in a kernel test:
Comment #99
quietone CreditAttribution: quietone as a volunteer commented@jonathanshaw, thanks you. After I thought about it some more I was thinking of doing the same thing you suggest.
This patch removes the functional test and the test storage class and replaces it with a Kernel Test. Because this is a new test a fail patch is added.
Comment #102
quietone CreditAttribution: quietone as a volunteer commentedNamespace error.
Comment #104
jonathanshawNice.
The signature and logic of this method are now significantly more complex than they need to be. We either need to make a strong case for why this is desirable, or simplify it.
I suggest that in CommentViewBuilder we check for invalid references and simply log a warning if one is found. I don't think this needs test coverage, that seems like overkill.
Otherwise RTBC.
Comment #105
quietone CreditAttribution: quietone as a volunteer commented1. Simplified it so much that the helper method no longer exists.
2. Ah, where to add this, exactly? What is the message?
Comment #106
jonathanshawThat interdiff made me chuckle.
Regarding #18, I've become uncomfortable with the precedent. If we systematically tested for valid references every time we worked with an entity reference and logged invalids, we'd add lines of code to thousands of places all over Drupal. If people wanted to be warned about invalid entities on their site, it would be easy to make a module for that.
Comment #108
catchOK #106 is a good point. It's not really different from other entity references where we're quite permissive.
Committed 9beb71e and pushed to 9.1.x. Cherry-picked to 9.0.x and 8.9.x, Thanks!
Comment #112
xjmThis broke HEAD on 8.9.x:
https://www.drupal.org/pift-ci-job/1803301
The
$modules
test property was made protected in D9, but should be public for D8.Comment #113
quietone CreditAttribution: quietone as a volunteer commented#112. Here is an 8.9.x patch.
Comment #114
jonathanshawComment #115
jonathanshawSorry for noise, I requeued for 8.9 instead.
Comment #117
catchWhoops thanks everyone. 8.9.x is green and a one word change, so committed/pushed to there now.
Comment #118
catchNope... https://www.drupal.org/pift-ci-job/1803553
Comment #120
quietone CreditAttribution: quietone as a volunteer commentedRemove return type declaration.
Comment #121
longwaveThe void return type thing is unfortunate, never easy to remember when doing backports unless you explicitly test on PHP 7.0.
Comment #122
larowlanCommitted 78cbebb and pushed to 8.9.x. Thanks!