Problem/Motivation
In a project we discovered a potential bug regarding file translations:
* 3 languages: german, french and italian
* base language german ( same error with english base language )
Steps to reproduce :
* enable all language modules
* make project 3 lingual
* create a content type with a file field ( not image )
* make this content type translatable ( including the file )
* create a german base one and translate it into french ( also replace the file )
* translate it also into italian ( also replacing the file )
* edit the german node again and you see that the german file is gone
Discoveries :
* While the initial translation into french works, the file will be removed from the file_usage table. The status of the first file in file_managed will be set to 0
* The french translation file will be created using the german langcode "de" in the file_managed table ( perhaps thats correct but its strange )
* The translation into italian then even deletes the german file in the file_managed table
* If you add the file again to the german node after it got deleted it works like expected.
* Additional problem: after adding one translation and then deleting that again, the file you uploaded from the translation is not removed from the file_managed and file_usage table, but it is from the file system ..
There is data loss in this issue.
Proposed resolution
This makes two changes to fix the mentioned bugs:
* The FileFieldItemList class should never look at anything but the current translation. There are only old values if the translation already existed.
* We have to call delete() on field items that are part of deleted translations, so that they work as expected. We've already seen this problem in other issues, e.g. #2539634: PathItem::delete() never runs because the path field type is a computed field in disguise
Remaining tasks
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#18 | 2639352-18.patch | 8.18 KB | swentel |
#18 | interdiff.txt | 836 bytes | swentel |
#13 | 2639352-13.patch | 8.15 KB | swentel |
#13 | interdiff-2639352.txt | 800 bytes | swentel |
#5 | 2639352-5.patch | 5.65 KB | swentel |
Comments
Comment #2
cilefen CreditAttribution: cilefen commentedThis is a thoughtful report and it is Critical unless someone can show it is not.
Comment #3
swentel CreditAttribution: swentel commentedCould reproduce the data loss, but with an additional detail for point 1: the file isn't deleted from file_managed when creating the first translation, the entry is still there, but the status is set to 0 (however, at some point, cron will clean this up of course). When creating the second translation, it's gone from the managed_file table, as well as from the file system. What's even more annoying is that there's still a reference in the dedicated reference table of the field to a non existing record.
So you definitely need to test this with 3 translations, subtle things start kicking in at the first translation, the second translation is even worse.
Trying to write a test to expose the failure.
(edit, changed the first sentence regarding the detail, should recheck the usage table at some point)
Comment #4
swentel CreditAttribution: swentel commentedComment #5
swentel CreditAttribution: swentel commentedTest that exposes the data loss.
Comment #6
swentel CreditAttribution: swentel commentedPasting the asserts that will fail in the code for reference.
This fails after the first translation
This fails even more after the second translation
Comment #7
swentel CreditAttribution: swentel commentedSome basic debugging: open FileUsageBase and comment out lines 36 & 37 and all will be fine (naturally). It's called from DatabaseFileUsageBackend(). We need to prevent decrementing it on the first translation here.
Comment #9
swentel CreditAttribution: swentel commentedRelated, but with different scenarios and on the image field: #2617174: Image field's file translation doesn't work as expected
Comment #10
swentel CreditAttribution: swentel commentedAdditional problem: after adding one translation and then deleting it again, the file you uploaded from the translation is not removed from the file_managed and file_usage table, but it is from the file system ..
Comment #11
swentel CreditAttribution: swentel commentedThis makes the test pass, however, I don't think the fix is right enough as I think it won't remove the file from the system in case you would edit the original source node, maybe other tests catch this one already, or I'll try to add additional tests.
Comment #13
swentel CreditAttribution: swentel commentedComment #15
Gábor HojtsyWow. Nice find. Adding tags. Adding issue summary template. Please fill in proposed resolution as well.
Comment #16
swentel CreditAttribution: swentel commentedThis should fix the fails in UserPictureTest - updated the proposed solution in the IS.
Added some references in the patch to a good movie :)
Comment #17
swentel CreditAttribution: swentel commentedNote, as I've said already, I think the current patch now prevents from deleting the file from the system when you're deleting it from the source node (and probably from the translated node too), so we'll need an additional test and fix too.
Comment #18
swentel CreditAttribution: swentel commentedBetter assert
Comment #21
swentel CreditAttribution: swentel commentedOk, getting highly confused now, $translation_sync seems to be empty - back to drawing board it seems.
Comment #22
Gábor Hojtsy#2650468: Translation of image and file fields is setting file status to temporary and causes loss of data was marked a duplicate of this one.
Comment #23
BerdirI might be missing something here, but this makes the test pass. Lets see what else happens.
Basically, since we're called for every translation, we shouldn't fall back to the original translation of the entity. If the translation didn't exist, then there are no original ID's.
Wondering if there are reverse problems, with not removing usages when translations are removed or if this really works. Go testbot.
Comment #24
swentel CreditAttribution: swentel commentedAmazing, good to have a fresh mind look at this one, thanks :)
I've been wondering this myself already too, I think it might be good to write additional tests for that, just to be sure.
Comment #25
tduong CreditAttribution: tduong at MD Systems GmbH commentedJust added a fourth language test.
Comment #26
tduong CreditAttribution: tduong at MD Systems GmbH commentedSorry, mismatched the instruction for the new test to be add. Restarted:
ContentEntityStorageBase::invokeFieldMethod()
Test:
Comment #27
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded test only patch.
Comment #29
BerdirComment #30
swentel CreditAttribution: swentel commentedLooks great. Since we're calling a lot of things on original here, I guess I just have one more question: I'm not sure exactly what happens in case we delete the 'original/source' node, does another one become the source ? Haven't looked exactly how/if content_translation handles that, so maybe just one additional test which deletes the original one then at the end - or maybe I'm even asking a stupid question :)
Comment #31
BerdirAFAIK, it's not possible to delete the source translation, that will delete the whole entity. Haven't tried, but I've seen issues about allowing that :)
That said, deleting the source translation and making sure that all files are then temporary makes sense.
I've also asked @plach to review this.
Comment #32
tduong CreditAttribution: tduong at MD Systems GmbH commentedAdded test deleting all translations.
Comment #33
dawehnerIMHO we should also add a dedicated test outside of the file module to ensure that those methods are executed.
It is not entirely obvious that removing an entire entity will let $entity->getTranslationLanguages() return an empty list of translations, so really a dedicated test for that itself would be nice.
Comment #34
BerdirYeah, a dedicated test makes sense, will see where we could do that. Maybe EntityTranslationTest.
It's not entirely obvious because there is no such thing. You can't save an entity without any translations, there's always the source translation that can't be removed. The only way to do that is to delete the whole entity and then the delete() methods are called directly.
Comment #35
Gábor Hojtsy@swentel: indeed, its currently not possible to delete the original language variant without deleting the whole entity with all the translations. There is an issue to allow to reassign the default translation to some other translation within the entity. See #2485499: Allow source translations to be removed and #2443991: Allow default_langcode field value to be changed.
Comment #36
BerdirI'd expect that won't allow you to actually delete a source translation either, you could just switch to another to make that the source and then delete the old source language?
So I think nothing changes for our case here and if it really does, then we now have enough test coverage here so it would fail and would have to be updated.
Comment #37
tduong CreditAttribution: tduong at MD Systems GmbH commentedUploaded patch:
Drupal\entity_test\Plugin\Field\FieldType\FieldTestItem
Drupal\system\Tests\Entity\EntityTranslationTest
Comment #38
BerdirFeedback on the comments.
We create a single entity, with both fields, not for.
We remove it from the entity, not the field.
the "except ..." part doesn't really make sense to me, just leave that out.
I'd also move the comment about untranslatable to the line where you assert that.
I'd write this instead:
// Delete the entity, which removes all remaining translations.
// All languages have been deleted now.
// The untranslatable field is shared and only deleted once, for the default lancode.
I think to_delete is a bit confusing. It's already being deleted. I'd just leave that variable out and put \Drupal::state() directly on the line where you add it to the array.
Comment #39
tduong CreditAttribution: tduong at MD Systems GmbH commentedDone.
Comment #40
Berdirnitpick: form => from. missed that before.
first line unfortunately still > 80 characters.
same here, > 80 characters.
Comment #41
tduong CreditAttribution: tduong at MD Systems GmbH commentedRefactored.
Comment #43
tduong CreditAttribution: tduong at MD Systems GmbH commentedSorted $actual and $expected arrays for test comparison.
Comment #44
plachOverall this makes a lot of sense to me, thanks!
I have only one serious remark. It may be valid or not, but would be good to check.
I'm wondering whether the FALSE argument here may be problematic when the default language is changed as part of the save operation. Also because
$langcodes
includes the default language.Missing trailing dot.
We've written this code so many times, it should really become a trait sooner or later :)
typo
Comment #45
Berdir1. My understanding now is that that's currently not possible. That said, there's no reason to pass FALSE there, I think we just copied that from somewhere. So it should work even if you did that.
Comment #46
tduong CreditAttribution: tduong at MD Systems GmbH commentedFixed #44.2 and #44.4. May the trait task be a new issue ?
Comment #47
tduong CreditAttribution: tduong at MD Systems GmbH commented#44.1 done.
Comment #48
swentel CreditAttribution: swentel commentedI think the trait is fine for another issue imo.
Only one small nitpick before RTBC.
Shouldn't go over 80 chars
Comment #49
tduong CreditAttribution: tduong at MD Systems GmbH commentedRerolled and done.
Comment #50
Gábor HojtsyI would personally have shortened the sentence to "Tests entity translation statuses after removing two translation." to fit on one line, but that sounds like a nitpick with a critical issue :) Fix looks good and the additional test coverage is great.
Comment #51
swentel CreditAttribution: swentel commentedOr maybe "Tests entity translation statuses after removing two translations" (plural) :)
Comment #52
plachYou can't change the
default_langcode
flag, but you can change the value associated tolangcode
for the default language, i.e. the default entity language can be modified.Yup, it was only a consideration :)
Comment #53
catchSorry to hold this up on comments, but this one confused me. Patch looks great otherwise.
This says isPermanent would crash the test suite, but then it calls it. Does it mean 'if the file did not exist'? Also isn't $file && $file->isPermanent() enough of a check?
Comment #54
swentel CreditAttribution: swentel commentedThe reason I added that was because I wanted to have more asserts after this one in the initial test-only patch to prove that more things started to go wrong. Oherwise, the test suite crashed on the fact that $file didn't exist at all and calling that method would crash it.
Of course, now with the fixes in place, we can actually just call $file->isPermanent() directly and even remove that $file check.
New patch, cleaning up those asserts a bit, still RTBC I think.
Comment #55
BerdirYes, that makes sense.
Comment #56
Gábor HojtsyLooks good, yay!
Comment #57
catchCommitted/pushed to 8.1.x and cherry-picked to 8.0.x. Thanks!
Comment #60
Gábor HojtsyYay, woot!
Comment #61
azinck CreditAttribution: azinck commentedTook this out as a follow-up: #2787187: Data loss: Deleting a translation of an entity deletes all file_usage entries for files referenced by the entity
Comment #62
lawxen CreditAttribution: lawxen commentedI'm working on a project that someone else did,An error was encountered on commerce checkout completion, what I did is just add an file field(multi value) to order type.
Error mesaage:
I haven't got what cause this problem, but the code change of this issue's commit has some flaw here:
if ($original->hasTranslation($langcode))
It didn't check whether the $original is null.
Comment #63
lawxen CreditAttribution: lawxen commentedI have create a new issue for it