Updated: Comment #N
This is part of: #2313159: [meta] Make multilingual views work
Originally, this was a spin-off from #2218025: Field language filter/sort/etc. for Views do not work and are not needed comments #1 and #2.
Problem/Motivation
Node views normally output one row per node translation, without additional filtering enabled.
If you add a "contains" filter by Title, you will end up filtering the translations down to fewer rows. However, if you try the same thing for Body "contains", you will end up getting more rows. This is counter-intuitive and needs to be fixed.
Steps to reproduce
1. Install in English, standard profile.
2. Turn on Language and content Translation modules.
3. Add French and Spanish languages.
4. Set up Page nodes to be translatable, with Body translated.
5. Make one Page node, and "translate" it. Set the title and body to:
English: Food in Paris
French: Nouriture en Paris
Spanish: Comida en Paris
5. Make a view of Page nodes, showing teasers.
Try these filters:
- No filtering: You will see all 3 translations (as expected).
- Title contains "Food", "Comida", or "Noriture": You will select just one translation for each filter (as expected)
- Body contains "Food", "Comida", or "Noriture": You will see all 3 translations for each filter (expect: select 1 as in title filter)
- Title contains "Paris": All 3 translations (as expected).
- Body contains "Paris": 9 rows, with each translation repeated 3 times (expected: 3 translations, as in title filter).
Underlying problem
The reason is that the join is between {node} and {node__body}, and it is ignoring language code. So the query becomes:
SELECT node_field_data.title AS node_field_data_title, node.nid AS nid, node_field_data.created AS node_field_data_created
FROM
{node} node
INNER JOIN {node_field_data} node_field_data ON node.nid = node_field_data.nid
LEFT JOIN {node__body} node__body ON node.nid = node__body.entity_id AND node__body.deleted = '0'
WHERE (( (node_field_data.status = '1') AND (node_field_data.type IN ('article')) AND (node__body.body_value LIKE '%Paris%' ESCAPE '\\') ))
ORDER BY node_field_data_created DESC
LIMIT 10 OFFSET 0
and so there are 3 rows for the {node} join with {node_field_data}, multiplied by 3 form {node} join with {node__body}, resulting in 9 output rows.
But this is wrong. Each row of {node__body} has a langcode on it, and this should be joined to the {node_field_data}.langcode column, so that each body is correlated with the translation it is, in reality, attached to.
The same thing would also, presumably, happen with other entities.
Proposed resolution
When joining field tables to entity tables in Views, instead of joining on the entity base_table, join on the data_table, making sure that the language columns are matched as well as the ID fields.
If there is no data_table, continue to join on the base_table. Note that all fieldable Core entities with Views support now have data_table; contrib entities without data_table will continue to have this bug, but on #2321381: Document that fieldable translatable entities should use data tables we have at least documented that they need data tables, so ... use at your own risk etc.
Remaining tasks
There is a patch on #3 that contains a failing test for the original reported issue with Node entities.
We need to (a) make additional tests for other entities and (b) make a patch that joins field tables using the entity field data tables instead of the base tables, to fix the bug.
User interface changes
You will be able to do filtering on your fields that is actually sensible in the presence of translations.
API changes
None
Comment | File | Size | Author |
---|---|---|---|
#28 | interdiff.txt | 1.19 KB | jhodgdon |
#28 | 2218065-field-filters-28.patch | 56.14 KB | jhodgdon |
#24 | interdiff-code-changes-only.txt | 6.43 KB | jhodgdon |
Comments
Comment #1
jhodgdonThis issue applies to field contains filters, not just language, but it's related to having translated nodes.
And I think it's pretty much major... I am going to verify it is still a problem though, and then probably write a test.
Comment #2
jhodgdonI just tested it again, and have added Steps to Reproduce to the issue summary. It is still a problem.
Comment #3
jhodgdonOK, here's a patch with a failing test.
Comment #4
jhodgdonAt this point... Having written a test to illustrate the problem, I am not sure how to fix it.
As I pointed out when I originally added this issue, the joins set up for fields in Views are generic to the Field module, not specific to the Node module. So the way the Field module does its Views integration, it is ignoring the fact that individual field entries in field tables are associated with particular translations of entities, and it is just joining to the base table for the entity (node), not the correct table (node_field_data).
This problem may be specific to nodes... I am not sure if other entities have two tables like this, one for the base item and one for the translation information? Anyway, I really don't know how to fix it, so I'm unassigning it.
But Views on Nodes with any field filters on them are currently unusable on multilingual sites, as it is, because instead of just filtering (reducing the number of results), they also multiply the number of results due to the bad join.
Comment #6
jhodgdonWell, the test failed the way it should have. Setting this back to "Active" since there is no patch yet to fix the issue.
Comment #7
jhodgdonThis has been added as a child issue of #2313159: [meta] Make multilingual views work. Since it is now part of the master plan there, updating title and summary (it is still the same issue, it's just that we've worked out how to make fixing it part of the master plan).
Also postponed until #1498664: Refactor user entity properties to multilingual is finished.
Comment #8
jhodgdonComment #9
andypostComment #10
plachThe last patch has test failures.
Comment #11
jhodgdonThe last patch on #3 is *only* a failing test, illustrating the problem with Node views. It doesn't test views with other entities, and has no code fix.
Comment #12
jhodgdonI talked to Gabor today in IRC and I think what we need to do on this issue is:
- If an entity has a data table (all Core fieldable ones do now), join field data to the data table.
- If an entity does not have a data table, join the way we do now. Contrib entities intending to be multilingual that don't use data tables do so at their own risk!
Comment #13
jhodgdonComment #14
jhodgdonI think I'm ready to take this on!
Comment #15
jhodgdonOK, here's another failing test added, for Taxonomy; have tested in my local environment and both Node and Taxonomy tests in this patch fail in the right way to demonstrate the bug.
We also need tests for Comment and User.
I haven't worked on the fix yet either, but I actually think that should be straightforward.
Comment #17
jhodgdonOK, here's a patch with an additional (failing as I'd expect) test for Comment.
I've made a test for User as well, but I'm running into problems, because neither the Entity view mode display choice nor the Field view mode display choice for User in Views is working correctly. So I'm going to upload that as a separate patch; we can add this test later if/when the underlying issues are fixed:
#2321645: Default entity view for User should show user name
#2217569: Fields row plugin: Translation is non-uniform for base fields, Field UI fields, links; no way to choose "this row's language"
Anyway, three tests is pretty good, right? Should be able to move towards making a patch for the actual issue now. So what we want to do here is patch the code and make the Node, Comment, and Taxonomy tests pass. Getting User to pass is not going to be possible right now.
Comment #20
jhodgdonOK, I think making the tests is probably about as far as I can take this. The Views join syntax is inscrutable and undocumented.
What needs to happen is in core/modules/field/field.views.inc in field_views_data(), or rather in the function it calls: field_views_field_default_views_data().
There it's doing:
What it needs to be doing instead is joining to the $entity_type->getDataTable() (if it exists), and instead of just matching the entity ID, it also needs to match the language code (if it exists).
I tried to do this, but what I tried only succeeded in not only breaking all views with fields, but also getting some kind of weird error when it tried to output an exception:
Call to a member function getTrace() on a non-object in ....../core/lib/Drupal/Core/Controller/ExceptionController.php
Alternatively, we could patch the hook_views_data() docs so they explain how to do complex joins, which they totally don't now. I also looked at the Join handler plugin in Views but it was not much more help.
Anyway... The tests in 2218065-field-filters-FAIL-comment-node-taxonomy-tests.patch need to pass... that is as far as I can take it right now. Any takers?
Comment #21
dawehnerSo here is a bit of work.
As we realized we should probably always join on the langcode ... otherwise we result in order of magnitude of duplicates.
Work done:
This still fails because the generated queries don't include any langcode filtering at the end of the day.
Comment #22
jhodgdonUm. No. This is not right.
This is adding to the auto-join between node data and the node_field_data. But this is wrong. The langcode column in node is "the original language of the node". The langcode column in node_field_data is "the language of this translation". They absolutely should not ever be joined. That will completely break everything. The joins in the *ViewsData classes are correct as they are -- we do want the joins there to just join to the field data tables on the entity ID, which is what they are doing.
So now I see the syntax of 'extra'. Let me take this on again... and I'll file a separate issue to document 'extra' better in hook_views_data(). :)
Comment #23
plachI think we should avoid automatic filtering and just update the test query to have a language filter. Multilingual use cases are so vast that we need everything as much configurable as possible.
Comment #24
jhodgdonOK, after some hair pulling and help from dawehner in IRC, I got this working, at least locally!
Some notes:
- There's a change in the Join handler -- to let it support 'left_field' syntax in 'extra' in joins.
- The default field join uses the data table if available, and base table if not. It does assume there is a langcode field in the data table; I think/hope this will be OK.
Anyway... The interdiff here is the code changes; full patch is code + tests. In other words, the interdiff is to my test-only patch in #17, not to dawehner's in #21, which wasn't my starting point.
Hopefully this won't break anything else...
Oh and regarding #23, there is no automatic language filtering in my patch. The tests check whether filtering by "contains" on fields work, and there is no language filtering.
Comment #25
jhodgdonOh, and just to clarify: the JoinHandlerBase code came from dawehner's code in his patch. :)
Comment #27
jhodgdonHah! That fixed most of it. Down to two fails in Drupal\field\Tests\Views\ApiDataTest -- I'll take a look shortly.
Comment #28
jhodgdonSo there were some tests in Field module looking for the exact format of the field views data, which was changed in this patch, so had to update the patch. This should pass everything now hopefully!
Comment #29
jhodgdonFollow-up to add docs to hook_views_data():
#2321995: More info needed in hook_views_data docs/sample body
Comment #30
dawehnerNext time just assume a little bit more knowledge here ;)
We could just check for the langcode entity key, right? In this case we maybe even would use that, as it might be stored somewhere else?
ah, cool!!
Comment #33
Gábor HojtsyComment #34
alexpottCommitted 62bb5c4 and pushed to 8.0.x. Thanks!
Comment #36
Gábor HojtsyYayayay, amazing, thanks!