Problem/Motivation

Quoting #3201646-21: Add support for image caption (<img data-caption>):

Tested this thoroughly. Works beautifully! 🤩

We will need a follow-up to make formatting inside the caption work though — but that's an upstream bug that we were warned about. Quoting @Reinmar from Slack:

The PoC of drupalimage implementation is still in progress. I heard it’s close to be finished but then we found that in CKE4 inline formatting inside captions can be stored in data-caption attributes. This is not that straightforward now.

Steps to reproduce

  1. Create image caption: enter foo bar
  2. Make bar bold
  3. Save the content
  4. Observe that the stored data contains just data-caption="foo bar" instead of data-caption="foo &lt;strong&gt;bar&lt;/strong&gt;"

Proposed resolution

TBD

Remaining tasks

  1. Per #14 + #15 + #16: this issue is hard-blocked on #2441811: Upgrade filter system to HTML5 getting fixed in Drupal core.

User interface changes

TBD

API changes

TBD

Data model changes

TBD

Issue fork ckeditor5-3222808

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Wim Leers created an issue. See original summary.

Wim Leers’s picture

Status: Active » Postponed

Wim Leers credited Reinmar.

Wim Leers’s picture

Wim Leers’s picture

Issue summary: View changes
Wim Leers’s picture

Title: [needs upstream feature] Follow-up for #3201646: markup in image captions is lost » Follow-up for #3201646: markup in image captions is lost
Status: Postponed » Active
Issue tags: -Needs upstream feature

@Reinmar just informed @lauriii and I that this was solved upstream 4 days ago: https://github.com/ckeditor/ckeditor5/commits/ci/763

Wim Leers’s picture

@Reinmar said we might/should be able to reuse the upstream test coverage for this specific functionality. So: TBD whether it's actually feasible to do that!

lauriii made their first commit to this issue’s fork.

lauriii’s picture

Status: Active » Needs work

I made some progress on this but got stuck on integrating this with CKEditor attribute encoding. I pinged @Reinmar in Slack for additional info on CKEditor 5 attribute encoding.

Wim Leers’s picture

Title: Follow-up for #3201646: markup in image captions is lost » [blocked on upstream] Follow-up for #3201646: markup in image captions is lost
Status: Needs work » Postponed

Wim Leers’s picture

@Reinmar responded to @lauriii's questions about encoding. Turns out HTML5 does not require < and > to be encoded in attribute values!!! 🤯 TIL! 🤓

Or rather… YIL, because over at #2441811-49: Upgrade filter system to HTML5 I basically asked that and @longwave pointed out in #2441811-50: Upgrade filter system to HTML5 that this is no longer true in HTML5!

lauriii’s picture

Based on WHATWG parsing rules, we would only have to worry about encoding quotation mark and ampersand. This is already handled by CKEditor 5 so we can get rid of the additional encoding that currently exists in the MR.

Based on some testing, captions encoded by CKEditor 4 are correctly decoded by CKEditor 5. There are some edge cases we still should test.

The tests are failing because Drupal\Component\Utility\Xss::filter is unable to handle < and > within attributes. We need to either fix that as part of #2441811: Upgrade filter system to HTML5 or another upstream issue.

Wim Leers’s picture

D'oh 😬

Well… at least that issue is well on its way!

Are you saying this issue is hard-blocked on #2441811: Upgrade filter system to HTML5?

lauriii’s picture

I think this issue is hard blocked on #2441811: Upgrade filter system to HTML5. I aded failing tests for that issue so that we can start working on a solution for the edge case.

Wim Leers’s picture

Title: [blocked on upstream] Follow-up for #3201646: markup in image captions is lost » [blocked on upstream] [PP-1] Follow-up for #3201646: markup in image captions is lost
Issue summary: View changes

👍

effulgentsia’s picture

I think this issue is hard blocked on #2441811: Upgrade filter system to HTML5

The relevant part for this issue got split out into #3227831: Xss::filter() does not handle HTML tags inside attribute values. However, why is this issue hard blocked on it? Why can't we escape the entities in the attribute values ourselves, either in the plugin, or at any point prior to calling Xss::filter(), even if CKE doesn't escape them?

effulgentsia’s picture

One way you can do #18 is by calling:

$html = WHAT_CKE_GIVES_US;
...
use Masterminds\HTML5;
$html5 = new HTML5(['encode_entities' => TRUE]);
$dom = $html5->loadHTML($html);
$html = $html5->saveHTML($dom);
....
// now we can call Xss::filter($html)
longwave’s picture

In #2441811: Upgrade filter system to HTML5, Html::normalize() is basically the code above without the encode_entities flag, maybe we should just enable it always there?

Wim Leers’s picture

@effulgentsia How would the CKEditor 5 module do that?

There is no reliably API to intercept data this early, i.e. upon form submission.

We'd almost need to override \Drupal\text\Plugin\Field\FieldType\TextItem::preSave() to make that work reliably (because note it could happen in any other text editor too — it is valid HTML5 after all).

The only alternative would be to do something like overriding ['#value_callback' => '\Drupal\Core\Render\Element\Textarea::valueCallback'] to do some extra processing there? That could be an okay work-around initially maybe … until it's solved in core?

I like @longwave's suggestion in #20 — that seems like the simplest solution and the strongest guarantee?

Wim Leers’s picture

Title: [blocked on upstream] [PP-1] Follow-up for #3201646: markup in image captions is lost » Follow-up for #3201646: markup in image captions is lost
Status: Postponed » Needs work

Great news from the CKEditor 5 team!

News regarding attribute values encoding.

We created a simple PoC of a custom HTML writer: https://github.com/ckeditor/ckeditor5/compare/ck/custom-html-writer-poc. It can be plugged by replacing editor.data.processor._htmlWriter with the instance of this new writer.

We also decided to make this property public so such overrides are more legit: https://github.com/ckeditor/ckeditor5/issues/10619. This is merged already and will be released at the end of October.

The key part of the HTML writer that is interesting to you are these lines: https://github.com/ckeditor/ckeditor5/compare/ck/custom-html-writer-poc#...

Using this writer should resolve your problems with data-caption. This implementation passes all our existing tests for HtmlWriter, but these tests are for sure incomplete. Since we were basing on innerHTML’s behavior (so, a native behavior), we could trust that it just works. Thus, IDK the completeness of the PoC that we provided… My biggest worry would be about XML namespaces, but I suppose there may be more.

That means we can move forward here again! 👍

lauriii’s picture

Assigned: Unassigned » lauriii
Wim Leers’s picture

🥳

lauriii’s picture

Title: Follow-up for #3201646: markup in image captions is lost » [PP-1] Follow-up for #3201646: markup in image captions is lost
Wim Leers’s picture

Status: Needs work » Postponed (maintainer needs more info)

I don't understand.

After lengthy investigations and discussions, the conclusion was that changing the Drupal Xss::filter() logic is infeasible and unrealistic.

The CKEditor 5 team worked to make it possible for us to override their HtmlWriter (that's what is described in #22).

I see you gave that a try in https://git.drupalcode.org/project/ckeditor5/-/merge_requests/75/diffs?c.... Are you concluding that this is impossible to solve in CKEditor 5? 😱😢

If so: why are you concluding that? Why can this work just fine in CKEditor 4, and why can't we make CKEditor 5 process data-caption attribute values the same way?

I am not seeing the change in direction/the new conclusion explained either here or in #3247246: Attribute value encoding not compatible with Xss::filter(), so I hope you can elaborate 🙏🤓 I have the feeling it's obvious to you, but it's certainly not to me 😅

lauriii’s picture

@Wim Leers Does #3247246-7: Attribute value encoding not compatible with Xss::filter() help explain this?

Manually tested to apply this on top of #3247246: Attribute value encoding not compatible with Xss::filter() to confirm that \Drupal\Tests\ckeditor5\FunctionalJavascript\CKEditor5Test::testImageCaption passes.

Wim Leers’s picture

Yes, thank you! 😊

lauriii’s picture

Title: [PP-1] Follow-up for #3201646: markup in image captions is lost » Follow-up for #3201646: markup in image captions is lost
Status: Postponed (maintainer needs more info) » Needs review
Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

RTBC as soon as the coding standards nits are addressed and the slightly expanded documentation!

Just talked to @lauriii, he confirmed that he did thoroughly review the pieces of code coming from upstream 👍🥳

Wim Leers’s picture

Crediting all participants :)

  • Wim Leers committed 07dbfdc on 1.0.x authored by lauriii
    Issue #3222808 by lauriii, Wim Leers, longwave, effulgentsia, Reinmar:...
Wim Leers’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.