Closed (duplicate)
Project:
Wysiwyg
Version:
6.x-2.0
Component:
Editor - TinyMCE
Priority:
Normal
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
10 Sep 2009 at 16:21 UTC
Updated:
18 Mar 2010 at 19:58 UTC
Jump to comment: Most recent file
Comments
Comment #1
twodThis is partially a problem with the editors themselves. See this TinyMCE forum thread.
It's not easy to place entities such as the break comment at the right position in WYSIWYG mode since you can only click inside paragraphs.
By default, the editors also wrap "orphaned" text or elements in paragraphs, so even if we put it outside of all paragraphs in source mode, the editor wraps it again.
In TinyMCE you can use the
forced_root_blockoption to change how this behaves. If you wish to experiment with this setting, you can use the method in comment #30 of #313497: Allow configuration of advanced editor settings to inject it.If you set it to false you can correctly position the break comment in Source mode and it will stay that way when switching back and forth.
However, as an experiment I put together a patch which attempts to "break out" of the paragraph. It seems to be working fairly well.
Btw, this problem is also shared by the Column break module and its Drupal plugin for Wysiwyg.
Comment #2
matthew petty commentedThank you! I looked for a long time and I wish I had found that thread earlier about TinyMCE custom settings earlier.
FYI, I applied your patch and noticed no change in behavior. Maybe I am missing something...
Comment #3
twodYou should notice that
<p><!--break--></p>is changed to<!--break-->.The patch also tries to change
<p><!--break-->Something</p>to<!--break--><p>Something</p>and also
<p>Something <!--break-->and maybe something else</p>to<p>Something and maybe something else</p><!--break-->Note that this does not work if paragraphs were nested (or there were divs instead of paragraphs) as it only checks the first parent.
This patch is not intended to be committed, more a way to study the issue.
Comment #4
matthew petty commentedTwoD, I hope you can help me again. :)
I've switched to FCKEditor for other reasons, and the original issue is present in that editor as well. I've applied your patch, but it moves the
tag to the very bottom of the body field.
Comment #5
twodI'd like some help testing the following patch.
It's an attempt to treat the tag/placeholder conversion as a pure string manipulation operation using Regular Expressions without Parsing HTML the Cthulhu way. It *should* avoid the problems discussed here as far as is possible without a deep analysis of all the content. I hope it will also partially fix the problems in #510552: Invalid XHTML: missing trailing slashes, absolute urls and uppercase tags and #495828: In TinyMCE in IE7 Relative Links Become Absolute Links if Teaser Break is Enabled, but those require changes to tinymce-3.js as well.
Note that this is just a test. It's a major change to the plugin and Sun needs a say in this too. I left some commented logging code (for Firebug) and the old code in there to make it easier to test.
Here are some simple test cases I've been using to verify the basics are working. Simply disable the editor, paste one of the test-cases in the original textarea, re-enable the editor and make sure the placeholders look ok (no paragraph fixing on attach), disable the editor and you should see that the markup has been corrected to not wrap the breaks in paragraphs. Doing any more complex checks (like making sure the break tag is on the root level and not wrapped by additional tags) is way beyond what we're capable of without using the DOM. The user has to be aware of where he/she inserts the breaks and make it as easy as possible for the plugin to get it right.
I now need to know the results from these tests match the results when real node contents are used.
Comment #6
dude4linux commentedThe patch installation failed.
I edited the file break.js and applied the changes manually.
Family just arrived so I'll have to test later and report the results.
Comment #7
twodAh, good to know, thanks for testing. Have fun with your family and happy holidays!
Comment #8
sunI'd like to see/hear an actual bug report here first.
That is, because I'm quite confident that a markup of
<p><!--break--></p>can be on-purpose/wanted. Likewise, you can place the break tag at the beginning, at the end, or at an arbitrary position within a paragraph.In any case, you need Drupal's HTMLCorrector filter enabled, so it doesn't really matter whether there is a non-closed paragraph or any other HTML element.
Just like this code block will be automatically closed by Drupal's HTMLCorrector filter when I forget the closing tag...Comment #9
twodI'm still in favor of replacing the jQuery/DOM logic with pure string manipulations (RegExp or otherwise) as it circumvents all the issues related to browsers mangling various parts of the contents. This issue was also found in the TinyMCE implementation, where only changing Teaser Break didn't help. (#495828: In TinyMCE in IE7 Relative Links Become Absolute Links if Teaser Break is Enabled)
The mangling problem will also appear in any Drupal plugin implementing similar functionality by looking at this module, and Wysiwyg will get the blame for it happening unless we set an example on how to treat the content string. The regular expressions from my patch above could of course be simplified to just replace all occurrences of the placeholder and do nothing else, requiring a HTMLCorrector filter to be run. But then we're back to this issue where users are wondering why the breaks keep jumping into paragraphs, defying their wishes.
Comment #10
dude4linux commentedTwoD,
It looks like the changes to break.js proposed in your last patch above have cleared up the missing closing tags < /p > and < img ... />. I had to open each page in the editor and then submit it without making any changes. Fortunately, I didn't have a huge number of pages to correct. In one case, I had to open and save a page twice to eliminate all the unclosed paragraph tags.
The bad news is that I found that teaser break also causes problems with unclosed < div>'s. I found this thread while searching for a solution for missing < /div> tags causing problems with certain themes, specifically zen-classic. While cutting and pasting material from other sources, it is easy to pickup an unmatched < div> or < /div> tag. My testing confirmed that inserting a teaser break between a valid pair of < div> tags will cause the undesired behavior.
< div>
< p>Teaser Paragraph< /p>
< !--break-- >
< p>Main Paragraph< /p>
< /div>
results in an error "missing < /div>" when viewing the teaser on the home page. I assume similar issues would occur when inserting a teaser break within any pair of block tags.
Note: I had to insert a space after each < to keep the tags visible.
Comment #11
dude4linux commentedI downloaded and installed the HTMLCorrector module for 5.x. It appears to be supplying the missing < /div> and the zen_classic theme no longer breaks.
Comment #12
twodThanks for testing! Good to know it works with real content too. It's very easy to stare oneself blind at certain tests and miss obvious problems under other conditions.
The unclosed div problem was expected. Like I said above, the regular expressions are only designed to deal with the p tags immediately surrounding the placeholder tag. Trying anything more advanced would be "Cthulhu parsing".
HTML Correctors should indeed be able to take care of unclosed tags like those divs or paragraphs left when the page is split. The easy way to code this would be to reduce it to a single regular expression which just cares about the image placeholder and leave the rest up to the markup correctors. I think we should at least take care of removing the paragraphs around the placeholder as those are [indirectly] created with it. That would be enough to make the markup valid in the most common cases if a corrector is not used. The users will of course need to be educated about where it's appropriate to place the breaks, but that is also needed without Wysiwyg module.
If we had good access to the relevant selection object during insert operations in Drupal plugins we could probably do some logic to split any parent tags into two groups before and after the break. But then we'd need to make sure all editors send the appropriate reference, and it would still not work for 'pure markup' editors which don't have a WYSIWYG mode. (But then again, this is a WYSIWYG module hehe.)
Thinking about it, we'd have to do some string manipulation if we're to support other output formats [from the editor perspective] than HTML anyway. The native BBCode plugin for TinyMCE will reduce the
<img src=".../spacer.gif" class="wysiwyg-break drupal-content" />placeholder to[img]/sites/all/modules/wysiwyg/plugins/break/images/spacer.gif[/img]before it even reaches the plugin. There's no feasible way to turn that into a DOM and back again just to do what we need.@sun. I don't see the case where
<p><!--break--></p>(or any variant with a break in the paragraph) would be useful, which is why I made the expressions 'push' them out of the paragraphs when possible. The hard cases where there is no opening/closing tag right next to the break are left up to the correctors, but educated users will hopefully not create these if there's no corrector.</end rant>;)Comment #13
aschiwi commentedAlso, when I have Teaser Break active and edit a node in IE8 there are
<br />tags added in all kinds of places.Comment #14
twod@aschiwi, are you sure that problem does not come from also having the "Line break converter" filter enabled on the input format using the editor? You don't want that filter on as the editor already takes care of inserting the proper tags for breaks and paragraphs.
Comment #15
aschiwi commented@TwoD: Alright, I didn't know that - thanks!
Comment #16
gregarios commentedI am also having this issue, but my problem is with IMG tags. When the teaser-break button is simply enabled, all my IMG tags are non-xhtml compliant, like
<img src="http://url.com/img.jpg">even when not using the teaser-break at all. However, when I turn off the teaser-break button in the WYSIWYG Editor settings, then my images are correctly formatted as<img src="http://url.com/img.jpg" />.This has been tested on at least 3 different Drupal sites, using TinyMCE 3.2.7.
Actually... mine looks like a different issue. Opening a new one...
Comment #17
gregarios commentedComment #18
sunIIRC the real cause was jQuery, which bases its .html() on the doctype being used. Can we temporarily hi-jack jQuery?
Or can we permanently hi-jack jQuery? i.e. replace jQuery.fn.html() ?
Comment #19
sunAfter discussion in IRC, we want to try whether something like http://www.stevetucker.co.uk/page-innerxhtml.php is able to solve the problem.
Comment #20
jide commentedAs said in #510552: Invalid XHTML: missing trailing slashes, absolute urls and uppercase tags, comment #16, avoiding DOM completely could fix this as far as the Teaser break plugin is concerned.
But if a library is needed for other cases, I think this one is the best around :
http://xhtmljs.codeplex.com/
BSD license though.
Comment #21
sunAlthough both issues contain valuable information, marking as duplicate of #510552: Invalid XHTML: missing trailing slashes, absolute urls and uppercase tags, as we think that we won't go with the approach of this issue.
You can follow up on that issue to track its status instead. If any information from this issue is missing in the other issue, please make sure you provide it over there.
Comment #22
m4oliveisubscribe