In node bodies, you can use <!--break--> to decide the teaser chop off point. It is in the form of an HTML comment for historical reasons. Today however, there is no need for this tag to be a comment. It is removed from the node body and teaser when output, so it works in BBcode, textile or any other format too.

So, let's change it to the easier to use <break>. HTML is the most used format around, and it does no harm in other formats anyway.

The attached patch maintains backwards compatibility with the old break tag, but the help texts only speak of <break>.

Comments

Not to nitpick, but shouldn't the tag really be <break/> instead of <break>? :)

*cough*

It is removed from the node body and teaser when output, so it works in BBcode, textile or any other format too.

Whats so bad about a html comment style break tag? Here are just a view points to consider IMO:

1. "... easier to use ..."
If you use a WYSIWYG editor like TinyMCE you dont need to care about the style of the break tag. If you use a textarea to enter your page content you can save a few characters to type, but on the other hand a html comment style tag is actually easier to spot (esp. among many html tags), when you come back to modify your page.

2. "HTML is the most used format around ..."
I agree that this tag could be nice in html, but if you really want a html style break tag it should at least be fully compliant (as Arto pointed out), what means <break /> instead of <break>. Otherwise I cant see any advantage in that (using a html-LIKE tag only). Even if it is removed on output its is still stored in the database. You dont know what the users will possibly do with it, for example one could try to import/export the node (directly from db) and voila there is a no-html but html-like tag!

Version:x.y.z» 5.x-dev

An HTML comment style tag is:

  • Unnecessary: it is removed from the output anyway and can take any form we wish.
  • Complicated: it adds 5 punctuation characters to type, and is asymmetrical to boot.
  • Inconsistent: Having to explain to the end user that the "break tag" uses a different syntax from the "bold tag" or "link tag" is confusing. This is important as we're dealing with the out-of-the-box experience here.

As for Arto's comment, that is ridiculous as there is absolutely no reason to require the XHTML closing slash. Plus, practical use here on drupal.org shows that even experienced users have problems getting their HTML syntax right, so forget the use of XHTML. Just recently I had to close an unclosed anchor tag on the theme upgrading guide. That is the page maintained and used by people who code HTML for a living.

Even if it is removed on output its is still stored in the database. You dont know what the users will possibly do with it, for example one could try to import/export the node (directly from db) and voila there is a no-html but html-like tag!

This is a poor argument because we already have our own special flavour of HTML: linebreaks and paragraphs are added, urls are converted to links, invalid tags are stripped out, XSS is removed and depending on what contrib filters you install, code can be escaped, words can be auto-linked, images can be inserted, and much, much more.

If you take the content that is stored in the database and display it literally, you will not only get something completely different from what was displayed on the Drupal site, you will also open yourself to security issues.

The proposed tag should be easier to use. I think we should deprecate the old tag, and provide a database upgrade path for existing sites though.

What about using something like [[break]] for greater visibility in WYSIWYG editors? I know there is the drupalbreak plugin for TinyMCE, but making a more transparent break tag like [[break]] would be visible in all WYSIWYG editors as opposed to <break> which would only be visible in code view. This way it would make it easier for less skilled users (who may be more likely to be working on a site with a WYSIWYG editor) to move the break tag around.

StatusFileSize
new2.78 KB

I still insist we should stick to <break>. I only provided backwards compatibility for easier transition, but really, we can just provide an update to convert the old tags to the new ones.

Patch attached.

Whats so bad about a html comment style break tag? Here are just a view points to consider IMO:

1. "... easier to use ..."
If you use a WYSIWYG editor like TinyMCE you dont need to care about the style of the break tag. If you use a textarea to enter your page content you can save a few characters to type, but on the other hand a html comment style tag is actually easier to spot (esp. among many html tags), when you come back to modify your page.

2. "HTML is the most used format around ..."
I agree that this tag could be nice in html, but if you really want a html style break tag it should at least be fully compliant (as Arto pointed out), what means <break /> instead of <break>. Otherwise I cant see any advantage in that (using a html-LIKE tag only). Even if it is removed on output its is still stored in the database. You dont know what the users will possibly do with it, for example one could try to import/export the node (directly from db) and voila there is a no-html but html-like tag!

Very valid points. I'd stick with the html comment.

Priority:Normal» Critical

IMO this is a critical improvement for breaking with bad old habits in 5.0.

Priority:Critical» Normal

I have to chime in with those who don't find <!--break--> to be a bad habit in the first place.

An HTML comment is HTML, and is still valid and parsable in HTML and XHTML. A custom HTML-ish tag is not still HTML and is not still valid HTML or parsable XHTML.

I really don't see how <break> is more usable than <!--break-->. It's 5 fewer characters; that's it. Big deal. If we're worried about people getting confused by the word "tag", then just call it the "break marker".

The comment tag is actually more robust, because if the data makes it through processing and the comment tag wasn't filtered or split properly (not all content goes through the normal body/teaser splitter, remember) then the comment tag breaks nothing while an errant invalid tag results in invalid XHTML and potentially other issues.

This isn't critical. It's not even necessary, IMO.

Version:5.x-dev» 6.x-dev
Status:Needs review» Fixed

Committed to CVS HEAD.

Version:6.x-dev» 5.x-dev
Status:Fixed» Needs review

I very much disagree. We're talking about the default, out-of-the-box experience for Drupal.

I recently made a site for people with novice web skills. Explaining them to use simple HTML is still doable, but something like "

" just doesn't mesh with non-coders. They'd have to write it down somewhere and would copy it over character by character every time, with little arrows saying "dash" and "exclamation mark" and "no spaces!". The form <break> still visually looks like a tag. Once you add an exclamation mark and dashes, it becomes coder talk.

As far as processing goes, there is no difference. If the break tag somehow ends up in the output, it means you are printing it unfiltered and are open to some serious security issues. Both before or after this patch, the processing by the filter system would still remove it (filter_xss for comment, html filter for tag). Though if you use the normal node viewing process, it should be removed explicitly by node.module (there is a str_replace).

In 6.0, I want to see us add a jQuery powered teaser/body splitter, which can be disabled when more advanced editing is available (e.g. WYSIWYG). But I think we should keep our default approach easy and simple.

Status:Needs review» Fixed

Oops, simultaneous submission. Thanks Dries.

IMHO, this is a silly patch. If anything at all, the tag should be moved to something like [teaser_break]. At least use brackets to distinguish it between real HTML. Using [teaser_break] distinguishes it from regular HTML and also is self explanatory.

Byte

I'd just like to voice my own -1 to this already-committed patch. It's a maze of contradictions, solely inspired by Steven's use case: that a novice site he built couldn't figure out [!--break--]. Fine, that's understandable. But then he advocates this new change with "HTML is the most used format around", but refuses to believe that valid XHTML is useful, claiming that "there is absolutely no reason to require the XHTML closing slash". What is your rationale when XHTML does require that [br /] and [img /] both have closing slashes? Is it solely to satisfy your novice user site? Fine, no problem, but then don't rationalize the feature as "HTML" - it's not, especially since Drupal does nothing with HTML (we care to send valid XHTML, not HTML). Likewise, selling something as HTML, which is meant to be interpreted and displayed, and then saying that it's stripped from the content, no harm no foul, is contradictory.

But then, you argue that the element is removed from output. That's a bum reasoning too - this is data in the database, and you simply can't assume that Drupal will be the only application ever using it. With HTML comments, at least the /most prevalently displayed format/ could continue to validate properly, since the break tag was in a comment. With this new change, any other application must be specifically told to strip [break] if they're planning on using the data and wish it to validate properly.

Finally, you commit the egregious sin of modifying the user's data with this change, even though you advocate that the old comment will continue to work (system_update_1018). That's making a gross assumption about the user's data and how the user makes due with it.

No, I am NOT suggesting that [!--break--] is the awesomest thing in existence. I am merely suggesting that this was the wrong way to "fix it" when all your rationale ("it's HTML, whee!") was sold to us with unspoken ("my novice users can't understand [!--break--] or [break /] for that matter") and contradictory ("it's HTML! No, not XHTML! HTML! I know, I know, we spit XHTML but I'd rather my users learn the incorrect [img] then [img /]") motives.

To specifically address, "This is a poor argument because we already have our own special flavour of HTML: ... and much, much more. If you take the content that is stored in the database and display it literally, you will not only get something completely different from what was displayed on the Drupal site, you will also open yourself to security issues." I can only say "And"? I don't find the followup particular compelling: they're all value-adding (and note that urlfilter-in-core only happened recently) to the original content, but they're all at the request of the user and thus, the assumption is made that other applications will access and interpret the same thing. This change, especially the forced str_replace in the system_update, is an application-specific one not requested by the user and which must now be specifically addressed outside the normal operating antics.

Whatever the outcome of this I think any break command pretty much stinks from an end-user point of view. It is very difficult to get most community-type end users to correctly use it and/or use it at all. For 6.0 I hope we finally see separate teaser/body input fields. I actually managed to do this with 4.7 without hacking core, so I'm sure it can be done very easily.

In the meantime, changing what people are already familiar with for such a modest 'gain' seems like a dubious thing to do...

A bit late into the action, but I think the new <break> tag opens up a can of worms ..

it is a non html tag that looks exactly like html but really messes up rich text editors and html validation and maybe even the output...

suppose you use this tag in a node with a full html or php format, is it stripped then?
will it be reliably stripped if someone has other filters than the default

if the old tag will slip though the filters, but also when you are editing, you at least have a valid html/xhtml ...

I vote to undo this patch, it does solve the one problem of being simpler, but it introduces a lot of new problems and possible breakage.

Status:Fixed» Needs review

reopening this issue

Priority:Normal» Critical
Status:Needs review» Active

ok chx... active is a better status :)
Critical too, because it directly affects what end users will encounter in the normal use of the site

I also think this was an unwise change. If anything [break] is more in line with all the other filters we employ.

I wasn't too fond of <!--break--> either, but I definitely thing <break> is the wrong thing.

I'm also worried about the fact that this was changed in between a beta and an rc.

if anything, it was 'fixing' something that wasn't broken.

I'm in favor of rolling this patch back. I agree with the arguments against it.

Despite the fact I'm personally quite happy with the html comment style of break marker I can see Steven's point that it's verbose and hard to use. But I also agree with the assertion that this is not a good fix. The sort of users that can't handle a comment break marker are those that find the whole concept of a break marker hard to handle. They need hand holding and are better served by wysiwyg editors.

+1 for a rollback

This patch received no positive reviews from anybody who reviewed it (aside from the core maintainers). It has received 11 negative reviews. The only question remaining in my mind is whether we need to provide an upgrade path for those users who have allowed their data to be altered by system_update_1018().

-1 on the exact proposed change.

If most (all other?) filters use [filter foo] syntax, then probably we ought to stay consistent.

I am also concerned about using filter strings which might occur in normal text where they are not meant to invoke a filter. Suppose I wanted to instruct the user of some application to press the break key? I have actually seen printed user manuals where non-symbol/alphanumeric keys on the keyboard were all specified as <shift>, <insert> and yes, <break>.

Maybe it's an extreme example, but we should try to avoid tripping up users, especially novices, with things like filter strings. The more cryptic the filter string syntax, the less likely something like this will happen. Nothing's perfect, of course, but we should at least be cognizant of such possibilities and weigh them fairly when picking a syntax.

I'm also in favor of separate teaser and body fields. With Javascript goodness, we can even prepopulate the teaser field on the fly if the user so desires.

Chris: could you clarify your position a bit? The patch is already in RC1, the proposed change at the moment is to roll it back. Your arguments didn't make it 100% clear which you are in favor of.

Robert: sorry, the "proposed" word was superfluous and misleading. -1 on the exact change that was committed and in RC1.

I think we should roll it back, yes.

Some use cases involve having databases that are shared by Drupal and legacy systems. An HTML comment is innocuous, well-documented and widely used outside Drupal. A pseudotag is not. +1 for rolling this back.

Unlike what some people suggested on IRC, I did not force this patch into core with a handgun to Dries' head. He agreed with me that <break> is much easier to use and remember for end-users and in line with our principle to prefer simple things (KISS).

My long term goal is indeed to get a textarea splitter into core where you can easily split a post and join it again. However, this is something which we cannot do for 5.0. So, I did the next best thing: get rid of the cryptic syntax and improve usability for now

So, with the risk of sounding like a broken record, I'll address some of the earlier points:

The sort of users that can't handle a comment break marker are those that find the whole concept of a break marker hard to handle. They need hand holding and are better served by wysiwyg editors.

The practice of using "Smart HTML" (i.e. where linebreaks are added, URLs are autolinked, ...) has existed for ages and is still the most popular way of commenting on blogs. People are familiar with tags, even if they can't use them 100% properly. Having to remember <break> to insert a break is much easier if you don't have to remember 4 unrelated dashes and an exclamation mark to go along with it.

Some people here make it sound as if there is a chasm between the wise web developer and the oafish non-developer who can only communicate by banging his hands on the keyboard or use WYSIWYG. This is not true. However, for a non-developer, <!--break--> is a lot more cryptic than <break>.

This is e.g. why people have trouble remembering passwords, but rarely forget phone numbers or PIN codes: they are used to dealing with numbers, but not complex alphanumeric or even symbolic sequences.

I'm not doing this just because a single user on one of my sites complained. I believe this significantly improves the out-of-the-box usability of Drupal 5.0 for a large amount of our users.

If most (all other?) filters use [filter foo] syntax, then probably we ought to stay consistent.

No, because there is no technical reason to use square brackets. The filter system makes it perfectly possible to introduce custom or enhanced HTML-like tags (see codefilter, which enhances <code> tags). Not a single core filter uses square brackets either. AFAICT, this practice only started because the filter system was not mature enough then, or because the developer simply assumes that HTML-like filter tags will not work.

We can't help it if contrib authors deliberately introduce yet another syntax.

"there is absolutely no reason to require the XHTML closing slash". What is your rationale when XHTML does require that [br /] and [img /] both have closing slashes?

The tag is removed before it is filtered and never appears in output. So, we are left with the argument that "we should keep valid XHTML in the database?" Why? XHTML validity is about much more than closing tags. It's about which tags go where... and after the content is filtered, it looks completely different from an XHTML point of view than before. Ensuring XHTML validity on both ends

The only people who add closing slashes to tags are web developers, and even they suck at it. Given the fact that I routinely correct bad HTML on Drupal.org, whose audience is highly technical, I think it's fair to assume that the typical Drupal site (whose audience is not technical) has absolutely nothing to gain from requiring the syntax <break /> over <break>.

Finally, you commit the egregious sin of modifying the user's data with this change, even though you advocate that the old comment will continue to work (system_update_1018). That's making a gross assumption about the user's data and how the user makes due with it.

The old syntax was removed by Dries' suggestion. I wanted to do this myself, but I didn't at first in an attempt to placate the masses. I guess that didn't work out.

However, the update makes absolutely no assumptions that Drupal has not been making for several years already. Find me a node that is broken by this update, which was not broken before. Test it and verify. I'm confident you won't find any.

Some use cases involve having databases that are shared by Drupal and legacy systems. An HTML comment is innocuous, well-documented and widely used outside Drupal. A pseudotag is not. +1 for rolling this back.

Do me favor. Take the content of your site, as it was typed into the database, and do something with it. Anything.

Oops, the linebreaks don't show up. Those URLs aren't clickable. That horrible img that that guy posted in his signature, but which was blocked by the HTML filter? You see that. If you are technical (you are, because you care about XHTML), you probably used codefilter.

The fact is that all content on a Drupal is created with the intention of publishing it as it appears after being filtered. That's how our workflow is.

If you wish to do something else with the content, you either need to accept that it looks completely different, or postprocess it. Removing the break tag is just another necessary step in this procedure.

I won't spend any more time on this issue as I simply keep repeating the same thing over and over again, and people are expecting things that are simply not true. I do know that I will be running this patch on each and every one of this sites if it gets rolled back, because I see no real, practical disadvantages.

I do think there is a degree of both resistance-to-change and fear-of-HTML-which-is-not-HTML here. For the latter, I can only say that our current filtering pipeline has been pretty much unchanged since Drupal 4.4. Since then, we only added input formats. It still works fine, and advanced filters like codefilter.module can do their job without hindering others.

but there is a technical reason to use square brakets -> they don't break html when unfiltered, they are hardly used in real life, they are easy to filter (easier to filter than html brackets, especially if you have to check for both html and bbcode like tags at the same time)

Your point that <break> is easier than <!--break--> is not in dispute... it is easier. I also do not care about changing the data in the database.

But the short term drawbacks are such that I still would like this change to be rolled back. Those drawbacks in my opinion are: breaks backwards compatibility, introduces a non HTML tag that looks like an HTML tag, it is not consistent with other methods, does not work well with rich text editors without requiring modifications to those, it is a non standard solution (why not use <hr> if we must have a teaser break, or use <div class="teaser">Teaser here</div>), there are better solutions (modules like excerpt, superteaser)

The user point of view:

- If your users are not HTML coder then you install a wysiwyg editor. In this case it’s a nightmare for these users to insert a <tag> (change mode to 'source', find the right position, insert the tag, go back in 'wysiwyg' mode). For these users a mark like [break] or /break/ is the simplest solution.

- If your users have a minimal background <break> or <!--break--> is not a problem.

The best solution (IMHO) is to accept each solution:
- a real xhtml (invisible) tag: <!--break--> or <break /> (for technical users)
- a plain text tag like [break] (for non technical users)

Priority:Critical» Normal

Sorry, I really don't think this is critical.

My 2c: Yeah, a filter break tag like [break] is what I rallied for in #6 for the WYSIWYG editor reason. I know there is a drupalbreak plugin for TinyMCE, but having a visible break tag for ALL rich-text editors as well would improve usability. When I think of usability in this case, I'm thinking of the most novice user who knows zero about HTML. In these instances I assume the site admin has set up SOME sort of WYSIWYG editor and like stated above, expecting the most-novice user to switch to HTML mode to type in <break> is not too user-friendly.

However, with CCK you can easily setup a content type with a separate teaser/body field so this issue is really not that big of a deal, just wanted to give my point of view.

I was actually working on making the <!--break--> tag visible in FCKeditor, making that visible is not different form making a [[break]] placeholder visible, it's just some work. <break> is no different. I believe that it should be possible for tinyMCE too, but I'm not familiar with the inner workings of that module.

Priority:Normal» Critical

I just noticed that the current RC and HEAD checkout have no backwards compatibility for <!--break--> where the initial patch supports either <!--break--> and <break>

I think it should at least be backwards compatible for now (that's why I marked it critical again)

Actually, [break] would always be visible in a WYSIWYG editor since it is just plain text, while <break> is HTML so would be hidden behind the scenes, thus requiring each rich-text editor to do implement a solution to make it visible to the end-user. That's the issue I was trying to bring up.

For WYSIWYG, there should be a button insert a break automatically, and it should be displayed in a nice fashion (horizontal rule with 'read more' label or something). This functionality has existed for a whole now.

I know TinyMCE has the Drupalbreak plugin so we're golden on that end, I just wasn't sure about the other editors. I'm going to go concentrate on some other stuff since I could pretty much care less about how this one goes.

For WYSIWYG, there should be a button insert a break automatically, and it should be displayed in a nice fashion (horizontal rule with 'read more' label or something). This functionality has existed for a whole now.

Let me get this straight... The problem isn't that we are using an HTML-invalid chunk of text for a tag... It's that everyone else isn't smart enough to work around what we've done?

Seriously, how freakin' hard is it to just use ? Why was the

tag so horribly repugnant that we had to break WYSIWYG editors to get rid of it?

Case in point; <break/> was filtered out of that comment transparently. ;-)

@39: Well, I guess the problem was noticed by the RTE users first, so they should fix it :P (note the smiley, please)

Why do we try to keep drupal clean and standards happy if we introduce non-standard tags / funny workarounds in the way we handle our content?

More than XHTML validation which, as Steven points out, is about much more than just closing tags, adding this (self-censored grumbling) syntax even stops the content from being well-formed XML once wrapped in a root element, although this is far less taxing than achieving XHTML compliance.

Beyond that, lessening the evil thus wrought would probably be allowing both the valid-XML syntax we've known previously and upon which some modules can build (like valid_node), and the new syntax.

Is there actually a case for removing the normal syntax instead of just allowing the new one ? Is there a specific defect to be expected from maintaining the previous syntax ?

An argument that also seems to have been very little developed is the fact that this change has been enacted to simplify syntax for non-savvy contributors. This is the typical - and valid - argument for wiki-like syntaxes, but then why go 1% of the route, breaking compatibility, and leave the 99% remaining syntax to be done (liquid wiki, anyone ?) ?

IMHO, simplifying user input is a worthy goal, as I've had contributors complaining about having to use XHTML fragments as input instead of just plain tag soup or a wiki syntax (hence valid_node.module), but just breaking XML well-formedness without any added value does not seem a proper way to go.

Obviously, I'm voting -1 on the patch and +1 on rolling it back, even though one of my sites already fell victim to system_update_1018().

IMHO, this doesn't seem more like an issue other than it requires WYSIWYG editors to specifically implement this so that it won't break.

The only concern (that I can think of) is that people upgrading from 4.x to 5.0 should be well-informed of this change. It took me a while to figure that out when using 5.0 RC1.

Just to balance the voting going on here: I agree that it was a bit unwise to make this change in a RC, but it's definately a big usability improvement - Steven is right on this one as well. Also, it reminds me of <front> and the similarity is a good thing even for advanced users.

Guys: to those who are arguing against this patch (which, so far, seems to be about 90% of the audience), please note that "it's invalid XML or XHTML" is an irrelevant argument that will fall on deaf ears: Steven has already said, numerous times, that a) it's removed on output, b) anything that cares about the raw data in the database should care about the <break> tag too. As such, arguments against this patch (of which, don't get me wrong, I am a vocal proponent, but most of these Followups are only helping in Quantity, not in Quality) need to be a bit stronger and more convincing.

(And, for what it's worth, <front> is a bad comparison. It has nothing to do with content.)

Add my -1 to this patch.

1) <!--break--> is already in use.
2) <!--break--> is natural.
3) <!--break--> so what that it is not output to the display.
4) <break> does cause already implemented code to break even though it isn't ouput to the display.
5) <break> most of us do not want it.
6) [break] is an alternative to <break> but isn't needed since...
7) <!--break--> support for this should never be removed.

Morbus,

Your answer to the (valid-XHTML / well-formed XML) fragment argument centers on output, but this is not where the problem is : as Steven already pointed out, this marker is removed upon node rendering.

The problem is of concern to modules processing node content, which all of a sudden can no longer rely on input being well-formed, which is a significant regression on the march towards valid content whatever the user input, which anyone can reasonably expect from an evolving CMS.

Would Drupal be the one go to backwards ? Say it ain't so.

fgm: Steven's response to that would likely be that these modules will need to specifically cater to the <break> tag, as they are no different than wysiwig editors that should have magical buttons applied to them (#37) or other external processors (#30: "...or postprocess it. Removing the break tag is just another necessary step in this procedure.").

  1. <!--break--> is already in use. This is totally invalid, it's against any development.
  2. <!--break--> is natural. Really? Natural how? Why not <!-- -- --> then for example which looks like a horizontal line so it's also natural.
  3. <!--break--> so what that it is not output to the display. Great. <break> is not outputted either.
  4. <break> does cause already implemented code to break even though it isn't ouput to the display. Since when Drupal is BC? You knock on the wrong door.
  5. <break> most of us do not want it. Who is us? Steven's answers placated many developers including me.
  6. [break] is an alternative to but isn't needed since... since we do not support bbcode like stuff.
  7. <break> support for this should never be removed. Again, BC.

The last is !--break-- of course.

One additional thing strikes me, in addition to the technical aspect. Dries said the current (ok, past) marker should be deprecated (http://drupal.org/node/87145#comment-156903), not removed.

Maybe allowing both the old, correct, syntax and the new one during the lifetime of D5 would allow time for both points of view to evolve : Steven could expand on the grand view he mentions (http://drupal.org/node/87145#comment-169474) for D6 and convince everyone, while some more generic input conversion could be cooked at the same time, allowing for instance by default modules to receive valid content whatever the user input format used, including such tag soup, to continue progress towards valid output, which is something drupal currently doesn't do yet, experiments like htmltidy.module and valid_node.module notwithstanding.

Status:Active» Closed (won't fix)

This is a clear example of a color of the bikeshed issue.

The number of developers who chime in to a form API or menu API discussion is at most a few dozen. But everybody can have an opinion on the break tag, and I mean every Drupal user out there. This discussion therefore is moot.

1) <!--break--> is already in use. This is totally invalid, it's against any development.: I'm using it now. It was the suggested use for an issue I had with one of my pages. I'm sure that since it was the suggested fix that others are using it.

2) <!--break--> is natural. Really? Natural how? Why not <!-- -- --> then for example which looks like a horizontal line so it's also natural.: It is natural because I'm already thinking in HTML tag input mode and I wouldn't use some other rendition because of item 1). Since I'm already thinking HTML then <break> too closely resembles <br> and one could easily become brain dead with the difference and this is especially true with newbie HTML coders. The argument that Drupal already changes HTML may be a valid argument but *NOT* in support of adding yet another difference as I would like to see fewer changes to HTML especially when the Full HTML Input Format is chosen.

3) <!--break--> so what that it is not output to the display. Great. <break> is not outputted either.: So, yes, it means that the Drupal output side isn't what is broken.

4) <break> does cause already implemented code to break even though it isn't ouput to the display. Since when Drupal is BC? You knock on the wrong door.: Implemented third party code. <break> causes third party code that used to coexist with Drupal not to coexist any longer.

The reasons to keep the <!--break--> tag as it is in my opinion:

I have found it useful when I train end-users that the break tag is an html comment tag. This allows for them to understand that comments are things that can control things behind the scenes and keeps them separate from html tags that they use for design of their text.

It is a good think to use current standards, this sort of things belongs in an html comment. That's what it is, a user comment about content. Comment tags have been used as controls for server side includes and triggering features of cms' for a long time. This seems like a good thing and I wonder why there is any need to move away from this.

It is good to get users to think about the break tag differently than a br p span or other tag that is to effect the visual display of the actual content (as apposed to a tag that determines an attribute of the content for the cms to act on).

But, I am also a bit strange in that I tend to avoid any wysiwyg editors. I find that they all end up confusing users rather than educate them or give them the feeling of control over their content that is the goal of using a CMS. (using qucktags.module I can give them an easy way to modify their content and let them learn html at the same time).

To address some of the points made in support of this change

An HTML comment style tag is:

* Unnecessary: it is removed from the output anyway and can take any form we wish.
* Complicated: it adds 5 punctuation characters to type, and is asymmetrical to boot.
* Inconsistent: Having to explain to the end user that the "break tag" uses a different syntax from the "bold tag" or "link tag" is confusing. This is important as we're dealing with the out-of-the-box experience here.

unnecessary: some tag is necessary, the question is why make a change like this? why is the change necessary. I feel the burden is on those supporting a change to defend why it is needed

complicated and too many characters? this is far from a valid reason to make this sort of change. why not just make it one character? users can be taught that this is what is called an html comment and get an understanding of something that applies to any web document, not just drupal. I see that as a positive thing.

Inconsistent: From my experience training people in using drupal over the past few years, they never have a hard time once they understand the distinction between an html comment and an html tag that effects rendering/style of content. I have never had this issue damage a users initial experiences with drupal.

and finally, issues that effect how end users learn and grow by using a tool is far from a color of the bikeshed issue, this is about developers making changes to a codebase that are not necessary and will cause confusion to end-users that have made a commitment to drupal and are looking forward to upgrades that don't force them to unlearn valuable knowledge

Two more points:

It has been said that since the filters act on this text that the arguments about what is semantically better are moot. This might be true from a developers standpoint, but hardly from the end-user's standpoint.

My concerns are with what the user inputs in the node content, how the end-user thinks about the content and the cms and how that reflects in the codes used.

What they might be able to apply from earlier experience and what might be the most useful and platform agnostic/web standard way of doing something is where I would rather us look.

Secondly, unless the old break tag is also supported what does this mean to end users when they upgrade? Will the upgrade script now have to sort through the content of nodes to change node content from the old break tag to the new one? That seems like a lot of needless db updates. What will that mean to server load for people on shared servers?

If both tags are going to be supported, won't that end up causing more confusion for end users in the long run? more code to maintain in the long term?

What valid reasons are there to make this sort of last minute change?

Let me chime in on this.

-1 for this patch. Please roll it back.

It is true the new tag is not output, but this is only valid if the database is not shared with another application.

A [break] tag is more consistent with the rest of the filters we use (img_assist, adsense, and a bunch more), and can be inputed equally easily with or without a WYSIWYG editor in place.

+1 on no functionality changes during a freeze.

I wrote:

If most (all other?) filters use [filter foo] syntax, then probably we ought to stay consistent.

Steven replied:

No, because there is no technical reason to use square brackets. The filter system makes it perfectly possible to introduce custom or enhanced HTML-like tags (see codefilter, which enhances tags). Not a single core filter uses square brackets either. AFAICT, this practice only started because the filter system was not mature enough then, or because the developer simply assumes that HTML-like filter tags will not work.

We can't help it if contrib authors deliberately introduce yet another syntax.

Steven mixes the arguments up completely here. He first claims this change is for usability, but then says there is no technical reason to use square brackets. Note that I said if most filters use square brackets, we should be consistent. The plain fact is, consistency adds to usability, regardless of any technical arguments one way or another. If most of the filters used most commonly by users use square brackets, whether they are core filters or not, then that is what users expect for consistent behavior. What core does is almost irrelevant.

chx later writes:

This is a clear example of a color of the bikeshed issue.

I think making a statement like that might encourage most non-core developers to simply shut up. While it might be true that many people commenting on this issue simply have an opinion about the color of the bikeshed, the fact is some of us have a whole lot more experience and expertise in usability issues than others. I count myself among the resident usability experts (we can argue that point elsewhere if you wish), but I'm not sure chx or Steven are expert, but am willing to be convinced. My point is -- not that I do not respect chx's and Steven's opinions, or the vast amount of incredible work that they've done -- but rather, it's unfair and unwise to simply cut people off using the bikeshed argument.

I agree with those who say <break> is simpler than <!--break-->. But is it enough simpler to make the change and inconsistency worth the trouble? If we're going to change things in the name of usability, than let's be sure we are really pursuing real, better usability for the majority of our audiences, and not just our own personal small sample.

I'm not arguing the technical merits at all, here.

i'd also like to see this patch rolled back.

Just some thoughts...

And if we completely deprecate

? Stop for a minute and think: what do we really use

for? To do some "adjustments" to the teaser output of some nodes?
What nodes and with what regularity? Blog posts? Articles? Book pages?

If we watch closely, Drupal has a feature that does auto-breaking
?q=admin/content/node-settings for a maximum number of characters. If this auto-breaking was smart enough we would never want the

.

But you ask, "Oh! And if we want a smaller teaser?" In this case, you probably want a *different* teaser and not just a smaller one. And to this happen you need the excerpt.module that allows the configuration and edition of the node teaser! Anyway this should be a core function and not a contrib one.

Summary of what should be the future:
* Remove of

tag
* Definition of maximum number of characters per node type (and not global)
* Development of a smarter auto-breaking (many improvements already done)
* Inclusion of excerpt.module in core, allowing the definition of manual and automatic teasers

Any questions?!

Note to myself, always do a preview. <!--break--> tag missing in some sentences...

-1
For reasons I explained at length in this related thread I really think this move is a bad idea.
Arguing about {break} vs [break] may be a 'colour of the bikeshed issue' (A new phrase for me :) but comparing current, working, valid markup (or markdown) with this new alternative which deliberately breaks XHTML validation, wysywyg editors and third-party integration for no appreciable benefit is the difference between code that compiles and code that doesn't.

This is NOT an HTML tag. Encoding it as one is bad code. Come up with something, anything else, OK, but please make it XHTML-valid and not create mutant pseudo-markup that will create future compatability issues.
One scenario is that these users who you think are to stupid to comprehend how to copy a comment tag properly will learn that <break> is part of this mysterious HTML language - with annoying misconceptions and re-education needed. Maintainers for years to come will be cursing Drupal for introducing this stupid lie into the mainstream. Compare with the <wbr> non-tag debacle.

<br class='pagebreak' /> would work for me

From a technical processing standpoint, it's just a strpos() call in node_teaser(). That means there is no PHP-centric reason for one magic incantation over another.

From a WYSIWYG editor standpoint, for any WYSIWYG to support "put teaser break here" it will need a function to insert a magic incantation. However, it will also need to know how to render that, or not render it, in WYSIWYG mode. Since there's no standard rendered form for a page break, they'll have to invent one. I've not written a WYSIWYG editor, but I imagine finding something that doesn't look like HTML (which is rendered as HTML) would be easier than a pseudo-HTML, since you wouldn't need to mix the renderers. (Someone who works on a WYSIWYG, please correct me if that is in error.)

From a "correctness" standpoint, I don't buy the "it's data so it doesn't matter if it's valid or compatible" argument. Two weeks ago, I needed to run an XHTML-correctness fixing script against the data on a site, at a client's specific request. Having to hack such a script up to ignore a <break> tag (or a <break /> tag, which is no better here) because it looks like HTML, tastes like HTML, but isn't HTML would make my life more difficult. That harms developer usability.

It's also out of character for Drupal. One of the things I like about Drupal is that it goes out of its way to comply with open standards; XHTML output by default, accessibility extras automatically, using @media in CSS, even going as far as using all the various different HTTP return codes. So why then make an exception and say that in this one case we won't make sure that we follows standards, we'll actually go out of our way to change something to not be standards-compatible and dismiss it as OK because "it's just data"? That really doesn't sit well with me. And if we pull data out to anything else at all, then having renderable text (like [break]) in the database is one more step to have to fix when doing, say, an export. Has anyone considered the impact on import/export routines to other systems? We have a nice new API module for that; this could break it unpleasantly.

From a user-usability standpoint, users who are too ignorant to know what an HTML comment is won't understand what HTML is in the first place either. (I'm using word "ignorant" here in the descriptive sense of simply not knowing, not saying that they're stupid or lesser.) For them, it's just an arbitrary string of text with magical meaning they don't understand that they are going to copy and paste out of the help text at the top of the page. Whether it has square brackets, angled brackets, or angled brackets and dashes doesn't actually make a difference. Copy and paste doesn't care, and it's all gibberish to this user anyway no matter what you do.

For users who would know the difference between a tag and a comment in the first place, for whom it's not arbitrary gibberish, they would recognize the difference between a tag and a comment and, I believe, understand why a comment makes more sense than a fake-tag. They would at least understand what a comment tag is and be able to recognize if they forgot a dash (or for that matter forgot a closing angle bracket, or whatever other magic incantation we use). Using an HTML comment is not a usability problem for these users in the first place.

I am all in favor of rolling back this patch completely and leaving it as an HTML comment. I do not see how it is a usability problem (users would either recognize it and know what it is or wouldn't know either way as it's just magic gibberish), using a renderable string makes import/export (and possibly WYSIWYGs) more complicated, and using a pseudo-HTML tag breaks Drupal's strong pro-standards policy and makes life more difficult for developers.

A possible workaround to make <break> liveable with the WYSIWYG problems is <!--<break>--> which I've tested with the current implementation via the default Drupal 5 input form. I don't have WYSIWYG installed so someone else needs to verify that this possible workaround has the achievable goal.

Umm...the point is to make the <break> tag more usable. I don't think <!--<break>--> is going in the right direction. WYSIWYG's can be made to deal with the new tag, but it will take a lot of work on contrib modules with very little gain in usability. Some points people have made are valid, but there are a lot of incorrect assertions about this.

RobRoy: his comment says that <--&ltbreak>--> works /right now/ and is "valid" HTML or XHTML.

Hrm. Although, I'm confused how it would work - would seem that teasers would have the start of an HTML comment that is never finished (unless our output filters remove HTML comments...).

I checked the output source of both the teaser and full mode. The comment pieces were not present.

-1 for the change
+1 to rollback before 5.0.0

It's a needless change, IMO. I don't even see why it was brought up, myself.

If it must be done, then Drupal should simply add an option in the settings near the teaser length setting for what tag the site administrator wants to use (freeform text field) and display that in the format help under text areas. That's if you guys insist on this silly change. I really don't think it's needed and feel it was committed far too quickly. This changes content creation patterns. For a CMS, that's a big change, even if it's just a few letters here and there.

Bad monkeys, no banana.

Another +1 for the rollback. We gain nothing in terms of usability with the change from

to ; both are tokens users must learn. It breaks WYSIWYG editors that have *already* been ported to Drupal 5.

API compatability during the code freeze should mean that 'a module I port to the next version of drupal, that works correctly, should not require more changes to be compatible with subsequent betas, RCs, etc.' If we want to change the delimiter token in Drupal , and provide some slick Javascript insertion tool as has been mentioned, by all means. Let's do it in Drupal 6.

Another -1 for the change; +1 to rollback before 5.0.0

While i agree that <!--break--> is a bit cumbersome, there are some very good reasons to keep it for now, and as far as i can see only spurious arguments in favor of changing it at this time (particularly the proposed syntax). This is especially true if this is only an interim solution until the next rev of Drupal, as Steven suggests:

My long term goal is indeed to get a textarea splitter into core where you can easily split a post and join it again. However, this is something which we cannot do for 5.0. So, I did the next best thing: get rid of the cryptic syntax and improve usability for now

The problem is it’s not the “next best thing,” especially if <!--break--> is disavowed and not just depreciated. Clearly from the number of posts and requests for rollback, this issue needs a better discussion; it should not be surreptitiously slipped in between beta and RC. Furthermore, if the next release of Drupal is going to eliminate the break tag altogether, then this change becomes unnecessary change for the sake of change, and that is very disruptive to the very people this change is purported to help: naive users.

I feel this change is ill-adivsed, but if it is to be made, then <!--break--> should continue to be supported and the DB should not be automatically updated (i.e. don’t needlessly mess with the user’s data without their consent).

Status:Closed (won't fix)» Needs review
StatusFileSize
new1 KB

The title is 'more usable'. Then let's make it so. Let there be a compromise. Let's introduce a variable, without an interface. Advanced users can go into settings.php, do an override an manually run one update without an interface. Less advanced users can wait for the 'break tag' project to emerge which I am sure will happen if the patch gets in. Steven's concern was with the update path and editing existing posts. The workflow so far dealt with existing posts. Update path? Well, advanced users can re-run their SQL to change back to break before update and less advanced users can use the break module update routine.

Status:Needs review» Reviewed & tested by the community

This will open up the possibility for a module that configures the break tag for people who need a different one. All they need to do is update the nodes similar to system_update_1018 whenever the break tag is changed.

I would be very pleased with that.

A big +1 on this. I agree with Steven that the break syntax is less than ideal, and should be changed. But enough valid points have been raised about problems caused by the new syntax that we should pay attention.

#72 seems like the perfect solution to me. We'll never get agreement on this, so why not make it configurable?

I strongly agree with the importance of not breaking the XML well-formedness of the database without a very good reason, and this is not one.
It's fine to make the break tag configurable, but by all means keep it to &lt;!--break--&gt; by default in 5.0! People can
change it to <break> in their settings.php if such is their fancy, but please leave people's database content alone if they don't want it changed.

+1 on rolling back the change and applying chx's patch to make the break tag configurable. That's a viable compromise, it won't break anything unless people choose to change the break tag...

+1 on the configurability patch by Karoly.

This opens the page to WYSIWYG editors to change the tag to what they want too.

The default should be the same as today, i.e. <!--break-->. This way, no upgrade path is needed until contrib module(s) emerge that can change the tag, as well as update the database.

Since there is no user interface for this at present, then we don't need to worry about documenting to the end user the fact that changing this requires database changes, ...etc. The contrib modules can worry about that, and perform an update (or ask for confirmation) along with doing the change to the variable.

If we watch closely, Drupal has a feature that does auto-breaking
?q=admin/content/node-settings for a maximum number of characters. If this auto-breaking was smart enough we would never want the...

It's an art to determine what should be part of a teaser and what should not be. You cannot solely rely on software to do this. It's not only a question of length.

Version:5.x-dev» 5.0-rc1

Moving to 5.0-RC1, and giving chx's patch my +1.

Version:5.0-rc1» 5.x-dev

Let's roll it back.

+1 for making this something that can be configured by a simple edit of a file
+1 for the idea of eventually making an interface to this
+1 for keeping the default as a valid html comment, as it is now
+1 for not changing this in a default setup, so users can continue to use the knowledge they have
and finally
+1 for not forcing me to put invalid code in my database

I run many Drupal sites, and have never once received a complaint from a user that <!--break--> was "too hard" for the non-techies.

Aside from that, my main concern is that the XHTML tag <br/> is commonly called "break" when spoken aloud -- such as during a tech support call with a user. How will we distinguish <br/> from <break> or <break/> in such a verbal context? "No, I don't mean the break tag that you use for a line break in your XHTML. I'm talking about the break tag that looks like XHTML but is really just a special pseudo-tag that exists only in Drupal.... You know, Drupal. The content management software. No, Drupal isn't inside your browser. It's on our server.... No, the other break tag is for your browser, but this is a special break tag that is used on our server by Drupal...."

UGH! Now that is user-hostile!

With the old-fashioned <!--break-->, you can say "the one with the exclamation mark in it" or (for more clueful users) "the one inside an XHTML comment." To a novice user, it doesn't look like the other XHTML tags (even though technically it is valid XHTML). In other words, there is a visible syntax distinction for the novice between what they will perceive as a "Drupal tag" vs. a "browser tag".

In either case, the help prompts in the node edit screen have perfectly good syntax help for the teaser break feature. It wasn't broken, so it didn't need to be fixed, IMO.

-1 for the change
+1 for rolling back to <!--break-->

Syscrusher

I would be +1 in favor of Karoly's patch, if the default were left as is current, i.e. <!--break-->. I favor not changing the default because it appears there are a number of people who depend on it being that way in their databases, and because the effort to retrain users to use the new format is not insignificant. This is one of those surprises (from the user's point of view) that is unpleasant.

StatusFileSize
new1.01 KB

+1 for compromise but prefer the patch look like the one I've attached.

The more advanced users are the ones that should be changing the settings, the original patch still leaves some poor unexpectant soul out in the cold. He updates to version 5.x, his DB contains <!--break--> and now his teasers are broken.

#72 is commitable. #85 is not without reverting the system.update that has already happened in system_update_1018().

I prefer #85, if I have to choose. After Drupal for 6+ years using <!--break--> to suddenly change it in the middle of an RC seems ludicrous to me, especially if it breaks RSS compliance? #85 doesn't need an update, because we don't support HEAD -> HEAD upgrades.

The "real" way to fix this though is probably to make separate teaser and body fields. Then people can break wherever they want, or type something else entirely.

I don't know if that would be too big of a change to do now though; I suspect it would, but reverting the <break> change seems like it needs to happen either way.

StatusFileSize
new2.08 KB

Duh. I misunderstood Morbus. Yes, #85 needs to revert the previous update so that it's not executed for people who are upgrading from 4.7.

Btw, ignore me re: RSS compliance. I had scanned the issue and misread. Replace that with "I need to do special processing in external applications for output I get from Drupal to be XHTML compliant." Also, +1 to what syscrusher said, which is an extremely valid point.

+1 on the rollback and +1 on #85. If the default is going to stay the same I think it's best that it be the default from 4.7 and not 5rc1. Also, Update #1018 from the update.php script needs to be removed so that update doesn't happen. Or, do we undo it with another update? Would it be good to undo an update that penalizes 4.7 uses in updating rather than us 5rc1ers?

+1 on the compromise : of course, it's still not a good thing to allow the disputed syntax introduced in RC1, but the evil is limited by the ability to maintain proper input syntax.

My preference obviously goes to the #85 patch, since it includes the proper behaviour (the one most followups in this issue have chosen) by default, while still allowing Steven to have the format he thinks he needs for his users. But I thank chx for daring introducing this setting in #72, considering his position, and making it possible to unlock drupal out of this situation.

Regarding Webchick's argument, it's true that any old HEAD version needn't include compatibility , and none of the commenters on this issue will really be bothered, but AIUI the problem was introduced in RC1, which has been massively downloaded, not just by coders, so it might make sense to introduce the required system_update_* to repair sites affected by the RC1 move.

Also, note that page http://drupal.org/node/86549 must be fixed if the proper syntax is restored, or if the setting is introduced.

I'm thinking that we'll need to create an install update to revert the DB's that have already been updated. Plus, the user instruction change was missed in theme_node_preview. So the #88 patch is still lacking. Later ...

Status:Reviewed & tested by the community» Needs work

The current patch misses:
- &lt;break&gt; (node.module)
- <break> (book.module)
- <!--break--> (blogapi.module)

I would suggest an additional DB update since the first one was included in a release candidate.

Category:feature» bug
Status:Needs work» Needs review
StatusFileSize
new6.54 KB

Thanks for identifying those other places!

I'm definitely not crazy about the idea of adding a second "undo" update, since our policy is not to support updates between non-releases. Cruft city. However, I can see why people would want that, given how late this change was added. So here's a patch that does that; I'll roll a second one that merely rolls back the previous change. Committers can take their pick.

Also, this is now a bug report, not a feature request.

StatusFileSize
new6.3 KB

Bah. NON-major.

StatusFileSize
new6.44 KB

And here's one without the revert update.

Shouldn't the variable be used everywhere if we go that route?

(also note that <!--break--> appears twice in blogapi.module)

Status:Needs review» Needs work

Bah. Of course. Sorry.

StatusFileSize
new6.87 KB

Ok, think I got them all.

The first break in blogapi is in the help, which I just removed since no other help pages have that in there before their "for more information" stuff. It is a string change, but that string would need to change either way to throw the variable in there, or to change it to the "<break>" syntax if that ends up being kept.

StatusFileSize
new7.3 KB

And the non-revert version.

Status:Needs work» Needs review

If we're talking about usability here, why not just get rid of the darn thing altogether? Based on my recollection this isn't listed on any of the 'help' sections when you are writing a post in the first place, so you either have to know about it or poke around on Drupal.org and find it. I don't often use it and I've had to go look it up a few times and I've used Drupal for years now.

There was a very simple module written ironically enough by Steven I think called "excerpt". I'm not sure if it has been upgraded to 5.0. It's pretty simple, it exposes the teaser field on the editing screen. If you fill in something there then congratulations you've set your teaser. If you don't it defaults to whatever the settings are for that node type. With possible a small bit of explanatory text this would be the most usable solution we could get without doing a bunch of ajax wysiwyg stuff. It also happens to be the approach that virtually every blogging tool out there uses to handle the teaser vs body concept.

Why not roll the code from excerpt into core and ditch the break tag altogether?

Mostly because we're in RC, and can't introduce big UI changes like that this late in the cycle. But yeah, that's the direction I think we should aim for in 6.x.

There was a very simple module written ironically enough by Steven I think called "excerpt". I'm not sure if it has been upgraded to 5.0.

Updated for 5.0 and patch posted to the issue queue.

A question from IRC:

"What happens if someone changes the variable and they already have content?"

This is why we don't provide an interface for the variable. Someone has to specifically edit settings.php to add it in, or else a module would have to do it (whose author presumably would run that little one-line update script as part of the process).

Status:Needs review» Reviewed & tested by the community

Works as advertized on HEAD as of now. Before patch <break> is the break. After it's back to <!--break-->

+1 for the #100 no revert version. Though, I think instructions or a contrib module need to be given to revert because there are a fair number of people who would need it. But, I like this version because this is all between releases and just shouldn't be there.

After all, using a RC is at your own risk because it not a release and pre-release versions are not recommended for production sites for reasons like just like this.

I tested the #99 patch as well with good results.

I'll commit to putting instructions about the variable in the Converting 4.7.x modules to 5.x page for potential module authors if this goes in.

Status:Reviewed & tested by the community» Closed (fixed)

We continue over at http://drupal.org/node/106947