Finishing up on a simple-syntax text filter module, something resembling a Textile subset, which seems to be coming together nicely despite my lack of experience with PHP and REGEX and...
One approach a filter I studied took, which I adopted, was to convert all of the \r\n and \n\r and \n characters sent to the filter by the user text area to \r characters, and explode()ing the text into paragraphs on each \r\r.
Each chunk can then be processed and encapsulated into <p>text</p> tags, easily enough. In addition, I added code which preserves additional \r line breaks as
tags rather than throwing them away.
My problem/question involves this: To format the resultant HTML output, one example used ONLY \n (newline--or linefeed, in some terminology), like so:
$output .= "\n\n<p id=\"someid\">$par</p>";
Similarly, each leftover \r would be converted to a break with something like:
$output = str_replace("\r", "\n<br />", $output);
My question is whether that \n (and not \r\n or \n\r) is universally right for most text viewers. To be clear, this is purely for the HTML formatting, and NOT the way the HTML displays in the browser, of course. (It looks fine when displayed on Win98 Firefox View Page Source, but... is it right on other text viewers?)
Curtis
Comments
?
I'm slightly confused by what you mean when you talk about newlines for HTML formatting.
First, when you're sending HTML data to a client, the client is supposed to treat \n and \r like any other white space. Generally, that means that the web browser will completely ignore any and all \n and \r characters in your output. Except in very special situations, the presence of \n and/or \r in your output should have absolutely no effect on the resultant formatting of the page.
As for parsing incoming data and formatting it as HTML, that's quite another story.
I normally strip out any \r characters that might be lurking around in the input. (Since all possible insane newline sequences revolve around the famed \n character, \r can be safely ignored.)
Once that's done, I simply explode based on \n. {$lines = explode('\n', $input);}
Then I just foreach all of the lines, wrapping good lines in <p> tags, counting up N empty lines between paragraphs, and adding style="margin-top: Nem;" to the next actual paragraph.
Anyway, as I alluded above, this was all something of a shot in the dark, since I don't really know what you're asking. ;)
Alex Markley--
--A warehouse is just like any other house, except it turns EVIL under the light of a full moon.
Sorry if I was unclear...
Sorry if I was unclear, Alex.
Yes, I understand that browsers generally ignore the white space in the HTML. (\n and \r)
So I COULD just send the HTML as tags without regard to code formatting, but the resultant HTML code (NOT the way it looks in a browser) is ugly and more difficult to debug.
For example, we wind up with a list which looks (in HTML code) like:
Instead of:
As I write this thing, I'm attempting to keep in mind both the browser display formatting and the layout of the HTML code.
My question was essentially: How should I format the code output, with \r or \n or \r\n?
The answer I seem to be getting is the latter, since \r\n is the format DOS/Windows uses. (It displays properly in notepad in that format.)
So my code puts out a
\r\n<ul>and not just the tag. The code to preserve the user's white space was a bit tricky, but I managed.Thanks for taking the time to reply.
Curtis
fair question
I use Windows to dev, myself, but haven't for the last decade used anything other than plain \n
Folk who care about looking at the code should be using something a bit smarter than notepad (which is the ONLY instance I've seen this still make a difference in for ages)
\r\n is an archaic stumbling block. not needed in practice since we moved away from teletypes.
You want newlines, use \n (thanks for making the effort - it does help us all).
You want indenting, use a decent text editor of your choice and htmlTidy.
.dan.
.dan. is the New Zealand Drupal Developer working on Government Web Standards
Aha
"So my code puts out a \r\n<ul> and not just the tag. The code to preserve the user's white space was a bit tricky, but I managed."
Aha. That sounds like an admirable goal. I should mention, however, that there are plenty of very good text editors for windows that interpret \n just fine. (My dad uses VIM on windows, and wordpad seems to work well too.) Notepad is one of the only ones to still "not get it".
And +1 to the poster above me. Avoid \r in your newline sequences if you can help it.
Alex Markley--
--A warehouse is just like any other house, except it turns EVIL under the light of a full moon.
I'd be more than delighted...
I'd be more than delighted, Dan and Alex, to simply emit \n characters. As you say, good editors have no problem with them.
I note that whenever I save HTML or view source Notepad has no trouble reading it, but I'll assume, in the light of your comments, that's because IE and Firefox are smart enough to translate, and not because everyone dutifully formats their source code using return characters.
Thanks for your comments.
Curtis