inability to handle line breaks
Roland684 - July 28, 2008 - 14:48
| Project: | String Overrides |
| Version: | 5.x-1.x-dev |
| Component: | Code |
| Category: | bug report |
| Priority: | normal |
| Assigned: | Unassigned |
| Status: | patch (to be ported) |
Description
It is not possible to replace a string which contains line breaks.
I've been trying to replace the "Your search yielded no results" tips of the search. I can only get it to work when I remove the line breaks from the (appropriate text in the) search.module file.
I first thought it could be a LF vs CRLF issue, but have not been able to pinpoint it using the templates and configuration.
The html output seems to contain CRLF's, while the original text only contains LF's. A transformation seems to be done somewhere (This could even be the "view source" window of the browser)

#1
Meanwhile, we have been able to identify the exact issue. The string we're trying to override contains line breaks. We filled out the string overrides admin form, on a Windows machine. On my local development machine, running Windows, this works fine. On the live server, running Linux, it doesn't. Moreover, when I import the database from the live server to my local machine, it again works on my local machine. When I converted the file containing the string we're overriding to Windows line endings, it all of a sudden worked on the live server as well. I'm not sure if the real reason is that we're doing this from a Windows machine, or that HTTP is defined to use CRLF line breaks, but I suspect the latter.
So in conclusion: string override seems to distinguish between line breaks from different platforms, which I feel it shouldn't. Until this is fixed, you cannot reliably use string override this way, and upgrading Drupal becomes dangerous (because the file will have Unix line endings again.)
#2
Drupal convention specifies that files should end in LF only. Since this module operates on Drupal strings using the t() functions, it seems that the best course of action is to convert line endings to LF regardless of platform.
Attached is a patch to stringoverrides.admin.inc that strips carriage returns when the data is uploaded. Since managing line breaking characters is essentially a protocol conversion (HTTP uses CR LF, Drupal uses LF) it seems appropriate to do that in stringoverrides_admin_submit().
This patch has been tested on CVS DRUPAL-6--1 and on the current 6.x-1.7 release.
It has been tested on Mac and Linux. It has NOT been tested on a Windows server.
#3
Hi mbutcher,
Thank you for your effort! I tested your patch on my Windows-based development system. If search.module (in this case) is indeed stored using Unix line endings, it works. However, if I store it using Windows line endings, it does not work.
You write that the Drupal convention is to use LF only. (Just out of curiosity: where did you find that?) However, if I extract the Drupal distribution on my Windows machine (using WinZip in this case), I get files with Windows line endings, so I wonder if it is indeed ok to assume that Drupal files are indeed stored using Unix line endings on Windows servers...
So I guess the real question is: can you assume a certain line ending style, or should you do the overrides in a way that line ending style doesn't matter? If it isn't too hard on performance, I would vote for the latter, because these kind of issues are hard to find (I can tell you from experience :-).
#4
The Drupal convention is to use LF only:
http://drupal.org/coding-standards#comment-80929
I think the Coder module even enforces this.
I suppose our alternatives are...
The first option is much easier, but makes more assumptions about what the files will contain.
Thoughts?
#5
My thoughts:
- I agree that the LF only thing for Drupal files is probably a good idea
- So it should be added to Windows installation notes that files must be Unix format
- Then you would have to convert strings entered in the String Overrides admin page to LF only
- However, these would then no longer work against other strings that are entered through the web UI, because those would still be CR/LF (HTTP standard)
I have a feeling that the only way might be to have String Overrides do its work in a platform-independent way. (Convert both translatable strings and string overrides to the same line endings, probably LF.) Any other choice will cause platform dependencies, I think: either overrides against strings in files work correctly, or overrides against strings entered in the web browser work correctly, but not both.
#6
Drupal convention for LF only endings is fine, so assuming that the code in the file uses LF,not CRLF, is safe, IMO. It's just the incoming string we can't safely make assumptions about, so that needs to be normalized to LF-only on save. That's what the patch in #2 is doing.
That said, I don't like that the replacement line is split across 2 lines at the =. That's not Drupal convention, AFAIK. The line being removed should also be outright removed, not just commented out.
#7
Hi Crell,
You write: "It's just the incoming string we can't safely make assumptions about, so that needs to be normalized to LF-only on save."
In fact, there are two forms of incoming strings (as in: through the web browser). The first one, which is what I think you're talking about, is the strings coming in through the admin page of String Overrides. The second form are strings coming in through any other Drupal web page, for example through admin pages of other modules. The latter form can also contain strings that you may want to translate using String Overrides. However, if they contain line breaks, those line breaks will be CRLF because of the way HTTP works.
The suggested patch takes care of the first form, but not the second. I don't even know if that's possible, I don't know the internals of Drupal well enough to judge that.
So my question is: can we make the second form work as well? Or doesn't this occur in practice?
#8
As far as I'm aware, String Overrides, like the translation system itself, only deals with strings in-code using t(). It doesn't deal with admin-entered content, nor does it have to. You can just, um, edit the content. :-) So the second case described in #7 is irrelevant to this issue.
#9
Hi Crell,
That's in fact good news, because then I think we have a good solution :-)
#10
If it's not too much of a hassle, could the patch in #2 be backported to D5?
#11
It needs to be applied to Drupal 6 first.
#12
So is anything happening here? Any chance of this getting committed? :-)
#13
How does t() handle the line breaks?
#14
+1 for patch in #2, works well for me.
#15
I've also tried out the patch in #2 and it works as expected. It was impossible to replace the following string until the patch was applied, and now with the patch it is being replaced.
<ul><li>Check if your spelling is correct.</li>
<li>Remove quotes around phrases to match each word individually: <em>"blue smurf"</em> will match less than <em>blue smurf</em>.</li>
<li>Consider loosening your query with <em>OR</em>: <em>blue smurf</em> will match less than <em>blue OR smurf</em>.</li>
</ul>
On a related note, I've created a feature request for #368465: Partial string replacement to simplify things instead of having to override a whole multi-line string.
#16
Committed to DRUPAL-6, will need a backport to 5.
http://drupal.org/cvs?commit=168877
You guys mind testing it a bit before the release is made?
#17
Thanks Rob. I have tried the latest CVS code out on a new D6 site (running on a MacOS X server) and have successfully replaced the search results string as well as a module help page string.
#18
I have used this patch on Drupal 6.9/Linux and it works fine to replace the blue smurf text.
#19
urra for the patch
working here too on windows
+1(i didnt apply the patch but used the latest dev)
#20
+1 on patch for Drupal 6.12 - worked to replace the blue smurf text.