Updated: Comment #N
Problem/Motivation
In many cases, transliterating one character can result in multiple characters of output. For instance, you might transliterate ö to oe in German.
If a caller of transliteration methods needs to limit the length of the resulting string from transliterating a string that contained ö somewhere in the middle, they might not want to split between the o and the e in the output. However, they would not have any way to know where it was and wasn't OK to truncate.
Therefore, it would be useful if the transliterate methods themselves had an optional maximum length parameter, which would ensure the return value was less than this length, split on the boundary between transliterations of individual characters.
Proposed resolution
Add an optional max_length parameter, which would return this length or fewer characters, ensuring that the full transliteration of each character (rather than partials) got into the return value. Callers who don't care about splitting on these boundaries could call the function without this parameter, and they could then truncate their output after the fact.
Remaining tasks
Add the parameter. Write a test that verifies it works.
User interface changes
None.
API changes
New optional parameter to the transliteration methods, with backwards-compatible behavior (no truncation).
Related Issues
None.
Original report by alippai
As the transliteration can differ in the output string length from the input we should provide a maxlength parameter.
E.g.: I think we shouldn't split "oe" from ö->oe conversion in German, if we limit the outputs length later.
Comment | File | Size | Author |
---|---|---|---|
#6 | 1919086-maxlength-transliteration.patch | 3.6 KB | jhodgdon |
Comments
Comment #1
jhodgdonIt seems to me that anyone calling the transliteration methods could just chop the resulting string to the maximum length that they need. I don't see why the transliteration class needs to do that itself?
Comment #2
jhodgdonComment #3
amateescu CreditAttribution: amateescu commentedI agree with #1, this is really the responsability of the caller, not of the system that actually performs the transliteration.
Comment #4
alippai CreditAttribution: alippai commentedThe caller can't determine the correct truncation of the output. If you want to transliterate the string "ö" (it results in "oe") and truncate it to one character the correct output is "" and not "o" in my opinion.
Comment #5
jhodgdonThat is an excellent point in #4, which I missed in the original feature request report. Maybe we should reconsider, in which case we would want to have two additional arguments, I think: one for the max length to return, and a boolean to require that the split happen on an input character boundary if TRUE and just truncate if not... I'm not sure that the case in #4 is universally desirable, but I can see that if it was desirable, it would need to happen in the transliteration class itself.
It shouldn't be that hard to do... anyone want to make a patch?
Comment #6
jhodgdonI've added an issue summary. And here's a patch, including an addition to the tests. Tests pass on my system...
Hopefully it is not too late in the D8 process to get this in. It is a relatively small change. Someone want to review it quickly?
Comment #7
amateescu CreditAttribution: amateescu commentedIt shouldn't be too late for this kind of small additions, and the patch looks good to me.
Comment #8
alippai CreditAttribution: alippai commentedI'd go with default value
PHP_INT_MAX
for $max_length and concatenation could go in theelse
branch. With or without these I'm okay with the patch. +1 from me.Comment #9
jhodgdonI think a default value of NULL to indicate "we don't want to do this" is more consistent with the rest of the Drupal API.
Comment #10
alexpottCommitted c9ab009 and pushed to 8.x. Thanks!
As outlined in #4 this change is very hard to implement outside of the transliteration service.
Comment #11
jhodgdonI added a note/code with the new parameter to the change notice that we had before (from the original issue):
https://drupal.org/node/1842748
Comment #12
xjmUntagging. Please remove the "Needs change notification" tag when the change notice task is complete.
Comment #13.0
(not verified) CreditAttribution: commentedAdded issue summary