Patch with three improvements to views_trim_text() attached:

Step 1: changed "/(.*)\b.+/us" to "/(.*)\p{Z}.+/us"

Current implementation of views_trim_text() has problems with special chars like German umlauts (e.g. 'Ä'): \b matches for them, so they are recognized as word boundaries, which results in "broken" words. Replacing \b with \p{Z} solves this (see http://de3.php.net/manual/en/regexp.reference.unicode.php , supported since PHP 4.4 which is OK since this is the minimum requirement for Drupal 6)

Step 2: changed "/(.*)\p{Z}.+/us" to "/(.*)\p{Z}/us"

The trailing ".+" was an unnecessary requirement because (.*) is greedy. Also a required change for step 3:

Step 3: Do not remove last word if shortened text is cut off at a word boundary

CommentFileSizeAuthor
views.module.patch1.28 KBsbusch

Comments

jehu’s picture

subscribe

dawehner’s picture

Status: Active » Needs work
+++ views.module	(working copy)
@@ -1315,12 +1315,27 @@
+
+      $remains = substr($orig_value, strlen($value)); // we can use non-unicode aware substr() and strlen() here
+      $exact_cutoff = !strlen($remains) || preg_match("/^\p{Z}/us", $remains); // strlen() check only for safety reason
+

It would be cool if the comments would be above the line

+++ views.module	(working copy)
@@ -1315,12 +1315,27 @@
+      else if ($exact_cutoff)
+      {

it should be elseif and the { should be in the same line

Beside this code style issues it is necessary from my perspective to write a good test coverage for this function. The function does so much and can brake really fast when you change something

esmerel’s picture

Status: Needs work » Closed (won't fix)

No activity on patch for more than 3 months.