unicode.inc contains global functions that cannot be lazy-loaded. Convert these to a unicode utility class.

Followups:

#1985104: Clean up Unicode function names

Files: 
CommentFileSizeAuthor
#50 drupal-unicode_component-1938670-50.patch72.73 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,514 pass(es).
[ View ]
#50 interdiff.txt1.34 KBParisLiakos
#49 drupal-unicode_component-1938670-49.patch72.73 KBParisLiakos
FAILED: [[SimpleTest]]: [MySQL] 55,450 pass(es), 1 fail(s), and 0 exception(s).
[ View ]
#49 interdiff.txt6.38 KBParisLiakos
#45 1938670-45.patch70.89 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,619 pass(es).
[ View ]
#45 interdiff.txt1.02 KBParisLiakos
#44 1938670-43.patch71.4 KBRobLoach
FAILED: [[SimpleTest]]: [MySQL] 55,334 pass(es), 12 fail(s), and 6 exception(s).
[ View ]
#42 1938670-42.patch71.4 KBRobLoach
PASSED: [[SimpleTest]]: [MySQL] 55,448 pass(es).
[ View ]
#42 interdiff-do-not-test.patch5.38 KBRobLoach
#41 drupal-unicode_component-1938670-41.patch71.41 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 56,316 pass(es).
[ View ]
#36 drupal-unicode_component-1938670-35.patch72.48 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,463 pass(es).
[ View ]
#36 interdiff.txt2.12 KBParisLiakos
#34 30-34-interdiff.txt414 bytesalexpott
#34 drupal-unicode_component-1938670-34.patch72.51 KBalexpott
PASSED: [[SimpleTest]]: [MySQL] 55,905 pass(es).
[ View ]
#30 drupal-unicode_component-1938670-30.patch72.04 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,459 pass(es).
[ View ]
#30 interdiff.txt4.35 KBParisLiakos
#27 1938670.patch71.41 KBRobLoach
PASSED: [[SimpleTest]]: [MySQL] 55,249 pass(es).
[ View ]
#25 drupal-unicode_component-1938670-25.patch71.34 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,809 pass(es).
[ View ]
#23 drupal-unicode_component-1938670-23.patch48.82 KBParisLiakos
PASSED: [[SimpleTest]]: [MySQL] 55,986 pass(es).
[ View ]
#23 interdiff.txt11.82 KBParisLiakos
#19 drupal-unicode_component-1938670-19.patch45.81 KBParisLiakos
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]
#9 drupal_1938670_04.patch201.24 KBpp
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_04.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#8 drupal_1938670_03.patch201.23 KBXano
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_03.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]
#5 drupal_1938670_02.patch198.98 KBXano
FAILED: [[SimpleTest]]: [MySQL] 52,605 pass(es), 8 fail(s), and 6 exception(s).
[ View ]
#4 drupal_1938670_01.patch177.31 KBXano
FAILED: [[SimpleTest]]: [MySQL] 52,621 pass(es), 8 fail(s), and 6 exception(s).
[ View ]
#2 drupal_1938670_00.patch199.86 KBXano
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_00.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Comments

Title:Convert unicode.inc to \Drupal\Utility\UnicodeConvert unicode.inc to \Drupal\Component\Utility\Unicode
Status:Active» Needs review

StatusFileSize
new199.86 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_00.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

Status:Needs review» Needs work

The last submitted patch, drupal_1938670_00.patch, failed testing.

Status:Needs work» Needs review
StatusFileSize
new177.31 KB
FAILED: [[SimpleTest]]: [MySQL] 52,621 pass(es), 8 fail(s), and 6 exception(s).
[ View ]

Re-roll.

StatusFileSize
new198.98 KB
FAILED: [[SimpleTest]]: [MySQL] 52,605 pass(es), 8 fail(s), and 6 exception(s).
[ View ]

Added a missing file deletion.

There is also a file ./core/includes/unicode.entities.inc, but I couldn't find any usage of it.

Status:Needs review» Needs work

The last submitted patch, drupal_1938670_02.patch, failed testing.

StatusFileSize
new201.23 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_03.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

This patch fixes all issues, but it requires #1938228: Convert TypedDataTest to make use of DrupalUnitTestBase to be committed first.

Status:Needs work» Needs review
StatusFileSize
new201.24 KB
FAILED: [[SimpleTest]]: [MySQL] Unable to apply patch drupal_1938670_04.patch. Unable to apply patch. See the log in the details link for more information.
[ View ]

The patch doesn't work. I re-rolled it.

And what about core/includes/unicode.entities.inc? Is it necessary?

Status:Needs review» Needs work

Given that this is a strategy pattern to begin with, we should split the different implementations to different classes and register the proper one in the container.

@Damien Tournoud You are right but this issue only resolves "unicode functions will be a class" nothing else. I think "X module/class will use "lazy loaded" Unicode class" is a different issue. I think your suggestion is too big step (or I don't understand you, please clarify the issue, e.g. give me a link which helps me to understand).

Given that this is a strategy pattern to begin with, we should split the different implementations to different classes and register the proper one in the container.

@amateescu disagreed that this should be registered as a service.

@msonnabaum was working on a String class. We still have to decide whether to move string manipulation functions there or keep them in Unicode, but I vote for moving them to String.

Status:Needs work» Needs review
Issue tags:-PHPUnit Blocker

#9: drupal_1938670_04.patch queued for re-testing.

Status:Needs review» Needs work
Issue tags:+PHPUnit Blocker

The last submitted patch, drupal_1938670_04.patch, failed testing.

this patch is unreviewable..i think we should keep procedural functions as component wrappers initially..and mark them deprecated
Also, unicode_requirements should move to system.install (not here of course, after the conversion)

Assigned:Xano» ParisLiakos

i am gonna give this a shot

this patch is unreviewable..i think we should keep procedural functions as component wrappers initially..and mark them deprecated

I don't agree. It's fairly trivial to convert all usages and it's easy to review (we did the same with #1969540: Convert token.inc to a service.

Also, unicode_requirements should move to system.install (not here of course, after the conversion)

Agreed. That's what I realized when writing the patch as well.

i am gonna give this a shot

Thanks! I don't really have time to help with this anytime soon anyway.

Status:Needs work» Needs review
StatusFileSize
new45.81 KB
FAILED: [[SimpleTest]]: [MySQL] Drupal installation failed.
[ View ]

@Xano: The token patch was 3 times smaller than the one above:) Which also mean it will break more patches

Ok, so here it is:

drupal_convert_to_utf8() => Unicode::convertToUtf8()
drupal_truncate_bytes() => Unicode::truncateBytes()
truncate_utf8() => String::truncate()
mime_header_encode() => Unicode::mimeHeaderEncode()
mime_header_decode() => Unicode::mimeHeaderDecode()
decode_entities() => Unicode::decodeEntities()
drupal_strlen() => String::length()
drupal_strtoupper() => String::toUpperCase()
drupal_strtolower() => String::toLowerCase()
drupal_ucfirst() => String::upperCaseFirst()
drupal_substr() => String::substract()

Ah. also..._mimeHeaderDecode method needs a name:P

Status:Needs review» Needs work

The last submitted patch, drupal-unicode_component-1938670-19.patch, failed testing.

Very well done! I enjoy this patch a lot. The following are a couple notes I had. Some quick fixes, some questions, mostly praise ;-) ...

+++ b/core/includes/bootstrap.incundefined
@@ -668,7 +653,8 @@ function drupal_environment_initialize() {
   // Detect string handling method.
-  unicode_check();
+  require_once DRUPAL_ROOT . '/core/lib/Drupal/Component/Utility/Unicode.php';
+  Unicode::check();

Is the ClassLoader not available at this point? We couldn't move it to after the loader is initialized, could we?

+++ b/core/includes/unicode.incundefined
@@ -207,33 +140,18 @@ function drupal_convert_to_utf8($data, $encoding) {
- * @param $len
+ * @param int $len
  *   An upper limit on the returned string length.
  *
- * @return
+ * @return string
  *   The truncated string.
+ *
+ * @see \Drupal\Component\Utility\Unicode::truncateBytes().

Could we also add @deprecated here? @deprecated as of Drupal 8.0. Use Drupal\Component\Utility\Unicode::truncateBytes() instead? Something along those lines.

+++ b/core/lib/Drupal/Component/Utility/String.phpundefined
@@ -105,4 +107,270 @@ public static function placeholder($text) {
+  public static function toLowerCase($text) {
+    global $multibyte;
+    if (Unicode::$status == Unicode::STATUS_MULTIBYTE) {
+      return mb_strtolower($text);
+    }
+    else {
+      // Use C-locale for ASCII-only lowercase

Was working on this patch a bit last week and had second thoughts on it due to the global use of $multibyte.

In this instance, global $multibyte could be removed completely, but in other functions where it's actually used, it feels rather messy. Should String become an actual object with a multibyte getter/setter? Really unsure about that. This might just be an interim solution... We'll have to think about that. Perhaps push it to a follow up....

+++ b/core/lib/Drupal/Component/Utility/String.phpundefined
@@ -105,4 +107,270 @@ public static function placeholder($text) {
+  public static function substract($text, $start, $length = NULL) {
+    if (Unicode::$status == Unicode::STATUS_MULTIBYTE) {
+      return $length === NULL ? mb_substr($text, $start) : mb_substr($text, $start, $length);
+    }

Unicode::$status... Should we switch that to Unicode::getStatus() and Unicode::setStatus()?... Global and static states on classes seem quite messy. Maybe the $status becomes a parameter that's passed in with a default state? Maybe we just leave it like this for now.

The problem with this is that during testing, the $status would stay the same state between tests, unless we manually wipe it, which is dirty. We'll have to figure something out.

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,311 @@
+   * @see http://unicode.org/glossary
+   */
+  const PREG_CLASS_WORD_BOUNDARY = <<< 'EOD'
+\x{0}-\x{2F}\x{3A}-\x{40}\x{5B}-\x{60}\x{7B}-\x{A9}\x{AB}-\x{B1}\x{B4}
+\x{B6}-\x{B8}\x{BB}\x{BF}\x{D7}\x{F7}\x{2C2}-\x{2C5}\x{2D2}-\x{2DF}
+\x{2E5}-\x{2EB}\x{2ED}\x{2EF}-\x{2FF}\x{375}\x{37E}-\x{385}\x{387}\x{3F6}
+\x{482}\x{55A}-\x{55F}\x{589}-\x{58A}\x{5BE}\x{5C0}\x{5C3}\x{5C6}

Definitely like the move from global define() functions to a constant.

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,311 @@
+
+  /**
+   * Holds the multibyte capabilities of the current enviroment.
+   *
+   * @var int
+   */
+  public static $status = 0;

This is what I was talking about... I understand the reason for it, but it might be better solved with Polymorphism. We could figure that out in a follow up issue, I suppose.

+++ b/core/includes/theme.maintenance.incundefined
@@ -28,7 +30,7 @@ function _drupal_maintenance_theme() {
   // Install and update pages are treated differently to prevent theming overrides.
   if (defined('MAINTENANCE_MODE') && (MAINTENANCE_MODE == 'install' || MAINTENANCE_MODE == 'update')) {
diff --git a/core/includes/unicode.inc b/core/includes/unicode.inc
index 70a8fde..3b3a8da 100644
--- a/core/includes/unicode.inc
+++ b/core/includes/unicode.incundefined
@@ -5,66 +5,8 @@
@@ -5,66 +5,8 @@
  * Provides Unicode-related conversions and operations.

Although this handles unicode.inc, what are your thoughts on unicode.entities.inc? Should that be done in a follow up issue?

hi:) thanks!

Is the ClassLoader not available at this point? We couldn't move it to after the loader is initialized, could we?

Yeah unfortunately its not available..i ll try to see if this can be moved after loader init.

Agreed about the deprecated note, will be in next patch.

In this instance, global $multibyte could be removed completely, but in other functions where it's actually used, it feels rather messy. Should String become an actual object with a multibyte getter/setter? Really unsure about that. This might just be an interim solution... We'll have to think about that. Perhaps push it to a follow up....

The multibyte global is now a Unicode property ($status) :) i think i forgot to remove this line.

Unicode::$status... Should we switch that to Unicode::getStatus() and Unicode::setStatus()?... Global and static states on classes seem quite messy

+1 agreed, did not think it much, i just stuck as public temporarily to see if ti works.

Although this handles unicode.inc, what are your thoughts on unicode.entities.inc? Should that be done in a follow up issue?

Already opened: #1981190: Remove unicode.entities.inc

Status:Needs work» Needs review
StatusFileSize
new11.82 KB
new48.82 KB
PASSED: [[SimpleTest]]: [MySQL] 55,986 pass(es).
[ View ]

I think this is a lot better now:) also installation is fixed

Issue tags:+Needs tests

Also the unicode DUTB should be converted to PHPunit and new ones should be added for new methos in the String class

StatusFileSize
new71.34 KB
PASSED: [[SimpleTest]]: [MySQL] 55,809 pass(es).
[ View ]
  1. moved tests around from DUTB to phpunit
  2. renamed String::toUpperCase to String::upperCase, to be consistent with String::upperCaseFirst
  3. which lead to rename String::toLowerCase to String::lowerCase
  4. moved decodeEntities() to String from Unicode, since it is more or less, checkPlain's reverse
  5. added a few tests for Unicode's status getters setters
  6. moved preg_replace_callback callbacks to anonymous functions...it is really ugly having one of them as static method in a class..not to mention not re-usable at all..it really makes sense

i messed up with interdiff..sorry

Status:Needs review» Reviewed & tested by the community

Tests pass. Looks great!

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,306 @@
+  public static function setStatus($status) {
+    if (!in_array($status, array(self::STATUS_SINGLEBYTE, self::STATUS_MULTIBYTE, self::STATUS_ERROR))) {
+      throw new \InvalidArgumentException('Invalid status value for unicode support.');
+    }
+    static::$status = $status;

Nicely done.

+++ b/core/tests/Drupal/Tests/Component/Utility/StringTest.phpundefined
@@ -106,4 +112,258 @@ function testPlaceholder() {
+   * Tests String::decodeEntities().
+   *
+   * @dataProvider providerDecodeEntities
+   */
+  public function testDecodeEntities($text, $expected) {
+    $this->assertEquals(String::decodeEntities($text), $expected);

Love the use of @dataProvider here. Much nicer to read that the old tests.

Status:Reviewed & tested by the community» Needs review
StatusFileSize
new71.41 KB
PASSED: [[SimpleTest]]: [MySQL] 55,249 pass(es).
[ View ]

Switched to setExpectedException:

<?php
 
/**
   * Tests Unicode::getStatus() and Unicode::setStatus().
   *
   * @dataProvider providerTestStatus
   */
 
public function testStatus($value, $expected, $invalid = FALSE) {
    if (
$invalid) {
     
$this->setExpectedException('InvalidArgumentException');
    }
   
Unicode::setStatus($value);
   
$this->assertEquals(Unicode::getStatus(), $expected);
  }
?>

Assigned:ParisLiakos» Unassigned
Status:Needs review» Reviewed & tested by the community

woot had no idea you can do this..thanks!!

Status:Reviewed & tested by the community» Needs work

I get a local failure... with one of the singlebyte lowercase tests...

1) Drupal\Tests\Component\Utility\StringTest::testLowerCase with data set #1 ('FrançAIS is ÜBER-åwesome', 'français is über-åwesome')
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'fran�ais is �ber-�wesome'
+'français is über-åwesome'

My PHP version is 5.3.15...

And actually the above output reminds me...

+++ b/core/tests/Drupal/Tests/Component/Utility/StringTest.phpundefined
@@ -106,4 +112,258 @@ function testPlaceholder() {
+    $this->assertEquals(String::decodeEntities($text), $expected);
...
+    $this->assertEquals(String::lowerCase($text), $expected);
...
+    $this->assertEquals(String::upperCase($text), $expected);
...
+    $this->assertEquals(String::upperCaseFirst($text), $expected);
...
+    $this->assertEquals(String::length($text), $expected);
...
+    $this->assertEquals(String::substract($text, $start, $length), $expected);
...
+    $this->assertEquals(String::truncate($text, $max_length, $wordsafe, $add_ellipsis), $expected);
+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.phpundefined
@@ -0,0 +1,76 @@
+    $this->assertEquals(Unicode::getStatus(), $expected);
...
+    $this->assertEquals(Unicode::mimeHeaderEncode($value), $encoded);
+    $this->assertEquals(Unicode::mimeHeaderDecode($encoded), $value);

In PHPUnit assertions the expected value comes first...

Status:Needs work» Needs review
StatusFileSize
new4.35 KB
new72.04 KB
PASSED: [[SimpleTest]]: [MySQL] 55,459 pass(es).
[ View ]

hmm this is weird, but since simpletest does not catch it, my guess is an enviroment specific php bug, since it happens on single byte mode.
Unfortunately i dont have any available 5.3 installation to test it:(

I fixed PHPUnit assertions! i did not know that. i guess i should pay more attention to docs:) thanks!

Well something has changed in the tests as the old Drupal\system\Tests\System\UnicodeUnitTest passes for me...

Status:Needs review» Reviewed & tested by the community

Pass on PHP 5.4 as well...

$ curl https://drupal.org/files/drupal-unicode_component-1938670-30.patch | git apply
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 73769  100 73769    0     0  98797      0 --:--:-- --:--:-- --:--:--  118k
$ cd core
core$ phpunit
PHPUnit 3.7.15 by Sebastian Bergmann.
Configuration read from /var/www/drupal/8/core/phpunit.xml.dist
...............................................................  63 / 154 ( 40%)
............................................................... 126 / 154 ( 81%)
............................
Time: 1 second, Memory: 9.75Mb
OK (154 tests, 321 assertions)

Status:Reviewed & tested by the community» Needs work

Well I'm still getting local failures and I recently fixed the phpunit timer test failing on windows machines... It's important we don't have these as this prevents people from using phpunit properly.

Status:Needs work» Needs review
StatusFileSize
new72.51 KB
PASSED: [[SimpleTest]]: [MySQL] 55,905 pass(es).
[ View ]
new414 bytes

Well that was fun tracking down... we need to setlocale in Drupal's bootstrap.php for PHPUnit. So we have a known environment... as the comment in drupal_environment_initialize() says:

+// Set sane locale settings, to ensure consistent string, dates, times and
+// numbers handling.

+1 for sanity :)

Status:Needs review» Reviewed & tested by the community

gasp! thank you for tracking this down:D

StatusFileSize
new2.12 KB
new72.48 KB
PASSED: [[SimpleTest]]: [MySQL] 55,463 pass(es).
[ View ]

So, alexpott asked me to do some profiling aroung String::lowercase because of caseFlip
And indeed he was right.

Singlebyte

8.x HEAD:

Total Incl. MemUse (bytes): 8,480 bytes
Total Incl. PeakMemUse (bytes): 2,536 bytes
Number of Function Calls: 8

patch #34:

Total Incl. MemUse (bytes): 11,072 bytes
Total Incl. PeakMemUse (bytes): 3,392 bytes
Number of Function Calls: 10

patch below:

Total Incl. MemUse (bytes): 9,880 bytes
Total Incl. PeakMemUse (bytes): 2,720 bytes
Number of Function Calls: 9

Multibyte

8.x HEAD:

Total Incl. MemUse (bytes): 4,160 bytes
Total Incl. PeakMemUse (bytes): 1,064 bytes
Number of Function Calls: 4

patch #34 and patch below (same code):

Total Incl. MemUse (bytes): 5,392 bytes
Total Incl. PeakMemUse (bytes): 1,448 bytes
Number of Function Calls: 5

General Note:
Here we are slower also because of the additional Unicode::getStatus() call..before we just had a global. Also generally multibyte is a lot faster (and most systems will use multibyte) because we rely on extensions

Status:Reviewed & tested by the community» Needs work

+++ b/core/includes/unicode.incundefined
@@ -540,92 +345,12 @@ function drupal_ucfirst($text) {
+ * @deprecated as of Drupal 8.0.
+ *   Use \Drupal\Component\Utility\String::substract() instead.
...
+  return String::substract($text, $start, $length);
+++ b/core/lib/Drupal/Component/Utility/String.phpundefined
@@ -105,4 +124,269 @@ public static function placeholder($text) {
+    return static::upperCase(static::substract($text, 0, 1)) . static::substract($text, 1);
...
+  public static function substract($text, $start, $length = NULL) {
...
+      $ellipsis = static::substract('…', 0, $max_length);
...
+        $string = static::substract($string, 0, $max_length);
...
+      $string = static::substract($string, 0, $max_length);
+++ b/core/tests/Drupal/Tests/Component/Utility/StringTest.phpundefined
@@ -106,4 +112,258 @@ function testPlaceholder() {
+  /**
+   * Tests String::substract().
+   *
+   * @dataProvider providerSubstract
+   */
+  public function testSubstract($text, $start, $length, $expected) {
+    $this->assertEquals($expected, String::substract($text, $start, $length));
+  }
+
+  /**
+   * Data provider for testSubstract().
+   *
+   * @see testSubstract()
+   */
+  public function providerSubstract() {

I think we should follow PHP here... and call this String::substr()

Status:Needs work» Needs review
Issue tags:-Needs tests

lets hear other opinions, i would really hate String::substr

msonnabaum proposed String::slice in IRC, which i find really good

Status:Needs review» Needs work
Issue tags:+Needs tests

After polling on IRC there was a strong feeling from @webchick, @timplunkett, @larowlan than we not use String::semantic_func_name()... ie String::length() but convert drupal_strlen to String::strlen()...

to quote @webchick:

Ok well once again, we are not inventing a DX wrapper language around PHP. If we want to bring PHP more inline with other languages, file upstream patches.

...and now we've agreed to put these function in Unicode so drupal_strlen becomes Unicode:strlen()

... and also agreed to remove the @deprecated from the procedural functions

Issue tags:-Needs tests

Removing tag added back unnecessarily...

Status:Needs work» Needs review
StatusFileSize
new71.41 KB
PASSED: [[SimpleTest]]: [MySQL] 56,316 pass(es).
[ View ]

ok here is the patch with what we decided on IRC
at least it is visible now on why we wrap standard php functions..for unicode support...Unicode::

sorry interdiff does not make any sense here

StatusFileSize
new5.38 KB
new71.4 KB
PASSED: [[SimpleTest]]: [MySQL] 55,448 pass(es).
[ View ]

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   *   The length of the string.
+   */
+  public static function strlen($text) {
+    if (static::getStatus() == static::STATUS_MULTIBYTE) {
+      return mb_strlen($text);

Should be strLen to go by lowerCamel function naming.

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   *
+   * @return string
+   *   The string in uppercase.
+   */
+  public static function strtoupper($text) {
+    if (static::getStatus() == static::STATUS_MULTIBYTE) {
+      return mb_strtoupper($text);

strToUpper()

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   * @return string
+   *   The string in lowercase.
+   */
+  public static function strtolower($text) {
+    if (static::getStatus() == static::STATUS_MULTIBYTE) {
+      return mb_strtolower($text);

strToLower

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   *   The string with the first letter as uppercase.
+   */
+  public static function ucfirst($text) {
+    return static::strtoupper(static::substr($text, 0, 1)) . static::substr($text, 1);

ucFirst

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   *   The number of characters to read.
+   *
+   * @return string
+   *   The shortened string.
+   */
+  public static function substr($text, $start, $length = NULL) {
+    if (static::getStatus() == static::STATUS_MULTIBYTE) {

subStr()

well, i specifically avoided that, (no matter how much i agree and as much as i liked original String::length), after a long irc conversation yesterday night, where the bottom line was for php function wrappers to be Class::<php_func_name>, eg Unicode::strlen because that is the PHP name and makes more sense to other PHP devs.

lets move on with #41 to avoid any more bikeshed here and make possible for more PHPunit tests to come:)

see also #39
Paris

StatusFileSize
new71.4 KB
FAILED: [[SimpleTest]]: [MySQL] 55,334 pass(es), 12 fail(s), and 6 exception(s).
[ View ]

Deal. Here it is with this docfix:

  * @return
  *   The shortened string.
  *
- * @see \Drupal\Component\Utility\Unicode::strtoupper().
+ * @see \Drupal\Component\Utility\Unicode::substr().
  * @ingroup php_wrappers
  */
function drupal_substr($text, $start, $length = NULL) {

Put a follow up to discuss the function names at #1985104: Clean up Unicode function names.

StatusFileSize
new1.02 KB
new70.89 KB
PASSED: [[SimpleTest]]: [MySQL] 55,619 pass(es).
[ View ]

great, thanks! added it as followup.
also found out a couple more leftovers:)

+++ b/core/tests/bootstrap.phpundefined
@@ -20,3 +20,8 @@
+
+// Set sane locale settings, to ensure consistent string, dates, times and
+// numbers handling.
+// @see drupal_environment_initialize()
+setlocale(LC_ALL, 'C');

Any thoughts on:

  1. Moving this into the Unicode::check() function
  2. Leaving UnicodeTest::setUp() in with the call to Unicode::check()

It feels rather weird to have too much stuff in bootstrap.php. If we were separate the Unicode component from bootstrap.php, tests would fail. Keeping setlocale() in setUp() ensures the setlocale() function is still called since it would be in check().

re #46 the setlocale affects far more than just the functions used in the Unicode... I think if we want to move it we should explore this in a followup. And we need to move it from drupal_environment_initialize() as well.

I agree with alex and the comment, that this is needed in other places then unicode handling.

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+   * Get the current status of unicode/multibyte support on this enviroment.

Nitpick-alarm: Missing @return

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+    if (!in_array($status, array(self::STATUS_SINGLEBYTE, self::STATUS_MULTIBYTE, self::STATUS_ERROR))) {

Just in general I prefer to use static instead of self, as you might extend the class ... you never know.

+++ b/core/lib/Drupal/Component/Utility/Unicode.phpundefined
@@ -0,0 +1,566 @@
+      // Use C-locale for ASCII-only uppercase
...
+      // Case flip Latin-1 accented letters
...
+      // Use C-locale for ASCII-only lowercase
...
+      // Case flip Latin-1 accented letters

Missing dots.

+++ b/core/tests/Drupal/Tests/Component/Utility/UnicodeTest.phpundefined
@@ -0,0 +1,298 @@
+  public function providerTestStatus() {
...
+  public function providerTestMimeHeader() {
...
+  public function providerStrtolower() {
...
+  public function providerStrtoupper() {
...
+  public function providerUcfirst() {
...
+  public function providerStrlen() {
...
+  public function providerSubstr() {
...
+  public function providerTruncate() {

I really like all this unit tests. These methods needs @return documentation.

StatusFileSize
new6.38 KB
new72.73 KB
FAILED: [[SimpleTest]]: [MySQL] 55,450 pass(es), 1 fail(s), and 0 exception(s).
[ View ]

thanks for the review!

StatusFileSize
new1.34 KB
new72.73 KB
PASSED: [[SimpleTest]]: [MySQL] 55,514 pass(es).
[ View ]

meh, fixing one more issue with docs

Status:Needs review» Reviewed & tested by the community

Title:Convert unicode.inc to \Drupal\Component\Utility\UnicodeChange notice: Convert unicode.inc to \Drupal\Component\Utility\Unicode
Priority:Normal» Critical
Status:Reviewed & tested by the community» Active
Issue tags:+Needs change record

Committed 686c269 and pushed to 8.x. Thanks!

Title:Change notice: Convert unicode.inc to \Drupal\Component\Utility\UnicodeConvert unicode.inc to \Drupal\Component\Utility\Unicode
Priority:Critical» Normal
Status:Active» Fixed
Issue tags:-Needs change record

Status:Fixed» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Issue tags:+Kill includes

Tagging for posterity

Issue summary:View changes

added naming followup