Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sun’s picture

FileSize
5.05 KB

Obviously forgot to inject some more $this->.

sun’s picture

FileSize
5.62 KB

drupal_attributes() seems to be the only location that needs some tweaking in manual testing.

Status: Needs review » Needs work

The last submitted patch, drupal8.t.2.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
6.1 KB

Stop-gap fix for #type 'select' options/optgroup handling.

Status: Needs review » Needs work

The last submitted patch, drupal8.t.4.patch, failed testing.

plach’s picture

Gábor Hojtsy’s picture

Subscribe.

donquixote’s picture

Funny - I thought function t and class t would nameclash.
Apparently they do not - lesson learned.

(and sub)

pounard’s picture

I'd rather extend the whole t() function as a unique component first (singleton), and maybe later using the flyweight pattern for each string (meaning a t object singleton without any context, but one single instance being used anywhere to avoid memory bloating). This could be an irrevelant note, but just a though when I read this code. It's more important to make the whole translation API pluggable rather than each string (looks like the problem is taken upside down here IMHO).

donquixote’s picture

[singleton rant on]
The discussion (if it is to become one) about singletons has moved here:
http://groups.drupal.org/node/163219
And btw, the current patch with its static variable is totally ok with me. If you call this a singleton, then I think I can take it all back.
[/singleton rant off]

pounard’s picture

[rant answer]
I do not agree. A singleton is an object, and can be subject to dependency injection easily, more easily than procedural functions. You can use inheritance, or make it implement an interface: doing this with static functions would require either static inheritance either that you replace on the fly the loaded code (just as the lock.inc or session.inc files). Database connection with DBTNG is a singleton (one for each connected database).

Singleton can also be a class shared in memory instead of duplicating new instances, when the object is actually context-less (the flyweight pattern is a good example of this usage).
Another usage can be for the null object pattern, stub neutral implementations can be shared, neutral means no context: it is a class, contracting with a an interface or using inheritance, except that for this particular one you will share an instance instead of creating many.

Singleton can be used with the factory pattern, you can have a singleton at runtime that can be different on each site instance (or even between each execution of the same site) depending on the factory context (here for the translation it can be the library used, either Drupal core, either PHP : Gettext). In this case, the factory and the singleton can be totally different objects. Let me quote some guy "there two kind of objects, one that do something, one other that create the objects that do something". I don't see where one cannot be a singleton instanciated by the one other.

These are common design pattern, well used (a lot). You cannot just say "singleton is crap, it's raw procedural static functions with an accessor", it is so wrong to say so, it has many usages, some are good, some not, but I was speaking of it in a particular context: on a living site, at runtime, I can easily call t() hundreds of time on the same page, I'd really not like to see hundreds of string recopy in order to only do a strtr() while I'm ensured that with the current algorithm it will be only done once for each string: IMHO there is probably an nice, elegant, and simple way to share (at least a part of) those.

Please read what I said before trying to make me looking dumber than you: I don't like childish debates.
[/rant answer]

EDIT: I am wrong on some details: indeed most people always take the patterns to the letter, I don't. Lot of people defending dependency injections says that singleton makes it hard: it is true. I doesn't make it impossible thought. It does introduce a global state component, but depending on the usage, it's maybe what we need. Please just read again and replace the word singleton by "a common shared component instance that doesn't bloat memory, we can fetch somewhere using some kind of factory method that is not mandatory static, on which we could share some pieces of code without instanciating more objects". It only differs to singleton by the "not mandatory static" piece of words.

Re-EDIT: Sorry for polluting.
@sun this seems like a good idea to attempt to modernize this kind of code, and you are, I think, definitely going in the right path using OOP code. I just disagree on a detail absolutely not on the overall design.

franz’s picture

On the first @todo

Couldn't $args go into __construct() ?

pounard’s picture

Status: Needs review » Needs work
FileSize
6.93 KB

May be something like this.

It shares one instance per langcode: this means that custom strings is not a static property anymore. It allows to share the static cache (and could also share much more information if needed) for all strings, for each langcode on a langcode basis: this code is just a sample don't take it as right for the context.

It returns the real string instead of an object, this is arguable but I don't think the end user needs a 't' instance.

It keeps flags (useless because called only once?) for sanitized and localized state. The rest is pretty much the same code just organized differently.

I also started to fix the static property access with the instance operator, but it's not needed anymore because it's shared.

pounard’s picture

Status: Needs work » Needs review

Switching as need review for pure curiosity.

Gábor Hojtsy’s picture

Status: Needs work » Needs review

Would be great if someone could inherit from DrupalTranslatableString (previously T) and implement a different method for string lookups. Either different cache solutions or different backends. There is a request to support gettext .mo files for example at #1187084: Improving Drupal performance using native gettext for translations.

pounard’s picture

I don't know enought the gettext library, but I assume that the .mo always are compiled on a per-language basis: if that's the case my pattern (once instance per language) fits with the need.

@sun what do you think?

pounard’s picture

@#15 How do you plan to update compiled .mo files, incremental rebuild just as some other pieces of cache (like hook implements) or would it be the user responsability to add it?

I saw this issue #361597: CRUD API for locale source and locale target strings and I think this plugable mecanism needs multiple extension points here and in the locale module: the locale module may (should?) be kept as the base for others, Drupal need to keep its UI for editing, importing and saving strings, while a PHP : Gettext implementation would need to be able to plug itself somewhere where the locale module save its cache in a incremental manner for being able to update its compiled .mo files and replace the locale() function in the same time.

The current function originally wrote by Sun and that I renamed getLocalizedString() is wrong, it should be where the different implementations derivates from each other:

  • Core should provide a null pattern implementation: english only, as a pass-through, only dealing with the arguments
  • Locale module should provide its own implementation: the actual locale() method would then be this method for this particular implementation.
  • Other modules could provide their own: the PHP : Gettext integration for example

Definitely need to find a way to get rid of the function_exists('locale') call and raise this test at instanciation time. Of course it wouldn't be the same test anymore but more a configuration option reading to fetch the right class to use (more or less like the cache backends do with the _cache_get_object() function): we need a real factory.

pounard’s picture

Gábor Hojtsy’s picture

Let's not get into any details of how someone would do a .mo implementation, that is totally out of scope here. As long as we can use our own implementation of the class on a site (that is a module can override the default or the class provided by locale module), then it should be doable. This issue should not dive into the details of other possible implementations unless those are really required to figure out parts of the API.

pounard’s picture

Yes maybe that's not the time, but this is that need design for having a consistent and full plugable system: the other opened issues maybe are the place to think about this. Anyway, I did another PoC patch that allows a configuration variable to pragmatically set the implementation to use.

Status: Needs review » Needs work

The last submitted patch, drupal8.t-almost_flyweight_pattern-2.patch, failed testing.

pounard’s picture

Status: Needs work » Needs review

Notes about my latest implementation:

  1. The setString() method reset the object state and prepare it for treating the current string: the usage is localized to the t() function and should remain here, it may be renamed translate() and return the translated string directly, thus making the object stateless regarding to the given string.
  2. Considering the first point, the magic method usage should be dismissed: it's slower and doesn't make sense in the current context anymore if the translate() method does everything we need.
  3. Thereof, the only contextual information remaining in the object would be the language itself.
  4. Custom strings shouldn't be loaded within the object itself, but should be given at instanciation time: thus allowing a better injection capacity: those strings could come from another source than configuration (potentially).
  5. My factory is definitely wrong: it's a PoC implementation, the goal here is to demonstrate that code is clearer if the factory lives by itself (encapsulation allows to focus on ponctual problems), it's easy to patch without risking the rest of the code's integrity.
  6. The whole needs testing to ensure no performance regression: I can't be sure here, I think the impact may be minimal, but only real measures will tell.

What do you think?

EDIT: The test fail is probably due to the fact that the function_exists('locale') is not being done for each string anymore: an early t() call may affect the instanciated almost singleton by instanciating the wrong implementation: but for this one I'm not sure, needs debugging.

Gábor Hojtsy’s picture

Regarding ideal OOP implementation, we should take into account the strong need for performance here. t() is one of the most used functions in a Drupal page request. Proving that this solution does not degrade page performance (much) is important or at least that the developer experience improvements far outweight whatever degradation can be measured. One prime reason t() did not change much (and one of the primary reasons st() exists as an installer counterpart) is to avoid too much functionality in t() and minimize the runtime cost incurred.

pounard’s picture

Making the objects contextless will avoid a lot of variable copy (even if PHP is able to do copy-on-write), remove some accessors and magic function usage: this can only be good for performances.

New patch attached implementing my latest notes: code base really is smaller (and I think performances will be better too), working while browsing on my test site, I don't know if tests will pass though.

pounard’s picture

Previous patch did contain a typo error that forgot to sanitize arguments: this will fail, attaching the fixed patch.

EDIT: Is there any way to cancel previous patch queing for testing?

Status: Needs review » Needs work

The last submitted patch, drupal8.t-almost_flyweight_pattern-contextless-2.patch, failed testing.

sun’s picture

I already mentioned on g.d.o that #4 needs benchmarks (optionally also #25 in case it is functional).

Do NOT work further on this patch unless we have benchmarks, which prove that a class t is not utopian from a performance perspective.

pounard’s picture

@#27 This was originally your patch, may I recall this detail to you. Design first, optimize later: when you design, design must be smart enough to avoid obvious bottlenecks, but only once you agreed on a design you can go to micro-optimization. But I agree, this needs some benchmarks.

franz’s picture

I believe @sun intended to use benchmarks to compare design proposals critically under a performance criteria. Isn't that so?

sun’s picture

"Needs benchmarks" is not about micro-optimization. All of the current patches will most likely not fly due to serious performance regressions of instantiating thousands of t() classes in a single request. That's currently assumed, but not verified yet. It has to be verified first, before exploring possible architectural alternatives.

As long as a performance decrease is a possible (or verified) issue, it does not make any sense to think about or even implement code for pluggable string translation classes with nice interfaces and whatnot. The entire idea of a modern t() is appealing, but Drupal core will not accept it if it makes Drupal slower - especially with regard to monolingual English sites.

pounard’s picture

I would tend to think this opposite, benchmark must be good for multilanguage AND mono languages sites, especially NOT English sites. English is a language like any other, and in the future I would like to see the possibility to translate a site from source code English to human English without patching the source. Source code strings are source code strings, they are not English strings: monolingual sites without local translation are the exception, not the opposite. Having them faster than the others is a bonus, not a prerequisite.

pounard’s picture

Did tests with apache benchmark, I wont give you the full output because it seems a bit too verbose, but results are those (using 1000 requests with 7 concurent requests, on my dev box):

Test URL for EN is: http://d8-git.blaster:80/
Test URL for FR is: http://d8-git.blaster:80/?q=fr

Home page has 10 strings (which is probably not enough to really test).

Current 8.x cold start, EN:
Req/s: 66.84, Processing median time: 105
Current 8.x, EN:
Req/s: 56.78, Processing median time: 123
My patch cold start, EN:
Req/s: 61.76, Processing median time: 113
My patch, EN:
Req/s: 53.88, Processing median time: 130

Current 8.x cold start, FR:
Req/s: 63.31, Processing median time: 110
Current 8.x, FR:
Req/s: 57.51, Processing median time: 122
My patch cold start, FR:
Req/s: 61.63, Processing median time: 112
My patch, FR:
Req/s: 58.82, Processing median time: 118

Note that this figures doesn't make any sense: my box does a lot of other stuff in the background, and the fact that cold start is faster than the second test denotes that my apache is probably strugling because misconfigured: this is what really makes the difference. This is not enough, but this shows that for EN only site, core may be faster (and again, figures are too tight), but for FR site, it's pretty much the same thing. In both case figures really are too tight to denote a real difference.

If anyone feels like doing some PHP profiling...

EDIT: For fun I did it with http://d8-git.blaster/?q=fr/user/register:
I experienced a 25% performance degrade, but figures inverted (~20 strings instead of 10): my patch version was 10% faster. The conclusion is that, with 20 strings on string, my patch doesn't degrade server capacity at all, but displaying a form with 5 elements does degrade performance to 25% less requests per second.

This means that I could probably raise the number of t() call to 1 thousand before actually seeing a real difference. Note that I'm using PHP 5.3, a lot faster that PHP 5.2 for functions calls (4 times faster than PHP 5.2 actually, average) and for OOP stuff (and also has a better garbage collector).

Gábor Hojtsy’s picture

Drupal 8 already requires PHP 5.3.2 at least so we can assume that: http://drupalcode.org/project/drupal.git/blob/refs/heads/8.x:/install.php

pounard’s picture

Is that true? I though decision hasn't been made yet about PHP required version. If so this is good news IMHO.

Gábor Hojtsy’s picture

@pounard: that landed in #576508: Require PHP 5.3.

podarok’s picture

subscribe

Gábor Hojtsy’s picture

Status: Needs work » Postponed

Marking postponed for the lack of work. If people are interested in pursuing this, feel free to change status.

pounard’s picture

I am! But I unfortunately don't have much time.

pounard’s picture

@Gábor Hojtsy I recently worked on a custom PHP software, and I had (not really "had to make" because it was for educational purposes but still) to make a gettext compliant API, and it's quite easy, a way more understable.

What I propose for D8 is splitting the t() function into 4 functions:

/**
 * See PHP gettext extension for more information.
 */
interface IntlBackendInterface
{
    /**
     * @see http://www.php.net/manual/en/function.gettext.php
     */
    public function gettext($text);

    /**
     * @see http://www.php.net/manual/en/function.ngettext.php
     */
    public function ngettext($singular, $plural, $n);

    /**
     * This doesn't exists in PHP gettext, but exists in real gettext library.
     * @see http://www.php.net/manual/en/book.gettext.php#89975
     * @see https://developer.mozilla.org/en/gettext
     */
    public function pgettext($msgctxt, $text);

    /**
     * This doesn't exists in PHP gettext, but exists in real gettext library.
     * @see http://www.php.net/manual/en/book.gettext.php#89975
     * @see https://developer.mozilla.org/en/gettext
     */
    public function npgettext($msgctxt, $singular, $plural, $n);
}

class Intl
{
    /**
     * @var IntlBackendInterface
     */
    protected $intlBackend;

    /**
     * Proceed to variables substitution.
     * @param string $text Already translated text
     * @param array $variables
     */
    protected function substitute($text, array $variables)
    {
        return strtr($text, $variables);
    }

    /**
     * Translate text
     * @param string $text
     * @param array $variables
     */
    public function t($text, array $variables = null)
    {
        if (empty($variables)) {
            return $this->intlBackend->gettext($text);
        } else {
            return $this->substitute($this->intlBackend->gettext($text), $variables);
        }
    }

    /**
     * Translate plural formula
     * @param string $singular
     * @param string $plural
     * @param int $n
     * @param array $variables
     */
    public function nt($singular, $plural, $n, array $variables = null)
    {
        if (empty($variables)) {
            return $this->intlBackend->ngettext($singular, $plural, $n);
        } else {
            return $this->substitute($this->intlBackend->ngettext($singular, $plural, $n), $variables);
        }
    }

    /**
     * Translate text with context (p as particular)
     * @param string $msgctxt
     * @param string $text
     * @param array $variables
     */
    public function pt($msgctxt, $text, array $variables = null)
    {
        if (empty($variables)) {
            return $this->intlBackend->pgettext($text);
        } else {
            return $this->substitute($this->intlBackend->pgettext($text), $variables);
        }
    }

    /**
     * Translate plural formula with context (p as particular)
     * @param string $msgctxt
     * @param string $singular
     * @param string $plural
     * @param int $n
     * @param array $variables
     */
    public function pnt($msgctxt, $singular, $plural, $n, array $variables = null)
    {
        if (empty($variables)) {
            return $this->intlBackend->npgettext($context, $singular, $plural, $n);
        } else {
            return $this->substitute($this->intlBackend->npgettext($context, $singular, $plural, $n), $variables);
        }
    }

This model is more predictable for outside users because function signatures and naming actually fits the real gettext API.

Backend interface is trivial, also more understable, and doesn't carry the variables replacement itself so is really simple to implement (gettext doesn't support variables either if I remember well).

And, more importantly:


// Good by, totally useless and horrible API (for handling contextes):
t("my @string", array(
  '@string' => $string,
), array(
  'context' => 'my_context',
));

// And welcome new one:
ct('my_context', 'my @string', array(
  '@string' => $string,
));

// Good by:
format_plural(2, '@my @count string', '@my @count strings', array(
  '@my' => $my,
), array(
  'context' => 'my_context',
);

// Welcome to:
pnt('my_context', '@my @count string', '@my @count strings', 2, array(
  '@my' => $my,
);

This of course breaks API compat, but right now, almost nobody uses the context, with a nicer syntax maybe people will switch more easily to it (Commerce is aawful for that, they don't use context at all actually they started to use it, but not everywhere, there is still a lot of ambiguities). Plus, format_plural() is not that muched used compared to a single t() so replacements would be done easily.

My sources are:
http://www.gnu.org/software/gettext/manual/gettext.html
http://www.php.net/manual/en/book.gettext.php
http://www.php.net/manual/en/book.gettext.php#89975
https://developer.mozilla.org/en/gettext

Gábor Hojtsy’s picture

Issue tags: +language-ui

Adding UI language translation tag.

sun’s picture

pounard’s picture

Don't restart from the beginning, please review all proposals. I'm just waiting for feedback else I'd have restarted a long time from now.

EDIT: I'd really like to evaluate API function signature identical to the PHP Gettext API, because it's the closest possible from the Gettext library. It can do pretty much everything that we can do, and fitting with the same API function signatures (except for function/method names of course) would allow to write backends really easily for any people not used to Drupal, but used to Gettext. It would also bring consistency.

Something else that worth looking at is what Symfony 2 actually does for translation, it might be really interesting.

YesCT’s picture

Is #39 the place to look for what to review?

This could really use an issue summary update with the template (http://drupal.org/node/1427826). It's time consuming and sometimes difficult, but an appreciated task.

xjm’s picture

xjm’s picture

Issue tags: +needs profiling
Gábor Hojtsy’s picture

Version: 8.x-dev » 9.x-dev

Not going to happen anymore in Drupal 8 given the API freeze.

Gábor Hojtsy’s picture

Issue summary: View changes

Updated issue summary.

Gábor Hojtsy’s picture

Version: 9.x-dev » 8.0.x-dev
Issue summary: View changes
Status: Active » Closed (duplicate)