Because of change of some strings (like $ to %) between different versions of Drupal, my database of strings (I use localization module to translate the strings) have now lots of obsolete translations. It is impossible for me to search and delete everyone of them separately. I wonder is there any alternative way to clean up my database of strings from old translations?

Comments

yngens’s picture

One more question. Does the fact that my Drupal site has lots of extra obsolete translated strings negatively effect the general functioning of the website? Does it slow down, for instance?

ray007’s picture

To the first question: you only can clean out the database, if
1.) all your translations strings come from po-file you upload, and you don't use the "translate strings" feature of the locale module
2.) if you do so, you can empty the translations table and the re-import all po-files

Sometime in the future when the potx module supports exporting po-files, this task should become a lot easier and should even be automatable on the server. Big question here would be: what to do with translations of currently deactivated modules, probably make it an option. And another problem would be strings coming from using the t() function in templates - so it's probably not too easy ...

To the second question: I _believe_, it would take a long time for your translation table to grow so large that it would have a measurable performance impact, but that's only a gut-feeling and I haven't done any benchmarks or other measurements.

yngens’s picture

ray007, thank you very much for your explanations. can anyone further elaborate on this part:

To the second question: I _believe_, it would take a long time for your translation table to grow so large that it would have a measurable performance impact, but that's only a gut-feeling and I haven't done any benchmarks or other measurements.

i am not good in running different benchmarks, to say frankly i do not know how to do it, but nevertheless ray007 is comforting me by saying it will not "measurable impact" website's performance soon, i believe this is important question for all the drupalers. thanks!

Gábor Hojtsy’s picture

Category: support » task

1. Export all translations as PO file with the built in Drupal interface.
2. Export all module templates as POT file on the potx interface.
3. Run msgmerge on the templates with the translations (this will move the unused translations to comments at the bottom of the resulting file). See http://drupal.org/node/11311
4. Empty the locale strings tables (but not the meta table).
5. Import the file you got in (3).

Of course it is important to take a backup of your site before doing all this.

At some later stage, some contributed module will provide better cleanup services. It is not exceptionally hard to implement with potx, but IMHO not a short task to do (we need to somehow backup the strings we removed for you, just removing them might not be a good idea).

yngens’s picture

thank you, Gabor. going to do this soon.

hass’s picture

Isn't this not the case http://drupal.org/node/148753 or the current check box autolocale already have? Just have a look.. i made a truncate one some tables and re-imported all current po files... and everything is clean, but i'm not doing my own translations without contributing them... so if you have your own manual created translation and they are not contributed you loose this stings with my truncate table way...

Aside burn in hell if you haven't contributed missing translated stings on d.o :-)

JirkaRybka’s picture

I'm trying to deal with this sort of thing WITH custom translations present, hopefully I'll be able to provide some in-Drupal overview of *very* outdated strings, and a way to prune ("on your own risk") later. But this is a long run, as my underlying idea is tracking how many Drupal versions back a string was last used. D6 already started collecting data for this (possible) future usage, and excluding outdated strings from cache processing to avoid performance impact. See http://drupal.org/node/171646 and http://drupal.org/node/175798

hass’s picture

@JirkaRybka: let's install POTX module on your box... extract your translations per module with one file. Save this PO file in the modules directory and truncate the table and re-import... don't forget to share your translation on d.o if missing in the module or you may need to do the same job again.

JirkaRybka’s picture

Default translations are not fitting all sites. I need to explain features in respect to my particular site's workflow, add my own help links and such. That's why there's UI for translations, isn't it?

hass’s picture

I cannot say what you are talking about in detail... but your own module should have it's own translation file where you can explain everything you want. Changing other strings (core or modules) into somewhat else sounds like wrong if the translation does not say the same like the original... if you think a string explains something bad... why are you not working together with CZ translation team or the module maintainer on make them sound better?

I hope your are not doing things like changing a "Save" button in "Save may be nice - click here"? It sounds like something similar...

JirkaRybka’s picture

No time to work on entire translation (also encountered contribs with no translation, and a lot of weirdness in the available translations), but every time I see someting *weird* and not fitting on the site's front-end, I go and correct it. I also encoutered quite a few strings, where t() received a variable, so no translation was provided. I'm not sure whether this still happens or not (that was 4.7), but again, no time to test everything over again with "official" translations. I consider translations similar to themes: Not exactly content, but customizable - yes. (Even more simple process than with themes.)

The places where I needed to customize were new user registration (site's policy on usernames in the form field's description, otherwise users ignored it), privatemsg module (previous admin named the whole feature "Naturistova pošta", i.e. "[sitename]'s post", so needed to keep that consistent), Themes (translates to Czech as "Téma", which means almost surely only "Topic" and nothing else, if seen first time, and confuses users), filter tips (I need to say *why* and *for what purposes* the single formats should be used, not just allowed tags (got a lot of complaints on this, previously, from end-users), let alone the Texy filter providing *no* syntax overview back then, there's no custom help going into that, "Login or register to post comments" which is horribly long in Czech official translation and so breaks theme, the same for image_assist popup window (again, too long wording breaks layout), description on "menu settings" of node form (needed to put in a bold warning for other admins, about site's policy of *not* putting single regular nodes into menu to keep it's size sane), Explanation for "Search yielded no results" needed to reflect (in my country very important) fact that only whole words are searched (which is really silly and unusable in Czech, by the way) and also a link to pre-drupal site containing more useful stuff, plus more site-orientation-friendly examples than "blue smurf" which is ridiculous in the context, then variety of strings non-existent in official core, but needed for custom hacks/bugfixes (most of these, luckily, now in core itself for D6)... Enough?

Sure, this goes above the basic "import po's and forget", but is really needed to give the site consistent own look&feel... No offence meant, but I don't think that official .po will be ever sufficient for all sites. If I proposed a patch for core, to put some admin-editable text inside this and that form field description, do you think it will get committed? I don't think so. Alter the form through custom module? But why? Locale module already offers a simple way to customize virtually *any* string through a nice UI.

We're off-topc here, BTW.

hass’s picture

You could get a core CZ translator and make the translation better :-). I think you wouldn't get a solution with older strings. We can only go forward for now and this means for your examples much handwork for your installation or loosing all/some of your strings.

I thought to go the same way like you long time ago, but i saw the troubles you now have... and ended up providing patches for German core translation. Mostly everything has been accepted or accepted with minor tweaks and have been committed. I think i would have lost more lifetime if i created my own custom German translation. Horrible to figure out what have been changed by me and what is core...

Yep, OT.

JirkaRybka’s picture

Maintaining custom theme and one small custom module through Drupal versions is much much worse nightmare, than just "import only new translations, keep existing ones" on update. Also translations might differ depending on target audience - web developers' site need different explanations that Grandmothers' Knittling Club ;-) Also English is quite uniform, but in Czech we have entirely different phrasing for "official", "friendly", "book/art" and other types of documents.

But OK, your opinion is also valid. Different approach.

beginner’s picture

subscribing.

hass’s picture

Version: 5.x-1.x-dev » 7.x-1.x-dev
Gábor Hojtsy’s picture

Title: How to clean up database of translated strings? » Add string cleanup feature for lingering unused strings
Category: task » feature

Most of what's explained in #4 can be implemented inside potx module, and would probably serve users much better. Except that some modules use strings which are not extractable. So we should instead mark strings for deletion and use Drupal core's built in "string version marking" feature to keep track of which ones are used in fact. Then we can remove unused strings after a while. (Not sure this workflow will suffice, but to be sure we do not remove stuff used, I can only imagine something like this).

Or maybe let the admin export all the strings removed, so they can import some of them back if they are in fact needed.

Retitling for this.

heyyo’s picture

Anything new on this request ? I'm really interested to clean my database, several version of drupal, several drupal modules updated and even some obsoletes.( i used this module http://drupal.org/project/variable_clean to clean obsolete variables)
When we have lots of modules installed the method described in #4 is very long.

Gábor Hojtsy’s picture

@heyyo: On the method in #4, step is one button click. Step 2 you can make one command line potx-cli.php execution. Step 3, 4 and 5 are short things too I think. Unfortunately currently I can only suggest this workaround for the issue, no solution yet. There is/was also active discussion of this missing feature at #1001554: Make it possible to fetch .po files for dev modules/core.

heyyo’s picture

Thanks for your answer, but not sure to understand what you mean for the step 2 by "one command line potx-cli.php" ?

Gábor Hojtsy’s picture

@heyyo: this module includes a potx-cli.php that can be used on the command line, just look into the package :)

klonos’s picture

devad’s picture

Issue summary: View changes
Status: Active » Closed (won't fix)

I suppose we can close this issue with reference to Cleanup Translations module which does pretty much the same thing requested here. If not, please feel free to reopen.

donquixote’s picture

From https://www.drupal.org/project/cleanup_translations

40 sites report using this module

Can we really assume this is a sufficient solution?