The releases of the Localized Drupal Distribution are multiplied. Just open the release combo.

CommentFileSizeAuthor
#3 L10nInstallReleases.png115.39 KBGábor Hojtsy
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Zoltán Balogh’s picture

Project: Localization server » Drupal.org site moderators
Version: 6.x-3.x-dev »
Component: Database » Localize.drupal.org
Gábor Hojtsy’s picture

Priority: Normal » Major

Whaaaa, that looks pretty bad.

Gábor Hojtsy’s picture

Priority: Major » Normal
FileSize
115.39 KB

Well, in fact, turns out to be "normal", just see how it happens:

This would only happen for distros built on drupal.org, because they have multiple download variants. Other project's don't do this. And we don't have many distros like that on drupal.org (yet). So at least this happens due to perfectly "normal" reasons, but needs to be special cased some way.

Gábor Hojtsy’s picture

Title: Multiple releases in Localized Drupal Distribution » Drupal.org built distributions have multiple release copies on localize.drupal.org
Project: Drupal.org site moderators » Localization server
Version: » 6.x-3.x-dev
Component: Localize.drupal.org » Code

Retitling for our current understanding. Moving to the queue where the fix/code needs to be written.

drumm’s picture

Version: 6.x-3.x-dev » 7.x-1.x-dev
Assigned: Unassigned » drumm
Issue summary: View changes
drumm’s picture

The releases that have been added by the newer REST release importing look good, so this only needs data cleanup.

drumm’s picture

Here's a draft of the queries to run:

db_query("DELETE fm FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_file} lpf ON lpf.rid = lsr.rid INNER JOIN {file_managed} fm ON fm.fid = lpf.fid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE f FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_file} lpf ON lpf.rid = lsr.rid INNER JOIN {files} f ON f.fid = lpf.fid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lpf FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_file} lpf ON lpf.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lpr FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_release} lpr ON lpr.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lse FROM {l10n_server_release} lsr INNER JOIN {l10n_server_error} lse ON lse.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lsf FROM {l10n_server_release} lsr INNER JOIN {l10n_server_file} lsf ON lsf.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lsl FROM {l10n_server_release} lsr INNER JOIN {l10n_server_line} lsl ON lsl.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
db_query("DELETE lsr FROM {l10n_server_release} lsr WHERE lsr.download_link LIKE '%-core.tar.gz'")->execute();
Gábor Hojtsy’s picture

I think there is a certain value in parsing at least two variants of a distro (standalone and maybe the no-core one), so you get info on strings used in distros specifically (eg. when looking up the use of a string, eg. https://localize.drupal.org/translate/source-details/12 shows the string is both part of project module and drupal.org testing profile). It is admittedly a sizable data management penalty to pay for sure.

For reasons parsing the core distro, the installer in D8 and the l10n_update in D7 can only download one .po file in the installer ATM, and it would be the complete translation file for the distro (the .po packaged for the -core.tar.gz). Again, the clients may be made more efficient instead of the server needing to provide this variant too.

So these are the reasons this was not attended to in the past 4 years (apart of not having time that is :D). I think at least this would be best checked against how D8 core and l10n_update works with a distro and what translation it attempts to download for them. It may be that it only downloads the core translation anyway for the base Drupal version and no distro specific stuff (with an opportunity to download the module/profile translations later in the process). However profiles may be providing additional install screens, etc. so it needs to somehow work there.

Gábor Hojtsy’s picture

That said, based on @drumm, this triplication of release info does not seem to be the case anymore since the REST export is used? So its not something we'd "keep" anyway, since it was only happening for a while, right?

@drumm: your draft queries would still leave data in l10n_server_line referring to files not existent anymore, that would also need to be cleaned up. Strings should not be cleaned up because the strings should appear in other projects anyway (the pure distro package and the core/module releases).

drumm’s picture

That said, based on @drumm, this triplication of release info does not seem to be the case anymore since the REST export is used? So its not something we'd "keep" anyway, since it was only happening for a while, right?

A good example is https://localize.drupal.org/translate/projects/cartaro/releases. 1.1 and above do not have the -core and -no-core files. The REST fetching looks like it hard-codes releases being named short_name-version. A followup issue could change this to prefer -core when available, and fallback to the usual naming.

drumm’s picture

your draft queries would still leave data in l10n_server_line referring to files not existent anymore, that would also need to be cleaned up.

I think it will get them all. The query I used, converted to SELECT:

mysql> SELECT count(1) FROM l10n_server_release lsr INNER JOIN l10n_server_line lsl ON lsl.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz';
+----------+
| count(1) |
+----------+
| 36792748 |
+----------+

And with instead joining via l10n_server_file.fid

mysql> SELECT count(1) FROM l10n_server_release lsr INNER JOIN l10n_server_file lsf ON lsf.rid = lsr.rid INNER JOIN l10n_server_line lsl ON lsl.fid = lsf.fid WHERE lsr.download_link LIKE '%-core.tar.gz';
+----------+
| count(1) |
+----------+
| 36792748 |
+----------+

The same number number of rows would be affected.

drumm’s picture

Staging row counts before execution:

mysql> SELECT count(1) FROM l10n_server_release\G
*************************** 1. row ***************************
count(1): 66613
1 row in set (0.01 sec)

mysql> SELECT count(1) FROM l10n_server_line\G
*************************** 1. row ***************************
count(1): 45206861
1 row in set (11.69 sec)

mysql> SELECT count(1) FROM l10n_server_file\G
*************************** 1. row ***************************
count(1): 6851234
1 row in set (1.16 sec)

mysql> SELECT count(1) FROM l10n_server_error\G
*************************** 1. row ***************************
count(1): 674859
1 row in set (0.12 sec)

mysql> SELECT count(1) FROM l10n_packager_release\G
*************************** 1. row ***************************
count(1): 95484
1 row in set (0.02 sec)

mysql> SELECT count(1) FROM l10n_packager_file\G
*************************** 1. row ***************************
count(1): 4623221
1 row in set (0.78 sec)

mysql> SELECT count(1) FROM files\G
*************************** 1. row ***************************
count(1): 4624058
1 row in set (0.78 sec)

mysql> SELECT count(1) FROM file_managed\G
*************************** 1. row ***************************
count(1): 2329304
1 row in set (0.39 sec)
drumm’s picture

The bulk of the ~1.5h time is taken in l10n_server_line. Splitting up those deletes might speed it up, or at least not lock the table for 1.5h.

db_query("DELETE f FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_file} lpf ON lpf.rid = lsr.rid INNER JOIN {files} f ON f.fid = lpf.fid WHERE lsr.download_link LIKE '%-core.tar.gz'");
db_query("DELETE lpf FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_file} lpf ON lpf.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'");
db_query("DELETE lpr FROM {l10n_server_release} lsr INNER JOIN {l10n_packager_release} lpr ON lpr.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'");
db_query("DELETE lse FROM {l10n_server_release} lsr INNER JOIN {l10n_server_error} lse ON lse.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'");
db_query("DELETE lsf FROM {l10n_server_release} lsr INNER JOIN {l10n_server_file} lsf ON lsf.rid = lsr.rid WHERE lsr.download_link LIKE '%-core.tar.gz'");
foreach (array_chunk(db_query("SELECT rid FROM {l10n_server_release} lsr WHERE lsr.download_link LIKE '%-core.tar.gz'")->fetchCol(), 10) as $rids) {
  db_query("DELETE FROM {l10n_server_line} WHERE rid IN (:rids)", array(':rids' => $rids));
}
db_query("DELETE lsr FROM {l10n_server_release} lsr WHERE lsr.download_link LIKE '%-core.tar.gz'");
Gábor Hojtsy’s picture

Well, unless its done on cron or something cleaning up a few projects at a time (and then a few hours off), it will be blocking anyway. It may be best to get over with faster (especially if it helps packaging work again), eg. by downing the site for 2 hours or so.

drumm’s picture

It doesn't seem to be locking out SELECTs, and each iteration takes less than 30s, often less than 10s. I don't think we need to do a downtime.

Gábor Hojtsy’s picture

ok, fine with me :)

drumm’s picture

After counts:

mysql> SELECT count(1) FROM l10n_server_release\G
*************************** 1. row ***************************
count(1): 64001
1 row in set (0.06 sec)

mysql> SELECT count(1) FROM l10n_server_line\G
*************************** 1. row ***************************
count(1): 8414644
1 row in set (12.71 sec)

mysql> SELECT count(1) FROM l10n_server_file\G
*************************** 1. row ***************************
count(1): 1331178
1 row in set (2.41 sec)

mysql> SELECT count(1) FROM l10n_server_error\G
*************************** 1. row ***************************
count(1): 202672
1 row in set (0.34 sec)

mysql> SELECT count(1) FROM l10n_packager_release\G
*************************** 1. row ***************************
count(1): 92858
1 row in set (0.13 sec)

mysql> SELECT count(1) FROM l10n_packager_file\G
*************************** 1. row ***************************
count(1): 4369033
1 row in set (27.66 sec)

mysql> SELECT count(1) FROM file_managed\G
*************************** 1. row ***************************
count(1): 2307106
1 row in set (1.11 sec)
drumm’s picture

l10n_server_line, l10n_server_file, and l10n_server_error are higher than I was expecting. These are 2 extra copies of core, plus one extra copy of the modules, for each distort release, so it does make some sense.

Otherwise, this looks okay. Needs some click testing to double check, and packaging up in a hook_update_N().

drumm’s picture

Project: Localization server » localize.drupal.org

Since this is Drupal.org-specific, putting the updates here.

  • drumm committed ba53877 on 7.x-1.x
    Issue #1261810: Drupal.org built distributions have multiple release...
drumm’s picture

Status: Active » Fixed
Issue tags: +needs drupal.org deployment
drumm’s picture

Issue tags: -needs drupal.org deployment

Now deployed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.