This issue was originally opened to update the stale countries list in Drupal 7. It was to ensure that this list;

  • Includes all of the new countries defined by International Organization for Standardization (ISO) ISO 3166-1.
  • Uses the official short country name as defined by the ISO 3166-1 standard.

This means that this issue contradicts the core string freeze policy, but Dries #31 (pre "Netherlands Antilles (historical)") and Gábor Hojtsy #34 have given their approval to this.

Other issues arising from this discussion are:

Proposed resolution

To update the core listings in both Drupal 7 and 8 to match the current ISO standard.
To bypass associated issues with deleting "Netherlands Antilles" by renaming this to "Netherlands Antilles (historical)". This is the only country deletion that has occurred to date within the core maintained list.

Remaining tasks

Drupal 7: Decide on how we should handle "Netherlands Antilles" if "Netherlands Antilles (historical)" is not acceptable.
Drupal 8: Update scripts for removing "Netherlands Antilles".

Please refer to the other issues to continue discussions on these topics.

User interface changes

The following string changes will occur if the proposed resolution is accepted.

Drupal 7 minimal patch

Additional strings

  • Bonaire, Sint Eustatius and Saba
  • South Sudan
  • Sint Maarten (Dutch part)

Drupal 7 complete patch and Drupal 8 patch.

Also includes the following changes

  • Aland Islands to Åland Islands
  • Bolivia to Bolivia, Plurinational State of
  • Brunei to Brunei Darussalam
  • Congo (Brazzaville) to Congo
  • Congo (Kinshasa) to Congo, The Democratic Republic of the
  • Falkland Islands to Falkland Islands (Malvinas)
  • Iran to Iran, Islamic Republic of
  • Ivory Coast to Côte d'Ivoire
  • Laos to Lao People's Democratic Republic
  • Macao S.A.R., China to Macao
  • Macedonia to Macedonia, The Former Yugoslav Republic of
  • Micronesia to Micronesia, Federated States of
  • Moldova to Moldova, Republic of
  • Netherlands Antilles to Netherlands Antilles (historical)
  • North Korea to Korea, Democratic People's Republic of
  • Palestinian Territory to Palestinian Territory, Occupied
  • Reunion to Réunion
  • Russia to Russian Federation
  • Saint Helena to Saint Helena, Ascension and Tristan da Cunha
  • South Korea to Korea, Republic of
  • Taiwan to Taiwan, Province of China
  • Tanzania to Tanzania, United Republic of
  • U.S. Virgin Islands to Virgin Islands, U.S.
  • Vatican to Holy See (Vatican City State)
  • Venezuela to Venezuela, Bolivarian Republic of
  • Vietnam to Viet Nam
  • British Virgin Islands to Virgin Islands, British

Original report by TR

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

TR’s picture

Assigned: Unassigned » TR
Status: Active » Needs review
Anonymous’s picture

All works fine for me.

Only thing I would be worried about is the effect of any deletions on existing sites, in this case there's one deletion 'AN' for Netherlands Antilles as it's now split into two (http://www.iso.org/iso/pressrelease.htm?refid=Ref1383).

If there are no consequences from this removal then I'd say it's fine and a good idea to update the iso.inc, otherwise there should be some thought put into an upgrade path - I have one client who's in the Netherlands Antilles, but not on a Drupal site :(

TR’s picture

Status: Needs review » Needs work

Good point. I'll work on a hook_update() and post a new patch.

catch’s picture

Alan D.’s picture

Cut n paste from the above issue, for easier issue searches:

ISO country changes, as per http://www.iso.org/iso/iso_3166-1_newsletter_vi-8_split_of_the_dutch_ant...

Deleted

* "Netherlands Antilles" (AN)

Added

* "Curaçao (CW)"
* "Sint Maarten (Dutch part) (SX)"
* "Bonaire, Saint Eustatius and Saba (BQ)"

I think there may be a number of unhappy users if a lot of the common names like "Bolivia" and "Venezuela" get changed to the longer official names such as "Bolivia, Plurinational State of" and "Venezuela, Bolivarian Republic of". I've been traveling in south america for 3 months and I have never once heard the official short names used...

I know that webform uses the iso.inc for its country list, as does the countries module (deleting will have no effect on this one), so there is at least two database references to handle here. There may be more. It may be safest to leave the older reference in for Drupal 7 and remove in Drupal 8.

TR’s picture

ISO sets the standards, we use them. The ISO 3166 set of country names is the closest thing there is to a worldwide agreed-upon list of names, and it is the countries themselves and the international community, through the UN, that has decided what they want their names to be. It is not up to us to decide in this forum that ISO is wrong and to "correct" their choice - that would be a subjective political decision rather than an objective choice of standardized names.

Once we start down the road of judging what a name "should" be there's no going back and you might as well throw out any pretense of conforming to a standard. Why not use Burma instead of Myanmar? The US and many EU nations insist on the former, because they don't recognize/approve of the government that changed the name. Maybe Tibet deserves to be listed separately? And why not the United States of America instead of just United States? Should we remove the word "Democratic" from the names of those countries which clearly *aren't* ? Why does Drupal refuse to recognize Transnistria?

As far as Bolivia and Venezuela, those names were chosen by the governments of those countries. They were both changed in the standard just last year, on 2010-02-03 (see http://www.iso.org/iso/iso_3166-2_newsletter_ii-1_corrected_2010-02-19.pdf). We should respect those countries' choice of how they want to be named.

It's a pretty easy decision - either conform to the ISO standard or not.

Any module or site that disapproves of the official names is free to modify them using hook_countries_alter(). Drupal itself should stick to the standard.

I am adamantly opposed to putting this change off until Drupal 8. The political divisions in this world are constantly changing. I'm pretty sure that was known when it was decided to put this list into Drupal. This list was meant to be complete and to keep up with changes in the world, not to be stagnant for the next two years until a major release. It would be totally unacceptable to tell residents of Sint Maarten, for example, that Drupal won't support their country until sometime after 2013.

Yes, this change may have minor consequences for contributed modules; they will have to be able to deal with new countries that didn't exist last year, and they will have to deal with countries that don't exist anymore this year. That's what hook_countries_alter() will easily solve, and that is the nature of a list of countries - it will *always* be changing. Drupal core should keep up with the changes. The only implication for core is that sites which had chosen a country of "Netherlands Antilles" (deleted) will have to switch to using their actual current country name. This can be easily handled with a system_update_N().

Anonymous’s picture

I have to agree with @TR:

>It's a pretty easy decision - either conform to the ISO standard or not.

>Any module or site that disapproves of the official names is free to modify them using hook_countries_alter(). Drupal itself should stick to the standard

It's our job to implement chosen standards, not to debate those standards, that's what the ISO do.

Alan D.’s picture

And which was one of the reasons I wrote the countries module was to handle different naming preferences, so no big issue with the name side of things. This provides an interface to a database that is used to update the core country list via the alter hook. While I personally agree with using ISO standards, many clients do not, which is the bottom line for many of us in the long run.

I'd run the string changes pass one of the locale/language maintainers, as we are well past the string freeze date.

The only issue I personally have with the patch is that deleting AN is potentially going to lead to data loss. Webform, countries, probably location, maybe gmap, ....., may already have references to this country. Even core references can not be 100% converted as 1 > 3 countries is not possible without further import, like more detailed address info or user input. Thus a simple update patch is not possible. :(

Anonymous’s picture

>And which was one of the reasons I wrote the countries module was to handle different naming preferences, so no big issue with the name side of things. This
>provides an interface to a database that is used to update the core country list via the alter hook. While I personally agree with using ISO standards, many clients do
>not, which is the bottom line for many of us in the long run.

Interesting... reason I ended up in this conversation is because I'm needing to make fields for countries and languages and have been going through the options. Tried country cck field and taxonomies but hadn't seen your module. Just been trying it out and it does some of what I need - "country of origin" ok as just need to choose one out of a list of countries, "subtitles" I can't seem to get working as I only see one language in the list, the one I have installed rather than a list of languages. I have been trying out an unofficial D7 port of the iso_639 cck field but that needs hacking too to get working for all languages :(

I also have to work out how to create an easily multiple selectable list of countries grouped by region so whole regions can be clicked then a fly-out list of countries can be done so you can untick one or two - a bit like when you install software packages through a gui installer and you can select some/none/all. At the moment I'm just using a taxonomy multiple select with two level hierarchy for prototyping, but it's not ideal obviously.

I know this isn't the right place for this particular discussion but there seem to be so many different ways of skinning the kittens it's hard to know which way to go! I've got video formats to sort out next... h264 but that's another story ;)

>I'd run the string changes pass one of the locale/language maintainers, as we are well past the string freeze date.

Sounds good.

>The only issue I personally have with the patch is that deleting AN is potentially going to lead to data loss. Webform, countries, probably location, maybe gmap, .....,
>may already have references to this country. Even core references can not be 100% converted as 1 > 3 countries is not possible without further import, like more
>detailed address info or user input. Thus a simple update patch is not possible. :(

As I pointed out in my original reply, some kind of upgrade path is needed. It broke for me when location module started using this list a few days ago, and however interesting and compelling this debate is, it'd be good to get some kind of resolution to it so I know what best to implement what I'm doing right now.

rooby’s picture

Subscribing so I can make the location module conform when this is resolved.

Anonymous’s picture

I was wondering the other day what was happening here... any further thoughts/news?

droplet’s picture

Component: other » language system
Assigned: TR » Unassigned
Status: Needs work » Needs review
FileSize
16.23 KB
// get iso country 

print '<pre>';

$lines = file('http://www.iso.org/iso/list-en1-semic-3.txt');

foreach ($lines as $line_num => $line) {
    $line = trim($line);
    $country = explode(";", $line);
    $country_name = preg_replace_callback("/\w+/",'format_name',$country['0']);
    $iso[$country['1']] = addslashes($country_name);
}

ksort($iso);

foreach ($iso as $country_code => $country_name) {
   $format = '\'%s\' => $t(\'%s\'),';
   printf($format, $country_code, $country_name);
   echo "\n";
}

print '</pre>';

function format_name($match)
{
    $exclude = array('and','not');
    if ( in_array(strtolower($match[0]),$exclude) ) return strtolower($match[0]);
    
    if ( in_array($match[0], array('MCDONALD')) ) return 'McDonald';
    
    return ucfirst(strtolower($match[0]));
}

plach’s picture

Status: Needs work » Needs review

We had to face a similar update issue in #385296: Standardize the language selector for the Philippine language. which actually stopped it, since we have not decided yet how to treat obsolete language codes.

I've been thinking about this for a while: iso.inc contains a list of known language codes, which we use to prepoulate some properties of the language object. OTOH we also support custom languages, which internally are treated exactly in the same way as the languages from iso.inc. So actually there should be no problem in changing the language codes returned by _locale_get_predefined_list, since obsolete languages would be suddenly treated as custom ones.

In the same way if we are able to ensure that core can handle obsolete country codes as "custom" country codes, for instance by implementing hook_countries_alter in core, it should not be too much problematic if have to remove an entry from iso.inc.

Anyway, the patch in #13 has some capitalization issue and is missing an upgrade path.

plach’s picture

Status: Needs review » Needs work
TR’s picture

Component: language system » other
Assigned: Unassigned » TR
Status: Needs review » Needs work

@droplet: What's wrong with my original patch? Did you look at it? By posting a second patch you're implying that the data I used in my patch is wrong. It is not. It would be far more helpful if you made the effort to verify the data in my patch and confirm that it is correct or point out any errors that may be there.

Your patch has a number of problems:

1) You haven't addressed the need for a hook_update_N(), which is the reason the thread status was set to "Needs work".

2) Your capitalization scheme is wrong and you're not following Drupal coding conventions for use of quotes. For example: $t('Korea, Democratic People\'S Republic Of').

3) Your patch is really impossible to review for accuracy because you're deleting the entire contents of the _country_get_predefined_list() function and replacing it with your own list - no one is going to compare all 248 countries "before" and "after" to try to figure out what changes you've made, then try to compare it with the actual ISO standards (like I did with my patch) to make sure your data is correct.

4) I made an effort to improve the documentation comments and bring them up to Drupal's documentation standards. You did not.

Country names aren't part of the language system at all, so I think it's inappropriate to change the category. Assigning this back to myself to reflect my intention of writing AND TESTING a hook_update_N() and addressing any remaining issues with my original patch.

plach’s picture

@TR:

All true but your patch in the OP is not in unified format :)

droplet’s picture

@TR,
welcome to fix my patch or you own issue :)
I'm copying my work from #1075420: 4 country changes for iso.inc with a little updates.

I'm not sure if it really need a hook_update_N(). when a country removed, you can't assume a new to them (it's not 1:1 mapping). Asking for reset their country isn't a way that hook_update_N() can do I think. (users can assume again without hook_update_N(), and no error I seem).

waiting for you work :)

TR’s picture

@plach: True, I don't know why I did it that way this one time. However, the testbot was able to recognize, apply, and test the patch without a problem, so it seems it's an acceptable format nonetheless.

TR’s picture

Version: 7.x-dev » 8.x-dev
Status: Needs work » Needs review
FileSize
7.98 KB

Here is a re-roll of the original patch against the current D8 HEAD. Note that one item in the original patch has since been fixed in a separate issue: #1313342: Curacao should appear in standard_country_list(). I would prefer to take care of *all* these issues at once, rather than deal with them piecemeal in a dozen separate issues.

TR’s picture

Title: includes/iso.inc contains inaccurate country data » core/includes/standard.inc contains inaccurate country data
Issue tags: +Needs backport to D7

Updating issue title to reflect current filename in D8 HEAD.
Tagging for backport to D7.

aspilicious’s picture

Status: Needs review » Needs work
+++ b/core/includes/standard.incundefined
@@ -6,14 +6,18 @@
+ * @link http://drupal.org/project/location the location project @endlink ¶

trailing whitespace

17 days to next Drupal core point release.

TR’s picture

Status: Needs work » Needs review
FileSize
7.98 KB

Whoops, sorry about that. Here's a new patch without the trailing whitespace.

Alan D.’s picture

Status: Needs review » Needs work

Libyan Arab Jamahiriya is now just Libya (again): http://www.iso.org/iso/nl_vi-11_name_change_for_libya.pdf

Did any of the core maintainers actually give feedback about string changes? I see this as a non-issue personally, but I guess if I was running a Chinese website and suddenly 8 or 9 countries started coming through with Latin characters...

TR’s picture

Status: Needs review » Needs work
FileSize
7.85 KB

Wow, the UN works fast sometimes. Fixed Libya. Thanks for pointing that out.

I think Latin characters should be a non-issue because:
1) standard.inc in D8 and iso.inc in D7 already use Latin characters, i.e. Saint Barthélemy and Curaçao. This patch isn't breaking new ground. And,
2) #1313342: Curacao should appear in standard_country_list() was committed recently, which introduced additional Latin characters. So the core committers have already blessed the notion of Latin characters in standard.inc (and iso.inc).

This patch adds two more country names with Latin characters: "Côte d'Ivoire" and "Réunion". But note that these are the standardized ISO 3166-1 country names for those countries. The normal Drupal translation mechanism works for translating these into languages like Chinese, because standard.inc (and iso.inc) wrap the country names in $t().

TR’s picture

Status: Needs work » Needs review

And back to needs review for the testbot.

Alan D.’s picture

Status: Needs work » Needs review

I would love to see this back-ported to D6/7, although D7 has a workaround via the countries module.

But can we back-port the string change to D6/7 after the string freeze? This is against the very core coding policies, but I think that this is worth an exception, at least for the new countries.

TR’s picture

Big picture: I think it was a bad idea to hard-code countries into Drupal in the first place. Things like the string freeze, a multi-year release cycle, and lack of revisioning for built-in data pretty much ensures that Drupal core will always be way out of date. As I mentioned earlier in the thread, country names are inherently dynamic data; the fact that Libya has changed so much is a great example. In the 10 months since I first posted this issue, Libya has had a revolution and now has a new government and has gone through the process of getting its ISO country name changed. Are we going to wait until Drupal 8 is released before we recognize this? Does Drupal really have more inertia than the UN?

So what can be done to fix this? First and foremost is to promptly address any changes. (Read: Please commit my patch!) In the long run, however, I think the whole notion of distributing data wrapped in code needs to be reconsidered. Country names, IMO, should be versioned data stored in the DB, and not stored in include files with no context of what the names used to be or what they have been changed to. Without versioning, contributed modules can't properly track changes in core.

plach’s picture

Issue tags: +D8MI

No time to review this, but #568986: Dynamically update standard language list from localization server seems in line with the arguments in #27. Probably we should open an issue to try and feed our country list from a web service or at least make it dynamic.

Adding to the D8MI initiative to see if we can get some attention.

Alan D.’s picture

It is a bit of a difficult one. Moving this to a web service would add a slight layer of complication to the code and only really results in removing the string freeze exception via proxy. Also, we have multiple sites running behind firewalls that would not be able to use a web service for updates.

You must remember that the UN is fairly inert too, so if Drupal continues to make consistent periodical releases, and there is an exception to the string freeze on languages / countries (and whatever other lists make it into core), this would keep 99.9% of users happy. Contrib can cater for the needs of the other 0.1%.

As an aside, core has code from many contributed modules, why does the location module have the honor of being the only module that is credited as the source within core (via search of the project URL)? Particularity as this list is in the public domain and there is no IP directly related to that project.

BTW, Happy with the patch - I've been tracking the iso changes for a year or two now. Lets get this in and close this thread for the meantime!

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks good. Agreed with Alan D. on how we can handle the updates in the future. (Not sure this has a lot of relation to D8MI though, keeping tag anyway.)

Dries’s picture

I think this patch looks good.

One could argue that we need an upgrade path here as some language codes stop to exist, and they may be used in various database tables.

I don't think there is anything we can do in core (?), but we should document this change so site owners can take steps to ensure data integrity.

Gábor Hojtsy’s picture

Status: Reviewed & tested by the community » Needs work

@Dries: That is true. Looking at the patch, AN is removed and BN and BQ is added (with CW added in a separate issue). These three are a split of AN. We can theoretically pick either in the update path and put out a note to the user if we did not pick the right one I guess. The only place country is used in core is the global site country setting I believe, so we'd need to update that.

Alan D.’s picture

Or an alternative to avoid the need for an update:

-    'AN' => $t('Netherlands Antilles'),
+    'AN' => $t('!country (historical)', array('!country' => $t('Netherlands Antilles'))),

This string is 33 characters long making it the 4th longest string in the array. While this is officially deleted from the ISO table, it is assigned as a "Transitionally reserved code element", of which there are codes up to 2 decades old still listed (ie: BU - Burma removed in 1989, not to be confused with MM - Myanmar).

Gábor Hojtsy’s picture

That could also work, since it is pretty impossible to tell which portion we should migrate the country info to. I do think we can just do $t('Netherlands Antilles (historical)') all in one. I would not stress over it being a new string.

Gábor Hojtsy’s picture

Issue tags: +language-content

Tagging for the content handling leg of D8MI.

Gábor Hojtsy’s picture

Issue tags: -language-content

Uhm, wrong issue :)

@Alan D.: can you post an update with that suggestion to rename as historical? I think that would be good to go.

Alan D.’s picture

Status: Needs review » Needs work
FileSize
7.42 KB

This is a re-roll of TR's patch, but focused only on countries and corrections to BQ and addition of SS.

[edit]
For a complete list of changes see the issue summary.

Alan D.’s picture

Status: Needs work » Needs review
Alan D.’s picture

Here are two Drupal 7 patches. The first does the additions only and does not change any existing names. The second does the additions and the updates.

Gábor Hojtsy’s picture

Status: Needs review » Reviewed & tested by the community

Looks good for D8 again (looks like only minor changes from previous patch). And with keeping the historic country name we keep compatibility and let people choose the right new one when/if they need, no need for an update.

Damien Tournoud’s picture

Status: Reviewed & tested by the community » Needs review
-    'AN' => $t('Netherlands Antilles'),
+    'AN' => $t('Netherlands Antilles (historical)'),

This one has been withdrawn in 1993. There is no reason to keep it in there. We don't have listing for many other historical names, even withdrawn *after* this one. (Example: Serbia and Montenegro, ex 'CS', withdrawn in 2006).

-    'MK' => $t('Macedonia'),
+    'MK' => $t('Macedonia, The Former Yugoslav Republic of'),

Many people would consider this a regression. See Debian and Mozilla. The Debian project picked "Macedonia, Republic of", which seems to be consensual.

Also, the Debian project maintains a really nice set of XML files for ISO codes, and we should consider just using those. I checked and they match the patches posted here (with the minor exception of the accent missing on "Réunion" in the Debian XML files, which is definitely a bug there).

Gábor Hojtsy’s picture

@Damien: how would you do an update routine for the historic country code for a country that was split in different codes?

Damien Tournoud’s picture

Either we put all the ISO-3166-3 data in our tables, or we put none of it. I don't see a reason to keep data in there just because it used to be there. We can ask the affected users to upgrade manually (we don't have an upgrade path for Sudan vs. South Sudan either, keeping all the people affected to Sudan is equally wrong).

I opened a Debian bug for the spelling of "Réunion".

Damien Tournoud’s picture

FileSize
530 bytes

Attached is a simple parser that generate this list based on the iso-3166.xml file from Debian.

TR’s picture

Why would we take the data from Debian, when we have the authoritative data available directly from ISO? The @see link in the patch takes you right to the ISO data. http://www.iso.org/iso/country_codes/iso_3166_code_lists.htm

Re: Macedonia

Many people would consider this a regression. See Debian and Mozilla. The Debian project picked "Macedonia, Republic of", which seems to be consensual.

The keyword here is "picked". It's not our job to decide what the names "should" be. As I said in #6:

ISO sets the standards, we use them. The ISO 3166 set of country names is the closest thing there is to a worldwide agreed-upon list of names, and it is the countries themselves and the international community, through the UN, that has decided what they want their names to be. It is not up to us to decide in this forum that ISO is wrong and to "correct" their choice - that would be a subjective political decision rather than an objective choice of standardized names.

See #6 for further argument along this line. Do we really have to rewind this discussion back to where it was 11 months ago?

Damien Tournoud’s picture

Why would we take the data from Debian, when we have the authoritative data available directly from ISO?

Because ISO data is not readily available, and Debian is doing a great job at maintaining accurate, ISO-compliant lists of language, territory, currency, script codes (*and* their translations in many languages).

This is precisely what we failed to do all those years, and is precisely the point of this issue.

TR’s picture

I disagree that ISO data is "not readily available". The link in #45 takes you to a page where the most current, up-to-date data is available directly from ISO as HTML, XML, or Text (name and alpha-2, semi-colon delimited).

Debian is a secondary source, which according to #41 has also been modified so that it does not agree with the ISO 3166-1 standard. Plus, there is certainly a lag between when the official ISO data (see link in #45) changes and when Debian publishes its new version. Relying on a secondary source rather than the primary source also exposes us to bugs like #43, where the Debian data does not agree with the official, standardized ISO data.

I see two distinct issues here:
1) Should we update the core Drupal list of countries to conform to the ISO standard?
I think the answer to this is unequivocally YES.

2) Do we need a new mechanism for including country data in Drupal?
Again, I think the answer is YES. Please read my post in #27.

This issue is about 1). While I think 2) is important in the long run, let's not allow that to distract us from fixing what is a simple, minor correction and improvement of the country data currently being distributed with Drupal.

Damien Tournoud’s picture

Status: Needs review » Needs work

#41-1 is still valid and requires a change in that patch. If we want to stick to the official ISO-3166-1 list, we need to remove 'AN' from the list.

Debian has a very well maintained secondary database of this, we should use it instead of pretending that:

  1. we follow ISO-3166 (we are not, ISO-3166 only contains the uppercase version of the official short names), and
  2. we are doing a good job at maintaining our secondary database (we are not, our version contains entries that are 19 years out-of-date)

ISO-3166 data is no readily available (the full database in a closed format is available from ISO for 162,00 CHF), incomplete for our use case and hard to track. We should stop hurting ourselves.

Alan D.’s picture

Status: Needs work » Needs review

Regarding #41 & #48

Debian has a very well maintained secondary database of this, we should use it instead of pretending that. This one (Netherlands Antilles) has been withdrawn in 1993

The ISO only officially removed this in ISO 3166-1 Newsletter VI-8 (2010-12-15) during the results of the other constitutional changes happening in this part of the world. If your ISO source removed this 17 years earlier, I would not trust it. As it is a territory, this may have been removed then added before being split again over the 17 year gap.

The Debian project picked "Macedonia, Republic of",

I have to agree with TR. The ISO is based off UN data that is itself not 100% agreed on by the political parties. However, it is the best source. There are about 20 entries that I would not pick personally due to the common usage of the old or shortened versions in common English. Some examples are: Bolivia, North/South Korea, Venezuela. I know of many countries that would exclude Palestine outright.

ISO-3166 data is no readily available (the full database in a closed format is available from ISO for 162,00 CHF), incomplete for our use case and hard to track. We should stop hurting ourselves.

See

Last night, I copied the data via Firefox, pasted this into excel with the core list via a couple of string replacements in a text editor and done a comparison using "= IF(LOWER(TRIM())=LOWER(TRIM()),1,0)". This was the first time that I tried this and it took all of 15min give or take. This is a one off process once we get the list updated.. Then track the update list that publishes all changes in the ISO database via email or http://www.iso.org/iso/country_codes/updates_on_iso_3166.htm

I do not mind that this data is in core, it is simple and fairly trivial to keep accurate (if we move on issues like this in a timely fashion). There should not be any issue from removing a country code in core, but there are many other uses, including currency mapping in contrib. So data integrity is an issue. So I would say, only remove countries during major release cycles, nearing the first Drupal alpha release in case there are others, and add a note upgrade notes that the country was deleted and that any modules that rely on the data should update this themselves during their major upgrade.

Removing this list from core would remove the issues that are preventing these patches from being committed, but I think that can simply be resolved via actually having a policy to reflect that this data should be treated differently from core.

Draft policy for handling country data

The country list maintained by core contains the latest representation of the all countries defined in the International Organization for Standardization (ISO) ISO 3166-1:2006 standard up to the point of the string freeze.

However, this data is exempt from the string freeze policy. During the release cycle of a major release, additional countries may be added and country names may be altered to reflect the latest data available from the ISO. Countries that are removed from the ISO 3166-1 standard during the maintenance cycle are marked as "(historical)", but are not removed until the next major release.

Patches that contain additional countries or that have been significantly renamed to reflect a broader constitutional change beyond the borders of the existing country, should be tagged with "release blocker" to force the inclusion of this data into the next release cycle.

The upgrade process between major releases should try and predict the most common result based on the population. If a country splits into two, then the existing settings will be updated to the new country that has the highest population.

Examples:

Sudan was spilt into Sudan (Pop: 31 million) and South Sudan (Pop: 8 million). Since the major population base is Sudan, no update script is required.

The addition of South Sudan would be marked as a "release blocker"

Saint Helena was renamed to Saint Helena, Ascension and Tristan da Cunha. This includes an extension of the territorial boundaries and is considered a "release blocker". This is the only example to date where a naming change reflects a significant territorial change.

The renaming of Syria to Syrian Arab Republic and then later renaming back to Syria would not be considered a release blocker, but we would expect these changes to be included in a timely fashion.

TODO
Add examples of how to handle the changes, the alter hook and maybe contrib countries module.
Add a summary of the UN and ISO policies for naming and their release cycle.

Alan D.’s picture

Status: Needs work » Needs review

One final push to get things moving...

Naming

I hope that the country naming issue is resolved. It is nice to be able to answer questions like "Why isn't Wales or Scotland listed?", "Why is Palestine listed?", or even "Why is Israel listed?" with an generic answer that we base things off the globally accepted standard defined primarily by decisions within the UN.

Palestine and Israel are great examples for the use of a UN based ISO standard. The State of Palestine recognized by only about half of the world countries and Israel is not recognized by 33 countries, including 1 UN member.

I can imagine an answer like "We base things off a computer system that you are unlikely to know nothing about." as a recipe for a very perplexed response.

Country removal issue

I have created the issue #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk)) to separate the removal of "Netherlands Antilles" that is now holding things up.

Note, Curaçao was listed along side Netherlands Antilles, another option is to do nothing with Netherlands Antilles for the time being. Netherlands Antilles being dissolved in 10 October 2010, Curaçao was one of the official entries that was created, along with Sint Maarten and "Bonaire, Sint Eustatius and Saba". They should have never been listed together (not post ISO 3166-1:2006 anyway).

To get this moving again, happy to re-roll without renaming Netherlands Antilles.

Alan D.’s picture

Issue summary: View changes

Updated issue summary to help track the individual sub-issues raised within this issue thread.

Alan D.’s picture

Status: Needs review » Needs work

This may be a windows issue, but the Aland Islands to Åland Islands change results in a list where Åland Islands is listed last as Å is considered greater than Z on windows using natcasesort().

Unless there is a built in safe mechanism for sorting these correctly, I would leave Åland Islands as Aland Islands for Drupal 7 and look at resolving this correctly for Drupal 8. (eg: sort based on a transliterated strings).

TR’s picture

tim.plunkett’s picture

FileSize
6.67 KB

Reuploading the patch from the other issue so it doesn't get lost. Leaving at CNW.

sun’s picture

I agree with @Damien Tournoud in that we need an automated solution for this. Manually updating the list does not work out.

  1. I'd like to see a script for the ISO/UN data, similar to the one @Damien coded for Debian's iso-3166.xml.
  2. I'd like to see the results of both scripts as patches, and an interdiff between the two, so all of us can see facts.
  3. Based on that, we want to decide which data source is easier to use for updating the list programmatically.
  4. We need to invent a simple solution that allows us to retain/skip/remove the country codes on new sites, but retaining them on existing/upgraded sites.

    The removed entries could potentially live in a separate standard.old.inc file, which performs some additional version comparison logic to re-add entries which have been removed from the official list.

    Particular entries only need to be re-added, if a Drupal site already existed prior to removal; i.e., when the site was installed at a time when the removed entries still existed.

    We can also explore a more sophisticated system/language module update solution, by defining foreign keys for langcode in all tables (like we did for text formats), retrieving that data from the database schema, and automatically checking which removed codes need to be retained. Obviously, that lookup operation would have to "cached" in a variable or similar.

Alan D.’s picture

Manually updating the list does not work out.

It is sad that we can not push through 0 to 3 changes 1 liner patches per year, and that is the reason that this doesn't work... AN is a different, but the others are just string changes or new entities which actually mainly involve correcting the list prior to it's inclusion in core.

Points 1-3
The non-official ISO lists have been shown to have their own customizations to naming, which has been decided as to be unacceptable. This is an extremely political topic! Otherwise, we risk insulting millions from a country that none of us have potentially heard of.

From #41

Many people would consider this a regression [renaming Macedonia to "Macedonia, The Former Yugoslav Republic of"]. See Debian and Mozilla. The Debian project picked "Macedonia, Republic of", which seems to be consensual.

See http://drupal.org/node/1446414 as just one example of potential issues, this one involving Macedonia.

However, it could be an option to subscribe and pay for the ISO's database to use as a base source and periodically update from this directly or provide this as a web service.

Alan D.’s picture

And maybe as there is no real usage of this data in core, should we simply remove the country listing from core and let contrib handle this?

For D6 has the countries_api module, D7 has the countries module.

droplet’s picture

This is an extremely political topic!

I see a new name in the #53 list will make their residents unhappy.

Alan D.’s picture

But we can blame the UN :)

sun’s picture

Status: Needs work » Needs review
FileSize
8.37 KB

Let's keep the update/upgrade discussion in #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk))

Attached patch updates standard.inc to the latest Debian ISO-3166 data and adds a script to automate the update.

sun’s picture

FileSize
8.36 KB

Fixed trailing white-space.

sun’s picture

FileSize
8.24 KB

d'oh. Also removed the duplicate Debian repository $uri. (originally caused by bogus link in #44)

Alan D.’s picture

This breaks the sorting of country data

- 'AX' => $t('Aland Islands'),
+ 'AX' => $t('Åland Islands'),

From memory, single quotes should be wrapped in double quotes for translations:

- 'CI' => $t('Ivory Coast'),
+ 'CI' => $t('Côte d\'Ivoire'),

shouldn't this be

+ 'CI' => $t("Côte d'Ivoire"),
+ 'KP' => $t("Korea, Democratic People's Republic of"),

from

+ $out .= ' ' . var_export($code, TRUE) . ' => $t(' . var_export($name, TRUE) . '),' . "\n";

And some documentation that explains why we are not using the official standard, rather a slightly modified one from a third party source, ready to pop in the documentation page about countries that are used on drupal.org maybe.

sun’s picture

Assigned: TR » sun
FileSize
9.32 KB

The good news is that the script totally works and I was able to only adjust the script according to #62 and run it again.

Added some docs, and fixed the quoting issue.

The sorting is NIH. natcasesort() should be able to handle that. If it doesn't, that's a separate issue (which existed before already).

sun’s picture

FileSize
9.82 KB

This one goes one step further and ensures that already existing country codes are not removed with the script.

Alan D.’s picture

The sorting is NIH. natcasesort() should be able to handle that. If it doesn't, that's a separate issue (which existed before already).

This doesn't work on windows or *nix boxes. Basically, "Å != A". Drupal core doesn't have any non-ascii out of the box here, so this limitation is not visible. So this would introduce a bug with the ordering of the core Regional settings Default country and every contrib module that relies on the core country listing. Åland Islands will be listed last.

AFAIK, there are no native sorting support in PHP for this. Transliteration in core maybe :)

Minor point, isn't the new doco policy to list all @see last?

sun’s picture

1) Again, fixing the natcasesort() to account for Unicode in territory names is a different issue. This issue is about updating the country data in standard.inc.

2) Nope, the documentation standard is to use @see where appropriate. It usually comes last when it applies to the entire phpDoc block. In this case, the @see is directly relevant for the paragraph right above it.

Alan D.’s picture

Regarding @see, good to know.

It appears that the sorting limitation has not been exposed to users very often (yet).

#1380432: Provide sorting support for accentuated (or UTF-8) letters..
#118898: Alphabet sorting of vocabulary terms or menu items does not work well with accented characters (Terms/Menus in Drupal 5)

AFAIK, the database driven sorts (ORDER BY) are fine with the transition to UTF-8, although I only have tried on MySQL 4/5.

YesCT’s picture

Issue tags: -Needs backport to D7, -D8MI

#64: drupal8.iso-3166.64.patch queued for re-testing.

Status: Needs review » Needs work

The last submitted patch, drupal8.iso-3166.64.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
Issue tags: +Needs backport to D7, +D8MI

#64: drupal8.iso-3166.64.patch queued for re-testing.

sun’s picture

The way this issue is progressing is exactly the reason for why we have completely stale and outdated and invalid data.

Xano’s picture

Status: Needs review » Needs work

If we are willing to support ISO 3166-3 next to ISO 3166-1, we do not have to remove countries. Instead, we update their codes to reflect ISO changes, and provide a way for contributed modules to do the same. That way, we prevent code from ending up in a state like "Err, yeah, we know this was a country once, but now, err, we don't know. Tough luck!". See #2 in #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk)) for a bit more background information.

sun’s picture

Status: Needs work » Needs review
  1. This patch does not remove existing entries. The update script explicitly takes care of that.
  2. The issue you referred to was spawned by this issue as a follow-up. Separate topic, not to be resolved here.
sun’s picture

#64: drupal8.iso-3166.64.patch queued for re-testing.

sun’s picture

Issue tags: -Needs backport to D7

In case anyone wants to ship with more up to date country data in D8, now is the time to mark #64 RTBC.

YesCT’s picture

Status: Needs review » Reviewed & tested by the community

code looks good.
follow-up issues are open already (or where already existing).

TR’s picture

Status: Reviewed & tested by the community » Needs work

I strongly disagree that this is RTBC.

I am the person who opened this issue two years ago. I supplied the original patch which fixed the core list of countries and made it conform with ISO standards. As such, it goes without saying that I am fully in favor of fixing the broken list of countries that is currently in core, but the patch under consideration does not do this.

The original patch was derailed by the desire to include an updating mechanism, i.e. how does existing country data in the database get updated when the list of countries in standard.inc gets change. That requirement has since been abandoned.

The patch from @sun currently under consideration differs from mine in #24 in that @sun takes his country data from Debian rather than ISO. The two are different - ISO is the primary source; Debian has made changes to the ISO data and Debian data lags behind the official ISO data. For a complete discussion of this, please read my posts #6, #27, #45, #47 plus ALL the other associated replies. THESE CONCERNS HAVE NOT BEEN ADDRESSED, and are ignored by the current patch.

In order to fully understand this issue, you have to read this ENTIRE thread, not just glance at the latest patch.

@sun's patch also includes a script which allows individual sites to update their own country data, but this script does NOT address the updating issues like what happens to abandoned country names and how does core (and contributed) country data get updated if the core .inc file gets changed. @sun himself said in #59 that the updating problem should be left to another issue. Why is this script still part of the patch?

So in short, my objections to the patch in #64 are that it 1) takes data from a secondary source (Debian) which does not agree with the primary source (ISO), 2) includes an "update" script that doesn't do anything other than update the .inc file (ensuring that different installations with the same version of Drupal have different country lists, and that historical country data stored in user profiles or used in other contexts is out of sync with the .inc file after the shell script is run).

As far as I'm concerned, the patch in #24 satisfies all the current requirements in the issue summary above, and the patch in #64 is just wrong and overreaches. If an update script is really desired, I'd be happy to add one that takes the data from the ISO XML; Debian's data is wrong and shouldn't be used.

TR’s picture

I'm curious as to how/why the revision history for this issue, http://drupal.org/node/1068840/revisions, shows that I edited the issue summary on 13/2/2013. I did not.

Gábor Hojtsy’s picture

@TR: you changed the issue status which is part of the main issue metadata.

Alan D.’s picture

I think that sun's compromise is (generally) fine.

1) The list does get updated at least, from 15 or so errors down to 1 (or 0 if you like Debian's interpretation).

2) This list is consistent across all installations by default.

* If you need to share data and you run the shell script, just ensure that this is part of the deployment process on all sites.

* Drupal is still one small player in a global market. Any scripts should have built in error checking. For example, you should expect SS to come from a news feed and this is currently not present.

3) There are no negative repercussions for upgrades.

The only removed country from the ISO standard is AN from memory, this is not removed. Any issues here can be taken up in #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk))

4) Uses one of the free lists of ISO data out there.

As far as I know, this list is the only free service from ISO. These can not be directly used as these are in uppercase only, country names use mixed cases. Scaping HTML would raise copyright issues.

And sad to say, but at least Debian maintainers are at least more active.

BUT

1) If the script is used and something was changed, then it must be run after every update of core. I think that this would effect some users. With Scotland debating to get some degree of sovereignty, there is the possibility that the number of affected users could be high before D9 is out :(

Is there a better way of handling this using configuration files or something? By default, it contains the corrected list without AN, an upgrade script from D7 could be added to append AN. And then minor change to core so that it uses this list rather than just the country array. Or at least var_dump to a custom include, something that is not going to be overridden.

This is my only concern re-RTBC this issue.

Similar issue: #1632236: Convert built-in language list to CMI

YesCT’s picture

re #78 @TR and @Gabor, yep. I've run into that before too. #1217286: Posting a comment changes too many issue node revision properties

sun’s picture

Status: Needs work » Needs review
Issue tags: +Proudly Found Elsewhere

Clarifications:

  1. The update script is primarily meant for updating the country codes in a Drupal core issue/patch, to prevent this situation from happening again. This operation can happen at any time, including point releases of existing stable releases, since the script does not remove preexisting country data. Developers/site-builders MAY also use the script if Drupal core happens to lag behind and they urgently need latest country data. The script reads and writes PHP code, so it can only be used by developers in the first place.
  2. The incompatible format of the official ISO data has been discussed many times already, so I do not understand why we're circling back into that topic again. Debian's data is directly derived from ISO and in the format we need, and it went through sufficient FOSS community discussions already; in particular with regard to country labels in the official ISO data that are potentially offending for citizens of a particular territory; see #41.
  3. The country data is declaration, not configuration. What would be configuration is a UI that would allow you to disable/hide selected countries on your own/particular site (because you do not expect users/data that needs them), but that could only have an impact on your site's UI; at the API/code level, the complete country data must be available, as it is typically used when interacting with other web services and APIs. An administrative UI to disable/hide countries is a separate issue though, but one we can explore.
Alan D.’s picture

Is it possible to use yaml files without an UI? Sorry, haven't had time to research many changes in D8 myself due to bottlenecks at work.

If so, then that was the only change that I was suggesting. This avoids additional update steps when the core file gets overridden during updates. Not everyone uses deployment scripts / update scripts.

The country data is declaration, not configuration.

No, it is both as this is the list that should drive all country lists on Drupal.

i.e. Drupal's site country, Webform's country lists, Countries country fields, ....

An administrative UI to disable/hide countries is a separate issue

I think that this is a contrib issue, the Countries module already does this, or alter hooks if you don't need the overhead :)

iMiksu’s picture

I want to point out an related issue I posted separately (sadly I missed this issue): #1917200: Country name changed for "Palestinian Territory"

Alan D.’s picture

Status: Needs review » Needs work

The English short name is "Palestine, State of" as per http://www.iso.org/iso/iso_3166-1_newsletter_vi-14_name_change_state_of_...

sun’s picture

Status: Needs work » Needs review
FileSize
9.11 KB

I've re-executed the script.

Still works excellently, despite being almost one year old. :)

And that's literally all we should do now and in the future: Re-execute the script. If the country data is still outdated, contribute upstream to Debian. FOSS.

sun’s picture

FileSize
9.09 KB

Sorry, I had an outdated copy of the XML file in my filesystem.

Turns out that Palestine has been adjusted upstream already.

Alan D.’s picture

And one less inconsistency with the ISO standard with this change :)

@sun
If a YAML file without an admin UI is out, could an include be used instead? Young shops / individuals are likely to do manual core updates, so having a "patched" core file will lead to some minor issues as these users inevitably forget to re-run the script.

Something like changing the script to create the file DRUPAL_ROOT . '/core/includes/standard.countries.inc' and just var_dump'ing the $countries array and this:

function standard_country_list() {
  static $countries;

  if (isset($countries)) {
    return $countries;
  }
  $t = get_t();

  if (file_exists(__FILE__ . '/standard.countries.inc')) {
    include __FILE__ . '/standard.countries.inc';
  }
  else {
    $countries = array(
      'AD' => $t('Andorra'),
      ....
    );
  }
  // Sort the list.
  natcasesort($countries);
  ....
}
sun’s picture

I do not see why novice developers ever would or should execute this script.

We have other scripts in core already. Before you execute a script, you better know what you're doing.

Lastly, no one needs to execute this script, if we regularly execute it for point releases in the future. Again, the patch in #87 is fully automated, a matter of seconds.

Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

I already explained in #48 why I believe this is a good idea.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed to 8.x. Thanks!

giorgio79’s picture

I guess it's late notice, but re this comment at #47

"Because ISO data is not readily available, and Debian is doing a great job at maintaining accurate, ISO-compliant lists of language, territory, currency, script codes (*and* their translations in many languages)."

The emerging de facto standard for locale and i18n data (including countries) is http://cldr.unicode.org/. All the big boys like Apple, Google, IBM (and Debian :) ) use this, and it also contains all the translations of the country names.

The countries drupal project is looking into leveraging this set of data as well here #1157504: New countries

sun’s picture

@giorgio79:
If the data is publicly available, scriptable, and contains the information we need, then we can always change the update script to use that resource instead.

Do you want to create a follow-up issue for discussing that and link to it from here? :)

For now, it was most important to get our data finally updated.

@all: Note that we should still try to resolve #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk)) for D8. It doesn't make sense to include bogus/outdated data in our default list. However, we'd have to provide a careful update/upgrade path for existing sites, which might have other data associated to the old/bogus values. That's the larger discussion we need to have over in that issue.

@webchick: Is there a handbook page for Drupal core maintainers somewhere that describes the procedure for creating a new stable/point release? This update script should be invoked whenever we'll create a new point release in the future.

webchick’s picture

Yep, it's at http://drupal.org/node/721106 although I haven't used it now in close to a year, so no idea how out of date it is.

sun’s picture

wowza. No idea what the appropriate place would be in there... ;)

Perhaps we should rather just roll with filing individual issues when necessary...

webchick’s picture

That would be my recommendation. ;)

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

JohnAlbin’s picture

Status: Closed (fixed) » Fixed

Good job on making a semi-automated way to create a country list! +1

But I consider the use of Debian as the upstream source for our country data as a MAJOR regression.

See the follow-up issue for more information: #1938892: Switch from ISO-3166-1 country data to CLDR unicode data

Pancho’s picture

Issue tags: +country list

Retroactively tagging.

TR’s picture

Version: 8.x-dev » 7.x-dev
Assigned: sun » Unassigned
Status: Fixed » Needs review
Issue tags: +Needs backport to D7
FileSize
9.11 KB

The list of countries in D7 is still broken - fixing that was the reason this issue was opened two and a half years ago.

Attached is the D8 patch from #87 backported to work in D7, with no other changes made.

Pancho’s picture

Status: Needs review » Postponed

I agree that some kind of backport to D7 would be nice.
But after this patch we switched to CLDR, see #1938892: Switch from ISO-3166-1 country data to CLDR unicode data. And now we still have to figure out a number of important things, including the correct upgrade path.

So what we can't do is just backporting this patch to D7.
We need to postpone this until both #1436754: Handle countries that have been removed from the CLDR standard (e.g: Netherlands Antilles) or changed (e.g.: Czechoslovakia (cs) => Czech (cz) | Slovakia (sk)) and #2036219: [policy] Inclusion criteria for CLDR territories in CountryManager::getStandardList() are fixed in D8. Only then we know what we can backport and decide what we want to backport. Sorry.

TR’s picture

Status: Postponed » Needs review

Yes, I'm fully aware of the other issue, and I don't see how that prevents backporting this patch. D7 is broken in this regard and has been for a very long time. The patch in this issue went in without an upgrade path, and so did the CLDR patch. We are getting further and further away from the original intent of this issue, which was simply to fix the extremely outdated country information that exists in D7. The whole "upgrade" issue has been raised at several points and used to derail the simple fix, then later on the whole "upgrade" issue was ignored in order to push through an ill-conceived alternative. Please be consistent - if #87 was good enough for D8, it's good enough for D7. The CLDR patch does not backport well because of the big changes in D8 since this current issue was originally opened, but if you desire to backport CLDR then it will certainly be easier to do if D7 has the same changes that were in D8 prior to the CLDR patch. I personally don't see the need to backport CLDR at all.

At some point I would like to see someone acknowledge that it was a mistake in the first place to embed dynamic string data (country names) into Drupal code where it can't be modified (string freeze) or effectively updated (no versioning for modules to use) - then let's get past that mistake and figure out how we are going to support the user base in a timely manner until we get around to correcting the architecture.

Here's what I wrote two years ago (see #6 above):

I am adamantly opposed to putting this change off until Drupal 8. The political divisions in this world are constantly changing. I'm pretty sure that was known when it was decided to put this list into Drupal. This list was meant to be complete and to keep up with changes in the world, not to be stagnant for the next two years until a major release. It would be totally unacceptable to tell residents of Sint Maarten, for example, that Drupal won't support their country until sometime after 2013.

No one disagreed with that, but that point seems to have been lost because this issue has morphed and moved to another thread where the original intent, context, and arguments were lost. (Which is why I'm reopening it here and not in the CLDR thread.)

It's really sad to me that this community has failed to accomplish something this simple in all this time. It seems that some are using this issue for political activism - I guess it's hipper to care about the people of Taiwan than the people of Sint Maarten or Transnistria, and it's more fashionable to ignore the interests of China than to ignore the interests of the United States and the UK ("why'd you say Burma?").

Setting to needs review again - test failure seems to be testbot error ("Base table or view not found: 1146 Table 'drupaltestbotmysql.simpletest945528cache_bootstrap' ")

Alan D.’s picture

Regarding string freeze

This means that this issue contradicts the core string freeze policy, but Dries #31 and Gábor Hojtsy #34 have given their approval to this.

Regarding Transnistria
I think that a fuzzy following of the ISO defined countries is going to be followed merged with the CLDR naming. So Transnistria is not included, nor is Kosovo.

In general
If only China, USA and Russia didn't have such strong influence at the UN we probably wouldn't really have to deal with this

i.e. Kosovo, Palestine and Taiwan

So I guess that the change to CLDR naming will mean that any changes here are now deferred, or we get a change to the ISO name then another to the the CLDR naming afterwards. And I'm not sure if it is right to say hipper, just more western.

All I can say is there are going to be very vocal users from Greece in the queues when Drupal 8 comes out ;)

Pancho’s picture

Re #102:

if #87 was good enough for D8, it's good enough for D7

Tim,
you've been contributing to Drupal for quite some time. If it weren't so, I'd say: "okay, I understand that, when I was new to Drupal I also didn't understand why some things are not simply getting fixed rightaway."
But you've done a lot for Ubercart and other contrib projects, so you should know the difference between committing something to the trunk and backporting something to a stable version with an installed code base of half a million.
I must admit that I somehow expected such kind of answer, and that's why in #101 I tried to explain it in detail. But that should really suffice.

If there was an urgent need to backport at least the absolute minimum now, then it would be:
- adding the undisputable codes 'BQ', 'SS' and 'SX' (as you do)
- and updating all entries to the CLDR (!) names, so they match the names in D8 now
such as the enclosed patch does.
I'm not completely sure it's worth it, but at least it wouldn't do any harm regarding the upgrade path we still have to come up with.
But at this point we certainly can't do more. Without an upgrade path, we can't throw out a record, neither can we allow a script to do the same. And we won't replace strings by something that then unnecessarily gets changed again in D8.

At some point I would like to see someone acknowledge that it was a mistake in the first place to embed dynamic string data (country names) into Drupal code

From today's perspective I'd agree. But this is old code, much older than your issue, and at that time we didn't have the possibilities we have now, we didn't have config that seems the best solution for D8.
Now do you want someone to say: sorry, back then, that's been no good decision? I mean your patch doesn't fix that either.
I guess you you're just angry about something. So pleeease relax a bit, and feel free to find the super-duper D8 solution that gets this fixed once and forever. :)

Pancho’s picture

Just fixed the docs.

Pancho’s picture

Issue summary: View changes

Mistakenly got the replacement order of Virgin Islands, British switched in the list above.

Liam Morland’s picture

Issue summary: View changes
Issue tags: -
FileSize
1.11 KB

Can we at least get a simple/minimal version of this committed? The attached patch adds all codes currently missing: BQ, SS, and SX using the ISO names. It also moves CW so that it is in alphabetical order.

Liam Morland’s picture

mgifford’s picture

mgifford’s picture

Title: core/includes/standard.inc contains inaccurate country data » includes/standard.inc contains inaccurate country data
Status: Needs review » Reviewed & tested by the community

Looks good.

Xano’s picture

Status: Reviewed & tested by the community » Closed (duplicate)

All these changes have already been made to \Drupal\Core\Locale\CountryManager. Sorry, folks! Thanks for the hard work.

mgifford’s picture

For D7?

Xano’s picture

Status: Closed (duplicate) » Reviewed & tested by the community

Oh, sorry! I overlooked that this is a backport.

mgifford’s picture

NP.

Liam Morland’s picture

Liam Morland’s picture

Title: includes/standard.inc contains inaccurate country data » Country codes missing from includes/standard.inc
Alan D.’s picture

CV => Cape Verde Cabo Verde if people want to stay in sync with the ISO

Change: 2013-11-26
Ref: https://www.iso.org/obp/ui/#iso:code:3166:CV

Liam Morland’s picture

If you read above you will see name changes are controversial. I suggest you open a new issue for CV so that this issue can stay focused on adding the missing codes.

Alan D.’s picture

I was simply pinging the thread with the change. Regarding the pain to get these changes in... oh how I am so painfully aware of this. ;)

Data sources in play are behind the ISO standard in this case, as of right now. Change is language neutral, so even wikipedia is out of date atm.

For D8 I think sun is going to run the script just before the 8.0 release to update the corresponding code. At that point I'm sure Debian will be up to date, though I have no idea on the CLDR updates. I think that the unicode version is going to end up as the data source, I gave up following these threads closely nearly a year ago.

I guess this country name change can be used as a guide to how reliable these are going to be.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 106: core_1068840_simple_codes.patch, failed testing.

mgifford’s picture

Status: Needs work » Needs review
Liam Morland’s picture

Status: Needs review » Reviewed & tested by the community

Setting back to RTBC.

Liam Morland’s picture

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs review

Hate to do this, but I'm really confused why the patch in #106 is using different country names for some of the new countries than Drupal 8 does. The current Drupal 8 list goes back to June of last year (#1938892: Switch from ISO-3166-1 country data to CLDR unicode data) and came after a lot of discussion... and it doesn't make sense to me that Drupal 7 and Drupal 8 would differ on what the correct name of a country is (!).

In particular, Drupal 8 currently uses "Caribbean Netherlands" (rather than "Bonaire, Sint Eustatius and Saba") and "Sint Maarten" (rather than "Sint Maarten (Dutch part)")...

As for updating existing country names (in addition to adding new ones) I think that kind of string breakage is completely legit in Drupal 7 if the old name is truly incorrect (https://drupal.org/node/1527558), but agree it's fine to get the new ones in first.

Liam Morland’s picture

Either way is fine with me. I just want the codes to go in. Attached is a patch that is identical to #106 except that the newly added codes use the same names as in D8.

Liam Morland’s picture

If there is any controversy about #124, please at least commit this patch, which only adds South Sudan. As far as I know, there is only one English name for this country.

Alan D.’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs backport to D7

Best to keep in sync with the new names that are in d8 imho.

I think that kind of string breakage is completely legit in Drupal 7 if the old name is truly incorrect

As per the summary:

Dries #31 and Gábor Hojtsy #34 have given their approval to this exception for the string freeze exception.

Avoid "correct names", rather I'd recon it is better to use "CLDR names" or "Unicode Consortium names" to keep things as unpolitical as possible.

Anyways, 3 years 3 months since Drupal 7 release with incorrect data, lets get this puppy in.

The sea gulls in Clipperton Island will love D8 ;)

Liam Morland’s picture

Title: Country codes missing from includes/standard.inc » Country codes missing from includes/iso.inc

Thanks.

Liam Morland’s picture

David_Rothstein’s picture

Status: Reviewed & tested by the community » Active
Issue tags: +7.28 release notes

Committed #124 to 7.x - thanks!

So, should this go back to "active" to discuss whether any of the existing country names are egregiously broken enough that it's worth breaking existing translatable strings for... or would that be better in a different issue?

  • Commit 0736eee on 7.x by David_Rothstein:
    Issue #1068840 by sun, TR, Liam Morland, Pancho, Alan D., tim.plunkett,...
Alan D.’s picture

The cldr names could be considered a regression and "Å" upsets PHP sorting, so imho, this should be closed.

At least two are wrong in the CLDR with regards to the state preferred naming; Cabo Verde and State of Palestine.

However, if they are going to change, they should change to the D8 version, a manual diff is below:

D8 Additions (note all invalid ISO codes bar XK)

+ 'AC' => t('Ascension Island'),
+ 'CP' => t('Clipperton Island'),
+ 'DG' => t('Diego Garcia'),
+ 'EA' => t('Ceuta and Melilla'),
+ 'IC' => t('Canary Islands'),
+ 'QO' => t('Outlying Oceania'),
+ 'TA' => t('Tristan da Cunha'),
+ 'XK' => t('Kosovo'),

D8 names moved into sync with the CLDR repo / ISO standard

- 'AX' => t('Aland Islands'),
+ 'AX' => t('Åland Islands'),

- 'CI' => t('Ivory Coast'),
+ 'CI' => t('Côte d’Ivoire'),

- 'RE' => t('Reunion'),
+ 'RE' => t('Réunion'),

D8 names moved in sync with CLDR but not the ISO standard (many are neither atm)

- 'CC' => t('Cocos (Keeling) Islands'),
+ 'CC' => t('Cocos [Keeling] Islands'),

- 'CD' => t('Congo (Kinshasa)'),
+ 'CD' => t('Congo - Kinshasa'),

- 'CG' => t('Congo (Brazzaville)'),
+ 'CG' => t('Congo - Brazzaville'),

- 'HK' => t('Hong Kong S.A.R., China'),
+ 'HK' => t('Hong Kong SAR China'),

- 'MF' => t('Saint Martin (French part)'),
+ 'MF' => t('Saint Martin'),

- 'MM' => t('Myanmar'),
+ 'MM' => t('Myanmar [Burma]'),

- 'MO' => t('Macao S.A.R., China'),
+ 'MO' => t('Macau SAR China'),

- 'PN' => t('Pitcairn'),
+ 'PN' => t('Pitcairn Islands'),

- 'PS' => t('Palestinian Territory'),
+ 'PS' => t('Palestinian Territories'),

- 'ST' => t('Sao Tome and Principe'),
+ 'ST' => t('São Tomé and Príncipe'),

- 'UM' => t('United States Minor Outlying Islands'),
+ 'UM' => t('U.S. Outlying Islands'),

- 'VA' => t('Vatican'),
+ 'VA' => t('Vatican City'),

Liam Morland’s picture

Status: Active » Fixed

Thanks for the commit. This issue is about the missing codes. Discussion of renames should be done in a different thread.

Alan D.’s picture

Title: Country codes missing from includes/iso.inc » core/includes/standard.inc contains inaccurate country data

Reverting the title back to refocus the thread back to it's original issue. Although I recon it should stay closed ;)

mgifford’s picture

@Alan D - can you open up a new issue for the changes in #131.

Alan D.’s picture

:/ I'd want to go back to the ISO standard! Best to let sleeping dogs lie imho.

Liam Morland’s picture

I understand that D8 will use CLDR. Is D7 supposed to use CLDR or ISO? That needs to be clarified by the project leadership, then there can be an issue to make D7 align with whichever standard is chosen as the reference.

Alan D.’s picture

Drupal 7 uses the ISO 3166-1 alpha-2.

Liam Morland’s picture

In that case, I think there should be a new issue with a title something like "Correct ISO 3166-1 alpha-2 data in _country_get_predefined_list()". That issue might be a child of this, but I think this issue is too long and a new issue would help to regain focus.

Alan D.’s picture

There was a string freeze exception added 2 years back but I'm not sure if the powers to be will want a D7.23 "Hong Kong S.A.R., China" to D7.?? string change to "Hong Kong" to another string change for D8.0 "Hong Kong SAR China". Or "Vatican" to "Holy See (Vatican City State)" to "Vatican City" as another example.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

David_Rothstein’s picture