I use phpMyAdmin to create and manage MySQL databases. Whenever I install phpMyAdmin and then create a database, the default collation is latin1_swedish_ci. My question is, should I change this if the site is strictly English without any need for special characters? Is there any standards as to which collation to choose? I would be inclined to change it to utf8_general_ci or iso utf8_general_cs. But since the default is always latin1_swedish_ci I assume that there is a reason for this. Perhaps the general collation has more rules and so the database perhaps run better with a 'simpler' collation? Or is it just the makers of PhpMyAdmin or MySQL are Swedes?

Comments

mooffie’s picture

[...] the default collation is latin1_swedish_ci. My question is, should I change this if [...]

You don't need to change this. You may, but you don't need to.

The database 'default collation' determines the collation of newly created tables. But when Drupal ("modules", to be exact) creates tables it explicitly asks MySQL to create them with a utf8 collation, so this 'default collation' isn't used.

But since the default is always latin1_swedish_ci I assume that there is a reason for this.

Not really.

Or is it just the makers of PhpMyAdmin or MySQL are Swedes?

I guess so.

Mike_Waters’s picture

Or is it just the makers of PhpMyAdmin or MySQL are Swedes?

In many case-insensitive collations (_ci), some similar-looking characters are considered to be the same, for the purposes of sorting and comparing. For example:
a == a
A == a
á == a

This is probably due to the fact that the languages those collations are based upon do not use these characters.

Because the Swedish language *does* use many of these 'special' characters, they are distinct in the latin1_swedish_ci collation:
a == a
á != a

Using latin1_swedish_ci allows you the most leeway without having to go to utf-8 and incur the overhead associated with unicode.

There are a few other non-unicode collations with similar benefits, I've heard latin1_hungarian_ci is one of them.