Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
It's no problem if you site language is English. But I have actually a problem because my site language is not English. So I see three solutions:
- Convert: mother tongue names -> ASCII names (“kind of” mother tongue). Sometimes easy, sometimes impossible (dependent on language), but either way, it's awkward, monster-like :-).
- Provide translation: native names -> English names. This is painful, costly, sometimes impossible. Completely unnecessary, if you site have only one language.
- Allow UTF-8 names of bundles (content types) and fields.
Comments
Comment #1
yched CreditAttribution: yched commentedThere can not be UTF8 in bundle names, we generate db tablenames from them, possibly variables, function names
A bundle is a machine name. Untranslatable, a-z and _. Just like current node type names.
If that's not in the API docs, we should make that clear.
Unless I'm missing something, this is a won't fix.
Comment #2
mki CreditAttribution: mki commentedAccording to MySQL 4.1 manual:
According to PostgreSQL 7.3 manual:
Bundle (content type) and field names are something considered to be data, not algorithm, classes, functions, variables, or any hard-coded identifiers. Please distinguish these two area. As data these names ARE translatable.
Yes, these names are symbolic names. But there are plenty of symbolic names, mostly notable URIs, domain names, directory/file names. And they ARE translatable.
So I would be happy with non-Latin bundle/filed names. Please consider this slightly seriously before "won't fix".
Comment #3
bjaspan CreditAttribution: bjaspan commentedBundle and field names are NEVER shown to end users as data. We are working seriously on making field data translatable (#367595: Translatable fields) but I cannot see any reason to expend serious effort to make field names.
That said, I'm not sure why we actually care that field and bundle names only contain [a-z0-9_]. I thought perhaps that PHP would not accept UTF-8 identifers but apparently it does:
So, yched: What would happen if we simply removed the preg check in field_create_field() that requires fields to be alphanumeric? If a particular database can't support it, then "don't use utf-8 field names with that database."
Comment #4
KarenS CreditAttribution: KarenS commentedWe need check what characters work as :
1) Function names
2) Table names
3) Column names
4) 'pseudo' field and table names in Views
5) Others, I feel there might be more
Plus we need to at least keep spaces out of the names (a common problem).
Also, do we care about case? Nix cares but Windows ignores it so 'MyField' will be the same as 'myfield' in Windows but two different fields in Nix.
We still have this check for content type names in the current code (or did last time I checked), so does the same argument apply there?
Comment #5
KarenS CreditAttribution: KarenS commentedEdit, you said field and bundle names, so forget the last comment :)
Comment #6
KarenS CreditAttribution: KarenS commentedOne more thing, some databases case about case. I think I remember that DB2 required all column names to be uppercased, so you couldn't have both MyField and myfield, both would have to become MYFIELD, so we need to think about ways that this could become a problem in various databases.
Comment #7
mki CreditAttribution: mki commentedThere is one place in Drupal 6 and Drupal 7.x-dev where users can see content type name: URL http://example.com/node/add/content-type-name. This is very important, please take a look at URL as UI (created in 1999! but very true so far).
Note #91744: Component based translation for paths. But this is not the point of this issuse.
Let's explain this issue by way of analogy. In String translation: why using t() for user specified text is evil? (2006-11-10) Gábor Hojtsy wrote:
In #141461: Object translation option #1: locale system, optimization strategies (2007-05-04) Gábor Hojtsy also wrote:
To sum up, I believe that we shouldn't assume that developer will use only English or Latin alphabet.
If your website contains simple content types, like Story, Article, and the like, that's OK. But if your website contains content types and fields that are specific for some area of knowledge (for example: medicine, economics, geography), then you have real troubles how to translate these names to English or "translate" to Latin alphabet.
When I'm thinking about Drupal as data repository (RDF, SPARQL, linked data, different storage engines etc.) for different area of knowledge, not as a simple sort of website like blog, this become a real problem.
I'm pleased with some naming convenion for bundle/filed name: word_word_word (PHP, database) and word-word-word (URL). That's OK as long as I can name things in my own language without effort into providing English translation that I will never use or even can't provide.
Comment #8
mki CreditAttribution: mki commentedPHP Manual make it clear:
(There is also UNICODE support to name variables and other PHP labels issue that have status "won't fix".)
So there is no default place for Unicode identifiers in PHP. Even if such code will work, this is going to be dirty hack that may stop working someday or somewhere. I'm not happy with such solution in production website.
I wonder if some kind drupaller could tell me why and where Drupal uses bundle/field name as variable, function or class name; why this is not considered as just configuration that appears only in database. Maybe this issue should be marked "by design", but first I'm trying to understand what's going on. (See also these comments).
Comment #9
bjaspan CreditAttribution: bjaspan commentedOkay, so UTF-8 for object properties are out, so UTF-8 for field names are out.
Bundle names show up in table names but not (as far as I can remember at the moment) in object property names. It appears that at the moment we are not checking anything about bundle names, so I doubt we would reject a UTF-8 bundle name; it would just work. But that's not a promise.
The one valid use case you have identified so far for actually caring about this issue field and bundle names in URLs. The URLs are not actually part of Field API, they are part of the CCK UI (which may be merged into core at some point, but isn't yet). There is no particular reason we cannot have a "UI name" for fields, like Label is for humans, which the UI puts into URLs instead of the field or bundle name.
So, I suggest identifying any other locations where field or bundle names are exposed to end users, and retargeting this issue to address them specifically instead of suggesting that field or bundle names themselves be allowed to contain UTF-8.
Comment #10
sun.core CreditAttribution: sun.core commentedI agree with Barry.
Comment #11
mki CreditAttribution: mki commentedPHP 6 allows Unicode characters in PHP identifiers. More information: http://schlueters.de/blog/archives/116-Unicode-identifiers.html
Comment #12
bjaspan CreditAttribution: bjaspan commented"Postponed until we require PHP 6" is sorta like "postponed until the Second Coming." But whatever.
Comment #13
mki CreditAttribution: mki commented