What is the upper case of ‘a’? Obviously, it’s ‘A’. The lower case of ‘Z’? It’s ‘z’, naturally. Finding the upper or lower case of a Latin character seems like a trivial task. However, in rare cases, the result depends on the language in which the character is used.
Take the letter ‘i’. In most languages, its upper case is ‘I’. However, in Azeri and Turkish, the upper case of ‘i’ is ‘İ’ (and conversely the lower case of ‘İ’ is ‘i’). What is the lower case of ‘I’, then? It’s ‘ı’, the dotless version of ‘i’. One cannot deny a certain consistency in the way the two languages handle the letter casing: the miniscule dotless ‘ı’ becomes the majuscule dotless ‘I’, while the miniscule ‘i’ with a dot becomes the majuscule ‘İ’ with a dot.
Another interesting example are the letters ‘ð’ (Icelandic et al.), ‘đ’ (Croatian, Vietnamese et al.), and ‘ɖ’ (African languages): all the three letters have “the same” upper case (‘Ð’, ‘Đ’, ‘Ɖ’). Therefore it is impossible to tell what is the lower case of the majuscule ‘D’ with a stroke by looking only at its graphical representation.
Interestingly enough, Unicode handles the two above examples in two different ways.
On one hand, there are three Unicode characters:
LATIN CAPITAL LETTER ETH),
LATIN CAPITAL LETTER D WITH STROKE),
and ‘Ɖ’ (
LATIN CAPITAL LETTER AFRICAN D)
which look the same, but each of them has a different lower case.
Therefore, given any of the Unicode characters,
there is no ambiguity as to its lower case.
On the other hand, Unicode contains
only one miniscule letter ‘i’ and one majuscule ‘I’,
and thus the task of finding the lower case of ‘I’
depends on the language in which the letter is used.
The latter ambiguity has some consequences in programming,
esp. in case-insensitive string comparisons:
depends on the implementation of conversion to upper/lower case.
A lot of buggy code has been written that works everywhere except Azerbaijan and Turkey,
since many programmers tend to assume
that the lower case of ‘I’ is always ‘i’ and the upper case of ‘i’ is always ‘I’.