2008-09-03

Problem with 'œ' and 'Œ' characters

By now, French users of Novelang willing to type "œ" and "Œ" need to type "«oelig»" and "«OElig»" (yes, angled quotes included). That's especially boring for Mac users who are eager to just type Alt-o and Shift-Alt-O. The Unicode specification makes œ and Œ ( 'LATIN SMALL LIGATURE OE' and 'LATIN CAPITAL LIGATURE OE') part of Latin Extended-A Block. All other letters with French accents are part of Latin-1 Supplement. Unfortunately, the commonly-favoured ISO-8859-1 encoding doesn't include "œ" and "Œ". As a consequence, while those characters may appear in a text editor configured to save files in ISO-8859-1 encoding, they'll appear as question marks when reopening the document. The Latin-1 supplement seems to offer characters that look the same: 'STRING TERMINATOR' (U+009C) and 'PARTIAL LINE BACKWARD' (U+008C). But I don't think it's a good idea to use them as their name suggests they have another purpose. Googling on "latin-extended-b iso-8859-1" I discovered this page listing all differences between ANSI (aka Windows-1252), Mac Roman and ISO-8859-1. Very useful! It seems that ISO-8851-1 was not such a clever choice, but I can't find any multiplatform 8-bit encoding including every commonly used French character.

No comments: