fr.xmlhyphenation rules, which fortunately comes under the GPL license. Hyphenation takes care of apostrophes. That's because with a "remain character count" of three it is correct to hyphenate a word like "
l'attrait" like this: "
fr.xmlwas quite clear, with many occurences of the APOSTROPHE character (U+0027) which is also called "single quote" and looks symmetrical. But hyphenation occurs after the FO-generating XSL replaced the
<apostrophe-wordmate>element by the RIGHT SINGLE QUOTATION MARK character (U+2019) which looks better than APOSTROPHE, but was not understood by hyphenation rules, causing potential hyphenation bug on every word with a "relooked" apostrophe. I spent much time trying to hack the rules which were correct, and finally the solution was to replace every APOSTROPHE by RIGHT SINGLE QUOTATION MARK (the
’XML entity). Because hyphenation worked better, it changed the word distribution and raised another problem: some proper nouns got hyphenated. FOP documentation tells about an
<exceptions>element containing words to not hyphenate at all. First it didn't work and I had to trace into FOP code to find out that every word in exception list should be lower-cased. So Novelang could support:
- An exception list declared in the Book file itself.
- Automatic replacement of the quoting character.
hyphenation.dtd). Generating temporary files may seem unelegant but it makes debugging easier than in-memory structures and playing with custom URL protocols. Hyphenation would get really simple for French users! Now this opens another interesing question : how to handle documents with several languages?