2008-07-24

Character escaping

I just fixed a few bugs, now literal form supports nested less-than / greater-than signs, except if there are three greater-than signs in sequence at the beginning of a line. Very sweet (at least for Novelang documentation) to make this a correct literal block (starting with '<<<' and ending with '>>>', both on the beginning of the line):
<<<
<<<
 >>>
>> >
>>>
This dramatically reduces the need for character escaping. Of course there is always a weird language to quote with three greater-than signs at the beginning of a line. And there may be other weird characters in a non-supported encoding. So we're hitting character escaping problem again. In the refactoring-characterescape branch I already pushed new character escaping based on the tilde '~' character but having a non-symmetrical delimiter makes the document source much less readable. Of course this is because I'm using character escaping as a workaround, until I implement better literal. But that unreadable stuff is like a warning that tilde character is inappropriate. And I realize that it's commonly used in programming languages, so it should be escaped in literal. Gets tedious when you copy-paste from your favorite programming language. As a Mac user I'm a bit stuck to their keyboard layout but I think that left and right pointing double angle quotation marks (don't laugh, it's official Unicode name) is ok. Instead of this:
~escapecode~
I'm about to switch to this:
«escapecode»
The interest is obvious when there are several escaped character to juxtapose:
«escape1»«escape2»«escape3»
is better than
 ~escape1~~escape2~~escape3~
On a Mac AZERTY keyboard the two characters are obtained with Alt-7 and Shift-Alt-7. There must be something similar on other platforms (Windows, QWERTY). Anyways this doesn't have to be used often so it's ok to use a weird character that doesn't appear in common text or programming language. It would be then possible to document Novelang correctly by giving a sample of literal like this:
<<<
<<<
Some literal here.
«greaterthan»«greaterthan»«greaterthan»
>>>
Or even like this:
<<<
Escape character like this: «lpdaqm»escapecode«rpdaqm».
>>>
Of course lpdaqm and rpdaqm stand for "left (respectively right) pointing double angle quotation mark". I prefer to avoid acronyms but this name is really too crazy.

No comments: