2008-12-09

Ongoing grammar refactoring

As refactoring goes on, it's time to answer some questions. The main new features (URLs and list items inside a paragraph) have a great on the whole design. Both may take place inside a paragraph; both start (and end) with a line break. That's a major change, as in former design the paragraph was a single piece as long as there were no two contiguous line breaks. While it's not a part of the grammar itself (as in the Novelang.g file), URLs support being "decorated" with a preceding double-quoted block, or an angled-bracketed block. So we should consider that URLs inherently spread over many lines and expose a double-quoted block. List items inside a paragraph are called "small list items". They cannot spread over more than one line (no break inside). For this reason, a small list item may not contain a paragraph. So we have a new beast: text blocks which don't contain line breaks as paragraphs do. For this reason, they cannot contain URLs or small lists. We'll call monoblock a text block with no line break. We'll call spreadblock a text with line breaks (as it may spread over several lines). We can't have URLs inside small lists. But's that's ok because there is another thing called "big list items" that's a plain paragraph with a special indicator at its start. By now (master branch), titles for sections and chapters are casual paragraphs. I wonder if it now makes sense to have URLs or small lists inside a title? Another limit I put arbitrarily is to forbid URLs and small lists inside double-quoted blocks. This sounds right for the URL because of the double-quoted block that may decorate a URL (it's logically impossible to have double-quoted block inside double-quoted block without an additional delimiter). For the small list, it's more for typographical sanity. Does it make sense to extend this to any asymmetrical delimiter (emphasis, interpolated clause...)? Sometimes it's useful to emphasize a whole paragraph, including its URLs and small lists. Because I'd like Novelang grammar to be twisted and abused in any technically-feasible way (for creating idioms), it doesn't make sense to forbid titles to be paragraphs, nor to forbid asymmetrically-delimited blocks (except double-quoted blocks) to spread over several lines and contain URLs and small list items. Depending on the stylesheet, things like this could make sense:
=== This is a section with embedded small list items:
- item1
- item2
- item3

This is a paragraph.
Rendering may look like:
This is a section with embedded small list items: item1, item2, item3.
Same for URLs. Because URL href must stand on its own line, this encourages adding a display text:
=== "This is URL display text"
http://foo.com
At the end, it seems that technically feasible things drive to well-designed grammar! Thanks to ANTLR for helping me to express the grammar so clearly.

No comments: