2009-05-17

Missing closing delimiters

By now, a block with a missing closing delimiter was properly detected as an error, but the error message was ugly. See, for this:


Something -- missing

You got:


line 0:-1 mismatched input '' expecting HYPHEN_MINUS

Not a great deal here, but pretty annoying in a 1000-line long source document.

After a close look, it looked very complex to determine where the error was coming from. Considering this case:


There " is ( something " missing

… The problem is obviously with the unclosed parenthesis. It’s easy to see (for a human) because parenthesis are paired delimiters: there is an opening and a closing one. The double quotes " is single in the sense it may be used for both opening and closing a block, depending on the context. In the example above, the Novelang parser started evaluating a parenthesized block, and the double quote looked like an unclosed block. How to handle this correctly?

— In order to avoid grammar bloat, the grammar emits some kind of events, telling it started parsing a block with such or such delimiter. The position of every token for a start delimiter is kept. If something goes wrong, the error message(s) will report the position of the unclosed delimiter.

— Event consistency check is scoped: if an unclosed delimiter is detected inside a paragraph, this should have no influence on the way unclosed delimiters are handled inside another paragraph.

— When trying to figure where is the opening delimiter with no closing counterpart, the trick is, to look at paired delimiters first. If something went wrong with paired delimiters, just report the errors about them. Otherwise, report errors with single delimiters.

I just checked this new feature into Github and the results are pretty good. Given source document like this (line numbers added for clarity):

1   ( s
2  t -- u
3  v )
4  
5  // w
6  x [ y
7  z //

Instead of a bunch of nonsense, Novelang now returns following problems:


2:2: Missing delimiter. For '--' there should be a matching '--' or '-_'
7:4: no viable alternative at input '\n'
6:2: Missing delimiter. For '[' there should be a matching ']'

This will be available in the next version (0.28.0). Keep informed reading this blog!

No comments: