The Novelang blog: May 2009

2009-05-30

Image processing

In my endless quest of nice libraries to integrate into Novelang, I’ve been wandering about image processing. This makes sense for technical documentation with screen captures; often it is useful to do some rescale of fade. Ccompression is useful, too, but it should be probably be left to the rendering stage.

When updating the captured image, you need to process the image again with your Gimp or whatever. This should be done automatically! I’m looking for a scripting language to do clever things. I’ve no idea on how to integrate it to Novelang – maybe some special files to avoid messing macro-instructions with content.

The language itself could be something like this:

  {
    rescale( 40% )
    fade( SOUTHWEST, 3px )
  }
  ./my-image.png

I want something clever with an explicit representation of pipeline processing. And, yes, it should be all in Java and with a GPL-compatible license. Am I asking too much here?

I’ve found an amazing software piece: ImageJ, a public domain tool for image processing with huge amount of macros and plugins. It seems widely used for science. Bad news, image transparency doesn't look like a great concern. The Alpha Channel plugin is the best I’ve found so far with its rough edges.

NetKernel has a pipeline image processing feature that looks like what I want. But I don’t like their everything-is-a-String approach.

Maybe I shouldn’t be so ambitious, and just start hacking “the smallest thing that could possibly work” for solving my own problem instead of looking for a save-the-world solution.

By the way, this is an interactive rendering effect editor based on BeanShell that may ease some pain while hacking image filters.

2009-05-28

Space character and related stuff

Blocks of literal

Most of times, the text inside blocks of literal should be kept in one piece. A blatant example is a numeric value and its unit.

Compact several spaces into one for the same reason as above.Trim leading and trailing spaces. Otherwise they offer a suspicious mean to override text layout.Replace spaces by non-break spaces.

With the low line character _ figuring the no-break space we’d like to obtain such transformation:

` 20   m  ` -> `20_m`

This means a long block of literal with several spaces (transformed into no-break spaces) could become very cumbersome and mess the layout. So we need a hint to allow line breaks at some places. This can be done by splitting the big block of literal into several small ones, which are not separated by spaces.

With the vertical bar character | figuring the zero-width space we have such transformation:

   `Y.O.U.``A.R.E.``B.E.A.U.T.I.F.U.L` 
-> `Y.O.U.|A.R.E.|B.E.A.U.T.I.F.U.L`

See more about the zero-width space here. A quick test shows that FOP supports it.

Implementation will be done at tree-mangling level. A whitespace between two consecutive blocks of literal will be replaced by a special node meaning that a break is allowed here. The special node will be replaced by a no-width space at rendering time.

Apostrophe

This technique could be useful to keep apostrophe character stuck to a word when in last position. By now, Novelang does not take care of the whitespace after or before the apostrophe.

he's here     -> he’s here
houses' roofs -> houses’roofs
during '60    -> during’60

This is because whitespaces are used as separators, but don’t cary “real” information (except in a few cases, like indentation for embedded lists). Before discarding WHITESPACE nodes, the ones immediately preceding or following an apostrophe could become an EXPLICIT_WHITESPACE to be rendered as, yes, a space character.

2009-05-27

Pretty color palette for tags

Default representation of tags attempts to help locating them at a glance, with nice colors. “Nice colors” means a lot of care.

Defining a color palette from scratch is tricky. Colors must be dinstiguishable one from the other. They must spread evenly on the visible spectrum; but this is not easy because the visual effect depends on the display. For this reason, I use the 140 colors of the SVG specification (the same are used in the CSS spec). Much of hard work is done here, including finding pretty names.

But that’s not all. Because the small rectangle of the tag has text, too, there must be a foreground color. First I tried to compute it, using a simple algorithm (increasing Red, Green and Blue of 50% each and applying modulus 255). The text was always barely readable. Not really good.

Another problem was the choice of the color for each tag. I’ve chosen to pick the color of each tag in a predefined list. When all the colors have been set, we start from the start again. This round-robin algorithm for chosing colors is ok, but inside the 140 colors, many look quite the same. Colors like mistyrose and lavenderblush are very close, and if we have only 10 tags, it’s a pity to see two tags looking the same. So it makes sense to edit the color list in order to make the first one look very different. In addition, because those first colors will be picked up the most often, they must be in the same tone (mild saturation).

If there are more than 10 or 20 tags, similar colors will be unavoidable, finally. But, since we display text (and a thin border) there is a foreground color to chose. This gives (140 × 139) 19460 possibilities! Of course background and foreground cannot be the same (hence the 139) and many possibilities are unreadable. But, given a color like white, those colors look quite similar: mintcream, honeydew, ghostwhite, floralwhite, seashell, azure, linen, aliceblue, cornsilk, oldlace, ivory, snow, whitesmoke. Wow!

Maybe there is a clever algorithm to detect which foreground colors give best contrast and distinguishability, but I didn’t find it. It seems much more convenient to let a human do the job.

Since editing some lines of code would require to switch back-and-forth between the code editor and the web browser, I wrote a palette editor based on a HTML page. It looks like this:

It’s easy to change the order of appearance of a color with a drag and drop:

And, after clicking on one color, you set the foreground with an alt-click on wished color.

Don’t forget to save using the Save button (File > Save in Web browser’s menu won’t work). Yes, the color palette editor only runs on Firefox by now.

The color palette is located in:

src/main-resources/style/javascript/colors.htm

This new feature (and the beautiful color palette) will be available in the next release of Novelang (0.29.0).

2009-05-24

Novelang-0.28.0 released!

Latest release available here.

Now tags are handled as query parameters. This is much faster on big documents, and it works for every kind of document.

See documentation for details, and the list of other enhancements.

2009-05-17

Missing closing delimiters

By now, a block with a missing closing delimiter was properly detected as an error, but the error message was ugly. See, for this:


Something -- missing

You got:


line 0:-1 mismatched input '' expecting HYPHEN_MINUS

Not a great deal here, but pretty annoying in a 1000-line long source document.

After a close look, it looked very complex to determine where the error was coming from. Considering this case:


There " is ( something " missing

… The problem is obviously with the unclosed parenthesis. It’s easy to see (for a human) because parenthesis are paired delimiters: there is an opening and a closing one. The double quotes " is single in the sense it may be used for both opening and closing a block, depending on the context. In the example above, the Novelang parser started evaluating a parenthesized block, and the double quote looked like an unclosed block. How to handle this correctly?

— In order to avoid grammar bloat, the grammar emits some kind of events, telling it started parsing a block with such or such delimiter. The position of every token for a start delimiter is kept. If something goes wrong, the error message(s) will report the position of the unclosed delimiter.

— Event consistency check is scoped: if an unclosed delimiter is detected inside a paragraph, this should have no influence on the way unclosed delimiters are handled inside another paragraph.

— When trying to figure where is the opening delimiter with no closing counterpart, the trick is, to look at paired delimiters first. If something went wrong with paired delimiters, just report the errors about them. Otherwise, report errors with single delimiters.

I just checked this new feature into Github and the results are pretty good. Given source document like this (line numbers added for clarity):

1   ( s
2  t -- u
3  v )
4  
5  // w
6  x [ y
7  z //

Instead of a bunch of nonsense, Novelang now returns following problems:


2:2: Missing delimiter. For '--' there should be a matching '--' or '-_'
7:4: no viable alternative at input '\n'
6:2: Missing delimiter. For '[' there should be a matching ']'

This will be available in the next version (0.28.0). Keep informed reading this blog!

2009-05-10

Novelang-0.27.1 released!

Just a fix after I messed MIME type for rendered documents. As usual, available here.

Novelang-0.27.0 released!

Latest release available here.

Novelang-0.27.0 enhances the tag feature with standard HTML stylesheet displaying the list of user-defined tags. In a source document, tags are words preceded by an arrobas @. Levels, paragraphs, paragraphs inside angled bracket pairs (aka blockquotes) and cell rows (aka tables) may be tagged.

  @javascript @performance
By now this feature all relies on Javascript 
running inside the Web browser.

HTML generated using default stylesheet renders tags like this, with a nice color set making tags distinguishable at a glance:

It is now possible to hide all the text which is not tagged, selecting tags in a list which appear on topright corner of HTML document, with a fixed position that keeps it always visible and a disclosure box which hides the list by default:

If a level or a set of paragraphs inside angled bracket pairs do have at least one of requested tags, it is displayed with all of its content. If a paragraph has at least one of requested tags, it is displayed, as all its parents (levels or set of paragraphs).

By now this feature all relies on Javascript running inside the Web browser. This doesn’t scale on big documents (with lots of paragraphs and levels). For some big document with HTML generation taking about 13 s, selecting one tag takes more than 70 s and triggers several “slow script” warnings.

A more suitable approach would be to trim the AST (Abstract Syntax Tree) server-side. This requires passing parameters to the query. Because of pre-rendering processing, tag-based filtering would work for any other other format than HTML for free.

There would be less to do in Javascript; it would just update the tag list in order to reflect document’s state.