Novelang-0.53.4 released!

Just released Novelang-0.53.4!

Summary of changes:

  • Fixed logging configuration with --log-dir option.

Download it from here.



Novelang-0.53.3 released!

Just released Novelang-0.53.3!

Summary of changes:

  • Fixed barcode generation.
  • Minor logging enhancements.

Download it from here.


Browser compatibility for HTML documentation

Since version 0.53.0 Novelang HTML documentation displays correctly on following browsers:

  • Safari 5.0 (Mac OS X 10.6.5, Windows XP 32bits SP3).
  • Firefox 3.6 (Mac OS X 10.6.5, Windows XP 32bits SP3).
  • Google Chrome 8.0 (Windows XP 32bits SP3).
  • Internet Explorer 8 (Windows XP 32bits SP3).

The website also passed W3C validation for HTML 4.0 Transitional.

Some links about CSS layout

Three Column Stretch : strech isn’t that good, makes too long lines for main text.

Piefecta – A superb 3-col tableless layout – long right col looks good on Safari and zooms well. Tells it deals with various browser bugs. Probably the finest piece of engineering but too many fixes are making it unreadable. We don’t care about supporting IE6. License: unknown.

CSS Fixed Layout #3.1 (Fixed-Fixed-Fixed) zooms correctly. Much simpler. License: unknown.

Elastic-fluid hybrid got it right. It scales up and down in with, staying in fixed limits. Read author’s comments .

Also check this tutorial about elastic layout.

The Holy Grail 3 column Liquid Layout has great explainations. (Only 1 error on W3C validator but the element reported to be missing appears as it should – validator bug?) License: free to use, linkback appreciated. After a close look, it turned out that forcing min-width and max-width for the column (worst case: nesting another div) and using proper text alignment does the job.


Novelang-0.53.2 released!

Just released Novelang-0.53.2!

Summary of changes:

  • Indicating error location when something goes bad during XSL transformation.
  • Minor fixes on HTML documentation.

Download it from here.



Novelang-0.53.1 released!

Just released Novelang-0.53.1!

Summary of changes:

  • Minor cosmetic changes for HTML generation.

Download it from here.


Novelang-0.53.0 released!

Just released Novelang-0.53.0!

Summary of changes:

  • Experimental support for Multipage. See the result in Novelang documentation.
  • Small logging enhancements.
  • More restrictive rules when applying XSL stylesheets. Generation now breaks on warnings. This might break existing incorrect stylesheets.
  • Changed default representation of Fragment Identifiers, both Implicit and Explicit. Removed leading double reverse solidus \\ when rendering (still required in document sources).

Download it from here.



Rule-based number spelling

Novelang comes with a Numbering class which formats an integer value in words. This adds a bit of magic when the stylesheet writes "Chapter fourty-two" from a stupid counter.

Currently the Numbering class only supports French and English, and values from 0 to 50 (all values are hardcoded). The ICU project offers the RuleBasedNumberFormat which supports rule-based formatting. This makes easy to support much greater ranges.


XSL mockup for multipage rendering

Here is how an XSL would render a multipage document.

First, let’s consider the whole document defining the opus:

== One

Some text of level one.

== Two

Some text of level two.

=== Two-one

Some text of level two-one.

The XML form of the document above is:

<?xml version="1.0" encoding="UTF-8" ?>
    <paragraph>Some text of level one.</paragraph>
    <paragraph>Some text of level two.</paragraph>
      <paragraph>Some text of level two-one.</paragraph>

Let’s take for granted that Novelang supports XSL metadata. Our multipage-enabled stylesheet would define an embedded stylesheet that transforms a whole opus into a simple map of page names and page paths. A path is whatever the stylesheet may reprocess, but an XPath expression is quite good. For the document above, here is how our map could look like, if we want to support 2 levels:

page1 -> /opus/level[1]
page2 -> /opus/level[2]
page3 -> /opus/level[2]/level[1]

Please note that, at this point, the decisision to support a given depth, or exclude some tagged levels, entirely belongs to the page-extracting stylesheet.

By merging the page map with the opus, we get the XML input for the rendering of one page. Novelang knows which page it is either because it is iterating over all known pages of the map (batch mode), or because the page name is a part of the request issued to the HTTP dæmon.


    <paragraph>Some text of level one.</paragraph>
    <paragraph>Some text of level two.</paragraph>
      <paragraph>Some text of level two-one.</paragraph>

(Note: the n: namespace prefix doesn’t appear here for brevity.)

The stylesheet gets this whole document as input for every page. All what changes is the name, path pair in the meta/page element. The stylesheet needs to know which page it is rendering, and the whole document tree as well, in order to create a navigation bar or any kind of header or footer corresponding to a specially-titled or tagged level of the document.

This involves some XSL trickery: evaluating an XPath expression at runtime. While it’s not part of XPath 1.0 specification, it is a part of semi-official EXSLT communitiy initiative. The dyn:evaluate http://www.exslt.org/dyn/functions/evaluate does that for us. It works well with Xalan-2.7.1 which is the XSLT engine bundled with Novelang (it works a slightly better than JDK’s one).

In the stylesheet below, we save useful expressions into variables.

The root template prints those variables, then a pseudo-navigation bar made of nested lists.

The nested loop for iterating over level elements is rather ugly but it makes sense as we don’t want infinite deph of titles in a navigation bar.

The title-with-locator template just adds bold on the title in the navigation bar that corresponds to current page.

All other templates mimic Novelang’s standard rendering.

  <!-- Be sure to use Xalan-2.7.1 (not JDK's default). -->

    Here, expect a meta section, embedding a stylesheet 
    that extracts the pages we'll find in the meta section 
    of input document.

  <xsl:output method="html" />

  <xsl:variable name="page-name" select="/opus/meta/page/name" />
  <xsl:variable name="page-path" select="/opus/meta/page/path" />
  <xsl:variable name="page-nodeset" 
      select="dyn:evaluate( $page-path )" />
  <xsl:variable name="page-id" 
      select="generate-id( $page-nodeset )" />

  <xsl:template match="meta/page" >
    $page-name=<xsl:value-of select="$page-name" />
    $page-path=<xsl:value-of select="$page-path" />
    $page-id=<xsl:value-of select="$page-id" />

  <xsl:template match="/opus" >
      <xsl:apply-templates select="meta" />

      <!-- Navigation bar -->
        <xsl:for-each select="level">
            <xsl:call-template name="title-with-locator"/>
          <xsl:if test="level">
              <xsl:for-each select="level">
                  <xsl:call-template name="title-with-locator"/>

      <!-- Document body, same templates as usual -->
      <xsl:apply-templates select="$page-nodeset" />



  <xsl:template match="paragraph" >
      <xsl:value-of select="." />

  <xsl:template match="title" />

  <xsl:template match="level" >
    <h2><xsl:value-of select="title" /></h2>

  <xsl:template match="level/level" >
    <h3><xsl:value-of select="title" /></h3>

  <xsl:template name="title-with-locator" >
      <xsl:when test="generate-id( . ) = $page-id" >
        <b><xsl:call-template name="title-alone" /></b>
        <xsl:call-template name="title-alone" />

  <xsl:template name="title-alone" >
    Title: <xsl:value-of select="title" />


Finally, this is how the rendering looks like:


Novelang-0.52.0 released!

Just released Novelang-0.52.0!

Summary of changes:

  • Added n:block-inside-asterisk-pairs. Default stylesheet render it as bold.

Download it from here.


Grammar pattern: twin delimiters

This post describes a tricky point of Novelang’s grammar design: how to handle twin delimiters like // in a non-ambiguous manner for an ANTLR grammar. It’s a useful refresh before adding long-awaited ** (asterisk pair) delimiter.

The problem

For paired delimiters like ( and ) or [ and ] it’s easy to know when to “open” or “close” a block, and support nested blocks. In contrast, a twin delimiter is an opening one if not preceded by a closing one inside the same block, regardless of what happens in subblocks. This is a complicated way to say we support this kind of nesting:

// block-1 ( block-2 //block-3// ) //

+ block-inside-solidus-pairs
  + block-inside-parenthesis
    + block-inside-solidus-pairs

We also support this:

block-1 // block-2 // block-3 // block-4 //

+ block-inside-solidus-pairs
+ block-inside-solidus-pairs

(We have only one level of nesting here. 2 levels of nesting is counter-intuitive and would have required very complex lookahead.)

The pattern

The pattern is to define special grammatical elements when inside a block defined by a twin delimiter, to propagate this element cannot appear again, unless inside some other subblock.

Taking “XXX” for the name of some twin delimiter, here is a simplified version of the grammar for spreadblocks. The term “spreadblock” stands for a block that may spread on several lines (containing single line breaks).

  : ... mixedDelimitedSpreadblock

  : word ( punctuationSign | delimitedSpreadblock ) ...

  : xxxSpreadblock
  : parenthesizedSpreadblock
  | squareBracketsSpreadblock
  | doubleQuotedSpreadblock
  | hyphenPairSpreadblock

  : '(' spreadblockBody ')' // Same for other paired delimiters.

  : ... mixedDelimitedSpreadblock

  : XXX spreadblockBodyNoXxx XXX

  : ... mixedDelimitedSpreadblockNoXxx ...

  : ... delimitedSpreadblockNoXxx ...

  : parenthesizedSpreadblock
  | squareBracketsSpreadblock
  | doubleQuotedSpreadblock
  | hyphenPairSpreadblock

This is more or less the same for tightblocks. “Tightblocks” stand for blocks containing no line breaks, like cells and embedded lists.

acell  // Same for embedded list items.
  : ... mixedDelimitedTightblock ..

  : word ( punctuationSign | delimitedTightblock | ... ) ...
  : word ( punctuationSign | delimitedSpreadblock | ... ) ...

  : xxxTightblock
  | parenthesizedTightblock
  | squareBracketsTightblock
  | doubleQuotedTightblock
  | hyphenPairTightblock

  : XXX tightblockBodyNoXxx XXX

  : ... mixedDelimitedTightblockNoXxx ...

  : word ( punctuationSign | delimitedTightblockNoXxx ) ...

  : parenthesizedTightblock 
  | squarebracketsTightblock
  | doubleQuotedTightblock
  | hyphenPairTightblock
  ; // That's all.

Thought it is over? There is another kind of block, the delimitedTightblockNoSeparator used inside the subblockAfterTilde which reflects each block inside ~x~y~z! But at this point you probably got the idea.

Yes this makes the grammar quite verbose, but factoring it would reduce ANTLR’s ability to check for inconsistencies. Anyways, the slightest addition brings the need of writing test cases for every logical path inside each ANTLR grammar rule.


Novelang-0.51.1 released!

Just released Novelang-0.51.1!

Summary of changes:

  • Upgraded from FOP-0.95 to FOP-1.0. FOP is the library for generating PDF documents.
  • Various other library upgrades that shouldn’t affect normal users.

Download it from here.



Novelang-0.51.0 released!

Just released Novelang-0.51.0!

Summary of changes:

  • Fixed: list with double hyphen and number sign was using a “plus sign” everywhere (source documents and XML elements). This might break existing documents and stylesheet using this brand new feature.

Download it from here.


Novelang-0.50.2 released!

Just released Novelang-0.50.2!

Summary of changes:

  • Fixed: support paragraphs as lists (n:list-with-triple-hyphen and n:list-with-double-hyphen-and-plus-sign) inside n:paragraphs-inside-angled-bracket-pairs.

Download it from here.


Novelang-0.50.1 released!

Just released Novelang-0.50.1!

Summary of changes:

  • Minor fix on JavaShell for cleaner shutdown when there is no default JmxKit. This only may affect users of Novelang-attirail subproject.
  • Fixed documentation generation where release notes for SNAPSHOT versions appeared for non-SNAPSHOT versions.

Download it from here.



Novelang-0.50.0 released!

Just released Novelang-0.50.0!

Summary of changes:

  • Embedded numbered lists (n:embedded-list-with-number-sign).
  • Paragraphs as numbered lists (n:list-with-double-hyphen-and-plus-sign).
  • Switched to Maven 3. This required no change but future build features may not work with formerly-used Maven 2.2.1.

Download it from here.



Novelang-0.49.0 released!

Just released Novelang-0.49.0!

Summary of changes:

  • In default stylesheet for HTML and PDF, the first n:cell-row element renders as a table header, if there is more than one. This might break existing documents.

Download it from here.


Maven cheat sheet (update)

This is an update of previous Maven cheat sheet.


mvn -e --batch-mode clean release:prepare -Dnovelang.build.distribution -DreleaseVersion=M.m.f > build-release-prepare.log

mvn release:perform -Dnovelang.build.distribution -DreleaseVersion=M.m.f > build-release-perform.log


Novelang-0.48.0 released!

Just released Novelang-0.48.0!

Summary of changes:

  • Fixed startup option in documentation.
  • Tags and location for lines of literal. Required an intermediate n:raw-lines element nested inside n:lines-of-literal. This might break existing stylesheets.
  • Location for cell rows with vertical line.

Download it from here.



Random text generator for French

This is a cool one. As it takes random phrases from classical French litteracy, punctuation signs and "typographic grey" look natural.


Cheatsheet template

Wikipedia definitely offers the right template for a cheat sheet. Short and easy to render with Novelang.


Novelang-0.47.0 released!

Just released Novelang-0.47.0!

Summary of changes:

  • Feature removal: relative identifier. Never used, and would make multipage rendering much more complicated.
  • Small enhancements to Novelang-attirail, the reusable library.

Download it from here.



Technical study: multi-page HTML rendering

What we need

How hard would that be to render a single Novelang document over multiple HTML pages? Better ask: how cool would that be? Think about Novelang documentation taking one single huge page. This makes non-linear reading quite uncomfortable. Of course, multi-page rendering should work for both batch and interactive mode.

Technical implications

For batch rendering, there can be a simple approach. Xalan (XSLT rendering engine) offers the redirect extension for redirecting output into a given file.

<xsl:template match="/doc/foo">
  <redirect:write select="@file">

Unfortunately, this is not suitable for interactive rendering. For interactive rendering, the endering process must known both:

  • The requested page, through a URL aware of the page (as sub-part of the whole document).
  • The whole document, because we may need to render links to other chapters or whatever.

The same need arises for batch rendering but with Xalan’s Redirect extension mentioned above, the whole logic gets buried inside the XSLT (which probably makes it quite complex).

Obviously, we need a Renderer to work the same way for batch and interactive rendering, e. g. there should be no special handling of interactive or batch rendering in the XSL stylesheet. (But multi-page rendering would require a special stylesheet anyways, at least for generating navigation.)

While XSLT-based rendering is the most common case in Novelang, it’s better to think about the general contract of a org.novelang.rendering.Renderer. As it already does, the Renderer should spit bytes into a java.io.OutputStream with no knowledge wether it is a file or a socket. The job of creating the output (which means chosing a file name in the case of batch rendering) is left to some upstream object opening the OutputStream. Currently, this is done by org.novelang.batch.DocumentGenerator or org.novelang.daemon.DocumentHandler which both end by calling DocumentProducer, passing it the OutputStream.

So rendering stage needs additional logic. Interactive rendering implies to extract the requested page from the URL. Batch rendering implies to find the list of pages to create corresponding files on the filesystem.

New Renderer contract

There can’t be unique way to split a document into pages, so we have new responsabilties for our Renderer:Given a document tree (as a org.novelang.common.SyntacticTree) it calculates a list of page identifiers.Given the same document tree, plus a page identifier, it renders the corresponding page to an OutputStream.With something like a single empty page identifier, we should get the same single-page rendering as we have now.

For an XSLT-based Renderer, we should embed page identifiers generation in the same XSL stylesheet (as a part of already-discussed stylesheet metadata ):

<xsl:stylesheet [namespaces blah blah] >

      Some XSL tranformations here,
      starting from  element. 


Before page rendering occurs, Novelang asks the Renderer for page identifiers. The default XSL-based Renderer applies the content of the element (if there is one) as a stylesheet on the whole document tree. Then it obtains a list of page identifiers as follows:

  <page name="Home" >/n:opus
  <page name="ChapterOne" >/n:opus/n:level[1]
  <page name="ChapterTwo" >/n:opus/n:level[2]

Of course each page name is unique. In order to achieve this with no tweak, the document tree may embed unique identifiers by extending the semantic of n:implicit-identifier or by adding a new n:unique-identifier element. Node paths seem easy to generate .

Now for each page, Novelang creates the corresponding file out of the page name. If the stylesheet in the did chose filesystem-friendly names, those will be used verbatim (otherwise we may apply some variant of URL encoding). And, for each page, Novelang calls the Renderer with the whole document tree again, and passes additional metadata elements to tell the Renderer which page it is rendering. Input XML looks like this:


This should be enough for the Renderer to figure how to render only the page of interest. It might need to peek elsewhere in the document tree (like for a footer with a copyright notice, or find other chapter names for a navigation bar).

Mix with other features (present or future)

There is an additional role for node identifiers: they might help to “enhance” internal links by adding the prefix corresponding to the target page. (The internal link feature is yet in inception phase. It just seems easier to implement it right after multipage rendering.)

Unique page names

Novelang’s Fragment Identifier is the perfect candidate to generate page identifers. Unfortunately, composite identifier contain the \ character. Should we escape it, or mix it with some weird pseudo-directory feature? But maybe it’s time to remove relative identifiers which never proved useful, and don’t guarantee identifer uniqueness, anyways.

It’s easy to create a new element by adding a simple counter to a colliding identifier. The value for some given document fragment may change across several generations, when adding fragments with colliding identifiers. This won’t be a problem for internal links (links defined by the document itself) prohibit usage of unique identifier. Remember: unique identifiers are only for pure HTML links.

If there is a chance that a foreign HTML documents links to the HTML anchor defined by the unique identifier (in a pure WWWW – World Wide Web Way) then document author should use explicit identifiers.

New URL scheme

With single-page rendering, the rendered document has the same name as the source document (with the difference of the extension). Multi-page adds a new “dimension”. Because the name of the page may collide with another document’s name, the name of the originating document prefixes the page name. Let’s look at different options:


Let’s see which character we could use (only checked on Mac OS X, to do: check on Windows):

Character Escaped? Comments
~ No Already used for Novelang meta pages.
- No Already used for Novelang identifiers.
^ No Meaningless in that context.
# No Fragment in URL.
, No Hard to distinguish from full stop . character.
! No Hard to read.
_ No Too common in file names.
+ No Used in URL encoding. Usage unrelated to “plus” meaning.
% No Used in URL encoding.
= Yes Used in URL encoding. Usage unrelated to “equality” meaning.
$ Yes Overused.
; Yes Hard to read.
| Yes Hard to read.
' Yes Hard to read.
& Yes Already used for URL parameters.
? Yes Already used for URL parameters, DOS wildcard.
@ Yes Inverted meaning if page name appears second.
{ Yes Weird because unpaired. Meaningful otherwise.
§ Yes Mac OS X console doesn’t like it.
: - Path separator on Unix.

The “Escaped?” column means, it requires escaping on Mac OS X console.

Finally, it turns out that -- looks the best, especially with a variable-width font like in Mac OS X Finder or Windows Explorer.

Special case: if the page identifier was blank, the page separator doesn’t appear so we would still have:


This naming scheme also implies that all pages appear flatly in the same directory. This should help when resolving resource names.


Novelang-0.46.1 released!

Just released Novelang-0.46.1!

Summary of changes:

  • Added source packaging for Novelang-attirail subproject.

Download it from here.



Novelang-0.46.0 released!

Just released Novelang-0.46.0!

Summary of changes:

New experimental features for code reuse:

  • Novelang-attirail subproject aggregating various tools. It’s not part of standard distribution, by now it requires separate rebuild.
  • Pluggable logging implementation.
  • Java code all under org.novelang package (was novelang).

Download it from here.



Novelang-0.45.0 released!

Just released Novelang-0.45.0!

Summary of changes:

  • Added Greek and Polish characters to the grammar.

Download it from here.



Novelang-0.44.5 released!

Just released Novelang-0.44.5!

Summary of changes:

  • Fixed release notes generation.

Download it from here.


Novelang-0.44.4 released!

Just released Novelang-0.44.4!

Summary of changes:

  • Fixed a few references to old "Part" and "Book" terms, and file suffixes as well.

Download it from here.


Script for renaming to new extensions

Here is a Bash script (tested on Mac OS X) that renames every .nlp into .novella and .nlb into .opus. It also changes file content. Use with care.


for file in `find src modules \( -name *.nlp -o -name *.nlb \) `
  newfile=` echo "$file" | sed $SED `
  echo "$file -> $newfile"
  sed $SED < $file > $newfile
  rm $file


Novelang-0.44.3 released!

Just released Novelang-0.44.3!

Summary of changes:

  • Fixed Nhovestone report generation.

Download it from here.



Maven cheat sheet (0.44.2)

There is an updated version of this post.

This is a list of useful Maven commands. They work with Novelang-0.44.2. Later version will probably make some of them less verbose, using some default parameters.

Convention: the Novelang/$ represents the command prompt, with working directory being Novelang’s home directory. Subdirectories appear when needed.

Plugin versions

Stay up-to-date by listing more recent plugins (there is another goal for dependencies):

Novelang/$ mvn versions:display-plugin-updates

Show dependency tree:

Novelang/$ mvn dependency:tree

Feed local repository with fresh artifacts

Novelang/$ mvn clean install 

Force child modules version

Force the version of every child module to the one of the parent:

Novelang/$ mvn -N versions:update-child-modules

Performing a release (may be specific to Novelang-0.44.2)

First, clean previous POM backup files:

Novelang/$ mvn release:clean

Then prepare the release. This does the following:

  • Check VCS state. Includes: no uncommitted file; remote repository sync’ed with local.
  • Change the POM versions to release version (shown as M.m.f in the snippet below).
  • Run the build, using declared .
  • Commit changed POMs to local SCM.
  • Tag the SCM locally.
  • Pushes the changes on remote repository, including tags (failing on a conflict).
  • Revert SCM versions to development version.

Novelang/$ mvn -e --batch-mode release:prepare -Drelease=false -DlocalCheckout=true -DreleaseVersion=M.m.f -DdevelopmentVersion=SNAPSHOT -Dtag=release-M.m.f > build-release-prepare.log

This part is likely to fail. If something goes wrong:

  • Reset git in the --hard way, to the version immediately before Maven’s changes.
  • Delete release-M.m.f tag local git repository.
  • Delete release-M.m.f tag on remote git repository: git push -v github :refs/tags/release-M.m.f
  • Force synchronization between local git repository and remote one. This may be done by committing an innocuous change, then pushing it with --force option (better idea, anyone?).
  • Call again: Novelang/$ mvn release:cleanGet sure that’s everything OK with gitk.

Might be useful to reset all POM version (like after some POM or branch hacking): set root pom.xml version to SNAPSHOT and run mvn versions:update-child-modules.

Once this is done, our git repositories contain good, tagged stuff. Last step is to perform the final build.

Novelang/$ mvn release:perform > build-release-perform.log

(There is no additional parameter to pass; the release.prepare did create some POM copies with relevant information.)

The release.perform goal performs a fresh checkout in Novelang/target/checkout where all the pom.xml contain expected M.m.f version. The build calls the deploy:deploy on Novelang-documentation and Novelang-distribution which upload relevant files on SourceForge and send email notifications.

Useful links

Using master password.

Mini-guide about Maven release plugin.


Resume from a give module folder instead of restarting the build since the beginning:

Novelang/$ mvn reactor:resume -Dfrom=bar 

Novelang-0.44.2 released!

Just released Novelang-0.44.2!

Summary of changes:

  • Another fix for a build problem. Now the deploy:deploy goal should work properly when called from release:perform.

Download it from here.


Novelang-0.44.1 released!

Just released Novelang-0.44.1!

Summary of changes:

  • Fixed build problem when deploying files and sending annoucements.

Download it from here.



Novelang-0.44.0 released!

Just released Novelang-0.44.0!

Summary of changes:

  • Renamed Part into Novella and Book into Opus. Nicer, clearer. New recommended file suffixes are .novella and .opus. Old .nlp and .nlb suffixes still supported.
  • Switched build system from Ant to Maven. This should be transparent for users.

Download it from here.



Syntax highlighter for HTML

The SyntaxHighlighter project looks nice. It has a "copy to clipboard" feature (implemented in Flash). With some additional hacking, this would save from keeping Novelang's nasty zero-width spaces added for correct line wrapping.


Zipper for faster tree modifications

Novelang uses immutable trees to represent a document and transform it. While immutable data structure have well-known advantages, Novelang's tree library requires to update every parent node on each change on any child node. Clever guys found how to save those changes when performing multiple local modifications. This relies on a tree structure called Zipper. Here is a very clear explaination, the original paper (site currently down), the Scala implementation and the Clojure one.


Google's font directory

This is an amazing initiative from Google: a directory for Web fonts. Fonts are available under SIL Open Font License 1.1. There are some beautiful fonts available, with a nice and clear browsing interface. One can download font sources from here.


Novelang-0.43.0 released!

Download Novelang-0.43.0 here !
  • Added nohead option to insert command.
  • Fixed some bugs around identifiers.
  • Introduced detection of colliding explicit identifiers. This has no useful purpose for now but will serve as a basis for implementing internal links.
  • Small performance enhancement on HTML document rendering in a Web browser: don’t use JavaScript to set collapsible descriptors hidden.


MinorThird's Mixup

Could this be useful in Novelang? The Mixup language performs complex queries on pure text. It's "like a regex query, but while regex operates at character level, Mixup operates at token level." Mixup is part of the MinorThird suite and available under BSD license.


Novelang-0.42.0 released!

Download Novelang-0.42.0 here !
  • Now requires Java 6.
  • New Nhovestone report: Novelang has its own benchmark!
  • Added stylesheet html-FR.xsl for French punctuation.
  • Performance enhancement on rendered HTML page: when containing many tags it should load faster. Instead of dynamically computing styles on the Web browser, HTML rendered by the server directly includes those styles.
  • Various performance enhancements on document generation. With the same amount of memory (-Xmx parameter), Novelang handles documents twice bigger and serves them 20 % faster than previous version. Benchmark ran against version 0.41.0 and 0.38.1. This includes buffered reading of Part files, multithreaded Part rendering, and reduced memory consumption when dealing with AST (Abstract Syntax Tree).



“Nhovestone” is the name of Novelang’s dedicated benchmark tool, and also a geeky pun .

Nhovestone aims to highlight performance variations across versions using only a few (carefully selected) measurements:How does response time evolve when increasing the number of documents aggregated in a single Book?How does response time evolve when increasing the size of one single document?

Nhovestone doesn’t try to generate an absolute performance index. This is because such an index makes sense only when computed from always the same source documents and the same hardware.

How it works

Nhovestone focuses on HTML generation using default stylesheet, because HTML is great for fast edit-and-review roundtrips. It uses the Novelist to generate pseudo-random text with a realistic structure. For each benchmarked Novelang version, Nhovestone starts a JVM with a small amount of memory (currently -Xmx32M). With few memory the breaking point appears sooner. Nhovestone increases the size of the source document(s) in a linear fashion, and after each increasing, measures how long takes the call of a Novelang instance.

Performance degradation

Response time start to increase exponentially as document becomes fairly big in regard of available memory. This triggers a lot of CPU-intensive garbage collection consuming a lot of time. Nhovestone detects that a running Novelang HTTP daemon gets “strained” when response time gets above a dynamically-computed threshold. The threshold comes from the straight line drawn from a linear regression on the first half of the measurements, with a slope made steeper by a fixed coefficient. When a response time appears above this straight line, the Novelang HTTP daemon got strained and it’s not worth any further measurement.

Adding Parts

This is the first scenario: for each new measurement, there is an additional Part file. All Parts are more or less equal in size and complexity (including level depth). The graph below shows that performance degradation stays linear until the 300th call. Then, version 0.41.0 starts suffering before older versions. It’s likely that new features require additional memory so starvation occurs sooner.

Increasing the size of the same Part

This is the second scenario: the generated document comes from a single Part file of a size increasing before each call. Each fragment added to the Part has the same size and structure as in the previous test, but all 3 versions show fatigue much sooner (at least 7.5 times). This shows that creating a Part takes much more temporary memory than the finished Part itself.


These figures are strongly connected to the volume and the structure of underlying document. Experience shows that small increments generate more measurements (before the fatal strain) and therefore show a more readable trend. They also reduce measurement artefacts that could fool strain detection.

Report generation

JFreeChart generates those graphs. JFreeChart is probably the best charting library for Java at this time, at least on the OSS marketplace. It is stable and highly configurable.

The next step: embed those graphs in a Novelang-generated PDF and publish it as a complement of existing documentation.


The Novelist: random text generation

Novelang already does all the typesetting for you. What’s next? Writing text, of course! The just-started Novelist subproject, which aims to generate big documents for Novelang testing under heavy load.

Based on French metrics, random text looks like this:

Uomuecto eaufues xuner ig ocanerr, ebanu otpaa. Uuse, on eian aibtd, rttaintlufe elvettarrh, yrn enemlcmlun, ebcazepuer madscg, êiiovemtt teeost eseeerde? Fetn eearréetcs emrseoss icia ntmvesrud. Aoasro cênit ctainetda aèugedet css eali, unero aaie eneoden, nrortio. Oovlod; tfsmenco méttsna, eesdis uoeaeanao rcuent, desungtt av au oneerao, dxuaste umeinétniu lccdeiilne rliùearde veyiritisac yàslu. Iinmseuo odiapqied cmiiapearlo ebnjtus uauueis, libginmasa edrc emaèi sllieyr sode!

It bases on simplistic distribution algorithm. Word count and letter count from uniform distribution in a pre-defined range (something like 5-20 for words and 2-12 for letters). Letters come from a frequency table giving the percentage of appearance for each letter.

While the result doesn’t look much like real text, it’s good enough to stress basic parsing and typesetting.

There has been a lot of research about text analysis, first for cryptography, next for natural language analysis and Web crawling. Among all of them, there is a nifty one: the n-grams , which describe all the different letter sequences of a fixed length in a given text. The demo on Wolfram Alpha is gorgeous. It shows how combinations grow fast: a simple sentence like “ceramics come from” contains 69 3-grams. Google’s n-grams database (ranging from 1-grams to 5-grams) weights 24 GiB gzip’ed and contains near 1 billion of 3-grams. Amazingly, this number doesn’t increase so much for 4-grams and 5-grams.


Novelang-0.41.1 released!

Download Novelang-0.41.1 here !
  • Fixed bug with Promoted Tags, not detected under some circumstances.


Novelang-0.41.0 released!

Download Novelang-0.41.0 here !
  • New feature: Promoted Tags. Implicit Tags matching Explicit Tags become Promoted Tags.
  • Support lines of literal inside paragraphs inside angled bracket pairs.
  • Minor enhancements on HTML default stylesheet.


Novelang-0.40.1 released!

Download Novelang-0.40.1 here !
  • Fixed display bug on generated documentation.

Novelang-0.40.0 released!

Download Novelang-0.40.0 here !
  • Brand new stylesheet for HTML.


HTML default stylesheet improvements

A new default HTML stylesheet will be available soon. It should improve Novelang usability a lot. Key features are:

  • A better look.
  • Scaling up with metadata-oriented features.

If a picture is worth a thousand words:

Fluid layout

New layout supports horizontal resize. The column for rendered text may span from 500 to 1000 pixels.

Lines of literal (<pre> tag) wrap if they are too long. Because wrapping only occurs with the white-space : pre-line style, which discards indentation by default. To prevent this, some JavaScript replaces every space character inside a <pre> by a non-brekable space, immediately followed by a zero-width space. This causes a clean-looking wrapping, but text copied in the clipboard has unwanted character.

Overall look

Titles are indented. This is a compromise with the Descriptor feature (described later).

Line spacing is constant, even between two paragraphs, or between a paragraph and an embedded list. There is a slight loss of information (it may be hard to see where a paragraph begins) but this globally increases readability.


Font choice has a huge impact on overall look. Chosing fonts is hard stuff, because fonts rendering is hardly the same across Web browsers. Font readability also changes a lot, depending on line spacing, contrast, and other fonts around.

The convention is: serif font for rendered document, sans-serif for extra information like actions and tags. Literal (<pre> and <code>) shows with a fixed-with font, which is serif, too.

After experimenting with a lot of combinations, finally, the winners are:

—  Palatino Linotye for the rendered document. This gorgeous font is a bit more readable than Times New Roman. It’s available on all platform, and looks gorgeous with appropriate contrats (dark grey over light gray instead of black over white).

—  Lucida Grande with Tahoma as second choice. Lucida Grande is highly readable (was chosen as default for Mac OS X), but sophisticated enough to not look “poor” aside Palatino.

—  Courier New is not new at all, but it mixes harmoniously across text in Palatino. Those fonts display much better on Mac OS X, or with Safari on Windows XP.


Descriptors appeared in Novelang-0.39.0, as an experimental feature. They now display with a nice fade and animation, in order to preserve user’s visual landmarks. Descriptor have a vertical bar that helps to see the scope of the descriptor. This vertical bar only shows when Descriptor is discloed.

Descriptor disclosers now appear close to Tag column. This avoids polluting the left margin.

Scalable lists for metadata

On big documents, there can be so many tags they don’t fit in the height on a Web browser’s window. But most of time, they all fit so it’s convenient to have all of them at a fixed position. How to deal with the exception without hurting common case? Having a 2nd scrollbar in a browser’s frame looks confusing. But the scrollbar has a great feature: it shows that some items are out of sight. One trick could be displaying a huge popup, but this probably means a lot of work for a poor result.

Finally, the solution comes with a fade to grey at the end of the list to show that all items don’t show. A tiny button “unpins” the tag list from the top of Web browser’s window and lets it go to the document’s beginning. So, when entering the “1 % case” we still have a standard behavior.

Here is the Tag tab in its default pinned state (note the fade at the bottom and scrollbar position):

Unpinning causes it to scroll with the rest of the document:

In addition to Tags, there will be, in a (hopefully near) future, more metadata like Identifiers. Tags show up under a tab bar where it’s easy to add new tabs.


Novelang-0.39.2 released!

Download Novelang-0.39.2 here !
  • Reject diacritics in tags.
  • Filter on implicit tags as on explicit tags.


Novelang-0.39.1 released!

Download Novelang-0.39.1 here !
  • Fixed documentation generation.

Novelang-0.39.0 released!

Download Novelang-0.39.0 here !
  • Experimental feature: descriptors in default HTML stylesheet.
  • Support Unicode files starting with a BOM (Byte Order Mark).
  • Reject TAB character.
  • Display Unicode name of a rejected character (as long as it belongs to the set of 16 bit Unicode characters as defined in Unicode 5.2 standard). This feature is experimental and may wreck existing error messages.