Just released Novelang-0.53.4!
Summary of changes:
- Fixed logging configuration with
--log-diroption.
Download it from here.
Enjoy!
Just released Novelang-0.53.4!
Summary of changes:
--log-dir option.Download it from here.
Enjoy!
Just released Novelang-0.53.3!
Summary of changes:
Download it from here.
Enjoy!
Since version 0.53.0 Novelang HTML documentation displays correctly on following browsers:
The website also passed W3C validation for HTML 4.0 Transitional.
Three Column Stretch : strech isn’t that good, makes too long lines for main text.
Piefecta – A superb 3-col tableless layout – long right col looks good on Safari and zooms well. Tells it deals with various browser bugs. Probably the finest piece of engineering but too many fixes are making it unreadable. We don’t care about supporting IE6. License: unknown.
CSS Fixed Layout #3.1 (Fixed-Fixed-Fixed) zooms correctly. Much simpler. License: unknown.
Elastic-fluid hybrid got it right. It scales up and down in with, staying in fixed limits. Read author’s comments .
Also check this tutorial about elastic layout.
The Holy Grail 3 column Liquid Layout has great explainations. (Only 1 error on W3C validator but the element reported to be missing appears as it should – validator bug?) License: free to use, linkback appreciated. After a close look, it turned out that forcing min-width and max-width for the column (worst case: nesting another div) and using proper text alignment does the job.
Just released Novelang-0.53.2!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.53.1!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.53.0!
Summary of changes:
\\ when rendering (still required in document sources).Download it from here.
Enjoy!
Novelang comes with a Numbering class which formats an integer value in words. This adds a bit of magic when the stylesheet writes "Chapter fourty-two" from a stupid counter.
Currently the Numbering class only supports French and English, and values from 0 to 50 (all values are hardcoded). The ICU project offers the RuleBasedNumberFormat which supports rule-based formatting. This makes easy to support much greater ranges.
Here is how an XSL would render a multipage document.
First, let’s consider the whole document defining the opus:
== One Some text of level one. == Two Some text of level two. === Two-one Some text of level two-one.
The XML form of the document above is:
<?xml version="1.0" encoding="UTF-8" ?>
<opus>
<level>
<title>One</title>
<paragraph>Some text of level one.</paragraph>
</level>
<level>
<title>Two</title>
<paragraph>Some text of level two.</paragraph>
<level>
<title>Two-one</title>
<paragraph>Some text of level two-one.</paragraph>
</level>
</level>
</opus>
Let’s take for granted that Novelang supports XSL metadata. Our multipage-enabled stylesheet would define an embedded stylesheet that transforms a whole opus into a simple map of page names and page paths. A path is whatever the stylesheet may reprocess, but an XPath expression is quite good. For the document above, here is how our map could look like, if we want to support 2 levels:
page1 -> /opus/level[1] page2 -> /opus/level[2] page3 -> /opus/level[2]/level[1]
Please note that, at this point, the decisision to support a given depth, or exclude some tagged levels, entirely belongs to the page-extracting stylesheet.
By merging the page map with the opus, we get the XML input for the rendering of one page. Novelang knows which page it is either because it is iterating over all known pages of the map (batch mode), or because the page name is a part of the request issued to the HTTP dæmon.
<op>us>
<meta>
<page>
<name>page2</name>
<path>/opus/level[2]</path>
</page>
</meta>
<level>
<title>One</title>
<paragraph>Some text of level one.</paragraph>
</level>
<level>
<title>Two</title>
<paragraph>Some text of level two.</paragraph>
<level>
<title>Two-one</title>
<paragraph>Some text of level two-one.</paragraph>
</level>
</level>
</opus>
(Note: the n: namespace prefix doesn’t appear here for brevity.)
The stylesheet gets this whole document as input for every page. All what changes is the name, path pair in the meta/page element. The stylesheet needs to know which page it is rendering, and the whole document tree as well, in order to create a navigation bar or any kind of header or footer corresponding to a specially-titled or tagged level of the document.
This involves some XSL trickery: evaluating an XPath expression at runtime. While it’s not part of XPath 1.0 specification, it is a part of semi-official EXSLT communitiy initiative. The dyn:evaluate http://www.exslt.org/dyn/functions/evaluate does that for us. It works well with Xalan-2.7.1 which is the XSLT engine bundled with Novelang (it works a slightly better than JDK’s one).
In the stylesheet below, we save useful expressions into variables.
The root template prints those variables, then a pseudo-navigation bar made of nested lists.
The nested loop for iterating over level elements is rather ugly but it makes sense as we don’t want infinite deph of titles in a navigation bar.
The title-with-locator template just adds bold on the title in the navigation bar that corresponds to current page.
All other templates mimic Novelang’s standard rendering.
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dyn="http://exslt.org/dynamic"
extension-element-prefixes="dyn"
>
<!-- Be sure to use Xalan-2.7.1 (not JDK's default). -->
<!--
Here, expect a meta section, embedding a stylesheet
that extracts the pages we'll find in the meta section
of input document.
-->
<xsl:output method="html" />
<xsl:variable name="page-name" select="/opus/meta/page/name" />
<xsl:variable name="page-path" select="/opus/meta/page/path" />
<xsl:variable name="page-nodeset"
select="dyn:evaluate( $page-path )" />
<xsl:variable name="page-id"
select="generate-id( $page-nodeset )" />
<xsl:template match="meta/page" >
$page-name=<xsl:value-of select="$page-name" />
$page-path=<xsl:value-of select="$page-path" />
$page-id=<xsl:value-of select="$page-id" />
</xsl:template>
<xsl:template match="/opus" >
<html>
<xsl:apply-templates select="meta" />
<!-- Navigation bar -->
<ul>
<xsl:for-each select="level">
<li>
<xsl:call-template name="title-with-locator"/>
</li>
<xsl:if test="level">
<ul>
<xsl:for-each select="level">
<li>
<xsl:call-template name="title-with-locator"/>
</li>
</xsl:for-each>
</ul>
</xsl:if>
</xsl:for-each>
</ul>
<!-- Document body, same templates as usual -->
<xsl:apply-templates select="$page-nodeset" />
</html>
</xsl:template>
<xsl:template match="paragraph" >
<p>
<xsl:value-of select="." />
</p>
</xsl:template>
<xsl:template match="title" />
<xsl:template match="level" >
<h2><xsl:value-of select="title" /></h2>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="level/level" >
<h3><xsl:value-of select="title" /></h3>
<xsl:apply-templates/>
</xsl:template>
<xsl:template name="title-with-locator" >
<xsl:text>
</xsl:text>
<xsl:choose>
<xsl:when test="generate-id( . ) = $page-id" >
<b><xsl:call-template name="title-alone" /></b>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="title-alone" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="title-alone" >
Title: <xsl:value-of select="title" />
</xsl:template>
</xsl:stylesheet>
Finally, this is how the rendering looks like:
Just released Novelang-0.52.0!
Summary of changes:
n:block-inside-asterisk-pairs. Default stylesheet render it as bold.Download it from here.
Enjoy!
This post describes a tricky point of Novelang’s grammar design: how to handle twin delimiters like // in a non-ambiguous manner for an ANTLR grammar. It’s a useful refresh before adding long-awaited ** (asterisk pair) delimiter.
The problem
For paired delimiters like ( and ) or [ and ] it’s easy to know when to “open” or “close” a block, and support nested blocks. In contrast, a twin delimiter is an opening one if not preceded by a closing one inside the same block, regardless of what happens in subblocks. This is a complicated way to say we support this kind of nesting:
// block-1 ( block-2 //block-3// ) //
+ block-inside-solidus-pairs
block-1
+ block-inside-parenthesis
block-2
+ block-inside-solidus-pairs
block-3
We also support this:
block-1 // block-2 // block-3 // block-4 //
block-1
+ block-inside-solidus-pairs
block-2
block-3
+ block-inside-solidus-pairs
block-4
(We have only one level of nesting here. 2 levels of nesting is counter-intuitive and would have required very complex lookahead.)
The pattern
The pattern is to define special grammatical elements when inside a block defined by a twin delimiter, to propagate this element cannot appear again, unless inside some other subblock.
Taking “XXX” for the name of some twin delimiter, here is a simplified version of the grammar for spreadblocks. The term “spreadblock” stands for a block that may spread on several lines (containing single line breaks).
paragraph
: ... mixedDelimitedSpreadblock
;
mixedDelimitedSpreadblock
: word ( punctuationSign | delimitedSpreadblock ) ...
delimitedSpreadblock
: xxxSpreadblock
: parenthesizedSpreadblock
| squareBracketsSpreadblock
| doubleQuotedSpreadblock
| hyphenPairSpreadblock
;
parenthesizedSpreadblock
: '(' spreadblockBody ')' // Same for other paired delimiters.
;
spreadblockBody
: ... mixedDelimitedSpreadblock
;
xxxSpreadblock
: XXX spreadblockBodyNoXxx XXX
;
spreadblockBodyNoXxx
: ... mixedDelimitedSpreadblockNoXxx ...
;
mixedDelimitedSpreadblockNoXxx
: ... delimitedSpreadblockNoXxx ...
;
delimitedSpreadblockNoXxx
: parenthesizedSpreadblock
| squareBracketsSpreadblock
| doubleQuotedSpreadblock
| hyphenPairSpreadblock
;
This is more or less the same for tightblocks. “Tightblocks” stand for blocks containing no line breaks, like cells and embedded lists.
acell // Same for embedded list items. : ... mixedDelimitedTightblock .. ; mixedDelimitedTightblock : word ( punctuationSign | delimitedTightblock | ... ) ... : word ( punctuationSign | delimitedSpreadblock | ... ) ... ; delimitedTightblock : xxxTightblock | parenthesizedTightblock | squareBracketsTightblock | doubleQuotedTightblock | hyphenPairTightblock ; xxxTightblock : XXX tightblockBodyNoXxx XXX ; tightblockBodyNoXxx : ... mixedDelimitedTightblockNoXxx ... ; mixedDelimitedTightblockNoXxx : word ( punctuationSign | delimitedTightblockNoXxx ) ... ; delimitedTightblockNoXxx : parenthesizedTightblock | squarebracketsTightblock | doubleQuotedTightblock | hyphenPairTightblock ; // That's all.
Thought it is over? There is another kind of block, the delimitedTightblockNoSeparator used inside the subblockAfterTilde which reflects each block inside ~x~y~z! But at this point you probably got the idea.
Yes this makes the grammar quite verbose, but factoring it would reduce ANTLR’s ability to check for inconsistencies. Anyways, the slightest addition brings the need of writing test cases for every logical path inside each ANTLR grammar rule.
Just released Novelang-0.51.0!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.50.2!
Summary of changes:
n:list-with-triple-hyphen and n:list-with-double-hyphen-and-plus-sign) inside n:paragraphs-inside-angled-bracket-pairs.Download it from here.
Enjoy!
Just released Novelang-0.50.1!
Summary of changes:
JavaShell for cleaner shutdown when there is no default JmxKit. This only may affect users of Novelang-attirail subproject.Download it from here.
Enjoy!
Just released Novelang-0.50.0!
Summary of changes:
n:embedded-list-with-number-sign).n:list-with-double-hyphen-and-plus-sign).Download it from here.
Enjoy!
Just released Novelang-0.49.0!
Summary of changes:
n:cell-row element renders as a table header, if there is more than one. This might break existing documents.Download it from here.
Enjoy!
This is an update of previous Maven cheat sheet.
Release
mvn -e --batch-mode clean release:prepare -Dnovelang.build.distribution -DreleaseVersion=M.m.f > build-release-prepare.log mvn release:perform -Dnovelang.build.distribution -DreleaseVersion=M.m.f > build-release-perform.log
Just released Novelang-0.48.0!
Summary of changes:
n:raw-lines element nested inside n:lines-of-literal. This might break existing stylesheets.Download it from here.
Enjoy!
This is a cool one. As it takes random phrases from classical French litteracy, punctuation signs and "typographic grey" look natural.
Wikipedia definitely offers the right template for a cheat sheet. Short and easy to render with Novelang.
Just released Novelang-0.47.0!
Summary of changes:
Download it from here.
Enjoy!
What we need
How hard would that be to render a single Novelang document over multiple HTML pages? Better ask: how cool would that be? Think about Novelang documentation taking one single huge page. This makes non-linear reading quite uncomfortable. Of course, multi-page rendering should work for both batch and interactive mode.
Technical implications
For batch rendering, there can be a simple approach. Xalan (XSLT rendering engine) offers the redirect extension for redirecting output into a given file.
<xsl:template match="/doc/foo">
<redirect:write select="@file">
<foo-out>
<xsl:apply-templates/>
</foo-out>
</redirect:write>
</xsl:template>Unfortunately, this is not suitable for interactive rendering. For interactive rendering, the endering process must known both:
The same need arises for batch rendering but with Xalan’s Redirect extension mentioned above, the whole logic gets buried inside the XSLT (which probably makes it quite complex).
Obviously, we need a Renderer to work the same way for batch and interactive rendering, e. g. there should be no special handling of interactive or batch rendering in the XSL stylesheet. (But multi-page rendering would require a special stylesheet anyways, at least for generating navigation.)
While XSLT-based rendering is the most common case in Novelang, it’s better to think about the general contract of a org.novelang.rendering.Renderer. As it already does, the Renderer should spit bytes into a java.io.OutputStream with no knowledge wether it is a file or a socket. The job of creating the output (which means chosing a file name in the case of batch rendering) is left to some upstream object opening the OutputStream. Currently, this is done by org.novelang.batch.DocumentGenerator or org.novelang.daemon.DocumentHandler which both end by calling DocumentProducer, passing it the OutputStream.
So rendering stage needs additional logic. Interactive rendering implies to extract the requested page from the URL. Batch rendering implies to find the list of pages to create corresponding files on the filesystem.
New Renderer contract
There can’t be unique way to split a document into pages, so we have new responsabilties for our Renderer:Given a document tree (as a org.novelang.common.SyntacticTree) it calculates a list of page identifiers.Given the same document tree, plus a page identifier, it renders the corresponding page to an OutputStream.With something like a single empty page identifier, we should get the same single-page rendering as we have now.
For an XSLT-based Renderer, we should embed page identifiers generation in the same XSL stylesheet (as a part of already-discussed stylesheet metadata ):
<xsl:stylesheet [namespaces blah blah] >
<nlm:multipage>
<!--
Some XSL tranformations here,
starting from element.
-->
</nlm:multipage>
... Before page rendering occurs, Novelang asks the Renderer for page identifiers. The default XSL-based Renderer applies the content of the element (if there is one) as a stylesheet on the whole document tree. Then it obtains a list of page identifiers as follows:
<pages> <page name="Home" >/n:opus <page name="ChapterOne" >/n:opus/n:level[1] <page name="ChapterTwo" >/n:opus/n:level[2] <pages>
Of course each page name is unique. In order to achieve this with no tweak, the document tree may embed unique identifiers by extending the semantic of n:implicit-identifier or by adding a new n:unique-identifier element. Node paths seem easy to generate .
Now for each page, Novelang creates the corresponding file out of the page name. If the stylesheet in the did chose filesystem-friendly names, those will be used verbatim (otherwise we may apply some variant of URL encoding). And, for each page, Novelang calls the Renderer with the whole document tree again, and passes additional metadata elements to tell the Renderer which page it is rendering. Input XML looks like this:
<n:opus>
<n:meta>
<n:page-name>ChapterOne
<n:page-path>/n:opus/n:level[1]
</n:meta>
</n:opus>This should be enough for the Renderer to figure how to render only the page of interest. It might need to peek elsewhere in the document tree (like for a footer with a copyright notice, or find other chapter names for a navigation bar).
Mix with other features (present or future)
There is an additional role for node identifiers: they might help to “enhance” internal links by adding the prefix corresponding to the target page. (The internal link feature is yet in inception phase. It just seems easier to implement it right after multipage rendering.)
Unique page names
Novelang’s Fragment Identifier is the perfect candidate to generate page identifers. Unfortunately, composite identifier contain the \ character. Should we escape it, or mix it with some weird pseudo-directory feature? But maybe it’s time to remove relative identifiers which never proved useful, and don’t guarantee identifer uniqueness, anyways.
It’s easy to create a new element by adding a simple counter to a colliding identifier. The value for some given document fragment may change across several generations, when adding fragments with colliding identifiers. This won’t be a problem for internal links (links defined by the document itself) prohibit usage of unique identifier. Remember: unique identifiers are only for pure HTML links.
If there is a chance that a foreign HTML documents links to the HTML anchor defined by the unique identifier (in a pure WWWW – World Wide Web Way) then document author should use explicit identifiers.
New URL scheme
With single-page rendering, the rendered document has the same name as the source document (with the difference of the extension). Multi-page adds a new “dimension”. Because the name of the page may collide with another document’s name, the name of the originating document prefixes the page name. Let’s look at different options:
/main/documentation~syntax.html /main/documentation!syntax.html /main/documentation,syntax.html /main/documentation^syntax.html /main/documentation--syntax.html
Let’s see which character we could use (only checked on Mac OS X, to do: check on Windows):
| Character | Escaped? | Comments |
~ | No | Already used for Novelang meta pages. |
- | No | Already used for Novelang identifiers. |
^ | No | Meaningless in that context. |
# | No | Fragment in URL. |
, | No | Hard to distinguish from full stop . character. |
! | No | Hard to read. |
_ | No | Too common in file names. |
+ | No | Used in URL encoding. Usage unrelated to “plus” meaning. |
% | No | Used in URL encoding. |
= | Yes | Used in URL encoding. Usage unrelated to “equality” meaning. |
$ | Yes | Overused. |
; | Yes | Hard to read. |
| | Yes | Hard to read. |
' | Yes | Hard to read. |
& | Yes | Already used for URL parameters. |
? | Yes | Already used for URL parameters, DOS wildcard. |
@ | Yes | Inverted meaning if page name appears second. |
{ | Yes | Weird because unpaired. Meaningful otherwise. |
§ | Yes | Mac OS X console doesn’t like it. |
: | - | Path separator on Unix. |
The “Escaped?” column means, it requires escaping on Mac OS X console.
Finally, it turns out that -- looks the best, especially with a variable-width font like in Mac OS X Finder or Windows Explorer.
Special case: if the page identifier was blank, the page separator doesn’t appear so we would still have:
/main/documentation.html
This naming scheme also implies that all pages appear flatly in the same directory. This should help when resolving resource names.
Just released Novelang-0.46.1!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.46.0!
Summary of changes:
New experimental features for code reuse:
org.novelang package (was novelang).Download it from here.
Enjoy!
Just released Novelang-0.45.0!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.44.5!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.44.4!
Summary of changes:
Download it from here.
Enjoy!
Here is a Bash script (tested on Mac OS X) that renames every .nlp into .novella and .nlb into .opus. It also changes file content. Use with care.
#!/bin/sh SED='s/\.nlp/\.novella/g;s/\.nlb/\.opus/g' for file in `find src modules \( -name *.nlp -o -name *.nlb \) ` do newfile=` echo "$file" | sed $SED ` echo "$file -> $newfile" sed $SED < $file > $newfile rm $file done
Just released Novelang-0.44.3!
Summary of changes:
Download it from here.
Enjoy!
There is an updated version of this post.
This is a list of useful Maven commands. They work with Novelang-0.44.2. Later version will probably make some of them less verbose, using some default parameters.
Convention: the Novelang/$ represents the command prompt, with working directory being Novelang’s home directory. Subdirectories appear when needed.
Plugin versions
Stay up-to-date by listing more recent plugins (there is another goal for dependencies):
Novelang/$ mvn versions:display-plugin-updates
Show dependency tree:
Novelang/$ mvn dependency:tree
Feed local repository with fresh artifacts
Novelang/$ mvn clean install
Force child modules version
Force the version of every child module to the one of the parent:
Novelang/$ mvn -N versions:update-child-modules
Performing a release (may be specific to Novelang-0.44.2)
First, clean previous POM backup files:
Novelang/$ mvn release:clean
Then prepare the release. This does the following:
M.m.f in the snippet below)..Novelang/$ mvn -e --batch-mode release:prepare -Drelease=false -DlocalCheckout=true -DreleaseVersion=M.m.f -DdevelopmentVersion=SNAPSHOT -Dtag=release-M.m.f > build-release-prepare.log
This part is likely to fail. If something goes wrong:
--hard way, to the version immediately before Maven’s changes.release-M.m.f tag local git repository.release-M.m.f tag on remote git repository: git push -v github :refs/tags/release-M.m.f--force option (better idea, anyone?).Novelang/$ mvn release:cleanGet sure that’s everything OK with gitk.
Might be useful to reset all POM version (like after some POM or branch hacking): set root pom.xml version to SNAPSHOT and run mvn versions:update-child-modules.
Once this is done, our git repositories contain good, tagged stuff. Last step is to perform the final build.
Novelang/$ mvn release:perform > build-release-perform.log
(There is no additional parameter to pass; the release.prepare did create some POM copies with relevant information.)
The release.perform goal performs a fresh checkout in Novelang/target/checkout where all the pom.xml contain expected M.m.f version. The build calls the deploy:deploy on Novelang-documentation and Novelang-distribution which upload relevant files on SourceForge and send email notifications.
Useful links
Mini-guide about Maven release plugin.
Untested
Resume from a give module folder instead of restarting the build since the beginning:
Novelang/$ mvn reactor:resume -Dfrom=bar
Just released Novelang-0.44.2!
Summary of changes:
deploy:deploy goal should work properly when called from release:perform.Download it from here.
Enjoy!
Just released Novelang-0.44.1!
Summary of changes:
Download it from here.
Enjoy!
Just released Novelang-0.44.0!
Summary of changes:
.novella and .opus. Old .nlp and .nlb suffixes still supported.Download it from here.
Enjoy!
Novelang uses immutable trees to represent a document and transform it. While immutable data structure have well-known advantages, Novelang's tree library requires to update every parent node on each change on any child node. Clever guys found how to save those changes when performing multiple local modifications. This relies on a tree structure called Zipper. Here is a very clear explaination, the original paper (site currently down), the Scala implementation and the Clojure one.
nohead option to insert command.
html-FR.xsl for French punctuation.-Xmx parameter), Novelang handles documents twice bigger and serves them 20 % faster than previous version. Benchmark ran against version 0.41.0 and 0.38.1. This includes buffered reading of Part files, multithreaded Part rendering, and reduced memory consumption when dealing with AST (Abstract Syntax Tree).
“Nhovestone” is the name of Novelang’s dedicated benchmark tool, and also a geeky pun .
Nhovestone aims to highlight performance variations across versions using only a few (carefully selected) measurements:How does response time evolve when increasing the number of documents aggregated in a single Book?How does response time evolve when increasing the size of one single document?
Nhovestone doesn’t try to generate an absolute performance index. This is because such an index makes sense only when computed from always the same source documents and the same hardware.
How it works
Nhovestone focuses on HTML generation using default stylesheet, because HTML is great for fast edit-and-review roundtrips. It uses the Novelist to generate pseudo-random text with a realistic structure. For each benchmarked Novelang version, Nhovestone starts a JVM with a small amount of memory (currently -Xmx32M). With few memory the breaking point appears sooner. Nhovestone increases the size of the source document(s) in a linear fashion, and after each increasing, measures how long takes the call of a Novelang instance.
Performance degradation
Response time start to increase exponentially as document becomes fairly big in regard of available memory. This triggers a lot of CPU-intensive garbage collection consuming a lot of time. Nhovestone detects that a running Novelang HTTP daemon gets “strained” when response time gets above a dynamically-computed threshold. The threshold comes from the straight line drawn from a linear regression on the first half of the measurements, with a slope made steeper by a fixed coefficient. When a response time appears above this straight line, the Novelang HTTP daemon got strained and it’s not worth any further measurement.
Adding Parts
This is the first scenario: for each new measurement, there is an additional Part file. All Parts are more or less equal in size and complexity (including level depth). The graph below shows that performance degradation stays linear until the 300th call. Then, version 0.41.0 starts suffering before older versions. It’s likely that new features require additional memory so starvation occurs sooner.
Increasing the size of the same Part
This is the second scenario: the generated document comes from a single Part file of a size increasing before each call. Each fragment added to the Part has the same size and structure as in the previous test, but all 3 versions show fatigue much sooner (at least 7.5 times). This shows that creating a Part takes much more temporary memory than the finished Part itself.
Tuning
These figures are strongly connected to the volume and the structure of underlying document. Experience shows that small increments generate more measurements (before the fatal strain) and therefore show a more readable trend. They also reduce measurement artefacts that could fool strain detection.
Report generation
JFreeChart generates those graphs. JFreeChart is probably the best charting library for Java at this time, at least on the OSS marketplace. It is stable and highly configurable.
The next step: embed those graphs in a Novelang-generated PDF and publish it as a complement of existing documentation.
Novelang already does all the typesetting for you. What’s next? Writing text, of course! The just-started Novelist subproject, which aims to generate big documents for Novelang testing under heavy load.
Based on French metrics, random text looks like this:
Uomuecto eaufues xuner ig ocanerr, ebanu otpaa. Uuse, on eian aibtd, rttaintlufe elvettarrh, yrn enemlcmlun, ebcazepuer madscg, êiiovemtt teeost eseeerde? Fetn eearréetcs emrseoss icia ntmvesrud. Aoasro cênit ctainetda aèugedet css eali, unero aaie eneoden, nrortio. Oovlod; tfsmenco méttsna, eesdis uoeaeanao rcuent, desungtt av au oneerao, dxuaste umeinétniu lccdeiilne rliùearde veyiritisac yàslu. Iinmseuo odiapqied cmiiapearlo ebnjtus uauueis, libginmasa edrc emaèi sllieyr sode!
It bases on simplistic distribution algorithm. Word count and letter count from uniform distribution in a pre-defined range (something like 5-20 for words and 2-12 for letters). Letters come from a frequency table giving the percentage of appearance for each letter.
While the result doesn’t look much like real text, it’s good enough to stress basic parsing and typesetting.
There has been a lot of research about text analysis, first for cryptography, next for natural language analysis and Web crawling. Among all of them, there is a nifty one: the n-grams , which describe all the different letter sequences of a fixed length in a given text. The demo on Wolfram Alpha is gorgeous. It shows how combinations grow fast: a simple sentence like “ceramics come from” contains 69 3-grams. Google’s n-grams database (ranging from 1-grams to 5-grams) weights 24 GiB gzip’ed and contains near 1 billion of 3-grams. Amazingly, this number doesn’t increase so much for 4-grams and 5-grams.
A new default HTML stylesheet will be available soon. It should improve Novelang usability a lot. Key features are:
If a picture is worth a thousand words:
Fluid layout
New layout supports horizontal resize. The column for rendered text may span from 500 to 1000 pixels.
Lines of literal (<pre> tag) wrap if they are too long. Because wrapping only occurs with the white-space : pre-line style, which discards indentation by default. To prevent this, some JavaScript replaces every space character inside a <pre> by a non-brekable space, immediately followed by a zero-width space. This causes a clean-looking wrapping, but text copied in the clipboard has unwanted character.
Overall look
Titles are indented. This is a compromise with the Descriptor feature (described later).
Line spacing is constant, even between two paragraphs, or between a paragraph and an embedded list. There is a slight loss of information (it may be hard to see where a paragraph begins) but this globally increases readability.
Fonts
Font choice has a huge impact on overall look. Chosing fonts is hard stuff, because fonts rendering is hardly the same across Web browsers. Font readability also changes a lot, depending on line spacing, contrast, and other fonts around.
The convention is: serif font for rendered document, sans-serif for extra information like actions and tags. Literal (<pre> and <code>) shows with a fixed-with font, which is serif, too.
After experimenting with a lot of combinations, finally, the winners are:
— Palatino Linotye for the rendered document. This gorgeous font is a bit more readable than Times New Roman. It’s available on all platform, and looks gorgeous with appropriate contrats (dark grey over light gray instead of black over white).
—  Lucida Grande with Tahoma as second choice. Lucida Grande is highly readable (was chosen as default for Mac OS X), but sophisticated enough to not look “poor” aside Palatino.
—  Courier New is not new at all, but it mixes harmoniously across text in Palatino. Those fonts display much better on Mac OS X, or with Safari on Windows XP.
Descriptors
Descriptors appeared in Novelang-0.39.0, as an experimental feature. They now display with a nice fade and animation, in order to preserve user’s visual landmarks. Descriptor have a vertical bar that helps to see the scope of the descriptor. This vertical bar only shows when Descriptor is discloed.
Descriptor disclosers now appear close to Tag column. This avoids polluting the left margin.
Scalable lists for metadata
On big documents, there can be so many tags they don’t fit in the height on a Web browser’s window. But most of time, they all fit so it’s convenient to have all of them at a fixed position. How to deal with the exception without hurting common case? Having a 2nd scrollbar in a browser’s frame looks confusing. But the scrollbar has a great feature: it shows that some items are out of sight. One trick could be displaying a huge popup, but this probably means a lot of work for a poor result.
Finally, the solution comes with a fade to grey at the end of the list to show that all items don’t show. A tiny button “unpins” the tag list from the top of Web browser’s window and lets it go to the document’s beginning. So, when entering the “1 % case” we still have a standard behavior.
Here is the Tag tab in its default pinned state (note the fade at the bottom and scrollbar position):
Unpinning causes it to scroll with the rest of the document:
In addition to Tags, there will be, in a (hopefully near) future, more metadata like Identifiers. Tags show up under a tab bar where it’s easy to add new tabs.