This page aims at gathering thoughts about the handling of footnotes and before-floats, and in particular make them play well with the Knuth algorithm. Before-floats were first implemented during a Google Summer of Code project, whose corresponding pages are left as is for archiving purpose.

Introduction

The Knuth algorithm was designed to layout paragraphs in an optimal manner, by considering the paragraph on its whole. The idea is to assign demerits to each line, and find the paragraph layout which leads to the least total demerits.

Without footnotes nor before-floats, there is a direct correspondence between words and lines, lines and pages, paragraphs and page-sequences. So the Knuth algorithm may also be applied at the page-level.

However, footnotes and before-floats introduce some difficulties:

  • they are anchored to some point in the normal flow, should ideally appear on the same page as the anchor, may appear on following pages but in no case on previous pages;
  • they provide additional flows of typographic material which must be correctly wired together;
  • they are more constrained than the normal flow.

Footnotes must not be deferred on following pages, unless there is really no other possibility. This means that an underfull page with no deferred footnote will be preferred to a full page with deferred footnotes. For nicer visual results, it may be possible to put the unfilled space between the normal flow and the footnote-separator, instead of at the bottom of the page. That way the underfull page might even become unnoticeable:
http://people.apache.org/~vhennebert/wiki/Floats/underfullPageWithFootnote.png

For before-floats the constraint is not so strong. For double-sided documents it is even perfectly acceptable to have the anchor on an even page and the float on the following odd page; the reader won't even have to turn a page to find the figure:
http://people.apache.org/~vhennebert/wiki/Floats/beforeFloatsMayBeDeferred.png

Unlike footnotes, it is best to defer before-floats in order to avoid underfull pages.

How to Handle Out-of-lines

For simplification, we will call a deferring node a node containing a deferred footnote.

Give Underfull Nodes a Chance to be Considered

Currently too-short nodes are registered separately from normal nodes, and discarded as soon as a normal node has been found for a given page. This means that even if a normal node has a deferred footnote, it will be preferred to a non-deferring too-short node.

To solve that we should register as feasible breaks non-deferring too-short nodes, if there is no possibility to make a non-deferring normal node. Otherwise the normal node should be preferred.

Still Accept Deferred Footnotes

An easy solution would be to simply forbid breaks with deferred footnotes. That is, not register such nodes as feasible breaks. But this is too restrictive, as can be seen on the following picture:
http://people.apache.org/~vhennebert/wiki/Floats/allowDeferredFootnotes.png

Granted, this is a degenerated case, but it will occur sooner or later...

A Possible Solution

So, what can we do? Playing with demerits is not enough: even if a non-deferring too-short node has lower demerits than a deferring normal one, this does not guarantee that it will be chosen. Indeed if pages following the too-short node have higher demerits than the pages following the normal one, they will end up with higher total demerits; which will lead to the normal node being chosen.

The idea would be to register too-short nodes as normal nodes if (and only if) they contain non-deferred footnotes. Otherwise they would be handled as usual. Once such a too-short node has been found, it would become forbidden to create deferring normal nodes, as a non-deferring possibility exists.

Regarding demerits, a too-short node would at least have the maximum demerits that a normal node can have. That is, demerits corresponding to an adjustment ratio of 1 (or -1), a node ending at a penalty element (max possible penalty = 1000), flagged (demerits += 50), of incompatible fitness with the previous node (+50). Let's call it maxnormal (we would then have maxnormal = 1212301). The demerits of the too-short node would be proportional to the ratio of unfillness, and would grow up to maxshort. That way, non-deferring normal nodes would be preferred to too-short ones (if they don't lead to further problematic breaks, and thus to higher total demerits).

A Word About Fill Ratio

During the GSoC the idea came up that it might be desirable to allow underfull pages. Indeed above a certain fill ratio (90%–95%) underfull pages may become unnoticeable. Doing this would ease the layout of complex documents: instead of always restarting the breaking process from the last too-short node because no more feasible break is possible, a layout could be found just in one pass. This could especially make sense for novel-like documents, which would have many lines with no available shrinking/stretching between them. Unless the page height is set to be an exact number of times the line height, we will always have too-short nodes.

However, this does not change anything regarding out-of-lines objects: a non-100% fill ratio just allows some too-short nodes to be handled as normal nodes. This moves the limit between too-short nodes and normal nodes by offering more layout possibilities, but this won't prevent non-deferring too-short nodes below the minimum fill ratio from being discarded in favor of deferring normal nodes (or too-short nodes above the fill ratio).

  • No labels