This page is a work-in-progress, and aims to serve as documentation of how FOP's layout engine processes footnotes, mainly to save interested contributors the hassle of having to find out the hard way...
Introduction
XSL-FO defines the fo:footnote
formatting object to allow authors to insert citations into their documents.
These footnotes are attached to the line area in which they would appear, and are to be rendered in a reserved footnote-area on the after-edge of the region-reference-area, preceded by a footnote-separator if one is defined in the source.
This page intends to describe only the approach with respect to the layout engine. The initial validation of out-of-line descendants during parsing is considered out of scope.
FOP's layout engine has to deal with footnote content in two phases (most general use case; see further below for the special cases):
- Line layout: the footnotes are attached to the box representing the line area that holds the surrounding inline content in the source
- Page layout: the footnotes are placed on the page that contains the line area that holds the anchor
Line layout
During the initial collection of the inline KnuthElements
, if footnotes are encountered, a KnuthInlineBox
is generated for the anchor, that holds a reference to the FootnoteBodyLayoutManager
. If the anchor is empty, a dummy auxiliary box will be inserted to make sure the footnote is rendered at the appropriate place. (see: org.apache.fop.layoutmgr.inline.FootnoteLayoutManager.getNextKnuthElements()
)
The LineLayoutManager
is responsible for making sure that the LMs for the footnote-bodies are attached to the appropriate line box (see: org.apache.fop.layoutmgr.inline.LineLayoutManager.postProcessLineBreaks()
).
Special cases: lists and tables
As lists and tables 'aggregate' their content and return KnuthBoxes
encompassing multiple line areas, they (currently) have to take care of propagating the footnotes from the contained line boxes upward, so that the PageBreaker
will be able to access them. TODOs have been left in the code, and concerns have been raised that the related classes should actually remain footnote-agnostic. Ultimately, there is a certain duplication of what LineLayoutManager
does...
One possible strategy to avoid that, would be to push this out of the LMs, and into a factory that constructs boxes based on (lists of) element lists. All this logic would then be contained in a single class. The LMs, instead of explicitly instantiating the boxes, would just pass the sublist(s) that represent the contained content, and get back the appropriate type of box to add to their element list.
see also: Bugzilla 37579
Page layout
Initial pass - Layout of the footnote bodies
During an initial pass over the block list returned by the FlowLayoutManager
, the PageBreaker
first triggers line layout of all the footnote-bodies, so that the footnotes' lists of line boxes are directly accessible in the corresponding content boxes. The reason this was deferred from line layout, is that the context will not always be the same. In case of multi-column layout, the footnotes will span the whole region-reference-area, rather than follow the flow's column-count.
If footnotes were encountered, layout of the footnote-separator is done here, so that its block-progression-dimension will be readily available when needed later by the PageBreakingAlgorithm
.
see: org.apache.fop.layoutmgr.PageBreaker.getNextKnuthElements()
Calculating the breaks
Variables
The variables that play a part in footnote processing are mostly private members of PageBreakingAlgorithm
footnotesPending
indent |
---|
flag indicating whether footnotes have been met |
totalFootnotesLength
indent |
---|
the total length of all footnotes that have been met |
insertedFootnotesLength
indent |
---|
the total length of all footnote parts that have been met |
footnotesList
indent |
---|
the list of content lists of all footnotes that have been met |
footnoteListIndex
indent |
---|
the index of the current footnote |
footnoteElementIndex
indent |
---|
the index of the last added part of the current footnote |
newFootnotes
indent |
---|
flag that is {{true}} if any new footnotes have been met (will be {{false}} if the footnote content consists solely of deferred parts) |
firstNewFootnoteIndex
indent |
---|
the index of the first new (non-deferred) footnote |
KnuthPageNode.totalFootnotes
indent |
---|
the total length of inserted footnote parts at this node |
KnuthPageNode.footnoteListIndex
indent |
---|
the index of the current footnote at this node |
KnuthPageNode.footnoteElementIndex
indent |
---|
the index of the last added part of the current footnote at this node |
Processing
A (line) box with anchors triggers PageBreakingAlgorithm.handleFootnotes()
, which:
- adds the corresponding element lists to footnoteList
- computes the total length of each of the element lists
- for each element list, stores the accumulated length of all preceding notes plus its own, in
lengthList
Additionally,totalFootnotesLength
is increased with the length of each footnote.
For all following legal breaks, this will result in PageBreakingAlgorithm.computeDifference()
taking into account the additional width required for the footnote separator and the footnotes up to that point.
If the total length of content + separator + all footnotes does not fit within the available width, and it is allowed to defer part of the footnotes to the following page, the footnote length will be split here.
The chosen strategy is to:
- first try adding all footnotes that can no longer be deferred (i.e. were already carried over from a previous break)
- then add whole footnotes, until we reach one that doesn't fit in its entirety
- from that last footnote, try adding more parts/lines, until we reach the one that doesn't fit anymore
The adjustmentRatio
for the break is updated with the stretch or shrink of the footnote separator, in case the computed difference
is respectively positive (stretch) or negative (shrink).
The eventual demerits
for the break are increased only in case footnote content is being deferred:
- if the current footnote is not the last - the more footnotes are being deferred, the less favorable the break
- if the current part is not the last of the current footnote - a fixed increase due to the footnote split
Wiki Markup |
---|
\[TBD\] |
Remaining Issues
Footnotes and multi-column flows
Since there is no hard distinction between page- and column-breaks, with respect to footnotes, each column acts as its own page. The best node for each column is determined by only looking at the footnotes whose anchors are in that particular column. This leads to overlaps between footnotes and the flow content.
see: Bugzilla 51304
Infinite loops in footnote deferral
There are currently multiple open Bugzilla reports about infinite loops being triggered under certain circumstances. Closer inspection reveals that the culprit is the footnote deferral mechanism, very likely in the interaction between PageBreakingAlgorithm.getFootnoteSplit()
and .createFootnotePages()
. Certain assumptions are made in getFootnoteSplit()
that do not always seem to hold. One such assumption is that the local variable splitLength
will eventually always exceed the availableLength
.
see: Bugzilla 47424 or Bugzilla 48397
Space resolution between footnotes
When the PageBreakingAlgorithm
adds a box's footnotes, space-resolution is triggered for each footnote separately. This does not take into account potential stacking constraints between the footnotes.
see code comment in: PageBreakingAlgorithm.handleFootnotes()
Footnote splits and changing page-ipd
No Bugzilla yet. Unverified, but likely to lead to trouble, as the decisive logic for footnote splits is entirely encapsulated in PageBreakingAlgorithm
.
The implementation for changing page-ipd works such that the algorithm instance that is responsible for the part before a change in page-ipd knows nothing about the one that processes what follows, and vice versa. This will almost certainly manifest itself as disappearing (deferred) footnote content.