Knuth element generation for tables
Overview
At the moment we see two fundamental approachs for generating Knuth elements for tables. Note: Some of the considerations made here can also be applied to lists.
1. Combined element list
In this approach the individual element lists of each cell are combined to a single element list that can be used in page breaking. The approach is described in detail further down.
2. Combined stretch boxes
In this approach for each cell a single box plus stretchable penalty is created to represent the cell as a whole. All the boxes for a row are combined to one which is used for page breaking. The approach is described here: link
Combined element list
Simple case
<fo:table-row> <fo:table-cell line-height="15pt" id="cell1"> <fo:block>block1</fo:block> <fo:block>block2</fo:block> <fo:block>block3</fo:block> </fo:table-cell> <fo:table-cell id="cell2"> <fo:block> <fo:external-graphic src="something" block-progression-dimension.minimum="15pt" block-progression-dimension.optimum="30pt" block-progression-dimension.maximum="45pt"/> </fo:block> </fo:table-cell> </fo:table-row>
Cell1 list:
1) box w=15000 2) penalty w=0 p=0 --legal break 3) box w=15000 4) penalty w=0 p=0 --legal break 5) box w=15000 --cut-off point 6) penalty w=0 p=INF 7) glue w=0 y=INF z=0 8) penalty w=0 p=-INF flagged
Cell2 list:
1) box w=30000 2) penalty w=0 p=INF 3) glue w=0 y=15000 z=15000 (stretch for previous box) --cut-off point 4) penalty w=0 p=INF 5) glue w=0 y=INF z=0 6) penalty w=0 p=-INF flagged
Creating a combined element list here is easy when we omit stretch and shrink values:
step list between Cell1 and Cell2:
0, 15000, 30000, 45000
(the step list represents the dotted green lines on the graphic on TableLayout)
combined list for row without stretch/shrink:
1) box w=15000 [C1: 1-2] [C2: empty] 2) penalty w=0 p=0 3) box w=15000 [C1: 3-4] [C2: 1-3] 4) penalty w=0 p=0 5) box w=15000 [C1: 5-5] [C2: empty]
The next step
Let's modify the original example a bit:
<fo:table-row> <fo:table-cell line-height="15pt" id="cell1"> <fo:block>block1</fo:block> <fo:block>block2</fo:block> <fo:block>block3</fo:block> </fo:table-cell> <fo:table-cell id="cell2a"> <fo:block> <fo:external-graphic src="something" block-progression-dimension="33pt"/> </fo:block> <fo:block line-height="8pt" font-size="8pt">Caption for the above image</fo:block> </fo:table-cell> </fo:table-row>
Cell2a list:
1) box w=33000 2) penalty w=0 p=0 --legal break 3) box w=8000 --cut-off point 4) penalty w=0 p=INF 5) glue w=0 y=INF z=0 6) penalty w=0 p=-INF flagged
step list between Cell1 and Cell2a 0, 15000, 30000, 33000, 41000, 45000
There are suddenly a lot more possible break points.
The challenge was to come up with an algorithm to create the combined list for the more complicated case. Here are the expected results from the broken lists:
- break at first legal break: Part 1 (height -> 15), Part 2 (height -> 41)
- break at second legal break: Part 1 (height -> 30), Part 2 (height -> 41)
- break at third legal break: Part 1 (height -> 33), Part 2 (height -> 15)
- break at fourth legal break: Part 1 (height -> 41), Part 2 (height -> 15)
- no breaks: Part 1 (height -> 45)
The proposed algorithm
Luca Furini came up with the following algorithm:
This could be the procedure that creates the combined elements, at least when there is no stretch nor shrink; with this assumption, the resulting sequence will contain only boxes and penalties.
First of all, we must know the total height of the row, if it is not split between pages.
totalHeight = 0 for each cell-list height[i] = compute the total height of each cell-list if (height[i] > totalHeight) totalHeight = height[i]
The sequence we are going to create must be totalHeight long when it is not divided.
step = 0 addedBoxHeight = 0; while step < totalHeight step = nextStep() penaltyHeight = step + maxRemainingHeight() - totalHeight boxHeight = step - addedBoxHeight - penaltyHeight addedBoxHeight += boxHeight sequence.add(new box(boxHeight)) sequence.add(new penalty(penaltyHeight, 0))
For each cell-list, we must know, besides the total height, the height of the elements already combined into the new ones.
The nextStep() method looks at each cell-list, computing the height of the first sub-sequence, chooses the smallest increase and updates in each list the height of the elements already combined; maxRemainingHeight() returns the greatest difference between the total height of a cell-list and the height of the elements already combined.
Handling the second example above
Using the example above, the behaviour of this algorithm is this:
totalHeight = 45 1st step: step = 15 maxRemainingHeight = 41 addedBoxHeight = 0 penaltyHeight = 15 + 41 - 45 = 11 boxHeight = 15 - 0 - 11 = 4 2nd step: step = 30 maxRemainingHeight = 41 addedBoxHeight = 4 penaltyHeight = 30 + 41 - 45 = 26 boxHeight = 30 - 4 - 26 = 0 3rd step: step = 33 maxRemainingHeight = 15 addedBoxHeight = 30 penaltyHeight = 33 + 15 - 45 = 3 boxHeight = 33 - 4 - 3 = 26 4th step: step = 41 maxRemainingHeight = 15 addedBoxHeight = 30 penaltyHeight = 41 + 15 - 45 = 11 boxHeight = 41 - 30 - 11 = 0 5th step: step = 45 maxRemainingHeight = 0 addedBoxHeight = 30 penaltyHeight = 45 + 0 - 45 = 0 boxHeight = 45 - 30 - 0 = 15
Combined elements, with their length and what happens if a penalty is used:
box 4 penalty 11 ----> 15+41 box 0 penalty 26 ----> 30+41 box 26 penalty 3 ----> 33+15 box 0 penalty 11 ----> 41+15 box 15 ----> 45
Handling the easiest example
The easiest example would look like this us with the algorithm above:
Table with one cell (three one-line blocks) height = (3,3,3) (each line has a height of 1 in this example) totalHeight = 3 Step1: stepw = 1; maxRemHeight = 2 penaltyHeight = 1 + 2 - 3 = 0; boxHeight = 1 - 0 - 0 = 1 addedBoxHeight now 1 Step 2: stepw = 1 + 1 = 2 maxRemHeight = 1 penaltyHeight = 2 + 1 - 3 = 0 boxHeight = 2 - 1 - 0 = 1 addedBoxHeight now 2 Step 3: stepw = 2 + 1 = 3 maxRemHeight = 0 penaltyHeight = 3 + 0 - 3 = 0 boxHeight = 3 - 2 - 0 = 1 addedBoxHeight now 3
breaks: 1/2, 2/1, 3/0
box 1 penalty 0 box 1 penalty 0 box 1 (penalty 0)
Handling row spanning with this approach
The meaning of "totalHeight" in the above algorithm needs to be extended for the combined list. totalHeight will not only be the total height of a single row, but of a row group. A row group in this context is the minimum number of consecutive rows which contains all spanned grid units of its cells. An example:
+----------+-------+----------+ | | | | +----------+-------+----------+ <-- here's the delimiter between two row groups | | | | | | | +-------+----------+ | | | | | +-------+ | | | | | +----------+-------+ | | | | | +----------+-------+----------+
Here we'd have two such row groups, the first is equivalent to the first row, the second contains rows 2 to 5. Note that we can't just sum up the heights of the individual cells in the row group per column. We have to add the space used by the cell borders (and optional cell spacing in separate mode), too, because they are also elements that create steps in the algorithm.
Determining the row height for each row (with row spanning)
for each row effRowHeight = minoptmax(0, 0, INF) (=auto height) if (bpd of row != auto) effRowHeight = minoptmax(bpd of row) for each cell ending in the current row effCellHeight = minoptmax(0, 0, INF) (=auto height) if (bpd of cell != auto) effCellHeight = minoptmax(bpd of cell) MinOptMax contentHeight = calcContentHeightOfCell() if collapsing border model then contentHeight += half of before and after borders for that cell (assuming no break happens) else contentHeight += before and after borders of that cell plus half the BPD border-separation //handle spanned cells contentHeight -= sum of all heights of previous occupied spanned grid units check if cell content overflows available cell BPD extend effCellHeight based on the contentHeight (handling auto height) check if cell content overflows available row BPD extend effRowHeight based on the effCellHeight remember the newly determined row height (effRowHeight)
Further details on nextStep()
There are a few additional things that have to be done in nextStep() (or at least in its context):
- Propagating hard breaks from child elements
- Creating elements in a way that keeps are fulfilled
- Determining the different effective borders for each grid unit
- Handling table-header|footer
More complex example with row spans, headers, footers and borders
This example tries to examine a few more aspects of the whole process.
<fo:table border="solid 5"> <fo:table-header border="solid 4"> <fo:table-row> <fo:table-cell line-height="8" border="solid 1" number-columns-spanned="2"> <fo:block>HEADER</fo:block> </fo:table-cell> </fo:table-row> </fo:table-header> <fo:table-footer border="solid 4"> <fo:table-row> <fo:table-cell line-height="8" border="solid 1" number-columns-spanned="2"> <fo:block>FOOTER</fo:block> </fo:table-cell> </fo:table-row> </fo:table-footer> <fo:table-body border="solid 3"> <fo:table-row border="solid 2"> <fo:table-cell line-height="10" number-rows-spanned="2" border="solid 1"> <fo:block>block1</fo:block> <fo:block>block2</fo:block> <fo:block>block3</fo:block> </fo:table-cell> <fo:table-cell line-height="15" border="solid 1"> <fo:block>blockA</fo:block> </fo:table-cell> </fo:table-row> <fo:table-row> <fo:table-cell line-height="20" border="solid 1"> <fo:block>blockB</fo:block> </fo:table-cell> </fo:table-row> </fo:table-body> </fo:table>
An illustration of this example:
http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders1.png
element lists for the cells
Cell1: length(30,30,30) box(10) penalty(0, 0) box(10) penalty(0, 0) box(10) Cell2: length(15,15,15) box(15) Cell3: length(20,20,20) box(20)
Normal borders (assuming no breaks)
Cell1 before: 4 (body border-after of table-header is dominant) Candidates: border-after header cell, border-after header row, border-after header body, border-before cell1, border-before row1, border-before body after: 4 (body border-before of table-footer is dominant) Cell2: before: 4 (body border-after of table-header is dominant) after: 2 (row border is dominant) Candidates: border after cell2, border-after row1, border before cell 3, border before row2 Cell3: before: 2 (row border is dominant) after: 4 (body border-before of table-footer is dominant)
Table header and footer width
Header is 5 + 8 = 13 Footer is 8 + 5 = 13
Row heights
row1: 15 + (2/2) => (16,16,16) row2: 20 + (2/2) => (21,21,21) row group (37,37,37)
Steps
Step 1: stepw= 10 maxRemainingHeight = 37 addedBoxHeight = 0 penalty = 10 + 37 - 37 = 10 effPenalty: w = 10 (space for block1) + 13 (header) + 4 (border-after-header) + 4 (border-before-footer) + 13 (footer) box = 10 - 0 - 10 = 0 Step 2: stepw = 15 maxRemaningHeight = 20 addedBoxHeight = 0 penalty = 15 + 20 - 37 = -2 effPenalty: w = -2 (calc penalty) + 13 + 4 + 4 + 13 box = 15 - 0 - (-2) = 17 Step 3: stepw = 10 + 10 = 20 maxRemainingHeight = 20 addedBoxHeight = 17 penalty = 20 + 20 - 37 = 3 effPenalty: w = 3 (calc penalty) + 13 + 4 + 4 + 13 box = 20 - 17 - 3 = 0 Step 4: stepw = 10 + 10 + 10 = 30 maxRemaningHeight = 20 addedBoxHeight = 17 penalty = 30 + 20 - 37 = 13 effPenalty: w = 14 (calc penalty) + 13 + 4 + 4 + 13 box = 30 - 17 - 13 = 0 Step 5: stepw = 15 + 2 + 20 = 37 maxRemainingHeight = 0 addedBoxHeight = 17 penalty = 37 + 0 - 37 = 0 (last step, omit penalty) box = 37 - 17 - 0 = 20
End result (with raw penalties)
box(0) penalty(10) -> 10/37 box(17) penalty(-2) -> 15/20 box(0) penalty(3) -> 20/20 box(0) penalty(13) -> 30/20 box(20) -> 37
End result (effective)
box(0) penalty(44) //incl. header and footer -> 44/71 box(17) penalty(32) //incl. header and footer -> 49/54 box(0) penalty(37) //incl. header and footer -> 54/54 box(0) penalty(47) //incl. header and footer -> 64/54 box(20) box(17) //header box(17) //footer -> 71
An example with interaction between the row borders and the header and footer borders: TableLayout KnuthElementsForTables/RowBorder
An example with interaction between the row borders and the header and footer borders taking the former example a step further:
TableLayout KnuthElementsForTables/RowBorder2
The next example converts the previous to the separate border model and shows how to handle it:
TableLayout KnuthElementsForTables/RowBorder3
Knowns problems
- There's a problem when a cell is broken over more than two pages. In this case the steps may not be accurate anymore for breaking between the second and third page. This is only the case when the step generator has to create penalties. If we wanted to fix this, we would have to discard all elements starting on the second page, so the elements can be recreated based on the earlier break decision.