Knuth element generation for tables

Overview

At the moment we see two fundamental approachs for generating Knuth elements for tables. Note: Some of the considerations made here can also be applied to lists.

1. Combined element list

In this approach the individual element lists of each cell are combined to a single element list that can be used in page breaking. The approach is described in detail further down.

2. Combined stretch boxes

In this approach for each cell a single box plus stretchable penalty is created to represent the cell as a whole. All the boxes for a row are combined to one which is used for page breaking. The approach is described here: link

Combined element list

Simple case

<fo:table-row>
  <fo:table-cell line-height="15pt" id="cell1">
    <fo:block>block1</fo:block>
    <fo:block>block2</fo:block>
    <fo:block>block3</fo:block>
  </fo:table-cell>
  <fo:table-cell id="cell2">
    <fo:block>
      <fo:external-graphic src="something" 
        block-progression-dimension.minimum="15pt" 
        block-progression-dimension.optimum="30pt" 
        block-progression-dimension.maximum="45pt"/>
    </fo:block>
  </fo:table-cell>
</fo:table-row>

Cell1 list:

1) box w=15000
2) penalty w=0 p=0
--legal break
3) box w=15000
4) penalty w=0 p=0
--legal break
5) box w=15000
--cut-off point
6) penalty w=0 p=INF
7) glue w=0 y=INF z=0
8) penalty w=0 p=-INF flagged

Cell2 list:

1) box w=30000
2) penalty w=0 p=INF
3) glue w=0 y=15000 z=15000 (stretch for previous box)
--cut-off point
4) penalty w=0 p=INF
5) glue w=0 y=INF z=0
6) penalty w=0 p=-INF flagged

Creating a combined element list here is easy when we omit stretch and shrink values:

step list between Cell1 and Cell2:
0, 15000, 30000, 45000

(the step list represents the dotted green lines on the graphic on TableLayout)

combined list for row without stretch/shrink:

1) box w=15000 [C1: 1-2] [C2: empty]
2) penalty w=0 p=0
3) box w=15000 [C1: 3-4] [C2: 1-3]
4) penalty w=0 p=0
5) box w=15000 [C1: 5-5] [C2: empty]

The next step

Let's modify the original example a bit:

<fo:table-row>
  <fo:table-cell line-height="15pt" id="cell1">
    <fo:block>block1</fo:block>
    <fo:block>block2</fo:block>
    <fo:block>block3</fo:block>
  </fo:table-cell>
  <fo:table-cell id="cell2a">
    <fo:block>
      <fo:external-graphic src="something" 
        block-progression-dimension="33pt"/>
    </fo:block>
    <fo:block line-height="8pt" font-size="8pt">Caption for the above image</fo:block>
  </fo:table-cell>
</fo:table-row>

Cell2a list:

1) box w=33000
2) penalty w=0 p=0
--legal break
3) box w=8000
--cut-off point
4) penalty w=0 p=INF
5) glue w=0 y=INF z=0
6) penalty w=0 p=-INF flagged

step list between Cell1 and Cell2a 0, 15000, 30000, 33000, 41000, 45000

There are suddenly a lot more possible break points.

The challenge was to come up with an algorithm to create the combined list for the more complicated case. Here are the expected results from the broken lists:

  • break at first legal break: Part 1 (height -> 15), Part 2 (height -> 41)
  • break at second legal break: Part 1 (height -> 30), Part 2 (height -> 41)
  • break at third legal break: Part 1 (height -> 33), Part 2 (height -> 15)
  • break at fourth legal break: Part 1 (height -> 41), Part 2 (height -> 15)
  • no breaks: Part 1 (height -> 45)

The proposed algorithm

Luca Furini came up with the following algorithm:

This could be the procedure that creates the combined elements, at least when there is no stretch nor shrink; with this assumption, the resulting sequence will contain only boxes and penalties.

First of all, we must know the total height of the row, if it is not split between pages.

    totalHeight = 0
    for each cell-list
        height[i] = compute the total height of each cell-list
        if (height[i] > totalHeight)
            totalHeight = height[i]

The sequence we are going to create must be totalHeight long when it is not divided.

    step = 0
    addedBoxHeight = 0;
    while step < totalHeight
        step = nextStep()
        penaltyHeight = step + maxRemainingHeight() - totalHeight
        boxHeight = step - addedBoxHeight - penaltyHeight
        addedBoxHeight += boxHeight
        sequence.add(new box(boxHeight))
        sequence.add(new penalty(penaltyHeight, 0))

For each cell-list, we must know, besides the total height, the height of the elements already combined into the new ones.

The nextStep() method looks at each cell-list, computing the height of the first sub-sequence, chooses the smallest increase and updates in each list the height of the elements already combined; maxRemainingHeight() returns the greatest difference between the total height of a cell-list and the height of the elements already combined.

Handling the second example above

Using the example above, the behaviour of this algorithm is this:

totalHeight = 45

1st step: step = 15
          maxRemainingHeight = 41
          addedBoxHeight = 0
          penaltyHeight = 15 + 41 - 45 = 11
          boxHeight = 15 - 0 - 11 = 4

2nd step: step = 30
          maxRemainingHeight = 41
          addedBoxHeight = 4
          penaltyHeight = 30 + 41 - 45 = 26
          boxHeight = 30 - 4 - 26 = 0

3rd step: step = 33
          maxRemainingHeight = 15
          addedBoxHeight = 30
          penaltyHeight = 33 + 15 - 45 = 3
          boxHeight = 33 - 4 - 3 = 26

4th step: step = 41
          maxRemainingHeight = 15
          addedBoxHeight = 30
          penaltyHeight = 41 + 15 - 45 = 11
          boxHeight = 41 - 30 - 11 = 0

5th step: step = 45
          maxRemainingHeight = 0
          addedBoxHeight = 30
          penaltyHeight = 45 + 0 - 45 = 0
          boxHeight = 45 - 30 - 0 = 15

Combined elements, with their length and what happens if a penalty is used:

 box      4
 penalty 11 ----> 15+41
 box      0
 penalty 26 ----> 30+41
 box     26
 penalty  3 ----> 33+15
 box      0
 penalty 11 ----> 41+15
 box     15 ----> 45

Handling the easiest example

The easiest example would look like this us with the algorithm above:

Table with one cell (three one-line blocks)
height = (3,3,3) (each line has a height of 1 in this example)
totalHeight = 3

Step1:
stepw = 1;
maxRemHeight = 2
penaltyHeight = 1 + 2 - 3 = 0;
boxHeight = 1 - 0 - 0 = 1
addedBoxHeight now 1

Step 2:
stepw = 1 + 1 = 2
maxRemHeight = 1
penaltyHeight = 2 + 1 - 3 = 0
boxHeight = 2 - 1 - 0 = 1
addedBoxHeight now 2

Step 3:
stepw = 2 + 1 = 3
maxRemHeight = 0
penaltyHeight = 3 + 0 - 3 = 0
boxHeight = 3 - 2 - 0 = 1
addedBoxHeight now 3

breaks: 1/2, 2/1, 3/0

box     1
penalty 0
box     1
penalty 0
box     1
(penalty 0)

Handling row spanning with this approach

The meaning of "totalHeight" in the above algorithm needs to be extended for the combined list. totalHeight will not only be the total height of a single row, but of a row group. A row group in this context is the minimum number of consecutive rows which contains all spanned grid units of its cells. An example:

+----------+-------+----------+
|          |       |          |
+----------+-------+----------+ <-- here's the delimiter between two row groups
|          |                  |
|          |                  |
|          +-------+----------+
|          |       |          |
|          +-------+          |
|          |       |          |
+----------+-------+          |
|          |       |          |
+----------+-------+----------+

Here we'd have two such row groups, the first is equivalent to the first row, the second contains rows 2 to 5. Note that we can't just sum up the heights of the individual cells in the row group per column. We have to add the space used by the cell borders (and optional cell spacing in separate mode), too, because they are also elements that create steps in the algorithm.

Determining the row height for each row (with row spanning)

for each row
  effRowHeight = minoptmax(0, 0, INF) (=auto height)
  if (bpd of row != auto)
    effRowHeight = minoptmax(bpd of row)
  for each cell ending in the current row
    effCellHeight = minoptmax(0, 0, INF) (=auto height)
    if (bpd of cell != auto)
      effCellHeight = minoptmax(bpd of cell)
    MinOptMax contentHeight = calcContentHeightOfCell()
    if collapsing border model then
      contentHeight += half of before and after borders for that cell (assuming no break happens)
    else
      contentHeight += before and after borders of that cell plus half the BPD border-separation
    //handle spanned cells
    contentHeight -= sum of all heights of previous occupied spanned grid units

    check if cell content overflows available cell BPD
    extend effCellHeight based on the contentHeight (handling auto height)

    check if cell content overflows available row BPD
    extend effRowHeight based on the effCellHeight
    
  remember the newly determined row height (effRowHeight)

Further details on nextStep()

There are a few additional things that have to be done in nextStep() (or at least in its context):

  • Propagating hard breaks from child elements
  • Creating elements in a way that keeps are fulfilled
  • Determining the different effective borders for each grid unit
  • Handling table-header|footer

More complex example with row spans, headers, footers and borders

This example tries to examine a few more aspects of the whole process.

<fo:table border="solid 5">
  <fo:table-header border="solid 4">
    <fo:table-row>
      <fo:table-cell line-height="8" border="solid 1" number-columns-spanned="2">
        <fo:block>HEADER</fo:block>
      </fo:table-cell>
    </fo:table-row>
  </fo:table-header>
  <fo:table-footer border="solid 4">
    <fo:table-row>
      <fo:table-cell line-height="8" border="solid 1" number-columns-spanned="2">
        <fo:block>FOOTER</fo:block>
      </fo:table-cell>
    </fo:table-row>
  </fo:table-footer>
  <fo:table-body border="solid 3">
    <fo:table-row border="solid 2">
      <fo:table-cell line-height="10" number-rows-spanned="2" border="solid 1">
        <fo:block>block1</fo:block>
        <fo:block>block2</fo:block>
        <fo:block>block3</fo:block>
      </fo:table-cell>
      <fo:table-cell line-height="15" border="solid 1">
        <fo:block>blockA</fo:block>
      </fo:table-cell>
    </fo:table-row>
    <fo:table-row>
      <fo:table-cell line-height="20" border="solid 1">
        <fo:block>blockB</fo:block>
      </fo:table-cell>
    </fo:table-row>
  </fo:table-body>
</fo:table>

An illustration of this example:

http://people.apache.org/~jeremias/fop/KnuthBoxesForTablesWithBorders1.png

element lists for the cells

Cell1: length(30,30,30)
box(10)
penalty(0, 0)
box(10)
penalty(0, 0)
box(10)

Cell2: length(15,15,15)
box(15)

Cell3: length(20,20,20)
box(20)

Normal borders (assuming no breaks)

Cell1
before: 4 (body border-after of table-header is dominant)
Candidates: border-after header cell, border-after header row, border-after header body, border-before cell1, border-before row1, border-before body
after: 4 (body border-before of table-footer is dominant)

Cell2:
before: 4 (body border-after of table-header is dominant)
after: 2 (row border is dominant)
Candidates: border after cell2, border-after row1, border before cell 3, border before row2

Cell3:
before: 2 (row border is dominant)
after: 4 (body border-before of table-footer is dominant)

Table header and footer width

Header is 5 + 8 = 13
Footer is 8 + 5 = 13

Row heights

row1: 15 + (2/2) => (16,16,16)
row2: 20 + (2/2) => (21,21,21)
row group (37,37,37)

Steps

Step 1:
stepw= 10
maxRemainingHeight = 37
addedBoxHeight = 0
penalty = 10 + 37 - 37 = 10
effPenalty: w = 10 (space for block1) + 13 (header) + 4 (border-after-header) + 4 (border-before-footer) + 13 (footer)
box = 10 - 0 - 10 = 0

Step 2:
stepw = 15
maxRemaningHeight = 20
addedBoxHeight = 0
penalty = 15 + 20 - 37 = -2
effPenalty: w = -2 (calc penalty) + 13 + 4 + 4 + 13
box = 15 - 0 - (-2) = 17

Step 3:
stepw = 10 + 10 = 20
maxRemainingHeight = 20
addedBoxHeight = 17
penalty = 20 + 20 - 37 = 3
effPenalty: w = 3 (calc penalty) + 13 + 4 + 4 + 13
box = 20 - 17 - 3 = 0

Step 4:
stepw = 10 + 10 + 10 = 30
maxRemaningHeight = 20
addedBoxHeight = 17
penalty = 30 + 20 - 37 = 13
effPenalty: w = 14 (calc penalty) + 13 + 4 + 4 + 13
box = 30 - 17 - 13 = 0

Step 5:
stepw = 15 + 2 + 20 = 37
maxRemainingHeight = 0
addedBoxHeight = 17
penalty = 37 + 0 - 37 = 0 (last step, omit penalty)
box = 37 - 17 - 0 = 20

End result (with raw penalties)

box(0)
penalty(10) -> 10/37
box(17)
penalty(-2) -> 15/20
box(0)
penalty(3)  -> 20/20
box(0)
penalty(13) -> 30/20
box(20)
            -> 37

End result (effective)

box(0)
penalty(44) //incl. header and footer -> 44/71
box(17)
penalty(32) //incl. header and footer -> 49/54
box(0)
penalty(37) //incl. header and footer -> 54/54
box(0)
penalty(47) //incl. header and footer -> 64/54
box(20)
box(17) //header
box(17) //footer
                                      -> 71

An example with interaction between the row borders and the header and footer borders: TableLayout KnuthElementsForTables/RowBorder

An example with interaction between the row borders and the header and footer borders taking the former example a step further:
TableLayout KnuthElementsForTables/RowBorder2

The next example converts the previous to the separate border model and shows how to handle it:
TableLayout KnuthElementsForTables/RowBorder3

Knowns problems

  • There's a problem when a cell is broken over more than two pages. In this case the steps may not be accurate anymore for breaking between the second and third page. This is only the case when the step generator has to create penalties. If we wanted to fix this, we would have to discard all elements starting on the second page, so the elements can be recreated based on the earlier break decision.
  • No labels