Interdependent page and line breaking

Introduction

Different page widths in a page sequence cause an interdependence between page and line breaking. It is no longer possible to determine the best linebreaks for the paragraphs first, and then the best pagebreaks. Here I wish to formulate an algorithm that determines the best total fit for both page breaking and line breaking in the presence of such an interdependence.

As a basis I use the abstract representation of Knuth and Plass of inline text and of vertical page elements, in terms of boxes, glues and penalties.

According to that representation paragraphs with inline text have legal linebreak points. I consider those legal linebreak points also as legal pagebreak points. In addition, there are legal pagebreak points between the vertical elements such as paragraphs and blocks.

I try to model the algorithm after that of Knuth and Plass, and to formulate the modifications that are needed for my goal of interdependent page and line breaking. The algorithm contains a main loop over the legal pagebreak points. For each legal pagebreak point the algorithm contains a loop over the active nodes. For each active node the best page layout between the active node and the current pagebreak point is determined. Its cost is expressed in demerits. The algorithm maintains a set of active page nodes. A node is a data structure that is connected with a feasible pagebreak point. A feasible pagebreak point is a pagebreak point up to which there exists a page layout within the tolerance. A node is kept on the active list if the content between that node's pagebreak point and the current pagebreak point fits on a page without squeezing beyond the specified shrink components of the content.

Within the inner loop, we consider the page and paragraph layout between the active and the current pagebreak point. If the active breakpoint is within a paragraph, we calculate the best line breaks from that breakpoint to the end of the paragraph. For all complete paragraphs on the page we use the best paragraph layout. If the current pagebreak point is within a paragraph, we calculate the best line breaks from the start of the paragraph to that breakpoint.

Because the page layout may prefer longer or shorter paragraphs than the best paragraph layout, we need to calculate for each paragraph the best layout for each possible number of lines. Because some inline content may modify the line height, we may even need to calculate for each paragraph the best layout for each different possible value of the height of the paragraph. In the interest of fewer computing resources, we may ignore the latter possibilities, and work only with the best number of lines for each paragraph.

In page independent linebreaking, for each feasible breakpoint the best node is retained, which represents the best layout of the paragraph up to that point, and which, due to the dynamic principle, is part of the best layout of the whole paragraph if that layout uses this breakpoint. If line numbers matter, the best node for each line number is retained. In page dependent linebreaking, even that is not enough. We must retain the best node for each vertical offset on the page, because that is the quantity that influences page breaking. This strategy invalidates the reusability of the linebreaking calculations, because selecting the best node involves the previous paragraphs on the page. In fact, because the nodes of a paragraph are independent of the preceding paragraphs, we only need to take the vertical offset from the top of the paragraph into account, which preserves the reusability of the results. Later we can combine the results with the offsets of the preceding paragraphs.

When we reach the current pagebreak point, we have a number of best nodes for each paragraph on the page. We make all possible combinations, and calculate the page demerits of each. Then we select the best page layout and construct the pagebreak node for it.

When the linewidths depend on the page number, we need to remember the best pagebreak node for each feasible pagebreak point for each page number. Otherwise, we only need to remember the best pagebreak node for each feasible pagebreak point. Note that the latter condition is true in the presence of out-of-line elements, because those are related to the content of the page, not the the page number.

One issue that will have to be addressed is that widow or keep-with-previous settings may invalidate some previously believed legal breakpoints. In such cases, active nodes which contain those breakpoints in their chains will have to be deactivated; if they were the chosen best nodes, some other nodes will somehow have to be retrieved to replace them.

Optimization opportunity 1: We may need to reuse many times the best layout from a breakpoint to the end of the paragraph, and the best layout from the start of the paragraph to a breakpoint. Therefore we need to store a reference to the best end node for either case in the active node. If we wish to take into account the different possible heights of the part of the paragraph, we need to store references to a set of best end nodes. Especially for long page sequences, a page breakpoint may be feasible both for an odd and an even page. In that case, we need to store different end points for each, due to the different line widths on odd and even pages. This optimization is certainly true for the start of the paragraph. There will be many page layouts on which the whole paragraph fits.

In general we need to determine for each pair of nodes if they are equivalent in the sense that the node with the higher amount of demerits may be dropped in favour of the other node. One can generalize this to the notion of the context. We need to identify all elements which determine the context, and be able to compare two contexts and say if they are equivalent or not.

Optimization opportunity 2: Do we need to consider each active node? Or can we already determine that some active nodes will never give a better layout than others? Suppose that a paragraph has two feasible breakpoints A and B, which have an equal number of lines, or even the same height, before and after the page breakpoint. Suppose that B has a higher amount of demerits than A. Can we then conclude that B will never be part of the best layout, because a better layout can always be achieved with A? Yes, we can.

Note that the arguments for both optimization opportunities are no longer completely correct in the presence of footnotes, before-floats and side-floats. An inline element with such an out-of-line element attached to it changes the height of the part of the paragraph to which it belongs.

In order to explore the above algorithm in more detail, and to study possible optimizations, I will focus on a few specific cases.

Case 1

In the first case I consider a page sequence whose content consists solely of paragraphs of text. There are two different page widths, one for the odd pages, one for the even ones. Every line has the same height.

In the first step we consider the first legal linebreak point in the first paragraph. This is also the first legal pagebreak point. The active page node list contains one node, that for the pseudo-pagebreak before the first page. For this pair of current pagebreak point and active page node we calculate the best page layout. The page content is the start of the first paragraph, up to the first legal linebreak point. We calculate the best linebreaking for this partial paragraph. The active line node list contains one node, that for the pseudo-linebreak before the first line. In almost all cases, except in the case of an extraordinary short line width, the linebreak point is not feasible. Therefore the pagebreak point is also not feasible.

In the second step we consider the second legal linebreak point in the first paragraph, which is also the second legal pagebreak point. The active page node list still contains its single starting node. For this pair of current pagebreak point and active page node we again calculate the best page layout. The page content is now the start of the first paragraph, up to the second legal linebreak point. Again we calculate the best linebreaking for this partial paragraph. The active line node list still contains its single starting node. Again in almost all cases the linebreak point and the pagebreak point are not feasible.

In this way we continue. At a certain step we reach the first feasible linebreak point, which ends the first line. The page now contains one line, and the pagebreak point is still not feasible.

In most cases the whole paragraph will fit on the page, and the legal pagebreak point after it is not feasible. So we do another paragraph, and another. Finally we will reach a feasible pagebreak point, which ends page 1. The corresponding linebreak point ends a line which is able to fill the page height with maximally tolerable stretching. In the next steps we will find more feasible pagebreak points. They correspond to different layouts of the last paragraph that starts on this page, with the same number of lines, and to layouts with more lines, which also fit on the page, until the maximal shrink is exceeded. At this point the starting node is deactivated, and the active page node list only contains nodes that end page 1.

As soon as we have the first active page node of page 1, we will be doing linebreaking calculations on page 2, with a different linewidth. When the linebreak point following that node is the current page and linebreak node, we consider it as a feasible pagebreak point on page 1, with the starting node as the active node. But with the next active page node, the page node ending page 1, we consider it as a feasible pagebreak point on page 2 and we calculate a possible layout of page 2. This calculation is very similar to that for the first pagebreak point, and both the line and the pagebreak points are not feasible.

It should be noted that for each feasible pagebreak node, we restart the linebreaking calculations of that page. Of course we must optimize this. We could attach a data structure to the starting page node which records the results of the calculations done. Those results consist of the active line node list, the last linebreak point considered, and the widths calculated up to that point. During uninterrupted linebreak calculation these data are held in loop variables. Now we must store them so that they can be retrieved when we resume the calculation in the consideration of the next current pagebreak point. On the first page, the starting page node is the only node for which this must be done. At other pages any active page node is considered several times as the start of a new page, and any active node must store the results of the corresponding linebreak calculations. In addition the starting node of each paragraph must store such data. These are used when the paragraph starts on the page. When the linebreak calculations for the complete paragraph have been done, the data structure contains the paragraph's end nodes. The data must also register the linewidth configuration for which they are valid. In our simple case, a pagebreak point may be feasible for both an odd and an even page. Then it must store two sets of linebreak data, one for each page width.

It should also be noted that a paragraph which contains the active page node or the current pagebreak point, is effectively split into two and each part is laid out separately. This makes it possible that we carry along layout solutions which have no chance of being part of the best solution. In the introduction we pointed out such a case:
Suppose two pagebreak points A and B have an equal number of lines before and after the pagebreak, and A has the fewest demerits. If we would consider the paragraph as a whole, we would soon notice that B can be dropped. In the above algorithm we will not notice this unless we pay special attention to this possibility. Of course, in the end we will calculate the overall best layout. This argument is strictly only true if the parts before and after the pagebreak have equal heights. If they do not, we are correct in carrying both solutions along, because they have a different contribution to the pagebreak calculations. In our current simple case we assume that the same number of lines have the same height.