Whitespace Management Extension
Overview
What is meant by Whitespace management? Whitespace management is the ability to dynamically select alternative content to ensure that a page is fully populated. XSL-FO 1.1 doesn't provide any means to fulfill this requirement, but this requirement was discussed at the XSL-FO 2.0 requirements meeting and it is a requirement that users of digital publishing tools often expect.
To explain the requirement a bit more, lets consider an example extension element fox:best-fit.
<fox:best-fit> <fox:alternative> <fo:block> Short Message </fo:block> </fox:alternative> <fox:alternative> <fo:block> 2 Line Message 2 Line Message </fo:block> <fox:alternative> </fox:alternative> <fo:block> 3 Line Message 3 Line Message 3 Line Message </fo:block> </fox:alternative> <fox:alternative> <fo:block> Long Message. Long Message. Long Message. Long Message. </fo:block> </fox:alternative> </fox:best-fit>
When the Renderer encounters a fox:best-fit element it analyses the available space in BPD within the current page and selects the alternative which fits "best" within the remaining BPD. By best fit, I mean the BPD of the alternative is less than remaining BPD but more than the BPD of all other alternatives which are less than the remaining BPD.
Questions
- I know that in this particular requirement such an alternative block will not be broken across pages. But since the XSL working group has it on their requirements list, is it thinkable that those block need to be broken across pages?
- Are there better suited names for the new elements?
- fox:best-fit-block and fox:best-fit-alternative perhaps? "alternative" seems somewhat too generic. "best-fit-block" stresses that this is about block-progression-direction only.
[CB] Yes I agree the names aren't great. I just made them up to get the point across. The 2 names you suggest do sound better to me.
- fox:best-fit-block and fox:best-fit-alternative perhaps? "alternative" seems somewhat too generic. "best-fit-block" stresses that this is about block-progression-direction only.
What about rewording this example using a single, standard block containing inline / block level alternatives? Would it be a different use case, a more general (or more particular) one, or just a different "perspective"? [LF]
[JM] Right. Thinking some more, there's at least one interesting use case (actually a real-life use case from a project I'm working on although we don't use FO there) in inline-progress-direction. Imagine a task where you have to format an address inside a table-cell with not too much room. You'll want to write the full name and address if possible but you also apply some shortening rules in case the strings are very long:
<fo:table-cell> <fo:block> <fox:best-fit> <fox:best-fit-alternative>John Ronald Reuel Tolkien</fox:best-fit-alternative> <fox:best-fit-alternative>John R. R. Tolkien</fox:best-fit-alternative> <fox:best-fit-alternative>J. R. R. Tolkien</fox:best-fit-alternative> </fox:best-fit> </fo:block> <fo:block> <fox:best-fit> <fox:best-fit-alternative>Literature Street 76</fox:best-fit-alternative> <fox:best-fit-alternative>Literature Str. 76</fox:best-fit-alternative> </fox:best-fit> </fo:block> </fo:table-cell>
As (in general) we could not know how many lines each alternative will create, we may need a way to define a priority between the alternative (a priority attribute?) so that, if both the n-th alternative and the m-th one lead to the creation of K lines the application knows which one to use. [LF]
Instead of creating new extension element types, why not use fo:multi-switch and fo:multi-case (instead of the proposed fox:best-fit and fox:best-fit-alternative), along with a new extension property, fox:best-fit on fo:multi-switch, which triggers an implicit process for setting the currently-visible-multi-case trait on the fo:multi-switch to select none or one of the alternative fo:multi-case sub-trees to contribute generated areas? The structure and function of these existing FO element types exactly match the proposed new element types. [GA]
Thoughts for a possible implementation
First thought is to implement it similarly to a block-container with stretch and shrink. The implementation could measure the min/opt/max of every alternative and could calculate a combined min/opt/max from that. Creating Knuth elements for this combined min/opt/max is easy (a single box plus glue for the whole best-fit block with no break opportunity in between). After the page breaking, generating the contents of a particular alternative is easy and can be done the same way as for block-container.
What's critical is the optimum BPD for the whole block. Assuming you are creating a catalog with each article starting with the same larger structure. After that you have some variable length content. Since the first block always takes some space you will want to keep it together. This can lead to larger white areas on a page after the end of an article. You may be able to fill that with additional content, an additional picture perhaps. But that should only happen if there's really room to waste. So in this case the optimum value is 0pt. Maybe each alternative would have to have an id attribute and the best-fit block will contain a reference to the preferred alternative. The preferred alternative's optimum height will then become the best-fit block's optimum height. In the case where you don't want any additional content in the normal case, you'd specify an empty alternative and assign its id to the best-fit block.
The general situation in which the alternative blocks could be broken seems quite similar to the MultiLayoutSequence one. [LF]
[VH] Unless I missed something this will work only if the min/opt/max of the several alternatives are overlapping. If each alternative has only a fixed length, you can't create a min/opt/max representing them, because glue is a continuous value (not a discrete one). Suppose you have 3 alternatives of fixed lengths 8, 10 and 12. You can't simply create a min/opt/max of 8/10/12: what if the glue is stretched so that the block ends up having a size of 11? There is no suitable alternative for such a length.
[JM] I see the problem. However, I don't think it would be a problem to simply add additional whitespace for the surplus stretch (in your example, length 10 is chosen and 1 unit whitespace added). We should probably look at how we do the area generation in this case. I'd keep it simple and let the best-fit element create a viewport/reference pair. You could then let the display-align/text-align properties define how the content should be placed inside the container.
However, if the situation is similar to the one on the picture below, it becomes possible to create a combining min/opt/max:
If this is not the case, I don't see any other possibility than computing break points once for each alternative... which will quickly lead to combinatorial explosion. If we implemented a best-fit strategy in addition to the current total-fit one, we could reduce the complexity a lot by making a decision at the end of each page. Anyway, it seems to me that this whole approach has an interest only with a best-fit strategy.
[JM] With the addition of the inline-progression-direction variant, the implementation gets more complicated. Since it's about two different orientations, I assume the two implementations would have to be done separately, i.e. best-fit-block and best-fit-inline.
Possible implementation
Although my requirements are slightly different compared to what has been previously said, or should I say simpler, the core idea is still the same. This work should be considered as a possible interpretation of the above discussion with some differences tailored to some particular needs. However, it is worth mentioning that this is still a work in progress and any issues or limitations will be shortly addressed as more tests are performed. The end goal is to implement a high quality, highly extensible whitespace management extension that conforms to FOP architecture and best practices.
1- Fitting strategy:
- The selection of the best alternative is based upon the chosen fitting strategy, which is just a property of fox:best-fit and can take one of the following set of values {“first-fit”, “smallest-fit”, “biggest-fit”}
- first-fit: as the name implies, the first alternative that fits into the available space is selected, the rest will be ignored.
- smallest-fit: the best alternative is the one that occupies the least amount of space and can be fit inside the current page.
- biggest-fit: the best alternative is the one that occupies the highest amount of space and can be fit inside the current page.
- More fitting strategies will be added as I develop my concept further. For example, it might be useful to define strategies that select more than an alternative depending on certain shared traits, or if their combined width is less than a certain threshold.
2- Best fit penalty:
- By definition, an alternative is not allowed to be broken apart across pages or columns; it must be put together in one chunk as if it has the keep-together property implicitly set to always. To achieve that, I have added a new penalty type (BestFitPenalty) that keeps record of the set of alternatives waiting to be evaluated. It also informs the layout manager about the best alternative that was chosen when addAreas() is called.