Improved Keeps

So far keeps in FOP are handled internally as a simple boolean. Recently, many users have run into the problem that they got the following error message:

java.lang.RuntimeException: Some content could not fit into a line/page after 50 attempts. Giving up to avoid an endless loop.

This is basically bug bug #39840. This shows the need for a more sophisticated expression of keeps, or at least for a distinction of "auto", "always" and any integer value, where in the first step all the integer values can be interpreted as having the same strength but not entirely forbidding a break if one is necessary to avoid an overflow. The meaning of the value "always" will be refined to not causing a RuntimeException if a section of a document cannot be placed on a page due to keep constraints. Instead the content will simply overflow with a warning to the user and according to the overflow property.

Discussions

Tasks

  1. Write test cases that document the desired behaviour.
    2. Find a suitable in-LM representation for the keep values.
    One possibility is to simply use the Property values but that may not be convenient for programming as it may not be type-safe enough. Another is to use an integer and restrict the integer range a little for our purposes (int KEEP_AUTO = Integer.MIN_VALUE, int KEEP_ALWAYS = Integer.MAX_VALUE). Finally, we could define a hierarchy of constant classes (constant = no modification of values after creation, like java.lang.String).
    3. Deal with combining keep-together specifiers. Outer stronger keeps need to override inner weaker keeps.
    4. Adjusting element list creation by creating the right penalty values.
    Once the right representation for keep values is found this shouldn't be difficult.
    5. Adjust overflow handling in the page breaker according to the newly defined behaviour above. 6. Revise the special penalty values used in the table LM, for example.
    In the table LM, special penalty values are used to keep the first parts of each cell of a row together to avoid unwanted breaks.

Further tasks

  • Investigation of the ideal approach to compressing the keep value range to the penalty range used internally.
  • Investigation on how to handle keep-*.within-page which is currently unhandled.
  • Investigation on how to best implement inline keeps.

Example of Integer Values in Keeps

http://people.apache.org/~jeremias/fop/keep-levels.png

SVG version

Random thoughts

The above example shows that, for a given sequence of formatting objects, we need to collect the different integer values on the keeps (Set<Integer>). In the first attempt, we break with all keep != auto as if they are specified as keep="always".

If that results in an overflow, we set the minimum strength to the second to lowest integer value (2 in the above case). In terms of the Knuth model, the penalties for the breaks with strength 1 are set to p=INFINITY-1 (discouraged but not illegal break). All others remain on p=INFINITY.

If that still results in an overflow, the minimum strength is set to the next higher integer value in the set (3 in the above case). And so on. If there is still an overflow when the minimum strength is on the last integer of the set, Integer.MAX_VALUE is assumed and only keeps with strength "always" are set to p=INFINITY. If there is still an overflow, it is reported to the user by normal means.

This means we may have to build a special KnuthPenalty class which knows that it's a "keep penalty" and holds the highest applicable keep strength. The breaking algorithm has to be extended so it can re-run the breaking process with changed settings for the minimum keep strength.

Implementation Stages

Advanced keeps could be implemented in two stages:

  1. First, we just make a difference between "always" and any integer. "always" will result in p=INFINITY and <integer> will result in p=INFINITY-1. That resolves at least the problem over overflowing pages in most cases.
    2. As a second step, the full implementation is put in place.
  • No labels