This page documents aspects around images in Apache FOP. Older design documentation can be found here: http://xmlgraphics.apache.org/fop/design/images.html

Current status

Some of the content below is slightly out-dated. The "current problems" are mostly "past problems" now, after the image loader framework in XML Graphics Commons has been introduced. Performance and memory consumption has been improved as expected. Still, the image handling in the various renderers is still done in different ways. As an example: Barcode4J currently makes calls against the Graphics2DAdapter interface, the ImageAdapter interface, the PSRenderer class and can still use the fallback via SVG. The coupling is too high. The PDFRenderer also still has a slightly different approach at image handling than, say, the PSRenderer. Now, with the new intermediate format, all the code that is directly dependent on the Renderer interface becomes a problem for code reuse.

Unification of image handling is being worked on as part of the implementation of the new intermediate format (AreaTreeIntermediateXml/NewDesign). The proposal is described on the ImageSupport ImageHandler page.

Current problems

Certain renderers can embed images directly (JPEG, EPS and certain TIFF subformats, for example) but other renderers still require a decoded bitmap image. Currently, the cache only provides the first requested variant of an image. If the PDF renderer rendered an FO file with a JPEG image and then the same document is rendered with the Java2DRenderer, there will be a problem because JPEGImage loaded the original data and did not decode the JPEG image.

Another problem is with color spaces. Details here.

Format support matrix

The following matrix tries to show all the possible combinations. A more graphical view of the whole thing can be seen here (SVG, 45KB).

JPEG

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	1:1 embedding
PostScript	1:1 embedding	Requires PostScript Level 2
Java2D	decoded bitmap
PCL	decoded bitmap
AFP	decoded bitmap
SVG	referenced or RFC2396 data URL
RTF	1:1 embedding

1:1 embedding through FOP's own code. No support for decoding JPEG images through an image library, yet.

PNG

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	decoded bitmap, possibly 1:1 embedding
PostScript	decoded bitmap, possibly 1:1 embedding for PS Level 3
Java2D	decoded bitmap
PCL	decoded bitmap
AFP	decoded bitmap
SVG	referenced or RFC2396 data URL
RTF	1:1 embedding

BMP

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	decoded bitmap
PostScript	decoded bitmap
Java2D	decoded bitmap
PCL	decoded bitmap
AFP	decoded bitmap
SVG	referenced or RFC2396 data URL
RTF	1:1 embedding

GIF

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	decoded bitmap
PostScript	decoded bitmap
Java2D	decoded bitmap
PCL	decoded bitmap
AFP	decoded bitmap
SVG	referenced or RFC2396 data URL
RTF	decoded bitmap

TIFF

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	decoded bitmap	1:1 embedding for CCITT encoded images
PostScript	decoded bitmap	1:1 embedding for CCITT encoded images (NYI)
Java2D	decoded bitmap
PCL	decoded bitmap
AFP	1:1 embedding for CCITT encoded images
SVG	referenced or RFC2396 data URL
RTF	decoded bitmap

1:1 embedding through the help of Batik's TIFF codec plus FOP's own code.

SVG

fo:instream-foreign-object and fo:external-graphic

Renderer	required/preferred variant	Comments
PDF	native conversion with Batik
PostScript	native conversion with Batik
Java2D	native conversion with Batik
PCL	conversion to bitmap with Batik	HP/GL Graphics2D implementation only for the simplest of SVGs available
AFP	conversion to bitmap with Batik	GOCA implementation in the works
SVG	referenced or embedded
RTF	conversion to bitmap with Batik

Output formats (like PCL and RTF) for which no native conversion is available we need an alternative to provide the SVG as a bitmap image. This is currently implemented in AbstractGenericSVGHandler and, for RTF, in SVGConverter.

For PDF, it would be interesting to have a native picture painted into a Form XObject so such an image can be preprocessed and more easily reused. The difficulty there are features like links which would need to be handled separately since they are not part of a Form object.

Similary for PostScript, the SVG could be rendered as an EPS file which could be reused within the document.

EPS

fo:external-graphic only

Renderer	required/preferred variant	Comments
PDF	embedded	PDF support is deprecated and not supported by newer Acrobat Readers
PostScript	embedded
Java2D	not supported
PCL	not supported
SVG	not supported
RTF	not supported

If we ever have a PostScript interpreter available to FOP we can support EPS images for other output formats. An alternative could be to extract the TIFF previews provided by certain EPS images but this is better solved through a better suited image format.

The FOray project has the beginnings of a PostScript interpreter with a proof-of-concept implementation for rendering graphics. But making it usable would take a lot of work.

MathML

fo:instream-foreign-object and fo:external-graphic

MathML is internally converted to SVG in the MathML extension and subsequently handled as such. So see the SVG section for details. Same problems, too. The alternative is to render MathML directly using Java2D.

One small issue here: a math expression usually has a baseline. This baseline should be aligned with the FO baseline.

Barcode4J

fo:instream-foreign-object only

Renderer	required/preferred variant	Comments
PDF	painted using Java2D or internally converted to SVG
PostScript	internally converted to EPS
Java2D	direct painting to Graphics2D
PCL	direct painting to Graphics2D
AFP	direct painting to Graphics2D	See Bugzilla #41995 for an alternative using BCOCA
SVG	internally converted to SVG (NYI)
RTF	internally converted to bitmap

The new FOP extension is available since Barcode4J 2.0alpha1.

Other foreign XML formats

The easiest way is to convert to SVG internally and let the renderers handle that format. Examples for this section: Example plan extension, JCharts support etc.

Better is to have those extension work directly on Java2D which enables to bypass Batik and speeds things up.

Requirements for the whole solution

Extensions which support foreign XML formats should be able to convert their content at least to SVG. Generating bitmaps is also desirable so output formats like RTF can also be supported. Rendering to Java2D may be preferred to SVG as it can reduce some overhead.
Renderers need to expose APIs to output directly supported formats other than bitmap formats. EPS for PostScript, Graphics2D for at least Java2DRenderer but possibly also for PDF and PS. See Graphics2DImagePainter/Graphics2DAdapter as an example for a solution which is already implemented for some renderers.
Different renderers support different source formats/flavors for the images to be embedded. The current cache only supports exactly one flavor. If the same image is rendered with another renderer this might result in problems.
- The image cache should store one entry per URI and flavor.
- Examples of possible flavors are: raw/undecoded, RenderedImage/BufferedImage, Graphics2DImagePainter, EPS, XML (SVG, MathML...), RFC2396 URL, etc.
- Renderers would provide a prioritized list of supported/preferred flavors. The image package would then do necessary decoding/conversions and deliver the best flavor it can deliver in a particular case.
During layout only the image dimensions need to be determined so the image can be properly placed. The actual image data only needs to be available during rendering. Ideally, the InputStream to load the dimensions from should be available to the component fully loading the image later on to avoid additional round-trips to fetch the image. Should additional flavors be needed, the InputStream can probably be reopened.
- To handle all kinds of formats, we may need a special PushBackInputStream which supports arbitrarily sized buffers to reset a stream to position 0. PNG is a case where the resolution information of an image is not guaranteed to be within the first 4KB of the file. See also here. The alternative (now chosen) is to use ImageIO's ImageInputStream which allows caching of content already read either in memory or in a temp file. Specialized implementations (like a ThresholdImageInputStream, switching from memory to file when a certain limit is reached) could be implemented if optimizations are needed.
- When someone works with the XML-based intermediate format to represent the area tree, the layout and the rendering might happen in different VMs, so the actual image data might actually never be loaded at all, so the InputStream still needs to be closed properly!
- We may have to make some distinction between fast and slow connections. It could be faster simply reopening the stream if a resource is loaded from a local file in which case we can rely on the operating system's cache. For resources loaded over the (Inter)net, local buffering makes sense be that in memory or even in a temporary file (like ImageIO likes to do).
- All the little modules should be dynamically registerable. The current hard-coding in ImageFactory is bad.
- We need transparent support for GZIPped content (like SVGZ).
- Support baseline adjustments (for MathML).
- Optional: Some people told is in the past that they do dynamic image generation (like for charts and such). We told them to implement this in a servlet but it could be worthwhile (if easily implemented), that a plug-in could generate a dynamic image based on a given URI in some flavor (SVG, bitmap...).
- Optional: If possible, the package should be able to change between multiple implementations of the same document format while parsing. We've had cases where ImageIO gave better results than our internal codecs and vice versa.

Random thoughts

It might be good to separate the image dimension object for the layout process from the actual decoded image, thus providing separate caches for both. [DONE]

For high-volume PostScript environments (or PPML) it might be worthwhile not to fully load images at all but to simply insert resource placeholders (DSC comment %%IncludeResource) into the stream. This would speed up the rendering process considerably for environments where such an approach is possible. [DONE]

If it is known which renderer the document will be rendered to during the layout stage, the images could be loaded in a separate thread after the dimensions have been determined while the actual layout continues. [ignored for now]

We need to get rid of our byte array approach for storing decoded images. This should be done entirely using Java2D/AWT means, i.e. [RenderedImage BufferedImage]. [DONE]

Page tree

ImageSupport

Current status

Current problems

Format support matrix

JPEG

PNG

BMP

GIF

TIFF

SVG

EPS

MathML

Barcode4J

Other foreign XML formats

Requirements for the whole solution

Random thoughts