General Information

OASIS Standards

The most comprehensive information on the ODF format is found in the OASIS Standards, freely available on-line.  The OASIS Standard is ODF 1.2, approved on 2011-09-29 and in the process of being promulgated as ISO/IEC International Standard IS 26300:2015.  The major implementations of ODF 1.2 are the descendants of OpenOffice.org 3.x (especially current versions of LibreOffice and Apache OpenOffice) and Microsoft Office 2013 Word, Excel, and PowerPoint.  There are also degrees of support in Google Docs and in the Microsoft Office web applications and emerging device applications.

 The current specifications can be downloaded from OASIS at <http://docs.oasis-open.org/office/v1.2/os/>.  If you are going to be working with the specification much, it is convenient to download the file OpenDocument-v1.2-os.zip which contains all of the specifications in their PDF, ODT, and HTML forms along with the various schema files.  The Part 1 document has the important details.  Part 2 is about OpenFormula only.  Part 3 is about the use of Zip for packaging an ODF documents as multi-part collections.  The unnumbered Part provides a combined Table of Content for the 3 parts and also has some conformance statements that are important overall.

For ODF 1.2, the schemas are not included in the text.  The schemas are important companions.  That is how one determines when an element or attribute occurrence is optional and whether there are pre-requisites of one XML feature occurrence on another.  Versions of the schemas that are searchable and navigable as hypertexts have been placed at <http://nfoworks.org/notes/2014/05/n140504d.htm> (OpenDocument-v1.2-manifest-schema.rng) and <http://nfoworks.org/notes/2014/05/n140504f.htm> (OpenDocument-v1.2-schema.rng).

The most widely-implemented version of ODF prior to ODF 1.2 is the OASIS Standard ODF 1.1 approved on 2007-02-01 and available at <http://docs.oasis-open.org/office/v1.1/errata01/os/> with its incorporated 2013 Errata.  This is the version that is aligned with the ISO/IEC International Standard 26300:2006/Amd 1:2012, although there are a few additional errata.  ODF 1.1 documents can be found in the wild, and there are some breaking changes between ODF 1.1 and ODF 1.2.  In particular, ODF 1.1 did not specify spreadsheet formulas so implementations had a form that preceded the ODF 1.2 OpenFormula introduction.  Some commercial software that remains in use (such as Microsoft Office 2007) have their support for OpenDocument files based on ODF 1.1.

Exploring and Processing ODF Files

OASIS OpenDocument Essentials

An introductory book can be obtained here:  OASIS OpenDocument Essentials HTML  PDF.   It is part of this overall material: http://books.evc-cit.info/index.html. [Note: This otherwise-excellent 2005 book is based on ODF 1.0 and relies on OpenOffice.org 1.9 with its many deviations and implementation-dependent provisions.  (For example, external DTDs are not used in ODF, which has Relax NG schemas.)  The material should be used cautiously and with consultation of the ODF 1.1 specification, the current International Standard and with OASIS Standard ODF 1.2 (especially its OpenFormula part), becoming the next version of the International Standard.  Current versions of OpenOffice.org descendants are in the 4.x range.]

Apache Project Resources

For implementation details, there is the source code and documentation of the Apache OpenOffice (AOO) project and the http://openoffice.org site. 

AOO is complex software.  An alternative source for some implementation fundamentals is found at the Apache ODF Toolkit podling.  Although the ODF Toolkit consists primarily of Java code, it is more concise and easier to comprehend for design concepts with regard to the consumption and production of such files.  There is also a validator in the Toolkit that may be usable and also informative.

ODF Conformance/Compliance Assurance Helix

The scope for Corinthia includes

Many office document programs claim to read/write to the ISO open standards for office documents, OpenDocument Format (ODF) and Office Open XML (OOXML), but do not document which parts are left unimplemented. Furthermore, the standards have a large number of "implementation defined" parts, making real-world congruence chancy. The Corinthia toolkit wants to put this unacknowledged aspect into the open and provide "compliance sheets" for document formats, as known from industry computer protocols.

Corinthia aims at generating a large set of test documents, which can be used to verify the "compliance sheets". The code can work as test case for other applications (or entities tendering for OOXML/ODF based systems) as well.

It is proposed to address this situation for OpenDocument Format using an Assurance Helix:

The Assurance Helix is independently usable by any party to develop comparisons and calibrations of particular ODF-supporting implementations  The development of compliance sheets is backed up by the Assurance Helix, and the helix is usable as backup to other demonstrations of conformance and identification of deviations and extensions.

Further detailing of the Assurance Helix involves the following aspects.

  1. Conformance Requirements
    Conformance cases for ODF documents are complex.  A matrix of the overall conformance cases is used to characterize how the major format variations are treated
  2. Single File Documents
    The ODF Specification provides for single XML files as carriers of complete ODF documents.  Such documents are not quite as flexible as their multi-part counterparts carried inside of an ODF Package (a special usage of Zip).  The single file documents are very useful, however:
  3. ODF Packaging
    The specific format employed for ODF Documents conveyed in ODF 1.2 Packages requires iterative development of features and their verification for the packaging structures themselves, as defined in ODF 1.2 Part 3 and supplemented by a few requirements in other parts.  There are a number of unique features (such as application of digital signatures and use of encryption/decryption) that apply for packages.  These are defined for use more broadly than to specific application as a carrier of ODF 1.2 documents.
  4. Multi-Part ODF Document Packages
    Single-file documents all have multi-part flavors that employ ODF Packaging  There are portions of Assurance Helix for those cases  Multi-part forms have important additional cases involving use of embedded materials, cross-references among materials, and additional ways of linking and carrying meta-data information  These are the most-common form of ODF documents "in the wild."  Mulit-Part documents will be the stress cases for interoperability, successful round-trip usage in collaborative work, and ability to substitute implementations while preserving fidelity to the intended document.
  5. Extended Features and Extension Mechanisms
    There are systematic provisions for the presence of extensions in the file formats for ODF documents. Providing benign or gracefully-reduced functions in the face of extensions that are not understood is an important factor in both the definition of extensions and in the recognition of them.  Extensions can be peppered almost anywhere in the above cases and are provided for in the conformance requirements.  The identification of extensions, the source of their definitions, and behavior in the face of them are also factors in the development of compliance information.

Questions and Answers

 

Information Store