The Feed Object Model
The Feed Object Model is the set of objects you will use to interact with Atom documents. It contains classes such as "Feed", "Entry", "Person" etc, all of which are modeled closely after the various elements and documents defined by the two Atom specifications. The Javadocs for the Feed Object Model can be found here.
Extensions to the Atom format can either be dynamic or static. A dynamic extensions use a generic API for setting and getting elements and attributes of the extension.
Or you can implement a static extension. Implement a class that extends the ExtensibleElementWrapper class. Implement an ExtensionFactory for it. Register the ExtensionFactory with Abdera. Code below.
Customizing the Parser
When parsing an Atom document, the parser uses a set of default configuration options that will adequately cover most application use cases. There are, however, times when the parsing options need to be adjusted. The ParserOptions class can be used to tweak the operation of the parser in a number of important ways.
- setAutodetectCharset - Abdera will, by default, attempt to automatically detect the character set used in an XML document. It will do so by looking at the XML prolog, the Byte Order Mark, or the first few bytes of the document. The process works reasonably well for the overwhelming majority of cases but it does cause of bit of performance hit. The autodetection algorithm can be disabled by calling options.setAutodetectCharset(false). This only has an effect when parsing an InputStream.
- setCharset - This option allows you to manually set the character set the parser should use when decoding an InputStream.
- setCompressionCodecs - Abdera is capable of parsing InputStream's that have been compressed using the GZIP or Deflate algorithms (typically used as HTTP transfer encodings). setCompressionCodecs can be used to specify which encodings have been applied.
- setFilterRestrictedCharacters - By default, Abdera will throw a parse exception if any characters not allowed in XML are detected. By setting setFilterRestrictedCharacters(true), the parser will automatically filter out invalid XML characters.
- setFilterRestrictedCharacterReplacement - When setFilterRestrictedCharacters has been set to "true", Abdera will, by default, replace the character with an empty string. Alternatively, you can use setFilterRestrictedCharacterReplacement to specify a replacement character.
- setParseFilter - See below
- setResolveEntities - There are a number of named character entities allowed by HTML and XHTML that are not supported in XML without a DTD. However, it is not uncommon to find these entities being used without a DTD. Abdera will, by default, automatically handle these entities by replacing them with the appropriate character equivalent. To disable automatic entity resolution call setResolveEntities(false). Doing so will cause Abdera to return an error whenever a named character entity is used.
- registerEntity - When setResolveEntities is true, registerEntity can be used to register a new custom named entity reference.
A ParseFilter is used to filter the stream of parse events. In the example below, only the elements added to the ParseFilter will be parsed and added to the Feed Object Model instance. All other elements will be silently ignored. The resulting savings in CPU and memory costs is significant.
There are three basic types of ParseFilters:
- WhiteListParseFilter - Only elements and attributes listed in the filter will be parsed.
- BlackListParseFilter - Elements and attributes listed in the filter will be ignored
- CompoundParseFilter - Allows multiple parse filters to be applied
Developers can also create their own ParseFilter instances by implementing the ParseFilter or ListParseFilter interfaces, or extending the AbstractParseFilter or AbstractListParseFilter abstract base classes:
Using a CompoundParseFilter, a developer can apply multple ParseFilters at once:
By default, the CompoundParseFilter will accept an element or attribute if it is acceptable to any of the ParseFilters in it's collection. This default can be modified by explicitly setting the condition parameter:
- Condition.ACCEPTABLE_TO_ALL: Accepts the element or attribute only if it is acceptable to all contained ParseFilters
- Condition.ACCEPTABLE_TO_ANY: Accepts the element or attribute if it is acceptable to any of the contained ParseFilters
- Condition.UNACCEPTABLE_TO_ALL: Accepts the element or attribute only if it is unacceptable to all contained ParseFilters
- Condition.UNACCEPTABLE_TO_ANY: Accepts the element or attribute if it is unacceptable to any of the contained ParseFilters
Note that the UNACCEPTABLE_TO_* conditions will accept an element or attribute based on a negative result. This is particularly useful when building blacklist-based filters, where an item is only acceptable if it does not meet an explicitly stated condition.
Serializing Atom Documents
Abdera uses a flexible mechanism for serializing Atom documents to a Java InputStream or Writer. A developer can use the default serializer or select an alternative Abdera writer implementation to use.
The default serializer will output valid, but unformatted XML; there will be no line-breaks or indents. Using the NamedWriter mechanism, it is possible to select alternative serializers. Abdera ships with two alternative serialiers: PrettyXML and JSON. Developers can implement additional serializers by implementing the NamedWriter interface.
For more on the JSON output, please see the section regarding the JSON Extension.
Implementing a NamedWriter
Your custom NamedWriter can be registered with Abdera by adding a file called META-INF/services/org.apache.abdera.writer.NamedWriter to the classpath and listing the fully-qualified class name of each named writer per line.
Registration will allow the named writer to be accessed via the Abdera WriterFactory:
Using the StreamWriter interface
The org.apache.abdera.writer.StreamWriter interface was added to Abdera after the release of 0.3.0. It provides an alternative means of writing out Atom documents using a streaming interface that avoids the need to build up a complex, in-memory object model. It is well suited for applications that need to quickly produce potentially large Atom documents.
Text and Content Options
Atom allows for a broad range of text and content options. The choices can often times be confusing. Text constructs such as atom:title, atom:rights, atom:subtitle and atom:summary can contain plain text, escaped HTML or XHTML markup. The atom:content element can contain plain text, escaped HTML, XHTML markup, arbitrary XML markup, any arbitrary text-based format, Base64-encoded binary data or referenced external content. Abdera provides methods for dealing with these options.
Atom Date Constructs
The Atom format requires that all dates and times be formatted to match the date-time construct from RFC 3339. The basic format is YYYY-MM-DD'T'HH:mm:ss.ms'Z' where 'Z' is either the literal value 'Z' or a timezone offset in the form +-HH:mm. Examples: 2007-10-31T12:11:12.123Z and 2007-10-31T12:11:12.123-08:00. Abdera provides the AtomDate class for working with timestamps.
The AtomDate class automatically converts all dates over the UTC and automatically includes milliseconds in the output string representation.
Date Construct Extensions
The Atom format explicitly allows the Atom Date Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:updated, atom:published and app:edited elements. Such extensions can use the dynamic and static extension APIs:
Static Date Constructs can be created by extending the DateTimeWrapper abstract class.
Atom defines the notion of a Person Construct to represent people and entities. A Person Construct consists minimally of a name, an optional email address and an optional URI.
Person Construct Extensions
The Atom format explicitly allows the Atom Person Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:author and atom:contributor elements. Such extensions can use the dynamic and static extension APIs:
Static Person Constructs can be created by extending the PersonWrapper abstract class.
Atom link elements are similar in design to the link tag used in HTML and XHTML. They can be added to feed, entry and source objects.
The rel attribute specifies the meaning of the link. The value of rel can either be a simple name or an IRI. Simple names MUST be registered with IANA. Note that each of the values in the IANA registry have a full IRI equivalent value, e.g., the value "http://www.iana.org/assignments/relation/alternate" is equivalent to the simple name "alternate". Any rel attribute value that is not registered MUST be an IRI.
IRIs and URIs
Atom allows the use of Internationalized Resource Identifiers (IRIs). An IRI is a URI that has been extended to allow non-ASCII characters.
IRIs allow for internationalization but can be difficult to handle due to a variety of issues involving proper Unicode normalization, conversion to URI form, etc. Abdera includes an IRI implementation that (fortunately) handles most of these details for you.
Note that the toURL() method automatically calls the ASCII conversion process to produce a valid ASCII URL.
Atom supports the use of the xml:base attribute to specify the Base URI of relative references. Abdera provides a means of automatically resolving relative references using the base URI.
Everywhere a relative IRI reference can be used, there will be a method for retrieving the resolved absolute IRI based on the in-scope base URI.
Abdera allows developers to use XPath statements to navigate the Feed Object Model.
All XPath statements are executed relative to the type of object specified. In the example above, the path "a:title" is used to get the atom:title Text element from the entry. The statement "a:author/a:name" will return the a:name element of the first a:author element in the entry.
Custom XPath Functions
By default, Abdera's XPath implementation supports all of the standard functions defined by the XPath standards. With a little extra work, it is possible to extend the implementation to support custom XPath functions and variables.
Abdera also provides mechanisms for transforming Abdera objects using XSLT.
Signatures and Encryption
Abdera supports digital signatures and encryption of Atom documents.
Signing an Atom Document
A number of options can be configured to adjust the way the document is signed
- setSigningAlgorithm - Specify the algorithm used to sign the entry. Any of the algorithms supported by the Apache XML Security implementation can be used.
- setSigningKey - Sets the private key used to sign the document
- setCertificate - Sets the X.509 Certificate to be associated with the signature
- setPublicKey - Sets the public key to be associated with the signature (alternative to setting the X.509 cert)
- addReference - Allows a developer to add additional Href signatures to be included in the signature
- setSignLinks - When set to true, Abdera will automatically sign resources referenced by atom:link elements and the atom:content src attribute
- setSignLinkRels - When setSignLinks is set to true, lists the rel attribute values of link elements that should be included in the signature
Encrypting an Abdera Document
Any valid Java crypto provider can be used. In these examples, we are using the Bouncy Castle provider.
Decrypting an Atom Document
Using the Diffie-Hellman Key Exchange protocol for encryption
The Diffie-Hellman Keyt Exchange Protocol is a popular means of establishing a shared secret for encryption. Abdera's Security Module includes utilities that make it easy to use Diffie-Hellman.
Various settings can be changed to affect the way documents are encrypted.
- setDataEncryptionKey - The secret key used to encrypt the data
- setKeyEncryptionKey - A secret key used to encrypt the data encryption key when the DEK is to be transmitted with the encrypted data
- setKeyCipherAlgorithm - The algorithm used to encrypt the data encryption key
- setDataCipherAlgorithm - The algorithm used to encrypt the data
- setIncludeKeyInfo - True if information about the secret key used to encrypt the data should be transmitted with the encrypted data