Child pages
  • Feed Object Model
Skip to end of metadata
Go to start of metadata

The Feed Object Model

The Feed Object Model is the set of objects you will use to interact with Atom documents. It contains classes such as "Feed", "Entry", "Person" etc, all of which are modeled closely after the various elements and documents defined by the two Atom specifications. The Javadocs for the Feed Object Model can be found here.

Extensions

Extensions to the Atom format can either be dynamic or static. A dynamic extensions use a generic API for setting and getting elements and attributes of the extension.

An extension element in an entry
Using the dynamic API to add the extension to an entry

Or you can implement a static extension. Implement a class that extends the ExtensibleElementWrapper class. Implement an ExtensionFactory for it. Register the ExtensionFactory with Abdera. Code below.

foo/MyListElement.java
foo/MyListExtensionFactory.java
META-INF/services/org.apache.abdera.factory.ExtensionFactory
Using the static extension

Customizing the Parser

When parsing an Atom document, the parser uses a set of default configuration options that will adequately cover most application use cases. There are, however, times when the parsing options need to be adjusted. The ParserOptions class can be used to tweak the operation of the parser in a number of important ways.

Using the ParserOptions
  • setAutodetectCharset - Abdera will, by default, attempt to automatically detect the character set used in an XML document. It will do so by looking at the XML prolog, the Byte Order Mark, or the first few bytes of the document. The process works reasonably well for the overwhelming majority of cases but it does cause of bit of performance hit. The autodetection algorithm can be disabled by calling options.setAutodetectCharset(false). This only has an effect when parsing an InputStream.
  • setCharset - This option allows you to manually set the character set the parser should use when decoding an InputStream.
  • setCompressionCodecs - Abdera is capable of parsing InputStream's that have been compressed using the GZIP or Deflate algorithms (typically used as HTTP transfer encodings). setCompressionCodecs can be used to specify which encodings have been applied.
  • setFilterRestrictedCharacters - By default, Abdera will throw a parse exception if any characters not allowed in XML are detected. By setting setFilterRestrictedCharacters(true), the parser will automatically filter out invalid XML characters.
  • setFilterRestrictedCharacterReplacement - When setFilterRestrictedCharacters has been set to "true", Abdera will, by default, replace the character with an empty string. Alternatively, you can use setFilterRestrictedCharacterReplacement to specify a replacement character.
  • setParseFilter - See below
  • setResolveEntities - There are a number of named character entities allowed by HTML and XHTML that are not supported in XML without a DTD. However, it is not uncommon to find these entities being used without a DTD. Abdera will, by default, automatically handle these entities by replacing them with the appropriate character equivalent. To disable automatic entity resolution call setResolveEntities(false). Doing so will cause Abdera to return an error whenever a named character entity is used.
  • registerEntity - When setResolveEntities is true, registerEntity can be used to register a new custom named entity reference.

ParseFilters

A ParseFilter is used to filter the stream of parse events. In the example below, only the elements added to the ParseFilter will be parsed and added to the Feed Object Model instance. All other elements will be silently ignored. The resulting savings in CPU and memory costs is significant.

Using a ParseFilter

There are three basic types of ParseFilters:

  • WhiteListParseFilter - Only elements and attributes listed in the filter will be parsed.
  • BlackListParseFilter - Elements and attributes listed in the filter will be ignored
  • CompoundParseFilter - Allows multiple parse filters to be applied

Developers can also create their own ParseFilter instances by implementing the ParseFilter or ListParseFilter interfaces, or extending the AbstractParseFilter or AbstractListParseFilter abstract base classes:

MyCustomParseFilter.java

Using a CompoundParseFilter, a developer can apply multple ParseFilters at once:

Using a CompoundParseFilter

By default, the CompoundParseFilter will accept an element or attribute if it is acceptable to any of the ParseFilters in it's collection. This default can be modified by explicitly setting the condition parameter:

  • Condition.ACCEPTABLE_TO_ALL: Accepts the element or attribute only if it is acceptable to all contained ParseFilters
  • Condition.ACCEPTABLE_TO_ANY: Accepts the element or attribute if it is acceptable to any of the contained ParseFilters
  • Condition.UNACCEPTABLE_TO_ALL: Accepts the element or attribute only if it is unacceptable to all contained ParseFilters
  • Condition.UNACCEPTABLE_TO_ANY: Accepts the element or attribute if it is unacceptable to any of the contained ParseFilters

Note that the UNACCEPTABLE_TO_* conditions will accept an element or attribute based on a negative result. This is particularly useful when building blacklist-based filters, where an item is only acceptable if it does not meet an explicitly stated condition.

Serializing Atom Documents

Abdera uses a flexible mechanism for serializing Atom documents to a Java InputStream or Writer. A developer can use the default serializer or select an alternative Abdera writer implementation to use.

Using the default serializer

The default serializer will output valid, but unformatted XML; there will be no line-breaks or indents. Using the NamedWriter mechanism, it is possible to select alternative serializers. Abdera ships with two alternative serialiers: PrettyXML and JSON. Developers can implement additional serializers by implementing the NamedWriter interface.

The PrettyXML Writer will output formatted XML containing line-breaks and indent
The JSON Writer will output the Atom document converted to a JSON format

For more on the JSON output, please see the section regarding the JSON Extension.

Implementing a NamedWriter

Implementing a NamedWriter

Your custom NamedWriter can be registered with Abdera by adding a file called META-INF/services/org.apache.abdera.writer.NamedWriter to the classpath and listing the fully-qualified class name of each named writer per line.

META-INF/services/org.apache.abdera.writer.NamedWriter

Registration will allow the named writer to be accessed via the Abdera WriterFactory:

META-INF/services/org.apache.abdera.writer.NamedWriter

Using the StreamWriter interface

The org.apache.abdera.writer.StreamWriter interface was added to Abdera after the release of 0.3.0. It provides an alternative means of writing out Atom documents using a streaming interface that avoids the need to build up a complex, in-memory object model. It is well suited for applications that need to quickly produce potentially large Atom documents.

From examples/src/main/java/org/apache/abdera/examples/simple/StreamWriterExample.java

Text and Content Options

Atom allows for a broad range of text and content options. The choices can often times be confusing. Text constructs such as atom:title, atom:rights, atom:subtitle and atom:summary can contain plain text, escaped HTML or XHTML markup. The atom:content element can contain plain text, escaped HTML, XHTML markup, arbitrary XML markup, any arbitrary text-based format, Base64-encoded binary data or referenced external content. Abdera provides methods for dealing with these options.

Text Constructs

Text construct options
Resulting atom:title elements
Getting the text value

Content

Content options
Resulting atom:content elements

Atom Date Constructs

The Atom format requires that all dates and times be formatted to match the date-time construct from RFC 3339. The basic format is YYYY-MM-DD'T'HH:mm:ss.ms'Z' where 'Z' is either the literal value 'Z' or a timezone offset in the form +-HH:mm. Examples: 2007-10-31T12:11:12.123Z and 2007-10-31T12:11:12.123-08:00. Abdera provides the AtomDate class for working with timestamps.

Using the AtomDate

The AtomDate class automatically converts all dates over the UTC and automatically includes milliseconds in the output string representation.

Setting date elements on an entry

Date Construct Extensions

The Atom format explicitly allows the Atom Date Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:updated, atom:published and app:edited elements. Such extensions can use the dynamic and static extension APIs:

Creating a Date Construct Extension
Resulting extension element

Static Date Constructs can be created by extending the DateTimeWrapper abstract class.

Implementing a static Date Construct Extension

Person Constructs

Atom defines the notion of a Person Construct to represent people and entities. A Person Construct consists minimally of a name, an optional email address and an optional URI.

Using the Person Construct

Person Construct Extensions

The Atom format explicitly allows the Atom Person Construct to be reused by extensions. This means you can create your own extension elements that use the same syntax rules as the atom:author and atom:contributor elements. Such extensions can use the dynamic and static extension APIs:

Creating a Person Construct Extension
Resulting extension element

Static Person Constructs can be created by extending the PersonWrapper abstract class.

Implementing a static Person Construct Extension

Atom Links

Atom link elements are similar in design to the link tag used in HTML and XHTML. They can be added to feed, entry and source objects.

Adding Atom Link elements to an entry

The rel attribute specifies the meaning of the link. The value of rel can either be a simple name or an IRI. Simple names MUST be registered with IANA. Note that each of the values in the IANA registry have a full IRI equivalent value, e.g., the value "http://www.iana.org/assignments/relation/alternate" is equivalent to the simple name "alternate". Any rel attribute value that is not registered MUST be an IRI.

Specifying a custom rel attribute value
Resulting link element

IRIs and URIs

Atom allows the use of Internationalized Resource Identifiers (IRIs). An IRI is a URI that has been extended to allow non-ASCII characters.

An example IRI

IRIs allow for internationalization but can be difficult to handle due to a variety of issues involving proper Unicode normalization, conversion to URI form, etc. Abdera includes an IRI implementation that (fortunately) handles most of these details for you.

Using the Abdera IRI implementation
Output

Note that the toURL() method automatically calls the ASCII conversion process to produce a valid ASCII URL.

Base URIs

Atom supports the use of the xml:base attribute to specify the Base URI of relative references. Abdera provides a means of automatically resolving relative references using the base URI.

An Atom entry using relative references
Resolving the absolute IRI for the link

Everywhere a relative IRI reference can be used, there will be a method for retrieving the resolved absolute IRI based on the in-scope base URI.

Using XPath

Abdera allows developers to use XPath statements to navigate the Feed Object Model.

Using XPath to navigate an entry

All XPath statements are executed relative to the type of object specified. In the example above, the path "a:title" is used to get the atom:title Text element from the entry. The statement "a:author/a:name" will return the a:name element of the first a:author element in the entry.

Custom XPath Functions

By default, Abdera's XPath implementation supports all of the standard functions defined by the XPath standards. With a little extra work, it is possible to extend the implementation to support custom XPath functions and variables.

Implementing and using a custom XPath function
Using custom XPath variables

Using XSLT

Abdera also provides mechanisms for transforming Abdera objects using XSLT.

Source Atom Document
XSLT Stylesheet
Using XSLT to transform Abdera objects

Signatures and Encryption

Abdera supports digital signatures and encryption of Atom documents.

Signing an Atom Document

Initialize the signing key
Create the entry to sign
Prepare the digital signature options
Sign the entry
Verify a signature

A number of options can be configured to adjust the way the document is signed

  • setSigningAlgorithm - Specify the algorithm used to sign the entry. Any of the algorithms supported by the Apache XML Security implementation can be used.
  • setSigningKey - Sets the private key used to sign the document
  • setCertificate - Sets the X.509 Certificate to be associated with the signature
  • setPublicKey - Sets the public key to be associated with the signature (alternative to setting the X.509 cert)
  • addReference - Allows a developer to add additional Href signatures to be included in the signature
  • setSignLinks - When set to true, Abdera will automatically sign resources referenced by atom:link elements and the atom:content src attribute
  • setSignLinkRels - When setSignLinks is set to true, lists the rel attribute values of link elements that should be included in the signature

Encrypting an Abdera Document

Any valid Java crypto provider can be used. In these examples, we are using the Bouncy Castle provider.

Prepare the crypto provider
Generate Encryption Key
Create the entry to encrypt
Prepare the encryption options
Encrypt the document using the generated key

Decrypting an Atom Document

Prepare the encryption options

Using the Diffie-Hellman Key Exchange protocol for encryption

The Diffie-Hellman Keyt Exchange Protocol is a popular means of establishing a shared secret for encryption. Abdera's Security Module includes utilities that make it easy to use Diffie-Hellman.

Prepare the Diffie-Hellman Key Exchange Session
Person A encrypts the document
Person B decrypts the document

Customizing EncryptionOptions

Various settings can be changed to affect the way documents are encrypted.

  • setDataEncryptionKey - The secret key used to encrypt the data
  • setKeyEncryptionKey - A secret key used to encrypt the data encryption key when the DEK is to be transmitted with the encrypted data
  • setKeyCipherAlgorithm - The algorithm used to encrypt the data encryption key
  • setDataCipherAlgorithm - The algorithm used to encrypt the data
  • setIncludeKeyInfo - True if information about the secret key used to encrypt the data should be transmitted with the encrypted data
  • No labels