SCA Atom Binding Scenarios

This page describes in further detail the Atom Binding scenarios in the Tuscany Web 2.0 roadmap at http://tuscany.apache.org/sca-java-roadmap.html.

A broader view of all the scenarios is located at http://cwiki.apache.org/confluence/display/TUSCANYWIKI/Scenarios.

Explanation of the use of caching in HTTP is in the HTTP spec at http://tools.ietf.org/html/rfc2616. Explanation of the use of caching in Atom is explained in the Atom spec at http://tools.ietf.org/html/rfc5023.

Understanding Data Caching using ETags, Last-Modified, and other Header Commands

Atom feeds and entries can often be very large pieces of data. Since Tuscany uses the Atom data binding as one of its supported bindings, there is the potential that many requests for data may have large pieces of data associated with a request.

Hyper Text Transfer Protocol (HTTP), the basis of the web, has support to help limit or cache the amount of data shared between a client and server by adding tags to a resource reques. These header tags are the ETag and the Last-Modified tags. When used with a predicate tag such as If-Match, If-Not-Match, If-Modified-Since, If-Unmodified-Since, etc., the client and the server can avoid shipping large pieces of data when updated data is not needed.

The following entry scenarios show how Tuscany supports this form of caching throught ETags, Last-Modified, and other Header Commands.

Posting new entry data to a feed
(Show entry data post request, item does not exist on server, server response code 200, return entry data body)
Updating existing entry data in a feed
(Show data update put, If-Match precondition, item is newer and matching, matching return code 412)
Requesting existing entry data
(Show get via ETag, If-None-Match precondition, modified entry data item, matching entry body returned)
Requesting stale entry data
(Show get via ETAG, If-None-Match precondition, unmodified entry data item, not modified return code 304)
Requesting up-to-date entry data
(Show request via last-modified date, entry data is unmodified, Not modified return code 304)
Requesting out-of-date entry data
(Show request via last-modified date, entry data is modified, updated data is returned)

These items are implemented as a result of JIRA Tuscany-2477. There is a test case ProviderEntryEntityTagsTest.java that validates these scenarios via JTest.

Additionally, the following Feed scenarios are provided in the test case ProviderFeedEntityTagsTest.java (with input from Luciano on the Tuscany dev list on 2008-08-02).

Test feed basics
(Request Atom feed. Check that feed is non-null, has Id, title, and updated values. Check for Etag and Last-Modified headers)
Test Unmodified If-Match predicate
(Request feed based on existing ETag. Use If-Match predicate in request header. Expect status 200 and feed body.)
Test Unmodified If-None-Match predicate
(Request feed based on existing ETag. Use If-None-Match predicate in request header. Expect status 304, item not modified, no feed body.)
Test Unmodified If-Unmodified-Since predicate
(Request feed based on very current Last-Modified. Use If-Unmodified-Since predicate in request header. Expect status 304, item not modified, no feed body.)
Test Unmodified If-Modified-Since predicate
(Request feed based on very old Last-Modified. Use If-Modified-Since predicate in request header. Expect status 200, feed in body.)
Test Modified If-None-Match predicate
(Request feed based on existing ETag. Use If-None-Match predicate in request header. Expect status 200, feed in body.)
Test Modified If-Match predicate
(Request feed based on existing ETag. Use If-Match predicate in request header. Expect status 412, precondition failed, no feed in body.)
Test Modified If-UnModified-Since predicate
(Request feed based on very recent Last-Mod date. Use If-Unmodified-Since predicate in request header. Expect status 304, no feed in body.)
Test Modified If-Modified-Since predicate
(Request feed based on very old Last-Mod date. Use If-Modified-Since predicate in request header. Expect status 200, feed in body.)

Support of Web 2.0 data caching via ETags and Last-Modified fields allow the Tuscany user to save bandwidth and re-requests of data. Especially in the area of content feeds which can have very large data objects, the ability to cache improves server performance and reduces network bottlenecks. A full end-to-end demonstration of this network savings is being created via Jira TUSCANY-2537 which will show caching in the feed aggregator sample.

Support for Negotiated Content Types

Requests for data now respond with negotiated content types. In other words, the requester can now state which content types are preferred, and the responder can provide different content types. The data preference is expressed in the request header "Accept" parameter.

These data binding types are supported:

Atom XML format (Request header Accept=application/atom+xml)
Atom in JSON format (Request header Accept=application/atom+json)

The following content types are requestable in different data bindings

Atom entry data (MIME type application/atom+xxx;type=entry where xxx=xml or json)
Atom feed data (MIME type application/atom+xxx;type=feed where xxx=xml or json)

For example, the requester asks for an Atom entry with no Accept header or Accept header value is application/atom+xml. The returned response body contains:

<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom">
  <title type="text">customer Fred Farkle</title>
  <updated>2008-08-08T18:40:30.484Z</updated>
  <author>
    <name>Apache Tuscany</name>
  </author>
  <content type="text">Fred Farkle</content>
  <id>urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085</id>
  <link href="urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085" rel="edit" />
  <link href="urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085" rel="alternate" />
</entry>

In contrast, the requester asks for an Atom entry with Accept header value is application/atom+json. The returned response body contains:

{
 "id":"urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085",
 "title":"customer Fred Farkle",
 "content":"Fred Farkle",
 "updated":"2008-08-08T18:40:30.484Z",
 "authors":[{
   "name":"Apache Tuscany"
  }
 ],
 "links":[{
   "href":"urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085",
   "rel":"edit"
  },{
   "href":"urn:uuid:customer-91d349b3-4b8b-4cfa-b9e9-d999f9937085",
   "rel":"alternate"
  }
 ]
}

The ability to view entires and feeds in multiple data formats allows the Tuscany user extreme flexibility in parsing and processing data returned by a service or collection.

Service and Workspace Document Support (application/atomsvc+xml)

This item implemented by TUSCANY-2597.

Prior to this implementation, there was a dummy service document provided when you visited an Atom feed service address with an "atomsvc" extension. For example, running the the Atom service binding unit tests, one could visit http://localhost:8080/customer/atomsvc and receive the following service document:

   <?xml version='1.0' encoding='UTF-8'?>
   <service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom">
      <workspace>
         <atom:title type="text">resource</atom:title>
         <collection href="http://luck.ibm.com:8084/customer">
            <atom:title type="text">collection</atom:title>
            <accept>application/atom+xml;type=entry</accept>
            <categories />
         </collection>
      </workspace>
   </service>

This dummy implementation did not provide a true collection name, URL to the collection, accept MIME types or categories.

Following the inclusion of TUSCANY-2597 and the new implentation, the Tuscany Atom binding will correctly populate an atomsvc document with information from the feed and give correct information for discovery. Now , running the the Atom service binding unit tests, one could visit http://localhost:8080/customer/atomsvc and receive the following service document:

<?xml version='1.0' encoding='UTF-8'?>
<service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom">
  <workspace xml:base="http://localhost:8084/">
    <atom:title type="text">workspace</atom:title>
    <collection href="http://localhost:8084/customer">
      <atom:title type="text">customers</atom:title>
      <accept>application/atom+xml; type=feed</accept>
      <accept>application/json; type=feed</accept>
      <accept>application/atom+xml; type=entry</accept>
      <accept>application/json; type=entry</accept>
      <categories />
    </collection>
  </workspace>
</service>

The service document is now properly populated with URLs, titles, accept MIME types and categories. These are elements that are needed for collection discovery and visitatin.

Support for Javascript for Atom collection

This item implemented by TUSCANY-2568.

"- A proper Javascript object model for an Atom collection and Atom entries to facilitate the use of Atom in Javascript clients, modeled after the Abdera model for collection and entry."

This item provides a full Java script object model for Atom Feeds, Entries, and other data objects. This benefits customers and client developers by providing an easy model to use in HTML, JSP, scripts, GUIs, and other client side technology.

For example, prior to this feature, developers would have to develop code in XML to manipulate nodes in the XML document that represented the current page:

   var entries = feed.getElementsByTagName("entry");              
   var list = "";
   for (var i=0; i<entries.length; i++) {
      var item = entries[i].getElementsByTagName("content")[0].firstChild.nodeValue;
      list += item + ' <br>';
   }

Using the new JavaScript client object model, the code is greatly simplified and easier to understand:

   var entries = feed.getEntries();              
   var list = "";
   for (var i=0; i<entries.length; i++) {
      var item = entries[i].getContent();
      list += item + ' <br>';
   }

A full example showing how to use this client model is given in implementation-widgets-runtime. The store.html page shows the older style of document XML manipulation. The storeJS.html pafe shows the newer style of JavaScript object manipulation.

(A note on implementation. Although we explored using the Google GData JavaScript client to manipulate data, we decided to build the JavaScript object model from scratch. This module is available as tuscanyAtom.js. This module is embedded into a client, and then all Atom objects, Feed, Entry, Person, Link, etc. are available from the client program.)

Support for postMedia, putMedia and other Streaming Data

This item implemented by TUSCANY-2567.

The Atom Publishing Protocol provides for a separate space for storage of media resources. The purpose of a separate space for these resources is to keep feeds and entries from growing too large with the contents of typically large items. Media resources can be any binary object that is supported by a MIME type, but typically media resources include video, image, and audio type files.

Media resources are maintained by the collection implementor, and are given a dual identity. There is a location in the media respository, typically where one places the media files, and there is a location in the feed or entry space. This second reference is known as a media link entry.

The Tuscany package at org.apache.tuscany.sca.binding.atom.collection (in tuscany-binding-atom-abdera package) has the following interface for MediaCollection:

   /**
     * Creates a new media entry
     * 
     * @param title
     * @param slug
     * @param contentType
     * @param media
     */
    Entry postMedia(String title, String slug, String contentType, InputStream media);

    /**
     * Update a media entry.
     * 
     * @param id
     * @param contentType
     * @param media
     * @return
     */
    void putMedia(String id, String contentType, InputStream media) throws NotFoundException;

These two methods are used to create (post) new media files, and update (put) new media information and edits. The media resources may be retrieved (get) or removed (delete) via the normal HTTP get and delete operations and the links returned by the post and get methods.

For instance, when creating a media resource, one typically posts the following information via an HTTP post request:

POST /edit/ HTTP/1.1
Host: media.example.org
Content-Type: image/png
Slug: The Beach
Authorization: Basic ZGFmZnk6c2VjZXJldA==
Content-Length: nnn
...binary data...

In turn, the Tuscany invocation framework invokes the postMedia shown above on the media collection implementation. The media collection implementation may then take the binary data from the media InputStream and store it to a media repository. The media collection implemenation should construct a proper Entry item via XML construction or some Atom model framework such as Apache Abdera. The Entry should contain the required elements (title, id, updated, summary, content, edit link, edit-media link) in order to provide the proper Atom Pub Protocol return headers and Entry data as given here:

HTTP/1.1 201 Created
Content-Length: nnn
Content-Type: application/atom+xml;type=entry;charset="utf-8"
Location: http://example.org/media/edit/the_beach.atom
<?xml version="1.0"?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <title>The Beach</title>
   <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
   <updated>2005-10-07T17:17:08Z</updated>
   <author><name>Daffy</name></author>
   <summary type="text" />
   <content type="image/png"
      src="http://media.example.org/the_beach.png"/>
   <link rel="edit-media"
      href="http://media.example.org/edit/the_beach.png" />
   <link rel="edit"
      href="http://example.org/media/edit/the_beach.atom" />
</entry>

Note that the edit link provides the Atom Feed link to the media entry, and the edit-media link provides the media repository link to the media entry. Use these links to get and delete media.

A special convention has been implemented to allow the media collection implementation to return properties in the response header. The summary element of the Entry returned may contain a set of key=value properties separated via commas. For example, in order to provide return Content-Type and Content-Length values in the response header, one can created this text in the postMedia Entry summary element: Content-Type=image/jpg,Content-Length=21642.

The putMedia method acts in much the same way, but the URI to the item should contain an ID to the media being updated. For instance, if the usual post and get feed URI is http://localhost:8084/receipt, then to update the media given above one would put to the URI http://localhost:8084/receipt/urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a. Here is an example of a put request:

PUT /edit/urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a HTTP/1.1
Host: media.example.org
Content-Type: image/png
Authorization: Basic ZGFmZnk6c2VjZXJldA==
Content-Length: nnn 
...binary data...

After the above put request, the Atom binding will invoke the media collection implementation putMedia method. The media collection should update the media if the ID exists (and a 200 OK status code will return), or the media collection should throw a NotFoundException if the ID does not exist (and a 404 not found status code will return).

The above scenarios are documents in the MediaCollectionTestCase unit test case in the binding-atom-abdera module in the Tuscany code base.

Security on Sensitive Commands (Delete,DeleteAll,etc.)

This item implemened by TUSCANY-2569.