Release date: 2018-01-17
The Apache Taverna (incubating) team is pleased to announce the release of:
- apache-taverna-server 3.1.0-incubating
This announcement is also available at: https://s.apache.org/taverna-server-3.1.0
Apache Taverna Server enables you to set up a dedicated server for remotely executing Taverna workflows, exposed as REST Web APIs and WSDL Web Services. This enables integration of Taverna workflows into web portals, mobile applications, as well as allowing desktop users to execute remotely on local server or cloud infrastructure. Taverna Server can be configured for POSIX user isolation (e.g. sudo) and can restrict which workflows to execute.
Taverna Server runs workflows defined with Apache Taverna Language in SCUFL2 format (
.wfbundle). It can also execute many Taverna 2 workflows (
.t2flow) , depending on which activities they require.
Download apache-taverna-server 3.1.0-incubating: https://taverna.incubator.apache.org/download/server/
Taverna Java libraries
Apache Taverna Server relies on these earlier releases:
- Apache Taverna Command-line Tool, which can run Taverna workflows from a command prompt or shell script.
- Apache Taverna Language, a Java API to create, inspect, modify and convert SCUFL2 workflow definitions and Research Object Bundles.
For a complete list of Apache Taverna libraries, see https://taverna.incubator.apache.org/download/ and https://taverna.incubator.apache.org/code
Major changes since Taverna Server 2.5.4 are described below:
Taverna 3 execution
Workflows are now executed with Apache Taverna Command-line Tool 3.1.0-incubating, which adds support for executing SCUFL2 workflows using the Apache Taverna Engine.
Source Code Name Changes
Package names have changed to
org.apache.taverna.server.* and source code modules have been reorganized. See the Javadoc for details.
Taverna Server Client
This release adds the taverna-server-client library, which can be used by Java clients to connect to the Taverna Service REST API.
Removed / Retired Features
These activities are no longer supported directly by Apache Taverna:
For interested developers, the source code for the above are available separately as part of the Taverna Plugin Bioinformatics and Taverna Extras.
For details of bug fixes since Taverna Server 3.0, see:
- Java 1.8 (tested with OpenJDK 1.8)
- Apache Maven 3.3.9 or newer
- a Java Servlet container, e.g. Apache Tomcat
See the included taverna-server README for details on how to build the source code.
For full documentation of installing, configuring and using Taverna Server, see https://taverna.incubator.apache.org/documentation/server/
Please subscribe to and contact the dev@taverna mailing list for any questions, suggestions, and discussions about Apache Taverna.
Bugs and feature plans are tracked in the JIRA Issue tracker under the corresponding components. Anyone may add, comment or work on an issue!
Apache Taverna is an effort undergoing incubation at The Apache Software Foundation (ASF) sponsored by the Apache Incubator PMC. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
The identifier rules include important aspects such as
When we mint identifiers as part of a Taverna workflow run (which might be running on a desktop computer behind a firewall), we mint structured URIs using a combination of UUIDs and structural information.
identifies the Taverna workflow "run" 385c794c-ba11-4007-a5b5-502ba8d14263a - each run gets a new UUID.
One advantage here is that our server, while not having access to the distributed workflow runs across the world, can still at least say that it is a workflow run - so it redirects to:
Each run generates workflow data items with internal identifiers, e.g.,
Here you would think c2f58d3e-8686-40a5-b1cd-b797cd18fbb7 would be sufficient to identify this data item, but we choose to include the run UUID as well, as data is always generated as part of a workflow run and this can help find the data.
Because of "list/" we can also say that it's a collection (it will always be a collection) - meaning that the guesser service can say it's a prov:Collection (but it does not say what elements are part of this collection).
For the data, we deliberately did NOT include the UUID of the workflow DEFINITION that was run, as that could be leaking information, e.g., that you are running someone else's workflow.
Similarly elements of a collection have their own UUID-based identifiers, as they could be part of multiple collections.
But when the hierarchy is fixed, and the identifier only makes sense within a hierarchical structure, then I don't mind it being present in the identifier.
Hierarchical structures go hand-in-hand with typing, e.g., the slightly lengthy identifier:
It's "OK" that the URL is not very friendly, as it would only appear within a detailed provenance trace - but it's important that the identifier is unique and consistent wherever the particular workflow definition is run - that way we can relate provenance across workflow runs.
Here you can see multiple global and local identifiers and associated types:
type workflow bundle: 01348671-5aaa-4cc2-84cc-477329b70b0d (a UUID)
type workflow: Hello_Anyone (a potentially nested workflow in the bundle)
type processor: Concatenate_two_strings (a step in the workflow)
type input port: string1 (a particular input parameter of that step)
This being a HTTP URI, the the 'guesser' service can recreate the above structure, even without knowing anything else about the workflow definition.
The local identifier 'string1' depends on each of the above to be contextualized and globally unique.
While we could just have minted UUIDs for each part of the workflow, e.g.,
this actually adds new problem - now the identifiers all look the same, and are useless for manual debugging (e.g., just having a "P" prefix for protein type helps humans wade through debug outputs).
Another problem comes in versioning - when does our workflow processor step input port "change" and get a new identifier? What if the entire rest of the workflow evolves? Do we re-assign every UUID across the structure, or can the same UUID appear in many workflow (versions)? Each approach has advantages and disadvantages.
By making the unit of change the workflow bundle (that is, the ZIP file of the workflow definition you can download), we also make a single point to update the hierarchical identifiers for all the constituent parts for any change. This means, code-wise, we have to be careful to use relative identifiers internally, and absolute identifiers externally.
BTW, we keep a provenance trace of those UUID ancestors.