As described under ACE-347 we want a new Management Agent (MA). The initial driver was mostly about reducing complexity and improving extensibility. However, in the mean time on several occasions we also discussed improving its capabilities and manageability from application code.

This page describes the requirements, analysis and design for such a MA rewrite, not (custom) launchers. The MA will still be based on the standard OSGi DeploymentAdmin (DA) specification assuming OSGi Core/Cmpn 4.3 or above.

Requirements

  1. The MA must be a regular bundle that can run in any standard OSGi container using a standard lifecycle, allowing it to be used in as many deployment scenarios as possible.
  2. The MA must have a small footprint in terms of size and resources requirements, allowing is to be used on resource constrained devices.
  3. The MA must be self contained and not require packages or services beyond the framework, allowing it to be easily portable and manageable.
  4. The MA must not run or expose services that are commonly used by applications and thus may introduce conflicts, allowing it to be compatible with all types of use-cases.
  5. The MA must be compatible with standard resource processors, allowing the use of 3rd party code.
  6. The MA standard subsystems must be replaceable through extension, allowing the introduction of prepackaged custom implementations.
  7. The MA standard functions must be fully manageable through a control API, allowing an application to take full control of its operation.
  8. The MA should publish events for all relevant lifecycle and status changes, allowing an application to observer them.
  9. The MA should support resume semantics on DP downloads, allowing control over downloads and minimizing bandwidth usage in network failure scenarios.
  10. The MA should be able to report on estimated download sized of DP downloads, allowing a controller to make decisions on downloads.
  11. The MA should be able to handle a 'backoff' from the server, allowing a server to signal that it should report back later but not the next day or week.

Analysis

MA control API

The MA operations should be manageable by an external process allowing full control over check, updates, downloads etc. This may be the deployed application or even some process outside the JVM operating through JMX or DMT. Therefore a ControlAPI is defined that will be published as a service and at least supports;

  • controlling check/update tasks
  • controlling download process
  • controlling feedback push
  • controlling MA configuration

MA updates

The MA will be a standard bundle and thus can be updated. However, by specification this can not be part of a standard deployment session and no solution is given as to how adn when the MA update is shipped and handled. Therefore this topic has a seperate analysis and design

MA deployment admin

The MA code deployment logic is based on the Deployment Admin spec and will leverage the existing Apache Felix implementation. In order to satisfy requirement 3 this implementation should be embedded as part of the MA bundle allowing full control over module and service layer.

MA packages

The MA should expose as little as possible to ensure maximum compatibility. However, for Resource Processors (RPs) to work, at least the DA packages must be exported. Therefore the MA will export these compendium packages;

  • org.osgi.service.deploymentadmin;version=1.1
  • org.osgi.service.deploymentadmin.spi;version=1.0

Note: RPs may require additional packages (eg. log/event/cm) to resolve. These must be shipped as part of the DP. However, as ACE 'magically' adds those RPs these package requirements may not always be obvious. However, this is not an MA concern.

MA Configuration

The MA will have several configuration option which may be provided as startup (eg. system properties or maybe extension) supporting initial provisioning, but may also be updated throughout it's lifecycle (eg. control api). Using CM would violate requirements 2, 3 and possibly 4. Therefore, MA will implement its own simple form of persisted configuration using bundle data. Configuration must at least cover;

  • configuring schedules/strategies/logging (autodownload/streaming)
  • configuring default subsystem implementations (identification/discovery/connections)

MA Events

OSGI CMPN 4.3 114.11 states that the DA must publish events to the Event Admin (EA) for the purpose of, for example, reporting progress. But there may be no EA deployed or it may be stopped as part of an update session. Providing an EA from the DA violates requirements 2, 3 and possibly 4. Thus we will publish events only if an EA is provided by the application and not require the package import. As a consequence, but in line with the spec intention, these events must be regarded as informational only.

The same semantics must apply for custom MA events that will be defined and published to allow applications to monitor MA lifecycle and status updates that are outside the DA spec. These may include;

  • download started
  • download failed (message) (both with auto-install and download-then-install)
  • download complete
  • download paused
  • download resumed
  • download cancelled

MA Subsystems

The MA functionality is provided by a set of logical subsystems with distinct responsibilities. Customization can happen through configuration and MA extension. Below a high level overview of these subsystems which will be detailed during design.

  • ControlApi; provides control API, core functionality, can not be replaced, provides published service
  • AgentImpl; provides core functions, core functionality, can not be replaced, can not be directly accessed
  • IdentificationHandler; specifies immutable agent identity, default configurable functionality, can be replaced through MA extension
  • DiscoveryHandler; specifies server endpoint(s), default configurable functionality, can be replaced through MA extension
  • ConnectionHandler; handles/secures connections, default configurable functionality, can be replaced through MA extension

MA Extensions

Replacing an MA subsytem means that the a default implementation is replaced by a custom one at packaging, effectively creating a custom agent bundle. This guards requirement 3 and ensures that the custom MA can be updated through the standard process.

MA Resumable Downloads

Support for resumable downloads allows the agent to resume a downloads after either a network failure or intentional pause. This has to be modeled into the deployment REST API and handled by the agent.

This can be modeled using HTTP range headers. For example;

> GET /deployment/target/version/4.0.0?current=3.0.0 HTTP/1.1
> host: localhost
> Range: bytes=500-999

< HTTP/1.1 200 OK
< Content-Range: bytes 500-999/*

Note: As at present the ACE server only generated DPs on the fly the asterix is used to signify an unkown entity-size

DP Download Size

Agent controllers on devices that have limited bandwith or connectivity can query the server about the (estimated) size of a deployment package. This has to be modeled into the deployment REST API and handled by the agent.

The can be done by leveraging the HTTP HEAD request and add an additional informational header. For example;

> HEAD /deployment/target/version/4.0.0?current=3.0.0 HTTP/1.1
> host: localhost

< HTTP/1.1 200 OK
< X-ACE-DpSize: 220

Note: If the server could determine actual size the Content-Length header could be used. However, the server does not pre-generate DPs thus an estimate is the best it can do.

BackOff exceptions

Servers handling many targets may get stormed and need to tell targets to report back in a certain amount of time. This prevents targets from continuously polling the server without the possibility of only reporting back after a (possibly long) configured interval. This has to be modeled into the deployment REST API and handled by the agent.

It seems most appropriate to leverage the HTTP 503 status code and add an additional informational header. For Example;

> GET /deployment/target/version/1.0.0 HTTP/1.1
> host: localhost

< HTTP/1.1 503 Service Unavailable
< X-ACE-BackOff: 300

Note: Marcel found there is actually a standard Retry-After header for this defined in HTTP 1/1 14.33)

> GET /deployment/target/version/1.0.0 HTTP/1.1
> host: localhost

< HTTP/1.1 503 Service Unavailable
< Retry-After: 300

Design

See the API in the org.apache.ace.agent project

  • No labels

3 Comments

  1. this looks quite good. I have some additional questions:

    1. the GrandCentralController is an example or a default implementation of a AgentControl client. An alternative implementation could be part of a application which is controlling the SW update e.g. by asking the user for confirmation etc. Is this correct?
    2. To the MA Events: Who/which subsystem/service is sending these events?`You mentioned above that these events are only raised to inform interested "objects". As I understand this the events could also be used for continueing the update process. For example calling the method downloadVersion starts the download (I suppose, the download is executed in the background). When the download is finished, a event is raised, and the install* method could be called. Or do I have to register a CompletionListener?

    Can you please add some example code to get a better understanding of your ideas?
    I think I understood your proposal and so far I agree to it.

    Do you also plan to collect all the MA code in one project as you already started with the agent project? I would support this against having the target deployement code in a separat deployment project, as it is currently.

    1. On 1) Yes, the default "grandcentral" controller can be configured/disabled and the control api will be exposed as a service so that custom controller can take full control.

      On 2) Yes, the download it asynchronous. You can wait using a CompletionListener or block on the result method. Haven't work out the event mechanism yet as I think we do not want to include EventAdmin in the agent, but we need to address this anyway for DeploymentAdmin.

      > Can you please add some example code to get a better understanding of your ideas?
      Will add an update shortly

      > Do you also plan to collect all the MA code in one project as you already started with the agent project
      Yes

  2. Hi,

    Thanks for these efforts. I am interested in utilising the ACE system, but the current implementation is too heavyweight for inclusion in my limited target device in which disk space is at a particular premium.

    Therefore, It's great to see requirement number 2 at so prominent an index. To achieve this, I see you intend to remove the bundled EA? Is there any further information or ideas about how you intend to reduce the resource requirements of the MA distribution?

    One other thing occurs, will resumable download support require disk space on the target?
    EDIT: One more thing occurs. Devices with reduced resource constraints may well (and indeed does, in our case) not offer very relevant JVMs. For example, our target device runs JamVM 1.4.3 and GNU Classpath 0.91 (e.g. sporadic Java 1.4.2 api adherence).

    Many thanks, Dan.