Repositories currently store an unlimited number of versioned blobs. https://issues.apache.org/jira/browse/ACE-331 describes our desire to limit that to the last N versions. This page contains the design changes that need to be made for this.

Configuration changes

The first thing we should address is how to configure the number versions. The repository already is a managed service factory, so configuration of the number of versions can easily be done this way, per repository. If you omit this configuration, the default should be the current behaviour: unlimited versions. When changing the configuration, we should try to activate such changes as soon as possible, trimming the amount of versions if necessary.

Limiting the versions

The normal way to add a new version to a repository is by committing it. If we know the limit, it's easy to remove one old version every time a new one comes in (if we're at the limit).

Consequences for replication

Repositories also have a replication interface, which is used by a task that tries to replicate a repository with a remote server. This task first asks the remote server what versions it has. Then it compares them with the local versions. Finally every missing version is fetched. All we need to do here is to make sure that we only ever fetch versions that are in our limited range. We do that by taking the union of the remote and local versions and then synchronise the ones from (highest - limit) to (highest).

  • No labels