A specification for changes needed in FSFS and the FS API in order to support moves.

See also the specification of the semantics of moves: MoveDev How to Add Moves to Svn#Move_Semantics.

FS API

These new APIs are required:

  1. A way to find what has moved between two repository trees. (In the repos layer, responding to the "report", this needs to be a comparison between a mixed-revision state and a single-revision tree.)
  2. A way to record moves during a commit.

1. Query

New query API to find the same node-line in another revision: See under “Commit Editor and Query Functions (FSFS)”.

The form of the query API could vary from the small scale such as querying the path at which a given node-line-id lives in a particular revision, to the large scale such as asking for all the moves between a given pair of revisions across the whole versioned tree. The form suggested here is at an intermediate level: children of a directory.

  • Given two versioned directories, DIR1@REV1 and DIR2@REV2 (REV1 != REV2), return a list of (NAME1, NODE-LINE-ID, NAME2) tuples containing each node that is an immediate child of DIR1@REV1 and/or of DIR2@REV2 and is moved in REV2 relative to REV1. In a given entry, NAME1 or NAME2 is null if the node moved into DIR2 or out of DIR1 respectively.

That form reports renames directly and enables the caller to build up a mapping of cross-directory moves by combining the results of multiple queries.

Since that form only looks at a directory's children, we will also need a single-node query. It could be in this form:

  • Given an existing node PATH1@REV1 and an existing revision REV2, return PATH2 at which the same node-line lives in REV2, or null if it does not.

2. Commit

The following new method is needed:

  • svn_fs_move(svn_fs_root_t *root, from_path, to_path)

Move the subtree at 'from_path' to 'to_path' within a transaction. 'root' is the root of a transaction. 'from_path' is a path that exists within the transaction. 'to_path' is a path that does not exist, and its parent path does exist and is a directory, within the transaction. If these conditions are not met, return an error.

[### Here I assumed we'll choose the sequential model for this API, but that may not be the best option. A 'commutative' model is also being considered, in which we would define move as copying from the base revision rather than from the current state of the transaction.]

FSFS

Changes needed to extend FSFS format 6 (or 7?).

  • A lazy child of a copied node always gets a new copy-id, never the copy-id of its parent, when un-lazified. If "move" implies "preserve node-id and copy-id", which is the sanest interpretation, then we have to make this change because the current copy-id inheritance logic can cause unique key clashes once moves are enabled.
  • New FS vtable methods to implement the interfaces defined in the 'FS' section.
  • 'changes' list - record as a 'move'
  • Adjust the 'rebase' code to take account of moves.
  • Adjust implementation of existing query APIs to report moves as copy-and-delete, for back-compat.
  • (Possibly) A mechanism to flag whether move-aware semantics are in use.

New svn_fs_move implementation

[? not sure] The moved node gets a new node-rev-id using the same rules as for a content modification (regardless whether the content is being modified). Thus it gets a new revision-id and keeps the same (node-id, copy-id), unless it was a lazy-copied child in which case it gets a new (unique) copy-id.

[Reason for creating a new node-revision: ...]

New svn_fs query implementations

[TODO]

Adjust the 'rebase' code to take account of moves

The 'rebase' (aka 'merge') that happens just before a commit is finalized must be adjusted to take account of moves. This should simply fail the commit if a move conflicts with some edits. More precisely, it should fail if any path edited in the transaction is within a subtree that was moved in the recent commits, or if any path edited in the recent commits is within a subtree that was moved in the transaction.

(The moves considered should be the collapsed result of all the recent commits; not the moves in each recent commit separately.)

In future we might want to make it automatically merge a move with some edits inside the moved subtree, but there is currently no compelling reason to do so.

Flag Move-Aware Semantics

Consider whether we need to identify move-aware revisions as such. This might be a single flag for the whole repository (perhaps a format upgrade), or a cut-off revision after which all further revisions are move-aware, or a flag per revision identifying whether a move-aware client made the commit.

The purpose of this information would be to be able to know explicitly that copy + delete does not mean 'move' in those commits. Is this distinction necessary? How would the information be used? (Client side? Server side?) What are the pros and cons?

  • No labels