A specification for changes needed in FSFS and the FS API in order to support moves.
See also the specification of the semantics of moves: MoveDev How to Add Moves to Svn#Move_Semantics.
FS API
These new APIs are required:
- A way to find what has moved between two repository trees. (In the repos layer, responding to the "report", this needs to be a comparison between a mixed-revision state and a single-revision tree.)
- A way to record moves during a commit.
1. Query
New query API to find the same node-line in another revision: See under “Commit Editor and Query Functions (FSFS)”.
The form of the query API could vary from the small scale such as querying the path at which a given node-line-id lives in a particular revision, to the large scale such as asking for all the moves between a given pair of revisions across the whole versioned tree. The form suggested here is at an intermediate level: children of a directory.
- Given two versioned directories, DIR1@REV1 and DIR2@REV2 (REV1 != REV2), return a list of (NAME1, NODE-LINE-ID, NAME2) tuples containing each node that is an immediate child of DIR1@REV1 and/or of DIR2@REV2 and is moved in REV2 relative to REV1. In a given entry, NAME1 or NAME2 is null if the node moved into DIR2 or out of DIR1 respectively.
That form reports renames directly and enables the caller to build up a mapping of cross-directory moves by combining the results of multiple queries.
Since that form only looks at a directory's children, we will also need a single-node query. It could be in this form:
- Given an existing node PATH1@REV1 and an existing revision REV2, return PATH2 at which the same node-line lives in REV2, or null if it does not.
2. Commit
The following new method is needed:
- svn_fs_move(svn_fs_root_t *root, from_path, to_path)
Move the subtree at 'from_path' to 'to_path' within a transaction. 'root' is the root of a transaction. 'from_path' is a path that exists within the transaction. 'to_path' is a path that does not exist, and its parent path does exist and is a directory, within the transaction. If these conditions are not met, return an error.
[### Here I assumed we'll choose the sequential model for this API, but that may not be the best option. A 'commutative' model is also being considered, in which we would define move as copying from the base revision rather than from the current state of the transaction.]
FSFS
Changes needed to extend FSFS format 6 (or 7?).
- A lazy child of a copied node always gets a new copy-id, never the copy-id of its parent, when un-lazified. If "move" implies "preserve node-id and copy-id", which is the sanest interpretation, then we have to make this change because the current copy-id inheritance logic can cause unique key clashes once moves are enabled.
- New FS vtable methods to implement the interfaces defined in the 'FS' section.
- 'changes' list - record as a 'move'
- Adjust the 'rebase' code to take account of moves.
- Adjust implementation of existing query APIs to report moves as copy-and-delete, for back-compat.
- (Possibly) A mechanism to flag whether move-aware semantics are in use.
New svn_fs_move implementation
[? not sure] The moved node gets a new node-rev-id using the same rules as for a content modification (regardless whether the content is being modified). Thus it gets a new revision-id and keeps the same (node-id, copy-id), unless it was a lazy-copied child in which case it gets a new (unique) copy-id.
[Reason for creating a new node-revision: ...]
New svn_fs query implementations
[TODO]
Adjust the 'rebase' code to take account of moves
The 'rebase' (aka 'merge') that happens just before a commit is finalized must be adjusted to take account of moves. This should simply fail the commit if a move conflicts with some edits. More precisely, it should fail if any path edited in the transaction is within a subtree that was moved in the recent commits, or if any path edited in the recent commits is within a subtree that was moved in the transaction.
(The moves considered should be the collapsed result of all the recent commits; not the moves in each recent commit separately.)
In future we might want to make it automatically merge a move with some edits inside the moved subtree, but there is currently no compelling reason to do so.
Flag Move-Aware Semantics
Consider whether we need to identify move-aware revisions as such. This might be a single flag for the whole repository (perhaps a format upgrade), or a cut-off revision after which all further revisions are move-aware, or a flag per revision identifying whether a move-aware client made the commit.
The purpose of this information would be to be able to know explicitly that copy + delete does not mean 'move' in those commits. Is this distinction necessary? How would the information be used? (Client side? Server side?) What are the pros and cons?