Ev2 Representation of Moves
Table of Contents |
---|
Summary
Ev2 was designed with an intention to support “move” semantics, but the current design of its move() method does not satisfy the requirements. Here I explain why, and present a solution.
...
- An edit must be able to express any change that can be constructed in a WC and committed.
- An edit must be able to express any change that can be constructed in a repo across multiple revisions.
- Ev2 uses sequential description: a succession of incremental edits applied to an initial state gives a final state.
- Each node should be changed only once.
Given these constraints, not all combinations of moves can be expressed using a “move source to destination” operation, with or without a “rotate” operation, without using temporary paths.
...
- The “move away” end of a move is (or can be) a separate operation from the corresponding “move here”.
- While a subtree is “in limbo” after being “moved away” from its source location and before being “moved here” at its destination, the behaviour is as if it has been moved to a temporary path. The temporary path is in a different name-space so that it cannot conflict with any real path, even a real path outside the scope of the edit.
Why is This Necessary?
Express Any Change
...
Through a series of simple steps of the form “move X to Y”, some quite funky overall changes can be created. For example, starting from this state*:
No Format |
---|
|
+-- A
|
+-- B
|
the following sequence:
No Format |
---|
move /A/B /X
move /A /X/B
move /X /A
|
results in swapping the nodes at paths A and B:
No Format |
---|
| |
+-- A mv--\ /--> +-- A
| X |
+-- B mv--/ \--> +-- B
|
...
- Notation: In the diagrams, a capital letter represents the name of a node; the node's identity (its node-copy-id) is not shown. In sequential instructions of the form “move /A/B /X”, the “/A/B” represents the source path of a node in the tree state that is being incrementally modified, and “/X” represents the destination path that it will have in the new state after this instruction.
Temporary Paths
Paths in the Ev2 editor operations refer to the current state of the tree being edited, at that moment in the sequence of edit operations. (The sole exception is the copy-from path, which is relative to a given committed revision.)
...
There are advantages to placing a node in its final position exactly once. It was claimed that Google's BigTable implementation of Svn's back-end would have benefited from knowing that once a node has been edited then it is in its final state. Ev2 aims to provide that guarantee, where Ev1 could not.
...
- tree changes before content changes
- alter/add/copy a directory (if required) before touching its (immediate) children
A Move Event is Not Adequate
...
Example 1: Insert a directory level
No Format |
---|
| |
+--A mv--\ (add) +--A
\ |
\--> +--B
|
...
Example 2: Swap two siblings
No Format |
---|
| |
+--A mv--\ /--> +--A
| X |
+--B mv--/ \--> +--B
|
...
Example 3: Swap two directory levels
No Format |
---|
| |
+--A mv--\ /--> +--A
| X |
+--B mv--/ \--> +--B
|
...
While this does enable Example 2 (swap two siblings) and Example 3 (swap two directory levels) and many other cases, it does not help with inserting a directory level (Example 1), and it has been shown [1] to be incapable of resolving other more involved cases involving swapping or rotation. One specific example is swapping A with A/B/C [2]:
No Format |
---|
| |
+-- A mv--\ /--> +-- A
| \ / |
+-- B mv--- X ---> +-- B
| / \ |
+-- C mv--/ \--> +-- C
|
...
One solution is to describe the two halves of each move separately:
No Format |
---|
move-away SOURCE-PATH
…
move-here DESTINATION-PATH
|
...
Of course there needs to be a way to link each move-away with the corresponding move-here. Remembering that each edit step refers to the current state in a sequence of states, we cannot simply specify the path corresponding to the other end of the move like this:
No Format |
---|
move-away SOURCE-PATH to DESTINATION-PATH
…
move-here DESTINATION-PATH from SOURCE-PATH
|
because the problem cases are when the destination path does not yet exist at the time of a move-away, or the source path no longer exists at the time of a move-here. What we can do is use some other unique reference that is unique within the edit, like this:
No Format |
---|
move-away SOURCE-PATH as identifier ID1
…
move-here DESTINATION-PATH from identifier ID1
|
...
We could have distinct operations for direct and indirect moves:
No Format |
---|
move SOURCE-PATH DESTINATION-PATH
move-away SOURCE-PATH as identifier ID1
move-here DESTINATION-PATH from identifier ID1
|
...
- Brane wrote: [The inclusion of an explicit direct move] also makes validating the drive easier. Move away without a matching moved here (or the converse) is clearly invalid. It must be trivial for the receiver to detect that. Making the temporary locations explicit makes that so much easier. Regarding direct move without intermediate state, IMO the driver should be required to to use that whenever it can. Driver always has enough info to know that receiver can process such a move. If it cannot, that indicates a bug in the driver.
We should probably just leave it as a "quality of implementation" issue for the editor driver to prefer single move(a,b) instructions over pairs of move-away, move-here. We could try to come up with a requirement that it must do so in certain cases (starting with collapsing adjacent pairs of move-away, move-here), but I think this may be rather difficult to define fully, and of limited benefit.
...
- mv-away must come before the matching mv-here
- The edit should provide a sequential procedure that the consumer must be able to follow without having to buffer an arbitrary amount of state.
- mv-here & cp & add must be in nesting order: create (or put in place) the parent before its children
- mv-away must come before deleting a parent
- Receiver needs to know that it must preserve this path when we delete its parent.
- mv-away must come before mv-away of a parent
- If we allowed “mv-away A; …; mv-away A/B” then the child path “A/B” would have to be specified not relative to the current state of the edit, as all other operative paths are, but in some other way, because the parent has gone into temporary namespace, and has perhaps been replaced so that “A/B” now refers to some other node.
- There is a general rule that all edits within a moved directory “A” must come after A is moved its destination, but a mv-away of a subtree of A is not considered an edit for this purpose.
- ### Reconsider this rule. s/must/may/ ?
Examples Solved
Example 1: Insert a directory level
No Format |
---|
| |
+--A mv--\ add--> +--A
\ |
\--> +--B
|
- alter-dir / (children={A})
- mv-away A (id=“original A”)
- add-directory A (children={B})
- mv-here A/B (id=“original A”)
Example 2: Swapping two siblings
No Format |
---|
| |
+--A --\ /--> +--A
| X |
+--B --/ \--> +--B
|
- alter-dir / (children={A,B})
- mv-away A (id=“original A”)
- mv-away B (id=“original B”)
- mv-here A (id=“original B”)
- mv-here B (id=“original A”)
Example 3 can also be solved in this way, except for some ordering restriction issues that are discussed below.
...
Example 1b: Remove a directory level (by deletion)
No Format |
---|
| |
+--A (del) /--> +--A
| /
+--B mv--/
|
...
Example 1c: Remove a directory level (by move-away)
No Format |
---|
| |
+--A mv-->? /--> +--A
| /
+--B mv--/
|
...
Why do we have this ordering rule?
No Format |
---|
If any path is added (with add_*) or deleted/moved/rotated, then
an svn_editor_alter_directory() call must be made for its parent
directory with the target/eventual set of children.
|
...
First let's address some simple problems in the wording above. Inside an add-directory, all the children have to be added, and that would imply we need an alter-directory as well as the add-directory, violating the Once Rule. It omits “copied” – just an oversight? It does not specify when the call must occur – presumably it must be before any such modifications to the children. To remedy those three initial problems, I suppose something like this is intended:
No Format |
---|
Either alter_directory() or add_directory() must be called on a
directory, declaring its final set of children, before calling
delete(), move_away(), move_here(), copy(), or add_*() on any child.
(For delete() or move_away(), it must be alter_directory(), as
the children of add_directory() must not be deleted or moved away.)
|
...
Example 3: Swapping two directory levels
No Format |
---|
| |
+--A --\ /--> +--A
| X |
+--B --/ \--> +--B
|
- alter-dir A (children={B})
- mv-away A/B (id=”original A/B”)
- alter-dir / (children={A})
- mv-away A (id=”original A”)
- mv-here A (id=”original A/B”)
- mv-here A/B (id=”original A”)
There is a potential problem here:
- We make an edit within subtree A (the “move-away A/B”) before moving A.
This violates the assumed constraint that we should disallow edits within a subtree before moving the subtree.
...
The path-based Once Rule was written something like this:
No Format |
---|
A path should never be referenced more than once by the add_*, alter_*, and delete operations (the "Once Rule"). The source path of a copy (and its children, if a directory) may be copied many times, and are otherwise subject to the Once Rule. The destination path of a copy [or move_here] may have alter_* operations applied, but not add_* or delete. If the destination path of a copy or move is a directory, then its children are subject to the Once Rule. The source path of a move_away() (and its child paths) may be replaced using add_*() or copy() or move_here() (where these new or copied nodes are subject to the Once Rule).
|
...
- The Once Rule applies per node, rather than per path. The definition of a “node” is such that “move” moves an existing node (with any children) to a new path and does not create a new node, while “add” creates a new node and “copy” creates a new node (with any children), each new node being different from all other nodes that are or were in the tree even if it replaces an existing node at the same path.
- One of the following actions can be applied, just Once, directly to each node:
- create (only if it does not exist in the initial tree)
- remove (only if it exists in the initial tree or is brought in as a child of a copy)
- modify (only if it exists in the initial tree or is brought in as a child of a copy)
- A node may be created by one of:
- add_*()
- copy(), optionally followed by alter_*()
- Its children (recursively) come with it, and are then subject to the Once Rule as “child of a copy” nodes.
- A node (with any children) may be removed by one of:
- delete()
- add_*() replacing this node
- copy() replacing this node
- move_here() replacing this node
When removing a directory, each child (recursively) is considered to be removed at this time as well, and, as such, must not have been touched by any other operation. However, a previous child node may have been moved away [or deleted?] before this deletion.
...
- A node may be modified by either or both of the following, in either order:
- move_away() followed by move_here()
- If the node is a directory, its children (recusively) move with it.
- Only if it exists in the initial tree: not allowed for a child of a copy.
- No operation may touch this node or any children (recursively) between move-away and move-here.
- alter_*()
- move_away() followed by move_here()
- alter_directory() is required before “editing” any children
- ### Not sure exactly what we mean here.
- ### The rules would be simpler without this requirement.
- The source of a copy operation may be a node in the initial tree being edited. Such a node (and its children, if a directory) may be copied many times, in addition to being subject to the Once Rule as existing nodes.
In tabular form, the sequences of operations that would be allowed on a node in this scheme are:
...
- Driver describes it as an add or delete.
- Driver describes it as a move (or rather as half of a move).
Advantages to describing it as a move include:
- The consumer can then choose whether to warn the user, raise a tree conflict, or even perhaps commit or update another subtree in order to complete the move.
The consumer can convert a move-out to a delete trivially.
...