Ev2

The following is taken from svn_editor.h, which see for more thorough and current information.

Table of Contents

Contents (up to the 2nd level)

Summary: What is "editing"?

In Subversion, we have a number of occasions where we transform a tree from one state into another. This process is called "editing" a tree.

In processing a `commit' command:

  • The client examines its working copy data to determine the set of changes necessary to transform its base tree into the desired target.
  • The client networking library delivers that set of changes/operations across the wire as an equivalent series of network requests (for example, to svnserve as an ra_svn protocol stream, or to an Apache httpd server as WebDAV commands)
  • The server receives those requests and applies the sequence of operations on a revision, producing a transaction representing the desired target.
  • The Subversion server then commits the transaction to the filesystem.

In processing an `update' command, the process is reversed:

  • The Subversion server module talks to the filesystem and computes a set of changes necessary to bring the client's working copy up to date.
  • The server serializes this description of changes, and delivers it to the client.
  • The client networking library receives that reply, producing a set of changes/operations to alter the working copy into the revision requested by the update command.
  • The working copy library applies those operations to the working copy to align it with the requested update target.

The series of changes (or operations) necessary to transform a tree from one state into another is passed between subsystems using this "editor" interface. The "receiver" edits its tree according to the operations described by the "driver".

"Ev2" is the second generation interface for performing these tree edits.

Implementation Plan

See the Ev2 implementation plan for current status and future implementation plans.

Tricky scenarios to model

The time traveler: swap(A, A/B/C)

svn mv A tmp; svn mv tmp/B/C A; svn mv tmp/B A/B; svn mv tmp A/B/C; svn ci

http://mid.gmane.org/87bo6rewwp.fsf@ntlworld.com (variant: change the last mv to 'svn rm tmp')

The 6-node swap

move_tests.py nested_replaces

http://mid.gmane.org/87obauwh4w.fsf@ntlworld.com

The 9-node swap

svnmucc_tests.py nested_replaces http://mid.gmane.org/20130625215307.GA47970@minotaur.apache.org

The constant grandchild: swap(A, A/B) without moving A/B/B

svnmucc rm A copy HEAD A/B A rm A/B copy HEAD A A/B rm A/B/B copy HEAD A/B/B A/B/B

The trick here is preserving A/B/B even though it has the same basename as A/B (where node A/ is destined to be its parent at the end of the juggling).

Suggested changes

danielsh: that's mostly a few concerns I raised over the last week (28 June 2013); more details, links, etc to follow

Remove rotate().

It's not needed; N move() calls are simpler.

http://mid.gmane.org/20130627100619.GA3011@lp-shahaf.local (near the end)

Consider a simple swap: rotate(A, A/B/C). There are a couple definitions of how this swap might be performed:

  1. A is reparented under B, B is reparented under C, and C is reparented at the root.
  2. C/B exists and A is reparented there, and C is reparented at the root

Either definition is possible, but the natural ambiguity leads us away from this solution. The "all-move" approach is much more transparent:

  1. move A/B/C@original to A@new
  2. move A/B@original to A/B@new
  3. move A@original to A/B/C@new

It is now quite obvious that we are affecting three nodes, rather than an apparent 2-node swap.

NOTE: moves' source path must be defined against the original state for this approach to work.

Make the SRC argument move move() relative to the start state, rather than to the current state

It's not possible to represent the time traveler otherwise: by the time the new A gets installed in the tree, A/B@start no longer has a name in the being-edited ("in-flight txn") tree.

gstein: the "no longer has a name" problem can be fixed by using rotate() (but see above for a definitional problem with rotate)

http://mid.gmane.org/20130627100619.GA3011@lp-shahaf.local (in particular, first paragraph)

See also http://mid.gmane.org/20130627122602.GF3011@lp-shahaf.local

See also http://mid.gmane.org/20130625223945.GG4806@lp-shahaf.local for some history

When switching to use the original state for the move source, it implies the Receiver must have access to the original state. Consider the following two moves:

  1. move(A@original, B@new)
  2. move(B@original, C@new)

The first move means that B@original is no longer in the current state. The Receiver will need to locate B@original in some other way.

The FS editor can easily locate B@original from the appropriate revision root.

The WC's update editor has a much more difficult time. There are a couple ways that the WC could solve this problem:

  1. The incoming changes are placed into a client-side TXN. At the end of the update, when complete() is called, the TXN becomes BASE along with the regular post-update revision bump.
  2. Each time a node is replaced (SVN_IS_VALID_REVNUM(replaces_rev)), then the WC will set aside information about the replaced node in case it needs to be accessed later.
  3. The WC grows a "checkpoint" feature wherein it can store an arbitrarily deep stack of working copy states, and every update/merge/switch implicitly creates a checkpoint. This is the most generic solution, but also the most complicated one. It does have a number of additional benefits:
    • Allows rollback to the working copy state (possibly including local modifications) immediately prior to the operation, thus not losing local mods upon a broken update.
    • Provides an "svn checkpoint" client feature.
    • Provides infrastructure for a future "svn shelve" feature.

All the above scenarios will allow the WC to retain the original state should it be needed for a later move.

Add a "replaced node's information" struct

Anything that takes replaces_rev should take a struct describing the possibly-replaced node (whether the replacement is performed using an add, copy, or move). It should include: replaces_rev, sha1 of replaced node (when it is a file), tristate "will I refer to this node as a moved-from or copy-src later within this edit".

Note: if the operation is not a replacement, then the pointer can just be NULL (today, we pass SVN_INVALID_REVNUM for replaces_rev).

http://mid.gmane.org/20130627173345.GL2950@tarsus.local2

gstein: I thought the SHA1 might be useful, but that would only provide file contents. We'd still need to get the original properties from somewhere. It seems that a WC-based receiver is still going to need to perform a lot of work to preserve the replaced-node. As such, it might be easier to just add the tristate, rather than create a structure.

Ambiguity: add_symlink() or add_file()?

Avoid ambiguity on the driver's side: whether to call add_file() [like the representation] or add_symlink() [like the high-level node kind that FS may have someday]. The same problem will happen with a 1.9 driver communicating the addition of a file node which is svn:special=foo for a value of 'foo' defined in 1.10.

See http://mid.gmane.org/20130628042034.GR3011@lp-shahaf.local and its thread.

Ordering

Walk the @end tree. Every node is mentioned as the destination of an operation (add, mv, cp, rm) exactly once. A node that was installed may not be modified again (create(A);move(A,B); is invalid). If A is an ancestor of B (in the @end tree), then A is visited before B.

http://mid.gmane.org/87sj03cxfp.fsf@ntlworld.com

http://mid.gmane.org/51CA321E.8000900@wandisco.com

  • No labels