Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Wildcard support and performance
    • Performance when a large number of paths needs to be checked. Affected operations are mainly checkout / export and log.
    • Support for wildcards. Subversion should support
      • "*" for single (exactly one), arbitrary path segments (with no "/" in them) and
      • "**" for arbitrary number (zero to infinite) of path segments.
      • Classic wildcard patterns like "*foo*.bar", including escapement via the "\" prefix shall be supported. The asterisks in there match zero to many characters other than "/" making the whole segment form "/*/" a mere special case of this.
        All wildcard usage applies to full path segments only, i.e. a '*' never matches a '/' except for the case of "/**/" where it matches zero to many full segments. For example, "/*/**/*" will match any path that contains at least 2 segments and is equivalent to "/**/*/*" as well as "/*/*/**".
  2. Better access control
    • The right to know about the existence of a node (a.k.a. lookup access rights and/or directory traversal rights) is implied by read access and cannot be manipulated separately.
      • See Issue 3380 for previous discussion of this topic;
      • the authz-overhaul branch for an attempt at implementing this distinction.
      • This thread on the dev@ mailing list is a recent example of the problems caused by implicit lookup access.

Wildcard Support and Performance

...

  1. An ACL is relevant to a user if the user, one of their aliases or groups that they are a member of is mentioned by at least one ACE in that ACL.
  2. Only path rules with ACLs relevant to the given user may match a path.
  3. If a path rule matches a given repository path, its ACL applies to that path.
  4. If no path rule matches a given repository path, the parent path's ACL applies.
  5. If no ACL is given for the repository root, a default ACL denying everybody access to the root path, applies.
  6. If repository-specific path rules as well as global path rules match a given path, only the repository-specific ones will be considered.
  7. If multiple path rules match a given repository path, only the one specified last in the authz file shall apply.
  8. If multiple ACEs of a given ACL apply to a user, the union of all individually granted access rights is granted.

Design

The idea is to use combine the following approaches:

  • Preprocessed, tree-like data structures applied to segmented paths
  • Reduction of the tree to what applies to the current user
  • Pre-calculate recursive rights for early exit

General Workflow

Putting caching aside, the workflow involved three data models, building on top of each other.

  • Parsing an authz file (from file system or repository), validating its contents and creating a pre-processed in-memory representation. In comparison to the old svn_config_t based code, additional restrictions apply:
    • No rule may appear more than once in the authz file.
    • Value placeholders (%(name)s) are not expanded.
  • Filtered path rule tree
    • prefix tree with one node per segment
    • created on demand per user and repository
    • contains only rules that apply to the respective user and repository
    • multiple instances of that being cached in the svn_authz_t structure alongside the single "full model"
    • ACLs being reduced to access rights + order ID
    • each node knows min / max rights on all sub-nodes
  • Lookup state
    • access rights accordingly the latest matching path rule
    • list of tree nodes that may match sub-paths as we may need to follow multiple patterns
    • temporary data structure thats reused between queries to save on allocation and construction overhead

Data models

These are persistent in the sense that we will cache and reuse them. They do not cover transient data models that various algorithms may use e.g. during authz parsing.

...