Discussion:
authz changes between 1.9 and 1.10
Philip Martin
2018-07-20 12:59:04 UTC
Permalink
In 1.9 it was possible to repeat, or reopen, a section:

[/some/path]
user = r
[/some/path]
otheruser = rw

This was equivalent to a single section:

[/some/path]
user = r
otheruser = rw

In 1.10 this is rejected by the parser and cannot be used. Is this a
bug in 1.10 or an acceptable behaviour change?

In 1.9 any repeat acl lines that were the exact same match, such as:

[/some/path]
user = rw
user = r

resulted in the last line overriding all the other lines, so user=r in
the example above. In 1.10 the lines combine, so user=rw in the example
above. Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?

Finally, issue 4762. In 1.9 if both global and per-repository sections
matched they were combined, so:

[/some/path]
user = rw
[repos:/some/path]
user = r

resulted in user=rw. The issue 4762 problem is that in 1.10 the
per-repository section overrides any global section, so user=r above. I
believe this is a 1.10 bug and that the 1.9 behaviour should be
reinstated. However, consider glob rules:

[:glob:/some/*]
user = rw
[:glob:repos:/some/*]
user = r

At present the per-repository section override the global section just
like the buggy behaviour for non-glob sections. If we fix 4762 to
reinstate the combining for non-glob sections should we change the
behaviour of glob sections so they combine as well? What about a
non-glob and glob section:

[/some/path]
user = rw
[:glob:repos:/some/path]
user = r

Should these combine?

Glob sections are new so they could have different behaviour from
non-glob sections, but is that what we want? There is a wiki page
https://cwiki.apache.org/confluence/display/SVN/AuthzImprovements but
given issue 4762 I not sure whether it describes the correct behaviour.
--
Philip
Branko Čibej
2018-07-21 11:38:38 UTC
Permalink
Post by Philip Martin
[/some/path]
user = r
[/some/path]
otheruser = rw
[/some/path]
user = r
otheruser = rw
In 1.10 this is rejected by the parser and cannot be used. Is this a
bug in 1.10 or an acceptable behaviour change?
It's an intentional change that is documented in the design wiki page.

The old behaviour was not by design but a side effect of the way our
config parser works. For defining ACLs, it is far too sloppy and makes
large authz files hard to debug. It's also impossible to decide which
rule overrides, which was fine before we had glob patterns but is quite
important now (see below).
Post by Philip Martin
[/some/path]
user = rw
user = r
resulted in the last line overriding all the other lines, so user=r in
the example above. In 1.10 the lines combine, so user=rw in the example
above. Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
This is also documented in the design page (Inheritance and
Disambiguation, item 8).

I can't think of a good reason to change the behaviour though. This
change prevents giving access to a group and then restricting it for one
member of that group. Perhaps we should revert to the 1.9 behaviour ...
I wish Stefan2 would comment.
Post by Philip Martin
Finally, issue 4762. In 1.9 if both global and per-repository sections
[/some/path]
user = rw
[repos:/some/path]
user = r
resulted in user=rw. The issue 4762 problem is that in 1.10 the
per-repository section overrides any global section, so user=r above. I
believe this is a 1.10 bug and that the 1.9 behaviour should be
reinstated.
See Inheritance and Disambiguation, items 6 and 7:  "If
repository-specific path rules as well as global path rules match a
given path, only the repository-specific ones will be considered." and
"If multiple path rules match a given repository path, only the one
specified last in the authz file shall apply."

So this is as designed. If this is a design bug, I wish someone had
pointed it out a few years ago ...

We have to be careful when changing the behaviour that we don't end up
reading the whole authz representation for each request, since this
would drastically reduce the performance improvements in 1.10. Of course
correctness comes before performance.
Post by Philip Martin
[:glob:/some/*]
user = rw
[:glob:repos:/some/*]
user = r
At present the per-repository section override the global section just
like the buggy behaviour for non-glob sections. If we fix 4762 to
reinstate the combining for non-glob sections should we change the
behaviour of glob sections so they combine as well? What about a
[/some/path]
user = rw
[:glob:repos:/some/path]
user = r
Should these combine?
At the point of the decision, there are no wildcard patterns any more:
we already have a matched path and ACL. What's different in the 1.10
implementation is that, as per Item 7, the resolver will only use the
last-defined and most specific path or pattern to find the access rule —
so a glob pattern that happens to match the whole path will override
anything else, being the best match.

Note that we also don't have a rule that a non-glob rule that matches
the same path as a glob rule should override: we have Item 7 instead,
and the right way to handle this in authz files is to define glob rules
first, non-glob rules last.
Post by Philip Martin
Glob sections are new so they could have different behaviour from
non-glob sections, but is that what we want? There is a wiki page
https://cwiki.apache.org/confluence/display/SVN/AuthzImprovements but
given issue 4762 I not sure whether it describes the correct behaviour.
It describes designed behaviour. If we change it, we should do it
carefully, as I wrote above. Also I think it turns out that the authz
section in the release notes misses a behaviour change or two. It should
probably include the whole Inheritance and Disambiguation list, however
we end up changing it.

-- Brane
Philip Martin
2018-07-24 22:37:06 UTC
Permalink
Post by Branko Čibej
It's an intentional change that is documented in the design wiki page.
And only there, it didn't make the release notes.
Post by Branko Čibej
This is also documented in the design page (Inheritance and
Disambiguation, item 8).
But not explicitly as a change from 1.9, and again not in the release
notes.
Post by Branko Čibej
Post by Philip Martin
Finally, issue 4762. In 1.9 if both global and per-repository sections
See Inheritance and Disambiguation, items 6 and 7:  "If
repository-specific path rules as well as global path rules match a
given path, only the repository-specific ones will be considered." and
"If multiple path rules match a given repository path, only the one
specified last in the authz file shall apply."
So this is as designed. If this is a design bug, I wish someone had
pointed it out a few years ago ...
Again, it's not explicitly documented as a change from 1.9 and it's not
in the release notes.
Post by Branko Čibej
It describes designed behaviour. If we change it, we should do it
carefully, as I wrote above. Also I think it turns out that the authz
section in the release notes misses a behaviour change or two. It should
probably include the whole Inheritance and Disambiguation list, however
we end up changing it.
The most important thing is to document the change in behaviour of the
non-glob rules between 1.9 and 1.10.

The problem I have is that I still don't know if the changes are
intentional. Of these undocumented (in the release notes) changes there
is one that appears to be intentional and two that could be accidental.
At least the first, intentional, change produces a run-time error if it
occurs, the other two just lead to different access being granted, one
less access the other more access. Anyone using a non-trivial authz
file in 1.9 has to be very careful upgrading to 1.10.

Is it worth me working on a fix? Can we declare 1.10.0 and 1.10.1 buggy
and change the behaviour in future 1.10.x? Or are we stuck with 1.10
being different from 1.9?
--
Philip
Daniel Shahaf
2018-07-25 06:21:34 UTC
Permalink
Post by Philip Martin
Post by Branko Čibej
It describes designed behaviour. If we change it, we should do it
carefully, as I wrote above. Also I think it turns out that the authz
section in the release notes misses a behaviour change or two. It should
probably include the whole Inheritance and Disambiguation list, however
we end up changing it.
The most important thing is to document the change in behaviour of the
non-glob rules between 1.9 and 1.10.
+1; we should document any incompatible changes (regardless of whether
they were intentional or not).
Post by Philip Martin
The problem I have is that I still don't know if the changes are
intentional. Of these undocumented (in the release notes) changes there
is one that appears to be intentional and two that could be accidental.
At least the first, intentional, change produces a run-time error if it
occurs, the other two just lead to different access being granted, one
less access the other more access. Anyone using a non-trivial authz
file in 1.9 has to be very careful upgrading to 1.10.
Sounds like we should encourage people to write unit tests for their
authz files. This would be fairly easy to implement using 'svnauthz
accessof'. We could ship something in tools/ that takes two
inputs, an authz file and a set of expectations, and validates the authz
file against the expectations.
Post by Philip Martin
Is it worth me working on a fix? Can we declare 1.10.0 and 1.10.1 buggy
and change the behaviour in future 1.10.x? Or are we stuck with 1.10
being different from 1.9?
(I don't know.)
Branko Čibej
2018-07-25 11:15:52 UTC
Permalink
Post by Daniel Shahaf
Post by Philip Martin
Post by Branko Čibej
It describes designed behaviour. If we change it, we should do it
carefully, as I wrote above. Also I think it turns out that the authz
section in the release notes misses a behaviour change or two. It should
probably include the whole Inheritance and Disambiguation list, however
we end up changing it.
The most important thing is to document the change in behaviour of the
non-glob rules between 1.9 and 1.10.
+1; we should document any incompatible changes (regardless of whether
they were intentional or not).
Post by Philip Martin
The problem I have is that I still don't know if the changes are
intentional. Of these undocumented (in the release notes) changes there
is one that appears to be intentional and two that could be accidental.
At least the first, intentional, change produces a run-time error if it
occurs, the other two just lead to different access being granted, one
less access the other more access. Anyone using a non-trivial authz
file in 1.9 has to be very careful upgrading to 1.10.
Sounds like we should encourage people to write unit tests for their
authz files. This would be fairly easy to implement using 'svnauthz
accessof'. We could ship something in tools/ that takes two
inputs, an authz file and a set of expectations, and validates the authz
file against the expectations.
I think serious admins already do this :) But sure, if someone has the
time and inclination to write such tools, they'll surely be useful.


-- Brane
Branko Čibej
2018-07-25 11:14:16 UTC
Permalink
Post by Philip Martin
Post by Branko Čibej
It's an intentional change that is documented in the design wiki page.
And only there, it didn't make the release notes.
Yes ...
Post by Philip Martin
Post by Branko Čibej
This is also documented in the design page (Inheritance and
Disambiguation, item 8).
But not explicitly as a change from 1.9, and again not in the release
notes.
... and yes ...
Post by Philip Martin
Post by Branko Čibej
Post by Philip Martin
Finally, issue 4762. In 1.9 if both global and per-repository sections
See Inheritance and Disambiguation, items 6 and 7:  "If
repository-specific path rules as well as global path rules match a
given path, only the repository-specific ones will be considered." and
"If multiple path rules match a given repository path, only the one
specified last in the authz file shall apply."
So this is as designed. If this is a design bug, I wish someone had
pointed it out a few years ago ...
Again, it's not explicitly documented as a change from 1.9 and it's not
in the release notes.
Post by Branko Čibej
It describes designed behaviour. If we change it, we should do it
carefully, as I wrote above. Also I think it turns out that the authz
section in the release notes misses a behaviour change or two. It should
probably include the whole Inheritance and Disambiguation list, however
we end up changing it.
The most important thing is to document the change in behaviour of the
non-glob rules between 1.9 and 1.10.
The problem I have is that I still don't know if the changes are
intentional. Of these undocumented (in the release notes) changes there
is one that appears to be intentional and two that could be accidental.
At least the first, intentional, change produces a run-time error if it
occurs, the other two just lead to different access being granted, one
less access the other more access. Anyone using a non-trivial authz
file in 1.9 has to be very careful upgrading to 1.10.
Is it worth me working on a fix? Can we declare 1.10.0 and 1.10.1 buggy
and change the behaviour in future 1.10.x? Or are we stuck with 1.10
being different from 1.9?
It's worth working on a fix, IMO. My suggestion would be:

* Keep the single-rule behaviour as it turned up in 1.10, just
document it. It's necessary for glob rules and making an exception
for "old-style" rules will limit the ways we can improve the authz
system in future. Also it'd make the next point quite a bit harder.
* Change the ACE override/merge behaviour back to the 1.9 way, as
there's no reason I can think of that it could be either.
* #4762 takes a bit of thought. It's relatively easy to revert to 1.9
behaviour for non-glob rules, because it can be done at parsing
time. For glob rules, it'd have to be done at resolve time, as
otherwise there's not consistent meaning of exact path match.
* Document everything in release notes ... or maybe provide a link to
the (or a new, more user-friendly) Wiki page.
Philip Martin
2018-07-30 00:38:58 UTC
Permalink
Post by Branko Čibej
* Keep the single-rule behaviour as it turned up in 1.10, just
document it. It's necessary for glob rules and making an exception
for "old-style" rules will limit the ways we can improve the authz
system in future. Also it'd make the next point quite a bit harder.
Agreed. It is a regression from 1.9 but it's a hard error so it will
not change behaviour silently.
Post by Branko Čibej
* Change the ACE override/merge behaviour back to the 1.9 way, as
there's no reason I can think of that it could be either.
I have a patch to do this. At present I am distinguishing between glob
and non-glob rules and only apply the 1.9 way to non-glob rules. This
means that a 1.9 authz file retains the 1.9 behaviour, and that the 1.10
glob rules retain the the 1.10 behaviour for anyone who has started
using them. But is also means glob and non-glob rules behave
differently.

Is it better to preserve as much 1.10 behaviour as possible because
people may have started using it, or is is better to have consistency
between glob and non-glob rules?
Post by Branko Čibej
* #4762 takes a bit of thought. It's relatively easy to revert to 1.9
behaviour for non-glob rules, because it can be done at parsing
time. For glob rules, it'd have to be done at resolve time, as
otherwise there's not consistent meaning of exact path match.
I have a patch to do this as well, still testing. Again I am
distinguishing between glob and non-glob rules, so the inheritance only
applies to non-glob rules. And once again I wonder if it would be
better for 1.10 glob rules to change?
--
Philip
Branko Čibej
2018-07-30 01:07:32 UTC
Permalink
Post by Philip Martin
Post by Branko Čibej
* Keep the single-rule behaviour as it turned up in 1.10, just
document it. It's necessary for glob rules and making an exception
for "old-style" rules will limit the ways we can improve the authz
system in future. Also it'd make the next point quite a bit harder.
Agreed. It is a regression from 1.9 but it's a hard error so it will
not change behaviour silently.
Post by Branko Čibej
* Change the ACE override/merge behaviour back to the 1.9 way, as
there's no reason I can think of that it could be either.
I have a patch to do this. At present I am distinguishing between glob
and non-glob rules and only apply the 1.9 way to non-glob rules. This
means that a 1.9 authz file retains the 1.9 behaviour, and that the 1.10
glob rules retain the the 1.10 behaviour for anyone who has started
using them. But is also means glob and non-glob rules behave
differently.
Is it better to preserve as much 1.10 behaviour as possible because
people may have started using it, or is is better to have consistency
between glob and non-glob rules?
It's definitely better to have consistent behaviour across all rule
types. Otherwise there'll be no end of user questions about it ... and
we'll keep second-guessing ourselves, too. Also imagine this:

[:glob:/path]
foo = rw
foo = r

What are the access rights for 'foo'? It's a glob rule but it doesn't
have any wildcards in the path, so it's functionally equivalent to
[/path] ... but with different behaviour about calculating the access
rights? I shudder to think that users would figure this out and start
using the :glob: prefix to select a different consolidation algorithm.
Post by Philip Martin
Post by Branko Čibej
* #4762 takes a bit of thought. It's relatively easy to revert to 1.9
behaviour for non-glob rules, because it can be done at parsing
time. For glob rules, it'd have to be done at resolve time, as
otherwise there's not consistent meaning of exact path match.
I have a patch to do this as well, still testing. Again I am
distinguishing between glob and non-glob rules, so the inheritance only
applies to non-glob rules. And once again I wonder if it would be
better for 1.10 glob rules to change?
See above. The same considerations apply.

-- Brane
Daniel Shahaf
2018-07-30 05:50:44 UTC
Permalink
Post by Branko Čibej
Post by Philip Martin
Post by Branko Čibej
* Keep the single-rule behaviour as it turned up in 1.10, just
document it. It's necessary for glob rules and making an exception
for "old-style" rules will limit the ways we can improve the authz
system in future. Also it'd make the next point quite a bit harder.
Agreed. It is a regression from 1.9 but it's a hard error so it will
not change behaviour silently.
Post by Branko Čibej
* Change the ACE override/merge behaviour back to the 1.9 way, as
there's no reason I can think of that it could be either.
I have a patch to do this. At present I am distinguishing between glob
and non-glob rules and only apply the 1.9 way to non-glob rules. This
means that a 1.9 authz file retains the 1.9 behaviour, and that the 1.10
glob rules retain the the 1.10 behaviour for anyone who has started
using them. But is also means glob and non-glob rules behave
differently.
Is it better to preserve as much 1.10 behaviour as possible because
people may have started using it, or is is better to have consistency
between glob and non-glob rules?
It's definitely better to have consistent behaviour across all rule
types.
+1
Post by Branko Čibej
Otherwise there'll be no end of user questions about it ... and
[:glob:/path]
foo = rw
foo = r
Here,

- The 1.9 behaviour is reasonable: it follows a simple rule (the same
one our CLI --option=argument parsing uses, and our
~/.subversion/config files parsing too, I believe), so an admin might
have reasonably assumed it to be intentional. It _is_ a feature of Subversion
that all ConfigParser-ish config files are parsed the same way, after all.

- 1.10 makes an incompatible change to the 1.9 semantics. The change
is undocumented in the release notes, so a 1.9 admin can't be
expected to be aware of the change, but _is_ documented in the design
wiki, so a 1.10 admin may have seen it in the design wiki and started
to rely on it.

We effectively made two contradictory compatibility promises; we can't
honour both of them. I think our only option is to make it a syntax error.

"In the face of ambiguity, refuse the temptation to guess."

Cheers,

Daniel

P.S. I _would_ have had this discussion during the design phase if I had
noticed the issue back then. Sorry to have missed it.
Philip Martin
2018-07-30 12:51:06 UTC
Permalink
Post by Branko Čibej
It's definitely better to have consistent behaviour across all rule
types.
+1
I like the idea of achieving consistency by making the duplicate entries
into an error: it changes the behaviour of 1.10 but anyone affected gets
an error. It's also a simpler version of my existing patch.

Consistency is more of a problem for the inheritance case:

[/path]
userA = rw
[repo:/path]
userB = rw

because any change will silently change the behaviour of 1.10. The glob
implementation made file order (sequence number in the implementation)
important and my experimental inheritance patch arbitrarily picks the
per-repository sequence number when inheriting but without any real
justification. The choice has no effect when using the glob
implementation on a 1.9 authz file, but the choice starts to matter when
glob rules are present.

The release notes for 1.10 didn't specify how glob rules are
prioritised, so anyone using them had to read the design docs or
experiment. How inheritance affects glob rules is the outstanding
behaviour question. Do the glob inherit like non-glob rules? How does
the sequence number of inherited rules get defined? One option is to
make :glob: into an error, and introduce :GLOB: with defined rules for
inheritance.

Here's a quick patch to implement the new duplicate error, and to
inherit non-glob rules only:

Index: subversion/libsvn_repos/authz.c
===================================================================
--- subversion/libsvn_repos/authz.c (revision 1837006)
+++ subversion/libsvn_repos/authz.c (working copy)
@@ -872,6 +872,24 @@ finalize_tree(node_t *node,
combine_right_limits(sum, local_sum);
}

+static authz_acl_t *
+combine_inherit(const authz_acl_t *prev_acl, const authz_acl_t *acl,
+ apr_pool_t *result_pool)
+{
+ authz_acl_t *combined = apr_palloc(result_pool, sizeof(authz_acl_t));
+
+ *combined = *acl;
+
+ combined->has_anon_access |= prev_acl->has_anon_access;
+ combined->anon_access |= prev_acl->anon_access;
+ combined->has_authn_access |= prev_acl->has_authn_access;
+ combined->authn_access |= prev_acl->authn_access;
+
+ combined->user_access = apr_array_append(result_pool, acl->user_access,
+ prev_acl->user_access);
+ return combined;
+}
+
/* From the authz CONFIG, extract the parts relevant to USER and REPOSITORY.
* Return the filtered rule tree.
*/
@@ -912,6 +930,9 @@ create_user_authz(authz_full_t *authz,
AUTHZ_ANY_REPOSITORY));
SVN_ERR_ASSERT_NO_RETURN(strcmp(acl->rule.repos,
AUTHZ_ANY_REPOSITORY));
+
+ if (!acl->glob_rule && !prev_acl->glob_rule)
+ acl = combine_inherit(prev_acl, acl, result_pool);
apr_array_pop(acls);
}
}
Index: subversion/libsvn_repos/authz.h
===================================================================
--- subversion/libsvn_repos/authz.h (revision 1837006)
+++ subversion/libsvn_repos/authz.h (working copy)
@@ -254,6 +254,8 @@ typedef struct authz_acl_t
matches. */
int sequence_number;

+ svn_boolean_t glob_rule;
+
/* The parsed rule. */
authz_rule_t rule;

Index: subversion/libsvn_repos/authz_parse.c
===================================================================
--- subversion/libsvn_repos/authz_parse.c (revision 1837006)
+++ subversion/libsvn_repos/authz_parse.c (working copy)
@@ -127,6 +127,12 @@ typedef struct ctor_baton_t
svn_membuf_t rule_path_buffer;
svn_stringbuf_t *rule_string_buffer;

+ /* Accumulates the access entries when processing a rule. */
+ apr_hash_t *rule_entries;
+
+ /* For duplicating the access entries when processing a rule. */
+ apr_pool_t *rule_entries_pool;
+
/* The parser's scratch pool. This may not be the same pool as
passed to the constructor callbacks, that is supposed to be an
iteration pool maintained by the generic parser.
@@ -229,6 +235,9 @@ create_ctor_baton(apr_pool_t *result_pool,
svn_membuf__create(&cb->rule_path_buffer, 0, parser_pool);
cb->rule_string_buffer = svn_stringbuf_create_empty(parser_pool);

+ cb->rule_entries = NULL;
+ cb->rule_entries_pool = svn_pool_create(parser_pool);
+
cb->parser_pool = parser_pool;

insert_default_acl(cb);
@@ -684,6 +693,8 @@ rules_open_section(void *baton, svn_stringbuf_t *s

SVN_ERR(check_open_section(cb, section));

+ svn_pool_clear(cb->rule_entries_pool);
+
/* Parse rule property tokens. */
if (*rule != ':')
glob = FALSE;
@@ -741,6 +752,8 @@ rules_open_section(void *baton, svn_stringbuf_t *s
SVN_ERR(parse_rule_path(&acl.acl.rule, cb, glob, rule, rule_len,
section->data));
SVN_ERR(check_unique_rule(cb, &acl.acl.rule, section->data));
+ cb->rule_entries = svn_hash__make(cb->rule_entries_pool);
+ acl.acl.glob_rule = glob;
}
else if (0 == strcmp(section->data, aliases_section))
{
@@ -820,6 +833,13 @@ add_access_entry(ctor_baton_t *cb, svn_stringbuf_t

SVN_ERR_ASSERT(acl != NULL);

+ if (svn_hash_gets(cb->rule_entries, name))
+ return svn_error_createf(SVN_ERR_AUTHZ_INVALID_CONFIG, NULL,
+ _("Duplicate entry '%s' in authz rule [%s]"),
+ name, section->data);
+ svn_hash_sets(cb->rule_entries,
+ apr_pstrmemdup(cb->rule_entries_pool, name, name_len), "");
+
if (inverted)
{
++name;
--
Philip
Daniel Shahaf
2018-07-30 20:03:26 UTC
Permalink
Post by Philip Martin
@@ -820,6 +833,13 @@ add_access_entry(ctor_baton_t *cb, svn_stringbuf_t
SVN_ERR_ASSERT(acl != NULL);
+ if (svn_hash_gets(cb->rule_entries, name))
+ return svn_error_createf(SVN_ERR_AUTHZ_INVALID_CONFIG, NULL,
+ _("Duplicate entry '%s' in authz rule [%s]"),
+ name, section->data);
A comment here explaining why this case needs to *remain* an error would be
helpful, otherwise I'm sure in a year or two someone will think "Hey, this case
is just a syntax error now, so it's fair game to make it a non-error and give
it a defined meaning"…

Maybe you were already planning to add a comment later.

Cheers,

Daniel
Post by Philip Martin
+ svn_hash_sets(cb->rule_entries,
+ apr_pstrmemdup(cb->rule_entries_pool, name, name_len), "");
+
if (inverted)
{
++name;
--
Philip
Branko Čibej
2018-07-31 07:38:06 UTC
Permalink
Post by Philip Martin
Post by Branko Čibej
It's definitely better to have consistent behaviour across all rule
types.
+1
I like the idea of achieving consistency by making the duplicate entries
into an error: it changes the behaviour of 1.10 but anyone affected gets
an error. It's also a simpler version of my existing patch.
[/path]
userA = rw
[repo:/path]
userB = rw
because any change will silently change the behaviour of 1.10. The glob
implementation made file order (sequence number in the implementation)
important and my experimental inheritance patch arbitrarily picks the
per-repository sequence number when inheriting but without any real
justification.
Shouldn't it be the other way around? At request processing time, only
one rule will match. If it's a per-repository rule, it should be
possible to check if a generic rule with the same path exists. This part
can even be pre-calculated in the parser, so expensive lookup at
processing time can be avoided.
Post by Philip Martin
The choice has no effect when using the glob
implementation on a 1.9 authz file, but the choice starts to matter when
glob rules are present.
The release notes for 1.10 didn't specify how glob rules are
prioritised, so anyone using them had to read the design docs or
experiment. How inheritance affects glob rules is the outstanding
behaviour question. Do the glob inherit like non-glob rules? How does
the sequence number of inherited rules get defined? One option is to
make :glob: into an error, and introduce :GLOB: with defined rules for
inheritance.
I think it's fair to say that the current behaviour in 1.10.x is a bug,
then explain in an announcement what we're doing about it.

It was never the intention of the new authz code to *silently* change
behaviour from 1.9.

-- Brane
Stefan Fuhrmann
2018-09-08 09:17:18 UTC
Permalink
Most of these issues have already been addressed by Brane,
but as he wished for my input, here it is.

These are the guiding principles for the 1.10 authz design:

(1) ACLs are only evaluated on a per-user bases; ACLs that
    don't mention this user (or any of their groups)  are ignored.
    Rationale: We don't want to explicitly repeat inherited access
    specs that don't change for the respective path / section.

(2) A more specific rule (fully) replaces any more general rule.
    Rationale: You want to specify access exactly where it applies
    and be sure that those are exactly the rights that will be applied.

(3) If there are multiple, equally specific rules, the last one
    replaces any previous one.
    Rationale: The last one, so you can specify catch-all denial ACLs.
    Replacement, again, to ensure that exactly the rights specified
    last will be in effect.

Those principles seemed quite reasonable and, most importantly,
workable with all the potential confusion caused by glob-rules.

Given that the previous authz code lacked similarly explicit
guidelines, I felt that any discrepancy between 1.9 and 1.10
would be a strong indication of a undesirable behavior in 1.9.
I still think this is true but we must neither break 1.9 authz.
Post by Philip Martin
[/some/path]
user = r
[/some/path]
otheruser = rw
[/some/path]
user = r
otheruser = rw
In 1.10 this is rejected by the parser and cannot be used. Is this a
bug in 1.10 or an acceptable behaviour change?
Collides with rule (3), which is essential for reasoning on glob-rules.
Since it seemed an accidental feature inherited from the config parser,
we explicitly flagged it as invalid in 1.10. It would otherwise become

[/some/path]
otheruser = rw
Post by Philip Martin
[/some/path]
user = rw
user = r
resulted in the last line overriding all the other lines, so user=r in
the example above. In 1.10 the lines combine, so user=rw in the example
above. Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
Ouch. That is a bad one and an oversight in the design - I think.

According to (3), 1.9 behaves correctly. I guess we consider it
an unordered collection in 1.10 and then a union is the only thing
that approximates intent when a user is a member of different
groups with different access rights. Strict ordering becomes
very useful when assigning rights to groups:

[/some/path]
@Users = rw
@BadUsers = r

I guess the fix in 1.10 is simple but will change 1.10 behavior.
My proposal here:

* apply replacement semantics here as in 1.9
* error out / warn on repeated lines with the same user or group
  (this is hardly ever intentional)
* provide a script / tool that takes 1.10 authz and checks for
  changed behavior ("r" after "rw" rules etc.)

The last one is a bit of work but would be really handy.
Post by Philip Martin
Finally, issue 4762. In 1.9 if both global and per-repository sections
[/some/path]
user = rw
[repos:/some/path]
user = r
resulted in user=rw. The issue 4762 problem is that in 1.10 the
per-repository section overrides any global section, so user=r above. I
believe this is a 1.10 bug and that the 1.9 behaviour should be
reinstated.
According to (2), 1.10 behaves correctly: "user" has rw access,
except for a specific repository. I was not aware that 1.9 has
different behavior.

Now, the crux is how to unbreak 1.9. I suggest the following.

* Introduce a resolution option in the authz code:
  - "union" (1.9 behavior)
  - "most specific" (1.10 behavior)
  - "error out" (new, will work in all non-ambiguous cases)
* default to the "error out"
* optionally specify the desired behavior as a config option
Post by Philip Martin
Glob sections are new so they could have different behaviour from
non-glob sections, but is that what we want? There is a wiki page
https://cwiki.apache.org/confluence/display/SVN/AuthzImprovements but
given issue 4762 I not sure whether it describes the correct behaviour.
User documentation on the new authz is inadequate.

-- Stefan^2.
Branko Čibej
2018-12-02 07:25:03 UTC
Permalink
(1) ACLs are only evaluated on a per-user bases; ACLs that
    don't mention this user (or any of their groups)  are ignored.
    Rationale: We don't want to explicitly repeat inherited access
    specs that don't change for the respective path / section.
This is not entirely true, as seen in the fix for SVN-4793. If a user is
"not mentioned" in an inverted selector, those rights do propagate to
the global level. For example:

[groups]
readers = foo, bar

[/]
~@readers = rw
@readers = r


In this case 'user' has read-write access to '[/]' even though she's not
mentioned anywhere in the authz file or the specific ACL for '[/]'.
   [/some/path]
   user = rw
   user = r
resulted in the last line overriding all the other lines, so user=r in
the example above.  In 1.10 the lines combine, so user=rw in the example
above.  Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
Ouch. That is a bad one and an oversight in the design - I think.
According to (3), 1.9 behaves correctly. I guess we consider it
an unordered collection in 1.10 and then a union is the only thing
that approximates intent when a user is a member of different
groups with different access rights. Strict ordering becomes
[/some/path]
@Users = rw
@BadUsers = r
We already have strict ordering within an ACL (authz_acl_t in
libsvn_repos/authz.h):

/* All other user- or group-specific access rights.
Aliases are replaced with their definitions, rules for the same
user or group are merged. */
apr_array_header_t *user_access;



The "merge" semantics was intentional; if we decide it's a bug (and I
think it is), it's fairly easy to change. I would lean in the direction
of making repeating the same access entry selection a hard error at
parsing time. This requires changes to the merge semantics implemented
in add_access_entry() and merge_alias_ace() in
libsvn_repos/authz_parse.c. The nice part is that it would catch errors
like this:

[aliases]
afoo = foo
abar = bar

[/]
&afoo = rw
foo = r
~&abar = rw
~bar = r


With the current implementation we translate the ACL to:

[/]
foo = rw
foo = r
~bar = rw
~bar = r


and even with strict ordering I'd say this is a bug and not intentional.
I guess the fix in 1.10 is simple but will change 1.10 behavior.
* apply replacement semantics here as in 1.9
* error out / warn on repeated lines with the same user or group
  (this is hardly ever intentional)
^^^ this
* provide a script / tool that takes 1.10 authz and checks for
  changed behavior ("r" after "rw" rules etc.)
The last one is a bit of work but would be really handy.
The remaining ambiguity that I would prefer _not_ to resolve at parsing
time is this:

[groups]
@readers = foo, bar, user
@writers = baz, quz, user

[/]
@writers = rw
@readers = r


With strict ordering and replacement semantics 'user' would get
read-only rights. But that doesn't seem right; surely if someone is a
member of both 'readers' and 'writers' groups, they should get merged
access rights of both?

Note that the current behaviour is to merge.

-- Brane
Branko Čibej
2018-12-02 07:43:23 UTC
Permalink
Post by Branko Čibej
(1) ACLs are only evaluated on a per-user bases; ACLs that
    don't mention this user (or any of their groups)  are ignored.
    Rationale: We don't want to explicitly repeat inherited access
    specs that don't change for the respective path / section.
This is not entirely true, as seen in the fix for SVN-4793. If a user is
"not mentioned" in an inverted selector, those rights do propagate to
[groups]
readers = foo, bar
[/]
@readers = r
In this case 'user' has read-write access to '[/]' even though she's not
mentioned anywhere in the authz file or the specific ACL for '[/]'.
   [/some/path]
   user = rw
   user = r
resulted in the last line overriding all the other lines, so user=r in
the example above.  In 1.10 the lines combine, so user=rw in the example
above.  Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
Ouch. That is a bad one and an oversight in the design - I think.
According to (3), 1.9 behaves correctly. I guess we consider it
an unordered collection in 1.10 and then a union is the only thing
that approximates intent when a user is a member of different
groups with different access rights. Strict ordering becomes
[/some/path]
@Users = rw
@BadUsers = r
We already have strict ordering within an ACL (authz_acl_t in
/* All other user- or group-specific access rights.
Aliases are replaced with their definitions, rules for the same
user or group are merged. */
apr_array_header_t *user_access;
The "merge" semantics was intentional; if we decide it's a bug (and I
think it is), it's fairly easy to change. I would lean in the direction
of making repeating the same access entry selection a hard error at
parsing time. This requires changes to the merge semantics implemented
in add_access_entry() and merge_alias_ace() in
libsvn_repos/authz_parse.c. The nice part is that it would catch errors
[aliases]
afoo = foo
abar = bar
[/]
&afoo = rw
foo = r
~&abar = rw
~bar = r
[/]
foo = rw
foo = r
~bar = rw
~bar = r
and even with strict ordering I'd say this is a bug and not intentional.
Note that this should also be an error:

[/]
$anonymous = r
~$authenticated = rw



-- Brane
Branko Čibej
2018-12-02 09:38:23 UTC
Permalink
Post by Branko Čibej
Post by Branko Čibej
(1) ACLs are only evaluated on a per-user bases; ACLs that
    don't mention this user (or any of their groups)  are ignored.
    Rationale: We don't want to explicitly repeat inherited access
    specs that don't change for the respective path / section.
This is not entirely true, as seen in the fix for SVN-4793. If a user is
"not mentioned" in an inverted selector, those rights do propagate to
[groups]
readers = foo, bar
[/]
@readers = r
In this case 'user' has read-write access to '[/]' even though she's not
mentioned anywhere in the authz file or the specific ACL for '[/]'.
   [/some/path]
   user = rw
   user = r
resulted in the last line overriding all the other lines, so user=r in
the example above.  In 1.10 the lines combine, so user=rw in the example
above.  Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
Ouch. That is a bad one and an oversight in the design - I think.
According to (3), 1.9 behaves correctly. I guess we consider it
an unordered collection in 1.10 and then a union is the only thing
that approximates intent when a user is a member of different
groups with different access rights. Strict ordering becomes
[/some/path]
@Users = rw
@BadUsers = r
We already have strict ordering within an ACL (authz_acl_t in
/* All other user- or group-specific access rights.
Aliases are replaced with their definitions, rules for the same
user or group are merged. */
apr_array_header_t *user_access;
The "merge" semantics was intentional; if we decide it's a bug (and I
think it is), it's fairly easy to change. I would lean in the direction
of making repeating the same access entry selection a hard error at
parsing time. This requires changes to the merge semantics implemented
in add_access_entry() and merge_alias_ace() in
libsvn_repos/authz_parse.c. The nice part is that it would catch errors
[aliases]
afoo = foo
abar = bar
[/]
&afoo = rw
foo = r
~&abar = rw
~bar = r
[/]
foo = rw
foo = r
~bar = rw
~bar = r
and even with strict ordering I'd say this is a bug and not intentional.
[/]
$anonymous = r
~$authenticated = rw
I have a patch ready, here are some examples of what it does (currently,
all these examples are valid and produce merged access rights):

$ cat authz.conf
[/]
user = rw
user = r
$ svnauthz validate authz.conf
svnauthz: E220003: Error while parsing authz file: 'authz.conf':
svnauthz: E220003: Duplicate access entry 'user' in rule [/]

$ cat authz.conf
[/]
$authenticated = rw
~$anonymous = r
$ svnauthz validate authz.conf
svnauthz: E220003: Error while parsing authz file: 'authz.conf':
svnauthz: E220003: Duplicate access entry '~$anonymous' (matches '$authenticated') in rule [/]

$ cat authz.conf
[aliases]
resu = user

[/]
~&resu = rw
~user = r
$ svnauthz validate authz.conf
svnauthz: E220003: Error while parsing authz file: 'authz.conf':
svnauthz: E220003: Duplicate access entry '~&resu' (matches '~user') in rule [/]



-- Brane
Julian Foad
2018-12-02 14:15:35 UTC
Permalink
I have a patch ready, here are some examples of what it does [...]
Not following the details, all I would ask ("all", he says) is that a complete description of the new semantics be available for review, and should be published along with the changes. Otherwise I fear the community has no real way of reviewing the actual semantics against the intended semantics.

One basis we have for such documentation could be the document that Doug Robinson contributed in dev@ thread "Subversion AuthZ Wildcards" on 2017-02-27 [1]. AFAIK it was the best or only documentation we had of the new semantics.

Might you or anyone else be prepared to do that?

[1] https://svn.haxx.se/dev/archive-2017-02/0188.shtml or https://lists.apache.org/thread.html/be50f6e5b1a92a244033bd7f13449c8d02e92bbe7e29dc89209f62f8@%3Cdev.subversion.apache.org%3E
--
- Julian
Branko Čibej
2018-12-02 15:49:58 UTC
Permalink
Post by Julian Foad
I have a patch ready, here are some examples of what it does [...]
Not following the details, all I would ask ("all", he says) is that a complete description of the new semantics be available for review, and should be published along with the changes. Otherwise I fear the community has no real way of reviewing the actual semantics against the intended semantics.
Might you or anyone else be prepared to do that?
Oooh, Yeah. I started by making the page tree in confluence make some
kind of sense, and found we have two roadmap pages, both by the same
author. :)  There are a few pages I haven't been able to classify, but
certainly a new chapter is necessary to gather actual implemented
feature documentation (which may, or may not, be different from the
related design notes).

-- Brane
Branko Čibej
2018-12-02 16:28:01 UTC
Permalink
Post by Branko Čibej
Post by Julian Foad
I have a patch ready, here are some examples of what it does [...]
Not following the details, all I would ask ("all", he says) is that a complete description of the new semantics be available for review, and should be published along with the changes. Otherwise I fear the community has no real way of reviewing the actual semantics against the intended semantics.
Might you or anyone else be prepared to do that?
Oooh, Yeah. I started by making the page tree in confluence make some
kind of sense, and found we have two roadmap pages, both by the same
author. :)  There are a few pages I haven't been able to classify, but
certainly a new chapter is necessary to gather actual implemented
feature documentation (which may, or may not, be different from the
related design notes).
Seriously though: I started this document
https://cwiki.apache.org/confluence/x/7IjQBQ
Yes, it's empty. It will improve.

-- Brane
Branko Čibej
2018-12-05 12:44:28 UTC
Permalink
Post by Branko Čibej
Seriously though: I started this document
https://cwiki.apache.org/confluence/x/7IjQBQ
Yes, it's empty. It will improve.
It is starting to improve. I'll be grateful to anyone who cares to take
a look to debug the syntax BNF.


-- Brane
Julian Foad
2018-12-05 13:27:48 UTC
Permalink
Post by Branko Čibej
Post by Branko Čibej
https://cwiki.apache.org/confluence/x/7IjQBQ
It is starting to improve. I'll be grateful to anyone who cares to take
a look to debug the syntax BNF.
I appreciate the right-bracket rule change is in that domain.

Could I nevertheless encourage you to jump ahead to writing the high-level rules, especially those that affected by the changes you propose in this thread relating to override/merge behaviours etc.?

That's the part we need in order to review the proposed changes.
--
- Julian
Branko Čibej
2018-12-05 13:37:52 UTC
Permalink
Post by Julian Foad
Post by Branko Čibej
Post by Branko Čibej
https://cwiki.apache.org/confluence/x/7IjQBQ
It is starting to improve. I'll be grateful to anyone who cares to take
a look to debug the syntax BNF.
I appreciate the right-bracket rule change is in that domain.
Could I nevertheless encourage you to jump ahead to writing the high-level rules, especially those that affected by the changes you propose in this thread relating to override/merge behaviours etc.?
That's the part we need in order to review the proposed changes.
One thing at a time. FWIW, the high-level rules are *already* documented
in our wiki, as are changes from the previous authz incarnation (modulo
bugs). I'll really just summarize them in a more user-friendly way. All
known docs and issues are also linked from that page.

In fact the documentation we have now, as well as test coverage, is much
better than what we had before 1.10, when it was limited to half a
chapter in The Book and to reading the code. It's kind of "fun" trying
to document changes from undocumented behaviour.

-- Brane
Branko Čibej
2018-12-05 13:40:02 UTC
Permalink
Post by Julian Foad
Post by Branko Čibej
Post by Branko Čibej
https://cwiki.apache.org/confluence/x/7IjQBQ
It is starting to improve. I'll be grateful to anyone who cares to take
a look to debug the syntax BNF.
I appreciate the right-bracket rule change is in that domain.
Could I nevertheless encourage you to jump ahead to writing the high-level rules, especially those that affected by the changes you propose in this thread relating to override/merge behaviours etc.?
That's the part we need in order to review the proposed changes.
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.

-- Brane
Julian Foad
2018-12-05 13:44:50 UTC
Permalink
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
--
- Julian
Branko Čibej
2018-12-05 13:59:16 UTC
Permalink
Post by Julian Foad
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
The <text> production used imply "optional". I already fixed that and
how the comment definition reads:

<comment> ::= "#" <opt-text> <line-end>


-- Brane
Branko Čibej
2018-12-05 14:04:47 UTC
Permalink
Post by Branko Čibej
Post by Julian Foad
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
The <text> production used imply "optional". I already fixed that and
<comment> ::= "#" <opt-text> <line-end>
However <key> /is/ wrong ... it implies that keys can't contain "#" or
"[", where actually they can, they just can't start with one of those
two. Whet they really can't contain is "=".

I will have to check the code though ...

-- Brane
Julian Foad
2018-12-05 14:05:12 UTC
Permalink
Post by Branko Čibej
Post by Julian Foad
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
The <text> production used imply "optional". I already fixed that and
<comment> ::= "#" <opt-text> <line-end>
The problem is <opt-text> isn't allowed to start with whitespace.
--
- Julian
Branko Čibej
2018-12-05 14:23:34 UTC
Permalink
Post by Julian Foad
Post by Branko Čibej
Post by Julian Foad
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
The <text> production used imply "optional". I already fixed that and
<comment> ::= "#" <opt-text> <line-end>
The problem is <opt-text> isn't allowed to start with whitespace.
Eh, you're right; I redefined <opt-text> to fix that, which also fixes
<key-value>.

-- Brane
Branko Čibej
2018-12-06 08:01:42 UTC
Permalink
Post by Branko Čibej
Post by Julian Foad
Post by Branko Čibej
Post by Julian Foad
Post by Branko Čibej
Yes, but I'm asking for a review of the BNF to make sure that it doesn't
contain silly bugs.
At first glance, I saw what looks like a bug: it says a comment must have a non-white-space immediately after the '#'... I didn't review further.
The <text> production used imply "optional". I already fixed that and
<comment> ::= "#" <opt-text> <line-end>
The problem is <opt-text> isn't allowed to start with whitespace.
Eh, you're right; I redefined <opt-text> to fix that, which also fixes
<key-value>.
I've decided that documenting the syntax of authz files at this level
doesn't really belong in this document. So I started this:

https://cwiki.apache.org/confluence/x/oYvQBQ

and will refer to that page instead, pointing out the differences.

-- Brane

Branko Čibej
2018-12-02 16:23:05 UTC
Permalink
Post by Branko Čibej
Post by Branko Čibej
Post by Branko Čibej
(1) ACLs are only evaluated on a per-user bases; ACLs that
    don't mention this user (or any of their groups)  are ignored.
    Rationale: We don't want to explicitly repeat inherited access
    specs that don't change for the respective path / section.
This is not entirely true, as seen in the fix for SVN-4793. If a user is
"not mentioned" in an inverted selector, those rights do propagate to
[groups]
readers = foo, bar
[/]
@readers = r
In this case 'user' has read-write access to '[/]' even though she's not
mentioned anywhere in the authz file or the specific ACL for '[/]'.
   [/some/path]
   user = rw
   user = r
resulted in the last line overriding all the other lines, so user=r in
the example above.  In 1.10 the lines combine, so user=rw in the example
above.  Is this a bug in 1.10, or a bug in 1.9 that is fixed in 1.10, or
an acceptable behaviour change?
Ouch. That is a bad one and an oversight in the design - I think.
According to (3), 1.9 behaves correctly. I guess we consider it
an unordered collection in 1.10 and then a union is the only thing
that approximates intent when a user is a member of different
groups with different access rights. Strict ordering becomes
[/some/path]
@Users = rw
@BadUsers = r
We already have strict ordering within an ACL (authz_acl_t in
/* All other user- or group-specific access rights.
Aliases are replaced with their definitions, rules for the same
user or group are merged. */
apr_array_header_t *user_access;
The "merge" semantics was intentional; if we decide it's a bug (and I
think it is), it's fairly easy to change. I would lean in the direction
of making repeating the same access entry selection a hard error at
parsing time. This requires changes to the merge semantics implemented
in add_access_entry() and merge_alias_ace() in
libsvn_repos/authz_parse.c. The nice part is that it would catch errors
[aliases]
afoo = foo
abar = bar
[/]
&afoo = rw
foo = r
~&abar = rw
~bar = r
[/]
foo = rw
foo = r
~bar = rw
~bar = r
and even with strict ordering I'd say this is a bug and not intentional.
[/]
$anonymous = r
~$authenticated = rw
I have a patch ready, here are some examples of what it does (currently,
https://issues.apache.org/jira/browse/SVN-4794
Loading...