Re: [OAUTH-WG] Namespacing "type" in RAR

Justin Richer Tue, 21 Jul 2020 11:06:27 -0700

I’m suggesting that API designers avoid using such glyphs in their “type” 
values if they want to avoid such human-copy errors, like they would need to do 
for most other strings in their system. If that means they stick to ASCII or 
put a note on the developer page that says “hey copy and paste this value, 
don’t try to re-type it” or whatever, that’s up to the AS.


You’d have the same kind of issue around “similar-looking” characters, like the 
semicolon vs. the greek question mark. Should the AS look for those and try to 
“fix” the inputs? I would argue not: the AS should be strict in matching these 
values because it could have security implications. 

This isn’t a problem unique to RAR, or OAuth for that matter. We can, and I 
think should, add guidance to the RAR document for all of these points. 

 — Justin

> On Jul 21, 2020, at 1:55 PM, Dick Hardt <dick.ha...@gmail.com> wrote:
> 
> In unicode, a glyph can be represented by more than one code point. When 
> reading the docs and entering a value, the developer will not know which code 
> point the AS intended. 
> 
> Are you suggesting that AS documentation would have the bytes rather than 
> glyphs? Or not use glyphs that have multiple code points? Or that they only 
> use english?
> 
> 
> 
> On Tue, Jul 21, 2020 at 10:34 AM Justin Richer <jric...@mit.edu 
> <mailto:jric...@mit.edu>> wrote:
> Right, and I’m saying that all three of those would be DIFFERENT “type” 
> values, because they’re different strings. The fact that when treated as URIs 
> they would be equivalent is irrelevant. Just like “foo”, “Foo”, and “FOO” 
> would be different “type” values, per the spec. Nothing is stopping an AS 
> from treating them as equivalent internally, but that seems a bit dangerous 
> to me. I’d love to see a formal breakdown of that, though.
> 
> As for the unicode example, if we define things as using byte comparisons, 
> then that becomes an issue for proper documentation and configuration — and 
> again, probably a good place to have recommendations for picking type value 
> strings so as to avoid such problems.
> 
> In short, I don’t think we should have any requirements on canonicalization 
> for these values.
> 
>  — Justin
> 
>> On Jul 21, 2020, at 1:03 PM, Dick Hardt <dick.ha...@gmail.com 
>> <mailto:dick.ha...@gmail.com>> wrote:
>> 
>> 
>> The following are the same URI, but are different strings:
>> 
>>      “https://schema.example.org/v1 <https://schema.example.org/v1>”
>>      “HTTPS://schema.example.org/v1 <https://schema.example.org/v1>”
>>      “https://SCHEMA.EXAMPLE.ORG/v1 <https://schema.example.org/v1>”
>> 
>> Before comparing them to each other, they must be canonicalized so that they 
>> become the same string.
>> 
>> From earlier in this thread, I am NOT suggesting that it must be a URI, nor 
>> that it is required:
>> 
>> Since the type represents a much more complex object then a JWT claim, a 
>> client developer's tooling could pull down the JSON Schema (or some such) 
>> for a type used in their source code, and provide autocompletion and 
>> validation which would improve productivity and reduce errors. An AS that is 
>> using a defined type could use the schema for input validation. Neither of 
>> these would be at run time. JSON Schema allows comments and examples.
>> 
>> What is the harm in non-normative language around a retrievable URI?
>> 
>> On Tue, Jul 21, 2020 at 9:58 AM Justin Richer <jric...@mit.edu 
>> <mailto:jric...@mit.edu>> wrote:
>> String comparison works just fine when the strings happen to be URIs, and 
>> you aren’t treating them as URIs:
>> 
>>      “https://schema.example.org/v1 <https://schema.example.org/v1>”
>> 
>> Is different from 
>> 
>>      “https://schema.example.org/v2 <https://schema.example.org/v2>”
>> 
>> And both are different from
>> 
>>      “https://schema.example.org:443/v1 <https://schema.example.org/v1>/“
>> 
>> All of these are strings, and the strings happen to be URIs but that’s 
>> irrelevant to the comparison process. Can you please help me understand why 
>> doing a string comparison on these values does not work in exactly the same 
>> way it would for “foo”, “bar”, and “baz” values? Why would these need to be 
>> canonicalized to be compared? The definition of a JSON string is an ordered 
>> set of unicode code points, and this can be compared byte-wise. (Or 
>> code-point-wise, whatever’s most correct here.) Can you give me 
>> counter-examples as to where string comparison doesn’t work? And can you 
>> help me understand how this same worry doesn’t apply to all of the rest of 
>> the values in the RAR specification, which are also strings and will need to 
>> be compared?
>> 
>> I’m still very confused as to the URI retrieval issue here, if there even is 
>> one. It sounds like we’re both saying that it could be useful if type values 
>> are retrievable when they’re URIs, but that would be something to augment a 
>> process and not required for the RAR spec. I’m against requiring the value 
>> to be a URI and against requiring the AS to process that URI as a URI at 
>> runtime. Anything that an AS wants to do with the “type” value, including 
>> providing additional tooling and validation, is up to the AS and outside of 
>> the spec.
>> 
>>  — Justin
>> 
>>> On Jul 21, 2020, at 12:35 PM, Dick Hardt <dick.ha...@gmail.com 
>>> <mailto:dick.ha...@gmail.com>> wrote:
>>> 
>>> This statement:
>>> 
>>> “compare two strings so that they’re exact”
>>> 
>>> does not work for either Unicode or URIs. A string, and a canonicalized 
>>> Unicode string are not the same thing. Similar for a URI. I have assumed 
>>> you understand the canonicalization requirement, but it does not sound like 
>>> you do. Would you like examples?
>>> 
>>> 
>>> wrt. the AS and URI, *you* keep saying that *I* said the AS would retrieve 
>>> the URI. I HAVE NOT SAID THAT!
>>> 
>>> I am suggesting that the URI MAY be retrievable, and I gave examples on how 
>>> that would be useful for tooling for client developers, and for an AS in 
>>> doing input validation. The URI would NOT be retrieved at run time.
>>> 
>>> 
>>> On Tue, Jul 21, 2020 at 7:35 AM Justin Richer <jric...@mit.edu 
>>> <mailto:jric...@mit.edu>> wrote:
>>> If we treat all the strings as just strings, without any special internal 
>>> format to be specified or detected, then comparing the strings is a 
>>> well-understood and well-documented process. I also think that we shouldn’t 
>>> invent anything here, so if there’s a better way to say “compare two 
>>> strings so that they’re exact” then that’s what I mean. Sorry if that was 
>>> unclear.
>>> 
>>> I’m saying the AS should not retrieve the URI passed in the “type” value. 
>>> You brought that up and then described the process that the AS would take 
>>> to do so. I have said from the start that the use of a URI is for name 
>>> spacing and not for addressing content to be fetched, so I’m confused why 
>>> you think I intend otherwise.
>>> 
>>>  — Justin
>>> 
>>>> On Jul 20, 2020, at 2:59 PM, Dick Hardt <dick.ha...@gmail.com 
>>>> <mailto:dick.ha...@gmail.com>> wrote:
>>>> 
>>>> Canonicalization of URIs and unicode is fairly well specified. I was not 
>>>> suggesting we invent anything there.
>>>> 
>>>> A byte comparison, as you suggested earlier, will be problematic, as I 
>>>> have pointed out.
>>>> 
>>>> I'm confused why you are still talking about the AS retrieving a URI.
>>>> 
>>>> ᐧ
>>>> 
>>>> On Mon, Jul 20, 2020 at 4:42 AM Justin Richer <jric...@mit.edu 
>>>> <mailto:jric...@mit.edu>> wrote:
>>>> Since this is a recommendation for namespace, we could also just say 
>>>> collision-resistant like JWT, and any of those examples are fine. But that 
>>>> said, I think there’s something particularly compelling about URIs since 
>>>> they have somewhat-human-readable portions. But again, I’m saying it 
>>>> should be a recommendation to API developers and not a requirement in the 
>>>> spec. In the spec, I argue that “type” should be a string, full stop.
>>>> 
>>>> If documentation is so confusing that developers are typing in the wrong 
>>>> strings, then that’s bad documentation. And likely a bad choice for the 
>>>> “type” string on the part of the AS. You’d have the same problem with any 
>>>> other value the developer’s supposed to copy over.  :)
>>>> 
>>>> I agree that we should call out explicitly how they should be compared, 
>>>> and I propose we use one of the handful of existing string-comparison 
>>>> RFC’s here instead of defining our own rules.
>>>> 
>>>> While the type could be a dereferenceable URI, requiring action on the AS 
>>>> is really getting into distributed authorization policies. We tried doing 
>>>> that with UMA1’s scope structures and it didn’t work very well in practice 
>>>> (in my memory and experience). Someone could profile “type" on top of this 
>>>> if they wanted to do so, with support at the AS for that, but I don’t see 
>>>> a compelling reason for that to be a requirement as that’s a lot of 
>>>> complexity and a lot more error states (the fetch fails, or it doesn’t 
>>>> have a policy, or the policy’s in a format the AS doesn’t understand, or 
>>>> the AS doesn’t like the policy, etc). 
>>>> 
>>>> And AS is always free to implement its types in such a fashion, and that 
>>>> could make plenty of sense in a smaller ecosystem. And this is yet another 
>>>> reason that we define “type” as being a string to be interpreted and 
>>>> understood by the AS — so that an AS that wants to work this way can do so.
>>>> 
>>>>  — Justin
>>>> 
>>>> PS: thanks for pointing out the error in the example in XYZ, I’ll fix that 
>>>> prior to publication.
>>>> 
>>>>> On Jul 18, 2020, at 8:58 PM, Dick Hardt <dick.ha...@gmail.com 
>>>>> <mailto:dick.ha...@gmail.com>> wrote:
>>>>> 
>>>>> Justin: thanks for kindly pointing out which mail list this is.
>>>>> 
>>>>> To clarify, public JWT claims are not just URIs, but any 
>>>>> collision-resistant namespace: 
>>>>> "Examples of collision-resistant namespaces include: Domain Names, Object 
>>>>> Identifiers (OIDs) as defined in the ITU-T X.660 and      X.670 
>>>>> Recommendation series, and Universally Unique IDentifiers (UUIDs) 
>>>>> [RFC4122]."
>>>>> 
>>>>> I think letting the "type" be any JSON string and doing a byte-wise 
>>>>> comparison will be problematic. A client developer will be reading 
>>>>> documentation to learn what the types are, and typing it in. Given the 
>>>>> wide set of whitespace characters, and unicode equivalence, different 
>>>>> byte streams will all look the same, and a byte-wise comparison will fail.
>>>>> 
>>>>> Similarly for URIs. If it is a valid URI, then a byte-wise comparison is 
>>>>> not sufficient. Canonicalization is required. 
>>>>> 
>>>>> These are not showstopper issues, but the specification should call out 
>>>>> how type strings are compared, and provide caveats to an AS developer.
>>>>> 
>>>>> I have no idea why you would think the AS would retrieve a URL.
>>>>> 
>>>>> Since the type represents a much more complex object then a JWT claim, a 
>>>>> client developer's tooling could pull down the JSON Schema (or some such) 
>>>>> for a type used in their source code, and provide autocompletion and 
>>>>> validation which would improve productivity and reduce errors. An AS that 
>>>>> is using a defined type could use the schema for input validation. 
>>>>> Neither of these would be at run time. JSON Schema allows comments and 
>>>>> examples.
>>>>> 
>>>>> What is the harm in non-normative language around a retrievable URI?
>>>>> 
>>>>> BTW: the example in 
>>>>> https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2 
>>>>> <https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2> has 
>>>>> not been updated with the "type" field.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Sat, Jul 18, 2020 at 8:10 AM Justin Richer <jric...@mit.edu 
>>>>> <mailto:jric...@mit.edu>> wrote:
>>>>> Hi Dick,
>>>>> 
>>>>> This is a discussion about the RAR specification on the OAuth list, and 
>>>>> therefore doesn’t have anything to do with alignment with XAuth. In fact, 
>>>>> I believe the alignment is the other way around, as doesn’t Xauth 
>>>>> normatively reference RAR at this point? Even though, last I saw, it uses 
>>>>> a different top-level structure for conveying things, I believe it does 
>>>>> say to use the internal object structures. I am also a co-author on RAR 
>>>>> and we had already defined a “type” field in RAR quite some time ago. You 
>>>>> did notice that XYZ’s latest draft added this field to keep the two in 
>>>>> alignment with each other, which has always been the goal since the 
>>>>> initial proposal of the RAR work, but that’s a time lag and not a display 
>>>>> of new intent. 
>>>>> 
>>>>> In any event, even though I think the decision has bearing in both 
>>>>> places, this isn’t about GNAP. Working on RAR’s requirements has brought 
>>>>> up this interesting issue of what should be in the type field for RAR in 
>>>>> OAuth 2.
>>>>> 
>>>>> I think that it should be defined as a string, and therefore compared as 
>>>>> a byte value in all cases, regardless of what the content of the string 
>>>>> is. I don’t think the AS should be expected to fetch a URI for anything. 
>>>>> I don’t think the AS should normalize any of the inputs. I think that any 
>>>>> JSON-friendly character set should be allowed (including spaces and 
>>>>> unicodes), and since RAR already requires the JSON objects to be 
>>>>> form-encoded, this shouldn’t cause additional trouble when adding them in 
>>>>> to OAuth 2’s request structures.
>>>>> 
>>>>> The idea of using a URI would be to get people out of each other’s 
>>>>> namespaces. It’s similar to the concept of “public” vs “private” claims 
>>>>> in JWT:
>>>>> 
>>>>> https://tools.ietf.org/html/rfc7519#section-4.2 
>>>>> <https://tools.ietf.org/html/rfc7519#section-4.2>
>>>>> 
>>>>> What I’m proposing is that if you think it’s going to be a 
>>>>> general-purpose type name, then we recommend you use a URI as your 
>>>>> string. And beyond that, that’s it. It’s up to the AS to figure out what 
>>>>> to do with it, and RAR stays out of it.
>>>>> 
>>>>>  — Justin
>>>>> 
>>>>>> On Jul 17, 2020, at 1:25 PM, Dick Hardt <dick.ha...@gmail.com 
>>>>>> <mailto:dick.ha...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hey Justin, glad to see that you have aligned with the latest XAuth 
>>>>>> draft on a type property being required.
>>>>>> 
>>>>>> I like the idea that the value of the type property is fully defined by 
>>>>>> the AS, which could delegate it to a common URI for reuse. This gets 
>>>>>> GNAP out of specifying access requests, and enables other parties to 
>>>>>> define access without any required coordination with IETF or IANA.
>>>>>> 
>>>>>> A complication in mixing plain strings and URIs is the canonicalization. 
>>>>>> A plain string can be a fixed byte representation, but a URI requires 
>>>>>> canonicalization for comparison. Mixing the two requires URI detection 
>>>>>> at the AS before canonicalization, and an AS MUST do canonicalization of 
>>>>>> URIs.
>>>>>> 
>>>>>> The URI is retrievable, it can provide machine and/or human readable 
>>>>>> documentation in JSON schema or some such, or any other content type. 
>>>>>> Once again, the details are out of scope of GNAP, but we can provide 
>>>>>> examples to guide implementers.
>>>>>> 
>>>>>> Are you still thinking that bare strings are allowed in GNAP, and are 
>>>>>> defined by the AS?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Fri, Jul 17, 2020 at 8:39 AM Justin Richer <jric...@mit.edu 
>>>>>> <mailto:jric...@mit.edu>> wrote:
>>>>>> The “type” field in the RAR spec serves an important purpose: it defines 
>>>>>> what goes in the rest of the object, including what other fields are 
>>>>>> available and what values are allowed for those fields. It provides an 
>>>>>> API-level definition for requesting access based on multiple dimensions, 
>>>>>> and that’s really powerful and flexible. Each type can use any of the 
>>>>>> general-purpose fields like “actions” and/or add its own fields as 
>>>>>> necessary, and the “type” parameter keeps everything well-defined.
>>>>>> 
>>>>>> The question, then, is what defines what’s allowed to go into the “type” 
>>>>>> field itself? And what defines how that value maps to the requirements 
>>>>>> for the rest of the object? The draft doesn’t say anything about it at 
>>>>>> the moment, but we should choose the direction we want to go. On the 
>>>>>> surface, there are three main options:
>>>>>> 
>>>>>> 1) Require all values to be registered. 
>>>>>> 2) Require all values to be collision-resistant (eg, URIs).
>>>>>> 3) Require all values to be defined by the AS (and/or the RS’s that it 
>>>>>> protects).
>>>>>> 
>>>>>> Are there any other options?
>>>>>> 
>>>>>> Here are my thoughts on each approach:
>>>>>> 
>>>>>> 1) While it usually makes sense to register things for interoperability, 
>>>>>> this is a case where I think that a registry would actually hurt 
>>>>>> interoperability and adoption. Like a “scope” value, the RAR “type” is 
>>>>>> ultimately up to the AS and RS to interpret in their own context. We 
>>>>>> :want: people to define rich objects for their APIs and enable 
>>>>>> fine-grained access for their systems, and if they have to register 
>>>>>> something every time they come up with a new API to protect, it’s going 
>>>>>> to be an unmaintainable mess. I genuinely don’t think this would scale, 
>>>>>> and that most developers would just ignore the registry and do what they 
>>>>>> want anyway. And since many of these systems are inside domains, it’s 
>>>>>> completely unenforceable in practice.
>>>>>> 
>>>>>> 2) This seems reasonable, but it’s a bit of a nuisance to require 
>>>>>> everything to be a URI here. It’s long and ugly, and a lot of APIs are 
>>>>>> going to be internal to a given group, deployment, or ecosystem anyway. 
>>>>>> This makes sense when you’ve got something reusable across many 
>>>>>> deployments, like OIDC, but it’s overhead when what you’re doing is tied 
>>>>>> to your environment.
>>>>>> 
>>>>>> 3) This allows the AS and RS to define the request parameters for their 
>>>>>> APIs just like they do today with scopes. Since it’s always the 
>>>>>> combination of “this type :AT: this AS/RS”, name spacing is less of an 
>>>>>> issue across systems. We haven’t seen huge problems in scope value 
>>>>>> overlap in the wild, though it does occur from time to time it’s more 
>>>>>> than manageable. A client isn’t going to just “speak RAR”, it’s going to 
>>>>>> be speaking RAR so that it can access something in particular.
>>>>>> 
>>>>>> And all that brings me to my proposal: 
>>>>>> 
>>>>>> 4) Require all values to be defined by the AS, and encourage 
>>>>>> specification developers to use URIs for collision resistance.
>>>>>> 
>>>>>> So officially in RAR, the AS would decide what “type” means, and nobody 
>>>>>> else. But we can also guide people who are developing general-purpose 
>>>>>> interoperable APIs to use URIs for their RAR “type” definitions. This 
>>>>>> would keep those interoperable APIs from stepping on each other, and 
>>>>>> from stepping on any locally-defined special “type” structure. But at 
>>>>>> the end of the day, the URI carries no more weight than just any other 
>>>>>> string, and the AS decides what it means and how it applies.
>>>>>> 
>>>>>> My argument is that this seems to have worked very, very well for 
>>>>>> scopes, and the RAR “type” is cut from similar descriptive cloth.
>>>>>> 
>>>>>> What does the rest of the group think? How should we manage the RAR 
>>>>>> “type” values and what they mean?
>>>>>> 
>>>>>>  — Justin
>>>>>> _______________________________________________
>>>>>> OAuth mailing list
>>>>>> OAuth@ietf.org <mailto:OAuth@ietf.org>
>>>>>> https://www.ietf.org/mailman/listinfo/oauth 
>>>>>> <https://www.ietf.org/mailman/listinfo/oauth>
>>>>> 
>>>> 
>>> 
>> 
>

_______________________________________________
OAuth mailing list
OAuth@ietf.org
https://www.ietf.org/mailman/listinfo/oauth

Re: [OAUTH-WG] Namespacing "type" in RAR

Reply via email to