An explanation of the issues in Unicode can be found here: https://en.wikipedia.org/wiki/Unicode_equivalence#Character_duplication
On Tue, Jul 21, 2020 at 10:03 AM Dick Hardt <dick.ha...@gmail.com> wrote: > > The following are the same URI, but are different strings: > > “https://schema.example.org/v1” > “HTTPS://schema.example.org/v1 <https://schema.example.org/v1>” > “https://SCHEMA.EXAMPLE.ORG/v1 <https://schema.example.org/v1>” > > Before comparing them to each other, they must be canonicalized so that > they become the same string. > > From earlier in this thread, I am NOT suggesting that it must be a URI, > nor that it is required: > > Since the type represents a much more complex object then a JWT claim, a > client developer's tooling could pull down the JSON Schema (or some such) > for a type used in their source code, and provide autocompletion and > validation which would improve productivity and reduce errors. An AS that > is using a defined type could use the schema for input validation. Neither > of these would be at run time. JSON Schema allows comments and examples. > > What is the harm in non-normative language around a retrievable URI? > > > On Tue, Jul 21, 2020 at 9:58 AM Justin Richer <jric...@mit.edu> wrote: > >> String comparison works just fine when the strings happen to be URIs, and >> you aren’t treating them as URIs: >> >> “https://schema.example.org/v1” >> >> Is different from >> >> “https://schema.example.org/v2” >> >> And both are different from >> >> “https://schema.example.org:443/v1/“ >> >> All of these are strings, and the strings happen to be URIs but that’s >> irrelevant to the comparison process. Can you please help me understand why >> doing a string comparison on these values does not work in exactly the same >> way it would for “foo”, “bar”, and “baz” values? Why would these need to be >> canonicalized to be compared? The definition of a JSON string is an ordered >> set of unicode code points, and this can be compared byte-wise. (Or >> code-point-wise, whatever’s most correct here.) Can you give me >> counter-examples as to where string comparison doesn’t work? And can you >> help me understand how this same worry doesn’t apply to all of the rest of >> the values in the RAR specification, which are also strings and will need >> to be compared? >> >> I’m still very confused as to the URI retrieval issue here, if there even >> is one. It sounds like we’re both saying that it could be useful if type >> values are retrievable when they’re URIs, but that would be something to >> augment a process and not required for the RAR spec. I’m against requiring >> the value to be a URI and against requiring the AS to process that URI *as >> a URI* at runtime. Anything that an AS wants to do with the “type” >> value, including providing additional tooling and validation, is up to the >> AS and outside of the spec. >> >> — Justin >> >> On Jul 21, 2020, at 12:35 PM, Dick Hardt <dick.ha...@gmail.com> wrote: >> >> This statement: >> >> “compare two strings so that they’re exact” >> >> does not work for either Unicode or URIs. A string, and a canonicalized >> Unicode string are not the same thing. Similar for a URI. I have assumed >> you understand the canonicalization requirement, but it does not sound like >> you do. Would you like examples? >> >> >> wrt. the AS and URI, *you* keep saying that *I* said the AS would >> retrieve the URI. I HAVE NOT SAID THAT! >> >> I am suggesting that the URI MAY be retrievable, and I gave examples on >> how that would be useful for tooling for client developers, and for an AS >> in doing input validation. The URI would NOT be retrieved at run time. >> >> >> On Tue, Jul 21, 2020 at 7:35 AM Justin Richer <jric...@mit.edu> wrote: >> >>> If we treat all the strings as just strings, without any special >>> internal format to be specified or detected, then comparing the strings is >>> a well-understood and well-documented process. I also think that we >>> shouldn’t invent anything here, so if there’s a better way to say “compare >>> two strings so that they’re exact” then that’s what I mean. Sorry if that >>> was unclear. >>> >>> I’m saying the AS should *not* retrieve the URI passed in the “type” >>> value. You brought that up and then described the process that the AS would >>> take to do so. I have said from the start that the use of a URI is for name >>> spacing and not for addressing content to be fetched, so I’m confused why >>> you think I intend otherwise. >>> >>> — Justin >>> >>> On Jul 20, 2020, at 2:59 PM, Dick Hardt <dick.ha...@gmail.com> wrote: >>> >>> Canonicalization of URIs and unicode is fairly well specified. I was not >>> suggesting we invent anything there. >>> >>> A byte comparison, as you suggested earlier, will be problematic, as I >>> have pointed out. >>> >>> I'm confused why you are still talking about the AS retrieving a URI. >>> >>> ᐧ >>> >>> On Mon, Jul 20, 2020 at 4:42 AM Justin Richer <jric...@mit.edu> wrote: >>> >>>> Since this is a recommendation for namespace, we could also just say >>>> collision-resistant like JWT, and any of those examples are fine. But that >>>> said, I think there’s something particularly compelling about URIs since >>>> they have somewhat-human-readable portions. But again, I’m saying it should >>>> be a recommendation to API developers and not a requirement in the spec. In >>>> the spec, I argue that “type” should be a string, full stop. >>>> >>>> If documentation is so confusing that developers are typing in the >>>> wrong strings, then that’s bad documentation. And likely a bad choice for >>>> the “type” string on the part of the AS. You’d have the same problem with >>>> any other value the developer’s supposed to copy over. :) >>>> >>>> I agree that we should call out explicitly how they should be compared, >>>> and I propose we use one of the handful of existing string-comparison RFC’s >>>> here instead of defining our own rules. >>>> >>>> While the type could be a dereferenceable URI, requiring action on the >>>> AS is really getting into distributed authorization policies. We tried >>>> doing that with UMA1’s scope structures and it didn’t work very well in >>>> practice (in my memory and experience). Someone could profile “type" on top >>>> of this if they wanted to do so, with support at the AS for that, but I >>>> don’t see a compelling reason for that to be a requirement as that’s a lot >>>> of complexity and a lot more error states (the fetch fails, or it doesn’t >>>> have a policy, or the policy’s in a format the AS doesn’t understand, or >>>> the AS doesn’t like the policy, etc). >>>> >>>> And AS is always free to implement its types in such a fashion, and >>>> that could make plenty of sense in a smaller ecosystem. And this is yet >>>> another reason that we define “type” as being a string to be interpreted >>>> and understood by the AS — so that an AS that wants to work this way can do >>>> so. >>>> >>>> — Justin >>>> >>>> PS: thanks for pointing out the error in the example in XYZ, I’ll fix >>>> that prior to publication. >>>> >>>> On Jul 18, 2020, at 8:58 PM, Dick Hardt <dick.ha...@gmail.com> wrote: >>>> >>>> Justin: thanks for kindly pointing out which mail list this is. >>>> >>>> To clarify, public JWT claims are not just URIs, but any >>>> collision-resistant namespace: >>>> "Examples of collision-resistant namespaces include: Domain Names, >>>> Object Identifiers (OIDs) as defined in the ITU-T X.660 and X.670 >>>> Recommendation series, and Universally Unique IDentifiers (UUIDs) >>>> [RFC4122]." >>>> >>>> I think letting the "type" be any JSON string and doing a byte-wise >>>> comparison will be problematic. A client developer will be reading >>>> documentation to learn what the types are, and typing it in. Given the wide >>>> set of whitespace characters, and unicode equivalence, different byte >>>> streams will all look the same, and a byte-wise comparison will fail. >>>> >>>> Similarly for URIs. If it is a valid URI, then a byte-wise comparison >>>> is not sufficient. Canonicalization is required. >>>> >>>> These are not showstopper issues, but the specification should call out >>>> how type strings are compared, and provide caveats to an AS developer. >>>> >>>> I have no idea why you would think the AS would retrieve a URL. >>>> >>>> Since the type represents a much more complex object then a JWT claim, >>>> a client developer's tooling could pull down the JSON Schema (or some such) >>>> for a type used in their source code, and provide autocompletion and >>>> validation which would improve productivity and reduce errors. An AS that >>>> is using a defined type could use the schema for input validation. Neither >>>> of these would be at run time. JSON Schema allows comments and examples. >>>> >>>> What is the harm in non-normative language around a retrievable URI? >>>> >>>> BTW: the example in >>>> https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2 has >>>> not been updated with the "type" field. >>>> >>>> >>>> >>>> On Sat, Jul 18, 2020 at 8:10 AM Justin Richer <jric...@mit.edu> wrote: >>>> >>>>> Hi Dick, >>>>> >>>>> This is a discussion about the RAR specification on the OAuth list, >>>>> and therefore doesn’t have anything to do with alignment with XAuth. In >>>>> fact, I believe the alignment is the other way around, as doesn’t Xauth >>>>> normatively reference RAR at this point? Even though, last I saw, it uses >>>>> a >>>>> different top-level structure for conveying things, I believe it does say >>>>> to use the internal object structures. I am also a co-author on RAR and we >>>>> had already defined a “type” field in RAR quite some time ago. You did >>>>> notice that XYZ’s latest draft added this field to keep the two in >>>>> alignment with each other, which has always been the goal since the >>>>> initial >>>>> proposal of the RAR work, but that’s a time lag and not a display of new >>>>> intent. >>>>> >>>>> In any event, even though I think the decision has bearing in both >>>>> places, this isn’t about GNAP. Working on RAR’s requirements has brought >>>>> up >>>>> this interesting issue of what should be in the type field for RAR in >>>>> OAuth >>>>> 2. >>>>> >>>>> I think that it should be defined as a string, and therefore compared >>>>> as a byte value in all cases, regardless of what the content of the string >>>>> is. I don’t think the AS should be expected to fetch a URI for anything. I >>>>> don’t think the AS should normalize any of the inputs. I think that any >>>>> JSON-friendly character set should be allowed (including spaces and >>>>> unicodes), and since RAR already requires the JSON objects to be >>>>> form-encoded, this shouldn’t cause additional trouble when adding them in >>>>> to OAuth 2’s request structures. >>>>> >>>>> The idea of using a URI would be to get people out of each other’s >>>>> namespaces. It’s similar to the concept of “public” vs “private” claims in >>>>> JWT: >>>>> >>>>> https://tools.ietf.org/html/rfc7519#section-4.2 >>>>> >>>>> What I’m proposing is that if you think it’s going to be a >>>>> general-purpose type name, then we recommend you use a URI as your string. >>>>> And beyond that, that’s it. It’s up to the AS to figure out what to do >>>>> with >>>>> it, and RAR stays out of it. >>>>> >>>>> — Justin >>>>> >>>>> On Jul 17, 2020, at 1:25 PM, Dick Hardt <dick.ha...@gmail.com> wrote: >>>>> >>>>> Hey Justin, glad to see that you have aligned with the latest XAuth >>>>> draft on a type property being required. >>>>> >>>>> I like the idea that the value of the type property is fully defined >>>>> by the AS, which could delegate it to a common URI for reuse. This gets >>>>> GNAP out of specifying access requests, and enables other parties to >>>>> define >>>>> access without any required coordination with IETF or IANA. >>>>> >>>>> A complication in mixing plain strings and URIs is the >>>>> canonicalization. A plain string can be a fixed byte representation, but a >>>>> URI requires canonicalization for comparison. Mixing the two requires URI >>>>> detection at the AS before canonicalization, and an AS MUST do >>>>> canonicalization of URIs. >>>>> >>>>> The URI is retrievable, it can provide machine and/or human readable >>>>> documentation in JSON schema or some such, or any other content type. Once >>>>> again, the details are out of scope of GNAP, but we can provide examples >>>>> to >>>>> guide implementers. >>>>> >>>>> Are you still thinking that bare strings are allowed in GNAP, and are >>>>> defined by the AS? >>>>> >>>>> >>>>> >>>>> On Fri, Jul 17, 2020 at 8:39 AM Justin Richer <jric...@mit.edu> wrote: >>>>> >>>>>> The “type” field in the RAR spec serves an important purpose: it >>>>>> defines what goes in the rest of the object, including what other fields >>>>>> are available and what values are allowed for those fields. It provides >>>>>> an >>>>>> API-level definition for requesting access based on multiple dimensions, >>>>>> and that’s really powerful and flexible. Each type can use any of the >>>>>> general-purpose fields like “actions” and/or add its own fields as >>>>>> necessary, and the “type” parameter keeps everything well-defined. >>>>>> >>>>>> The question, then, is what defines what’s allowed to go into the >>>>>> “type” field itself? And what defines how that value maps to the >>>>>> requirements for the rest of the object? The draft doesn’t say anything >>>>>> about it at the moment, but we should choose the direction we want to go. >>>>>> On the surface, there are three main options: >>>>>> >>>>>> 1) Require all values to be registered. >>>>>> 2) Require all values to be collision-resistant (eg, URIs). >>>>>> 3) Require all values to be defined by the AS (and/or the RS’s that >>>>>> it protects). >>>>>> >>>>>> Are there any other options? >>>>>> >>>>>> Here are my thoughts on each approach: >>>>>> >>>>>> 1) While it usually makes sense to register things for >>>>>> interoperability, this is a case where I think that a registry would >>>>>> actually hurt interoperability and adoption. Like a “scope” value, the >>>>>> RAR >>>>>> “type” is ultimately up to the AS and RS to interpret in their own >>>>>> context. >>>>>> We :want: people to define rich objects for their APIs and enable >>>>>> fine-grained access for their systems, and if they have to register >>>>>> something every time they come up with a new API to protect, it’s going >>>>>> to >>>>>> be an unmaintainable mess. I genuinely don’t think this would scale, and >>>>>> that most developers would just ignore the registry and do what they want >>>>>> anyway. And since many of these systems are inside domains, it’s >>>>>> completely >>>>>> unenforceable in practice. >>>>>> >>>>>> 2) This seems reasonable, but it’s a bit of a nuisance to require >>>>>> everything to be a URI here. It’s long and ugly, and a lot of APIs are >>>>>> going to be internal to a given group, deployment, or ecosystem anyway. >>>>>> This makes sense when you’ve got something reusable across many >>>>>> deployments, like OIDC, but it’s overhead when what you’re doing is tied >>>>>> to >>>>>> your environment. >>>>>> >>>>>> 3) This allows the AS and RS to define the request parameters for >>>>>> their APIs just like they do today with scopes. Since it’s always the >>>>>> combination of “this type :AT: this AS/RS”, name spacing is less of an >>>>>> issue across systems. We haven’t seen huge problems in scope value >>>>>> overlap >>>>>> in the wild, though it does occur from time to time it’s more than >>>>>> manageable. A client isn’t going to just “speak RAR”, it’s going to be >>>>>> speaking RAR so that it can access something in particular. >>>>>> >>>>>> And all that brings me to my proposal: >>>>>> >>>>>> 4) Require all values to be defined by the AS, and encourage >>>>>> specification developers to use URIs for collision resistance. >>>>>> >>>>>> So officially in RAR, the AS would decide what “type” means, and >>>>>> nobody else. But we can also guide people who are developing >>>>>> general-purpose interoperable APIs to use URIs for their RAR “type” >>>>>> definitions. This would keep those interoperable APIs from stepping on >>>>>> each >>>>>> other, and from stepping on any locally-defined special “type” structure. >>>>>> But at the end of the day, the URI carries no more weight than just any >>>>>> other string, and the AS decides what it means and how it applies. >>>>>> >>>>>> My argument is that this seems to have worked very, very well for >>>>>> scopes, and the RAR “type” is cut from similar descriptive cloth. >>>>>> >>>>>> What does the rest of the group think? How should we manage the RAR >>>>>> “type” values and what they mean? >>>>>> >>>>>> — Justin >>>>>> _______________________________________________ >>>>>> OAuth mailing list >>>>>> OAuth@ietf.org >>>>>> https://www.ietf.org/mailman/listinfo/oauth >>>>>> >>>>> >>>>> >>>> >>> >>
_______________________________________________ OAuth mailing list OAuth@ietf.org https://www.ietf.org/mailman/listinfo/oauth