I’m suggesting that API designers avoid using such glyphs in their “type” values if they want to avoid such human-copy errors, like they would need to do for most other strings in their system. If that means they stick to ASCII or put a note on the developer page that says “hey copy and paste this value, don’t try to re-type it” or whatever, that’s up to the AS.
You’d have the same kind of issue around “similar-looking” characters, like the semicolon vs. the greek question mark. Should the AS look for those and try to “fix” the inputs? I would argue not: the AS should be strict in matching these values because it could have security implications. This isn’t a problem unique to RAR, or OAuth for that matter. We can, and I think should, add guidance to the RAR document for all of these points. — Justin > On Jul 21, 2020, at 1:55 PM, Dick Hardt <dick.ha...@gmail.com> wrote: > > In unicode, a glyph can be represented by more than one code point. When > reading the docs and entering a value, the developer will not know which code > point the AS intended. > > Are you suggesting that AS documentation would have the bytes rather than > glyphs? Or not use glyphs that have multiple code points? Or that they only > use english? > > > > On Tue, Jul 21, 2020 at 10:34 AM Justin Richer <jric...@mit.edu > <mailto:jric...@mit.edu>> wrote: > Right, and I’m saying that all three of those would be DIFFERENT “type” > values, because they’re different strings. The fact that when treated as URIs > they would be equivalent is irrelevant. Just like “foo”, “Foo”, and “FOO” > would be different “type” values, per the spec. Nothing is stopping an AS > from treating them as equivalent internally, but that seems a bit dangerous > to me. I’d love to see a formal breakdown of that, though. > > As for the unicode example, if we define things as using byte comparisons, > then that becomes an issue for proper documentation and configuration — and > again, probably a good place to have recommendations for picking type value > strings so as to avoid such problems. > > In short, I don’t think we should have any requirements on canonicalization > for these values. > > — Justin > >> On Jul 21, 2020, at 1:03 PM, Dick Hardt <dick.ha...@gmail.com >> <mailto:dick.ha...@gmail.com>> wrote: >> >> >> The following are the same URI, but are different strings: >> >> “https://schema.example.org/v1 <https://schema.example.org/v1>” >> “HTTPS://schema.example.org/v1 <https://schema.example.org/v1>” >> “https://SCHEMA.EXAMPLE.ORG/v1 <https://schema.example.org/v1>” >> >> Before comparing them to each other, they must be canonicalized so that they >> become the same string. >> >> From earlier in this thread, I am NOT suggesting that it must be a URI, nor >> that it is required: >> >> Since the type represents a much more complex object then a JWT claim, a >> client developer's tooling could pull down the JSON Schema (or some such) >> for a type used in their source code, and provide autocompletion and >> validation which would improve productivity and reduce errors. An AS that is >> using a defined type could use the schema for input validation. Neither of >> these would be at run time. JSON Schema allows comments and examples. >> >> What is the harm in non-normative language around a retrievable URI? >> >> On Tue, Jul 21, 2020 at 9:58 AM Justin Richer <jric...@mit.edu >> <mailto:jric...@mit.edu>> wrote: >> String comparison works just fine when the strings happen to be URIs, and >> you aren’t treating them as URIs: >> >> “https://schema.example.org/v1 <https://schema.example.org/v1>” >> >> Is different from >> >> “https://schema.example.org/v2 <https://schema.example.org/v2>” >> >> And both are different from >> >> “https://schema.example.org:443/v1 <https://schema.example.org/v1>/“ >> >> All of these are strings, and the strings happen to be URIs but that’s >> irrelevant to the comparison process. Can you please help me understand why >> doing a string comparison on these values does not work in exactly the same >> way it would for “foo”, “bar”, and “baz” values? Why would these need to be >> canonicalized to be compared? The definition of a JSON string is an ordered >> set of unicode code points, and this can be compared byte-wise. (Or >> code-point-wise, whatever’s most correct here.) Can you give me >> counter-examples as to where string comparison doesn’t work? And can you >> help me understand how this same worry doesn’t apply to all of the rest of >> the values in the RAR specification, which are also strings and will need to >> be compared? >> >> I’m still very confused as to the URI retrieval issue here, if there even is >> one. It sounds like we’re both saying that it could be useful if type values >> are retrievable when they’re URIs, but that would be something to augment a >> process and not required for the RAR spec. I’m against requiring the value >> to be a URI and against requiring the AS to process that URI as a URI at >> runtime. Anything that an AS wants to do with the “type” value, including >> providing additional tooling and validation, is up to the AS and outside of >> the spec. >> >> — Justin >> >>> On Jul 21, 2020, at 12:35 PM, Dick Hardt <dick.ha...@gmail.com >>> <mailto:dick.ha...@gmail.com>> wrote: >>> >>> This statement: >>> >>> “compare two strings so that they’re exact” >>> >>> does not work for either Unicode or URIs. A string, and a canonicalized >>> Unicode string are not the same thing. Similar for a URI. I have assumed >>> you understand the canonicalization requirement, but it does not sound like >>> you do. Would you like examples? >>> >>> >>> wrt. the AS and URI, *you* keep saying that *I* said the AS would retrieve >>> the URI. I HAVE NOT SAID THAT! >>> >>> I am suggesting that the URI MAY be retrievable, and I gave examples on how >>> that would be useful for tooling for client developers, and for an AS in >>> doing input validation. The URI would NOT be retrieved at run time. >>> >>> >>> On Tue, Jul 21, 2020 at 7:35 AM Justin Richer <jric...@mit.edu >>> <mailto:jric...@mit.edu>> wrote: >>> If we treat all the strings as just strings, without any special internal >>> format to be specified or detected, then comparing the strings is a >>> well-understood and well-documented process. I also think that we shouldn’t >>> invent anything here, so if there’s a better way to say “compare two >>> strings so that they’re exact” then that’s what I mean. Sorry if that was >>> unclear. >>> >>> I’m saying the AS should not retrieve the URI passed in the “type” value. >>> You brought that up and then described the process that the AS would take >>> to do so. I have said from the start that the use of a URI is for name >>> spacing and not for addressing content to be fetched, so I’m confused why >>> you think I intend otherwise. >>> >>> — Justin >>> >>>> On Jul 20, 2020, at 2:59 PM, Dick Hardt <dick.ha...@gmail.com >>>> <mailto:dick.ha...@gmail.com>> wrote: >>>> >>>> Canonicalization of URIs and unicode is fairly well specified. I was not >>>> suggesting we invent anything there. >>>> >>>> A byte comparison, as you suggested earlier, will be problematic, as I >>>> have pointed out. >>>> >>>> I'm confused why you are still talking about the AS retrieving a URI. >>>> >>>> ᐧ >>>> >>>> On Mon, Jul 20, 2020 at 4:42 AM Justin Richer <jric...@mit.edu >>>> <mailto:jric...@mit.edu>> wrote: >>>> Since this is a recommendation for namespace, we could also just say >>>> collision-resistant like JWT, and any of those examples are fine. But that >>>> said, I think there’s something particularly compelling about URIs since >>>> they have somewhat-human-readable portions. But again, I’m saying it >>>> should be a recommendation to API developers and not a requirement in the >>>> spec. In the spec, I argue that “type” should be a string, full stop. >>>> >>>> If documentation is so confusing that developers are typing in the wrong >>>> strings, then that’s bad documentation. And likely a bad choice for the >>>> “type” string on the part of the AS. You’d have the same problem with any >>>> other value the developer’s supposed to copy over. :) >>>> >>>> I agree that we should call out explicitly how they should be compared, >>>> and I propose we use one of the handful of existing string-comparison >>>> RFC’s here instead of defining our own rules. >>>> >>>> While the type could be a dereferenceable URI, requiring action on the AS >>>> is really getting into distributed authorization policies. We tried doing >>>> that with UMA1’s scope structures and it didn’t work very well in practice >>>> (in my memory and experience). Someone could profile “type" on top of this >>>> if they wanted to do so, with support at the AS for that, but I don’t see >>>> a compelling reason for that to be a requirement as that’s a lot of >>>> complexity and a lot more error states (the fetch fails, or it doesn’t >>>> have a policy, or the policy’s in a format the AS doesn’t understand, or >>>> the AS doesn’t like the policy, etc). >>>> >>>> And AS is always free to implement its types in such a fashion, and that >>>> could make plenty of sense in a smaller ecosystem. And this is yet another >>>> reason that we define “type” as being a string to be interpreted and >>>> understood by the AS — so that an AS that wants to work this way can do so. >>>> >>>> — Justin >>>> >>>> PS: thanks for pointing out the error in the example in XYZ, I’ll fix that >>>> prior to publication. >>>> >>>>> On Jul 18, 2020, at 8:58 PM, Dick Hardt <dick.ha...@gmail.com >>>>> <mailto:dick.ha...@gmail.com>> wrote: >>>>> >>>>> Justin: thanks for kindly pointing out which mail list this is. >>>>> >>>>> To clarify, public JWT claims are not just URIs, but any >>>>> collision-resistant namespace: >>>>> "Examples of collision-resistant namespaces include: Domain Names, Object >>>>> Identifiers (OIDs) as defined in the ITU-T X.660 and X.670 >>>>> Recommendation series, and Universally Unique IDentifiers (UUIDs) >>>>> [RFC4122]." >>>>> >>>>> I think letting the "type" be any JSON string and doing a byte-wise >>>>> comparison will be problematic. A client developer will be reading >>>>> documentation to learn what the types are, and typing it in. Given the >>>>> wide set of whitespace characters, and unicode equivalence, different >>>>> byte streams will all look the same, and a byte-wise comparison will fail. >>>>> >>>>> Similarly for URIs. If it is a valid URI, then a byte-wise comparison is >>>>> not sufficient. Canonicalization is required. >>>>> >>>>> These are not showstopper issues, but the specification should call out >>>>> how type strings are compared, and provide caveats to an AS developer. >>>>> >>>>> I have no idea why you would think the AS would retrieve a URL. >>>>> >>>>> Since the type represents a much more complex object then a JWT claim, a >>>>> client developer's tooling could pull down the JSON Schema (or some such) >>>>> for a type used in their source code, and provide autocompletion and >>>>> validation which would improve productivity and reduce errors. An AS that >>>>> is using a defined type could use the schema for input validation. >>>>> Neither of these would be at run time. JSON Schema allows comments and >>>>> examples. >>>>> >>>>> What is the harm in non-normative language around a retrievable URI? >>>>> >>>>> BTW: the example in >>>>> https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2 >>>>> <https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2> has >>>>> not been updated with the "type" field. >>>>> >>>>> >>>>> >>>>> On Sat, Jul 18, 2020 at 8:10 AM Justin Richer <jric...@mit.edu >>>>> <mailto:jric...@mit.edu>> wrote: >>>>> Hi Dick, >>>>> >>>>> This is a discussion about the RAR specification on the OAuth list, and >>>>> therefore doesn’t have anything to do with alignment with XAuth. In fact, >>>>> I believe the alignment is the other way around, as doesn’t Xauth >>>>> normatively reference RAR at this point? Even though, last I saw, it uses >>>>> a different top-level structure for conveying things, I believe it does >>>>> say to use the internal object structures. I am also a co-author on RAR >>>>> and we had already defined a “type” field in RAR quite some time ago. You >>>>> did notice that XYZ’s latest draft added this field to keep the two in >>>>> alignment with each other, which has always been the goal since the >>>>> initial proposal of the RAR work, but that’s a time lag and not a display >>>>> of new intent. >>>>> >>>>> In any event, even though I think the decision has bearing in both >>>>> places, this isn’t about GNAP. Working on RAR’s requirements has brought >>>>> up this interesting issue of what should be in the type field for RAR in >>>>> OAuth 2. >>>>> >>>>> I think that it should be defined as a string, and therefore compared as >>>>> a byte value in all cases, regardless of what the content of the string >>>>> is. I don’t think the AS should be expected to fetch a URI for anything. >>>>> I don’t think the AS should normalize any of the inputs. I think that any >>>>> JSON-friendly character set should be allowed (including spaces and >>>>> unicodes), and since RAR already requires the JSON objects to be >>>>> form-encoded, this shouldn’t cause additional trouble when adding them in >>>>> to OAuth 2’s request structures. >>>>> >>>>> The idea of using a URI would be to get people out of each other’s >>>>> namespaces. It’s similar to the concept of “public” vs “private” claims >>>>> in JWT: >>>>> >>>>> https://tools.ietf.org/html/rfc7519#section-4.2 >>>>> <https://tools.ietf.org/html/rfc7519#section-4.2> >>>>> >>>>> What I’m proposing is that if you think it’s going to be a >>>>> general-purpose type name, then we recommend you use a URI as your >>>>> string. And beyond that, that’s it. It’s up to the AS to figure out what >>>>> to do with it, and RAR stays out of it. >>>>> >>>>> — Justin >>>>> >>>>>> On Jul 17, 2020, at 1:25 PM, Dick Hardt <dick.ha...@gmail.com >>>>>> <mailto:dick.ha...@gmail.com>> wrote: >>>>>> >>>>>> Hey Justin, glad to see that you have aligned with the latest XAuth >>>>>> draft on a type property being required. >>>>>> >>>>>> I like the idea that the value of the type property is fully defined by >>>>>> the AS, which could delegate it to a common URI for reuse. This gets >>>>>> GNAP out of specifying access requests, and enables other parties to >>>>>> define access without any required coordination with IETF or IANA. >>>>>> >>>>>> A complication in mixing plain strings and URIs is the canonicalization. >>>>>> A plain string can be a fixed byte representation, but a URI requires >>>>>> canonicalization for comparison. Mixing the two requires URI detection >>>>>> at the AS before canonicalization, and an AS MUST do canonicalization of >>>>>> URIs. >>>>>> >>>>>> The URI is retrievable, it can provide machine and/or human readable >>>>>> documentation in JSON schema or some such, or any other content type. >>>>>> Once again, the details are out of scope of GNAP, but we can provide >>>>>> examples to guide implementers. >>>>>> >>>>>> Are you still thinking that bare strings are allowed in GNAP, and are >>>>>> defined by the AS? >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jul 17, 2020 at 8:39 AM Justin Richer <jric...@mit.edu >>>>>> <mailto:jric...@mit.edu>> wrote: >>>>>> The “type” field in the RAR spec serves an important purpose: it defines >>>>>> what goes in the rest of the object, including what other fields are >>>>>> available and what values are allowed for those fields. It provides an >>>>>> API-level definition for requesting access based on multiple dimensions, >>>>>> and that’s really powerful and flexible. Each type can use any of the >>>>>> general-purpose fields like “actions” and/or add its own fields as >>>>>> necessary, and the “type” parameter keeps everything well-defined. >>>>>> >>>>>> The question, then, is what defines what’s allowed to go into the “type” >>>>>> field itself? And what defines how that value maps to the requirements >>>>>> for the rest of the object? The draft doesn’t say anything about it at >>>>>> the moment, but we should choose the direction we want to go. On the >>>>>> surface, there are three main options: >>>>>> >>>>>> 1) Require all values to be registered. >>>>>> 2) Require all values to be collision-resistant (eg, URIs). >>>>>> 3) Require all values to be defined by the AS (and/or the RS’s that it >>>>>> protects). >>>>>> >>>>>> Are there any other options? >>>>>> >>>>>> Here are my thoughts on each approach: >>>>>> >>>>>> 1) While it usually makes sense to register things for interoperability, >>>>>> this is a case where I think that a registry would actually hurt >>>>>> interoperability and adoption. Like a “scope” value, the RAR “type” is >>>>>> ultimately up to the AS and RS to interpret in their own context. We >>>>>> :want: people to define rich objects for their APIs and enable >>>>>> fine-grained access for their systems, and if they have to register >>>>>> something every time they come up with a new API to protect, it’s going >>>>>> to be an unmaintainable mess. I genuinely don’t think this would scale, >>>>>> and that most developers would just ignore the registry and do what they >>>>>> want anyway. And since many of these systems are inside domains, it’s >>>>>> completely unenforceable in practice. >>>>>> >>>>>> 2) This seems reasonable, but it’s a bit of a nuisance to require >>>>>> everything to be a URI here. It’s long and ugly, and a lot of APIs are >>>>>> going to be internal to a given group, deployment, or ecosystem anyway. >>>>>> This makes sense when you’ve got something reusable across many >>>>>> deployments, like OIDC, but it’s overhead when what you’re doing is tied >>>>>> to your environment. >>>>>> >>>>>> 3) This allows the AS and RS to define the request parameters for their >>>>>> APIs just like they do today with scopes. Since it’s always the >>>>>> combination of “this type :AT: this AS/RS”, name spacing is less of an >>>>>> issue across systems. We haven’t seen huge problems in scope value >>>>>> overlap in the wild, though it does occur from time to time it’s more >>>>>> than manageable. A client isn’t going to just “speak RAR”, it’s going to >>>>>> be speaking RAR so that it can access something in particular. >>>>>> >>>>>> And all that brings me to my proposal: >>>>>> >>>>>> 4) Require all values to be defined by the AS, and encourage >>>>>> specification developers to use URIs for collision resistance. >>>>>> >>>>>> So officially in RAR, the AS would decide what “type” means, and nobody >>>>>> else. But we can also guide people who are developing general-purpose >>>>>> interoperable APIs to use URIs for their RAR “type” definitions. This >>>>>> would keep those interoperable APIs from stepping on each other, and >>>>>> from stepping on any locally-defined special “type” structure. But at >>>>>> the end of the day, the URI carries no more weight than just any other >>>>>> string, and the AS decides what it means and how it applies. >>>>>> >>>>>> My argument is that this seems to have worked very, very well for >>>>>> scopes, and the RAR “type” is cut from similar descriptive cloth. >>>>>> >>>>>> What does the rest of the group think? How should we manage the RAR >>>>>> “type” values and what they mean? >>>>>> >>>>>> — Justin >>>>>> _______________________________________________ >>>>>> OAuth mailing list >>>>>> OAuth@ietf.org <mailto:OAuth@ietf.org> >>>>>> https://www.ietf.org/mailman/listinfo/oauth >>>>>> <https://www.ietf.org/mailman/listinfo/oauth> >>>>> >>>> >>> >> >
_______________________________________________ OAuth mailing list OAuth@ietf.org https://www.ietf.org/mailman/listinfo/oauth