Benjamin Kaduk has entered the following ballot position for draft-ietf-regext-rdap-sorting-and-paging-17: Discuss
When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-regext-rdap-sorting-and-paging/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- Should we say something about which order the sorting criteria are applied (first to last vs last to first) when multiple sortItems are specified in a query? I recognize that in the HATEOS model, the actual JSONPaths reported by the server should be used by the client to determine what a given sort property does, but it also seems like it would be confusing for this document to specify (e.g.) an "email" property with specific JSONPath, and then have a server go off and use "email" to mean something else, even if that is just the addition of "pref" as discussed at the end of Section 2.3.1. Do we want to try to have the properties defined by this document be universally defined and encourage the use of new/different property names for variations on them? (The answer may well be "no", but the answer is not intuitively clear to me.) To put it another way, is the list in Section 2.3.1 normative, or just an example? ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- Section 1 However, there are some drawbacks associated with the use of the HTTP header. First, the header properties cannot be set directly from a web browser. Moreover, in an HTTP session, the information on the status (i.e. the session identifier) is usually inserted in the header or a cookie, while the information on the resource identification or the search type is included in the query string. The second approach is therefore not compliant with the HTTP standard [RFC7230]. As a result, this document describes a specification based on the use of query parameters. A few more words (section number from 7230?) on why the second approach is not compliant with HTTP might help the reader, though it isn't stricly necessary (we're not using it, after all). Section 2.1 * "jsonPath": "String" (OPTIONAL) the JSONPath of the RDAP field corresponding to the property; What is this path relative to? (Does the client have to know from the other context what type of object it refers to?) * "links": "Link[]" (OPTIONAL) an array of links as described in [RFC8288] containing the query string that applies the sort criterion. Just to check: this is going to have the same structure for a Link object that draft-ietf-regext-rdap-partial-response does? (I am not coming up with a great way to deduplicate the definitions, off the top of my head.) o "pageSize": "Numeric" (OPTIONAL) a numeric value representing the number of objects returned in the current page. It MUST be provided if and only if the total number of objects exceeds the page size. This property is redundant for RDAP clients because the page size can be derived from the length of the search results array but, it can be helpful if the end user interacts with the server through a web browser; If it's redundant, we should probably say something about error handling for when the things that are supposed to be identical have different values. Section 2.3 Except for sorting IP addresses, servers MUST implement sorting according to the JSON value type of the RDAP field the sorting property refers to. That is, JSON strings MUST be sorted lexicographically and JSON numbers MUST be sorted numerically. If IP addresses are represented as JSON strings, they MUST be sorted based on their numeric conversion. There are more JSON types than string and number; are those other types garanteed to not appear in sortable RDAP fields? (I can't see how such a guarantee could be made, given that servers can define their own sorting properties.) If the "sort" parameter reports an allowed sorting property, it MUST be provided in the "currentSort" field of the "sorting_metadata" element. nit: is "reports" the best word to describe this behavior (which, IIUC, is "present in the query component of the request URL"? Section 2.3.1 In the "sort" parameter ABNF syntax, property-ref represents a reference to a property of an RDAP object. Such a reference could be expressed by using a JSONPath. The JSONPath in a JSON document nit: is there a missing word here ("a JSONPath expression")? o Note that some of the object specific properties are also defined as query paths. The object specific properties include: nit: the list structure in this item does not seem parallel to the structure of the first item. as two representations of the same value. By default, the unicodeName value MUST be used while sorting. When the unicodeName is unavailable, the value of the ldhName MUST be used instead; I'm not entirely sure how much value "by default" adds here. Would the meaning be different if we said "The unicodeName value MUST be used while sorting if it is present; when the unicodeName is unavailable, the value of the ldhName is used instead"? o The jCard "sort-as" parameter MUST be ignored for the sorting capability described in this document; It's a little bit of a juxtaposition to refer to jCard here in the prose but vcard in the table. o Even if a nameserver can have multiple IPv4 and IPv6 addresses, the most common configuration includes one address for each IP version. Therefore, the assumption of having a single IPv4 and/or IPv6 value for a nameserver cannot be considered too stringent. I disagree with the flat assertion that it "cannot be considered too stringent". It can be so considered, as a matter of difference of opinion; what is appropriate to do here is to say that this document/protocol makes the assumption (especially since we go on to describe the exception-handling procedure when the assumption is violated). o Multiple events with a given action on an object might be returned. If this occurs, sorting MUST be applied to the most recent event; This makes a lot of sense as the default and I don't propose changing it now, but I do wonder how hard it would be to add support later for sorting on (say) the oldest event instead. The "jsonPath" field in the "sorting_metadata" element is used to clarify the RDAP field the sorting property refers to. The mapping between the sorting properties and the JSONPaths of the RDAP fields is shown below: [...] name $.domainSearchResults[*].unicodeName This seems to ignore the subtlety regarding unicodeName vs ldhName. Is there a way it could be expressed in JSONPath? o Nameserver name $.domainSearchResults[*].unicodeName Presumably this is supposed to be nameserverSearchResults? Section 2.4 I think we want another introductory paragraph like: % The cursor parameter is used by the server to preserve information % about the pagination state of a given query's results across calls to % the search API, so that successive requests by the client can return % page N, N+1, N+2, etc. Its value is only required to be interpretable % by the server and could be implemented, for example, as an opaque % database lookup key. If a server does use a method for generating % cursor values that involves internal structure, such as the one % described below, the server needs to recognize that the value supplied % by a client could have been modified (maliciously), and implement % appropriate bounds-checking and similar measures when parsing received % values. The current wording strongly suggests that base64-encoding a meaningful value that the client could inspect or even construct is required, and I do not think that is very maintainable or what was intended, given the current second paragraph ("servers can change the method over time without announcing anything to clients"). (side note) I'm also pretty partial to the way JMAP discusses returning (paginated, but non-uniformly) changes to a given data stream, e.g., at https://www.rfc-editor.org/rfc/rfc8620.html#section-5.2 -- any given state is named, and you can get "stuff starting at <named state>" and the name to use for the state as of the current reply. Section 4 If the server doesn't have access to an efficient (e.g.) counting operation on the backend, would we recommend that the server not support sorting/pagination, since there's not much benefit from having the server pull up all the results and count them just to be able to return the total count value back to the client, and then go do the same work again when the client asks for the next page of results? Section 7 I suggest noting that (encoded) structured "cursor" values present a new attack surface on the server that needs to be protected. results in a response. However, this last security policy can result in a higher inefficiency if the RDAP server does not provide any functionality to return the truncated results. I'm not sure I understand (or agree with) this last sentence -- it seems that unlateral silent truncation of results by the server leads to not just inefficiency but also potential security considerations in its own right, with the client not knowing that it has incomplete results. Also, if the server is truncating the results, by definition it "has functionality to return the truncated results" -- that's what it's doing! So I assume the intent was to say something about negotiating or indicating that the results are truncated, not actually doing the truncation. The new parameters presented in this document provide RDAP operators with a way to implement a server that reduces inefficiency risks. [same question about "inefficiency" being the right word] Appendix B o It does not allow direct navigation to arbitrary pages because the result set must be scrolled in sequential order starting from the initial page; (side note) I didn't follow the references, so maybe this was covered there, but I don't quite follow why direct navigation is impossible. If you use a key field for seeking, can't you just start in the middle from some known value for that key field? Appendix C.2 total count. Therefore, as "totalCount" is an optional response information, fetching always the total number of rows has been I'm not entirely sure in what sense "optional response information" is intended -- my reading of Section 2.1 is that it's mandatory to return totalCount if the client included the 'count' query parameter. _______________________________________________ regext mailing list regext@ietf.org https://www.ietf.org/mailman/listinfo/regext