On Fri, Jun 24, 2022 at 11:12:05AM +0200, Morten Brørup wrote: > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > Sent: Friday, 24 June 2022 10.14 > > > > On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Brørup wrote: > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > > > Sent: Thursday, 23 June 2022 18.43 > > > > > > > > This RFC shows one possible approach for escaping strings for the > > json > > > > output of telemetry library. For now this RFC supports escaping > > strings > > > > for the cases of returning a single string, or returning an array > > of > > > > strings. Not done is escaping of strings in objs/dicts [see more > > below > > > > on TODO] > > > > > > Very good initiative. > > > > > > > > > > > As well as telemetry lib changes, this patchset includes unit tests > > for > > > > the above and also little bit of cleanup to the json tests. > > > > > > > > TODO: > > > > Beyond what is here in this RFC: > > > > > > > > 1. we need to decide what to do about name/value pairs. Personally, > > I > > > > think we should add the restriction to the > > "rte_tel_data_add_obj_*" > > > > APIs > > > > to only allow a defined subset of characters in names: e.g. > > > > alphanumeric > > > > chars, underscore and dash. That means that we only need to > > escape > > > > the data part in the case of string returns. > > > > > > I agree about only allowing a subset of characters in names, so JSON > > (and other) encoding is not required. > > > > > > However, I think we should be less restrictive, and also allow > > characters commonly used for separation, indexing and wildcard, such as > > '/', '[', ']', and '*', '?' or '%'. > > > > > > Obviously, we should disallow characters requiring escaping in not > > just JSON, but also other foreseeable encodings and protocols. So > > please bring your crystal ball to the discussion. ;-) > > > > > Exactly why I am looking for feedback - and why I'm looking to have an > > explicit allowed list of characters rather than trying to just block > > the > > known-bad in json ones. > > > > For your suggestions: +1 to separators and indexing, i.e. '[', ']' and > > '/', > > though I would probably also add ',' and maybe '.' (unless it's likely > > to > > cause issues with some protocol we are likely to want to use). > > After having slept on it, I think we should also allow characters that could > appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for subnetting). > > > For the wildcarding, I find it hard to see why we would want those? > > Initially, I thought a wildcard might be useful as a placeholder in templates. > > But it might also be useful for partial IP or MAC addresses. E.g.: > - The SmartShare Systems OUI could be represented by the MAC address > "00:1F:B4:??:??:??". > - A default gateway address in a template configuration could be > "192.168.*.1". > > On the other hand, wildcard characters could be disallowed or require > escaping in other (non-JSON) protocols. > > So I'm just being a bit creative here, throwing out ideas in our search for > the right balance in the restrictions. >
I could see those characters certainly being needed in data values, but do you foresee them being required in the names of fields? > > > > The other advantage of using an allowlist of characters is that it > > makes it > > possible to expand over time, compared to a blocklist which always runs > > the > > risk of breaking something if you expand it. Therefore I suggest we > > keep > > the list as small as we need right now, and expand it only as we need. > > +1 > >From previous on-list discussion, I take it that SNMP is a possible target protocol you might have in mind. Any other protocols you can think of and what restrictions (if any) would SNMP or those other protocols add? /Bruce