[RFC PATCH 0/6] add json string escaping to telemetry
    Bruce Richardson 
    bruce.richardson at intel.com
       
    Fri Jun 24 11:17:04 CEST 2022
    
    
  
On Fri, Jun 24, 2022 at 11:12:05AM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > Sent: Friday, 24 June 2022 10.14
> > 
> > On Thu, Jun 23, 2022 at 09:04:31PM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > > > Sent: Thursday, 23 June 2022 18.43
> > > >
> > > > This RFC shows one possible approach for escaping strings for the
> > json
> > > > output of telemetry library. For now this RFC supports escaping
> > strings
> > > > for the cases of returning a single string, or returning an array
> > of
> > > > strings. Not done is escaping of strings in objs/dicts [see more
> > below
> > > > on TODO]
> > >
> > > Very good initiative.
> > >
> > > >
> > > > As well as telemetry lib changes, this patchset includes unit tests
> > for
> > > > the above and also little bit of cleanup to the json tests.
> > > >
> > > > TODO:
> > > > Beyond what is here in this RFC:
> > > >
> > > > 1. we need to decide what to do about name/value pairs. Personally,
> > I
> > > >    think we should add the restriction to the
> > "rte_tel_data_add_obj_*"
> > > > APIs
> > > >    to only allow a defined subset of characters in names: e.g.
> > > > alphanumeric
> > > >    chars, underscore and dash. That means that we only need to
> > escape
> > > >    the data part in the case of string returns.
> > >
> > > I agree about only allowing a subset of characters in names, so JSON
> > (and other) encoding is not required.
> > >
> > > However, I think we should be less restrictive, and also allow
> > characters commonly used for separation, indexing and wildcard, such as
> > '/', '[', ']', and '*', '?' or '%'.
> > >
> > > Obviously, we should disallow characters requiring escaping in not
> > just JSON, but also other foreseeable encodings and protocols. So
> > please bring your crystal ball to the discussion. ;-)
> > >
> > Exactly why I am looking for feedback - and why I'm looking to have an
> > explicit allowed list of characters rather than trying to just block
> > the
> > known-bad in json ones.
> > 
> > For your suggestions: +1 to separators and indexing, i.e. '[', ']' and
> > '/',
> > though I would probably also add ',' and maybe '.' (unless it's likely
> > to
> > cause issues with some protocol we are likely to want to use).
> 
> After having slept on it, I think we should also allow characters that could appear in IP and MAC addresses, i.e. '.' and ':' (and '/' for subnetting).
> 
> > For the wildcarding, I find it hard to see why we would want those?
> 
> Initially, I thought a wildcard might be useful as a placeholder in templates.
> 
> But it might also be useful for partial IP or MAC addresses. E.g.:
> - The SmartShare Systems OUI could be represented by the MAC address "00:1F:B4:??:??:??".
> - A default gateway address in a template configuration could be "192.168.*.1".
> 
> On the other hand, wildcard characters could be disallowed or require escaping in other (non-JSON) protocols.
> 
> So I'm just being a bit creative here, throwing out ideas in our search for the right balance in the restrictions.
> 
I could see those characters certainly being needed in data values, but do
you foresee them being required in the names of fields?
> > 
> > The other advantage of using an allowlist of characters is that it
> > makes it
> > possible to expand over time, compared to a blocklist which always runs
> > the
> > risk of breaking something if you expand it. Therefore I suggest we
> > keep
> > the list as small as we need right now, and expand it only as we need.
> 
> +1
>
>From previous on-list discussion, I take it that SNMP is a possible target
protocol you might have in mind. Any other protocols you can think of and
what restrictions (if any) would SNMP or those other protocols add?
/Bruce 
    
    
More information about the dev
mailing list