[PATCH v2 1/5] telemetry: escape special char when tel string
Bruce Richardson
bruce.richardson at intel.com
Wed Jun 22 11:19:48 CEST 2022
On Wed, Jun 22, 2022 at 08:57:43AM +0100, Power, Ciara wrote:
> Hi folks,
>
> > -----Original Message-----
> > From: Morten Brørup <mb at smartsharesystems.com>
> > Sent: Saturday 18 June 2022 10:59
> > To: fengchengwen <fengchengwen at huawei.com>; Stephen Hemminger
> > <stephen at networkplumber.org>; Richardson, Bruce
> > <bruce.richardson at intel.com>
> > Cc: thomas at monjalon.net; ferruh.yigit at xilinx.com; Laatz, Kevin
> > <kevin.laatz at intel.com>; andrew.rybchenko at oktetlabs.ru;
> > jerinj at marvell.com; sachin.saxena at oss.nxp.com;
> > hemant.agrawal at nxp.com; dev at dpdk.org; Power, Ciara
> > <ciara.power at intel.com>
> > Subject: RE: [PATCH v2 1/5] telemetry: escape special char when tel string
> >
> > +CC: Ciara Power, Telemetry library maintainer
> >
> > > From: fengchengwen [mailto:fengchengwen at huawei.com]
> > > Sent: Saturday, 18 June 2022 05.52
> > >
> > > On 2022/6/18 1:05, Stephen Hemminger wrote:
> > > > On Fri, 17 Jun 2022 12:25:04 +0100
> > > > Bruce Richardson <bruce.richardson at intel.com> wrote:
> > > >
> > > >> On Fri, Jun 17, 2022 at 01:16:08PM +0200, Morten Brørup wrote:
> > > >>>> From: Chengwen Feng [mailto:fengchengwen at huawei.com]
> > > >>>> Sent: Friday, 17 June 2022 11.46
> > > >>>>
> > > >>>> This patch supports escape special characters (including:
> > > \",\\,/,\b,
> > > >>>> /f,/n,/r,/t) when telemetry string.
> > > >>>> This patch is used to support telemetry xxx-dump commands which
> > > the
> > > >>>> string may include special characters.
> > > >>>>
> > > >>>> Signed-off-by: Chengwen Feng <fengchengwen at huawei.com>
> > > >>>> ---
> > > >>>> lib/telemetry/telemetry.c | 96
> > > +++++++++++++++++++++++++++++++++++++--
> > > >>>> 1 file changed, 93 insertions(+), 3 deletions(-)
> > > >>>>
> > > >>>> diff --git a/lib/telemetry/telemetry.c
> > > >>>> b/lib/telemetry/telemetry.c index c6fd03a5ab..0f762f633e 100644
> > > >>>> --- a/lib/telemetry/telemetry.c
> > > >>>> +++ b/lib/telemetry/telemetry.c
> > > >>>> @@ -215,6 +215,94 @@ container_to_json(const struct rte_tel_data
> > > *d,
> > > >>>> char *out_buf, size_t buf_len)
> > > >>>> return used;
> > > >>>> }
> > > >>>>
> > > >>>> +static bool
> > > >>>> +json_is_special_char(char ch)
> > > >>>> +{
> > > >>>> + static unsigned char is_spec[256] = { 0 };
> > > >>>> + static bool init_once;
> > > >>>> +
> > > >>>> + if (!init_once) {
> > > >>>> + is_spec['\"'] = 1;
> > > >>>> + is_spec['\\'] = 1;
> > > >>>> + is_spec['/'] = 1;
> > > >>>> + is_spec['\b'] = 1;
> > > >>>> + is_spec['\f'] = 1;
> > > >>>> + is_spec['\n'] = 1;
> > > >>>> + is_spec['\r'] = 1;
> > > >>>> + is_spec['\t'] = 1;
> > > >>>> + init_once = true;
> > > >>>> + }
> > > >>>> +
> > > >>>> + return (bool)is_spec[(unsigned char)ch]; }
> > > >>
> > > >> According to the json spec at [1], the characters that need to be
> > > escaped
> > > >> are:
> > > >> a) any characters <0x20
> > > >> b) inverted commas/quote character \"
> > > >> c) the "reverse solidus character", better known to you and I as
> > > >> the back-slash.
> > > >>
> > > >> Therefore, I think this table generation could be simplified, but
> > > also
> > > >> expanded using this. For completeness we should also see about
> > > handling all
> > > >> control characters if they are encountered.
> > > >>
> > > >> [1] https://www.rfc-editor.org/rfc/rfc8259.txt
> > > >>
> > > >> /Bruce
> > > >
> > > > Since it is trivial could be initializer?
> > > >
> > > > static const uint8_t is_spec[256] = {
> > > > [0 ... 0x20] = 1,
> > > > ['\"' ] = 1,
> > > > ['\\' ] = 1,
> > > > ['/'] = 1,
> > > >
> > > > etc
> > > >
> > > > Or we could change the telemetry API to disallow control characters?
> > >
> > > I was thinking about converting 0~0x20, but I don't think there's a
> > > scenario.
> > >
> > > I prefer change the telemetry API to disallow control characters, and
> > > this may not be a violation of the ABI, if yes, the dpdk-telemetry.py
> > > will returns an error.
> >
> > I agree with Chengwen Feng. The telemetry data type is STRING, not BLOB.
> >
> > So we need to define exactly what the STRING type contains.
> >
> > I hope we can all agree that control characters should be disallowed.
> >
> > The more complicated question is: Do we want to use the ASCII character set
> > only, or do we want to use UTF-8 encoded Unicode?
> >
> > Personally, think UTF-8 encoded Unicode is more future proof, and would
> > vote for that.
> >
> > But I wouldn't reject limiting it to ASCII, and perhaps in the future introduce
> > another data type for UTF-8 strings.
> >
> > UTF-8 is the modern choice, but it is incompatible with old stuff, e.g. many
> > SNMP MIBs.
> >
> [CP]
>
> Just from looking at the spec [1] , I would say UTF-8, as it seems to suggest its use for JSON (section 8.1).
>
> [1] https://www.rfc-editor.org/rfc/rfc8259.txt
>
> > >
> > > So I think we could add declaring and checking functions to make sure
> > > telemetry string do not allow control characters.
> [CP]
>
> I am not sure why we don't want these at all - I thought we do want some of them, like tab (\u0009) for example.
>
> <snip>
>
> In general, I think Bruce's suggestion of using a customised printf function instead of snprintf would be a good way forward, to scan the chars as they are being copied in.
>
I'm hoping to have some time to try and prototype this myself soon, and
send out a draft patch to this mailing list for consideration.
/Bruce
More information about the dev
mailing list