[PATCH 1/2] telemetry: use malloc instead of variable length array

Bruce Richardson bruce.richardson at intel.com
Tue Apr 4 19:25:42 CEST 2023


On Tue, Apr 04, 2023 at 09:44:46AM -0700, Tyler Retzlaff wrote:
> On Tue, Apr 04, 2023 at 05:28:29PM +0100, Bruce Richardson wrote:
> > On Tue, Apr 04, 2023 at 09:24:44AM -0700, Tyler Retzlaff wrote:
> > > On Tue, Apr 04, 2023 at 09:47:21AM +0100, Bruce Richardson wrote:
> > > > On Mon, Apr 03, 2023 at 01:19:12PM -0700, Stephen Hemminger wrote:
> > > > > On Mon,  3 Apr 2023 09:30:23 -0700
> > > > > Tyler Retzlaff <roretzla at linux.microsoft.com> wrote:
> > > > > 
> > > > > >  __json_snprintf(char *buf, const int len, const char *format, ...)
> > > > > >  {
> > > > > > -	char tmp[len];
> > > > > > +	char *tmp = malloc(len);
> > > > > >  	va_list ap;
> > > > > > -	int ret;
> > > > > > +	int ret = 0;
> > > > > > +
> > > > > > +	if (tmp == NULL)
> > > > > > +		return ret;
> > > > > >  
> > > > > >  	va_start(ap, format);
> > > > > >  	ret = vsnprintf(tmp, sizeof(tmp), format, ap);
> > > > > >  	va_end(ap);
> > > > > >  	if (ret > 0 && ret < (int)sizeof(tmp) && ret < len) {
> > > > > >  		strcpy(buf, tmp);
> > > > > > -		return ret;
> > > > > >  	}
> > > > > > -	return 0; /* nothing written or modified */
> > > > > > +
> > > > > > +	free(tmp);
> > > > > > +
> > > > > > +	return ret;
> > > > > >  }
> > > > > 
> > > > > Not sure why it needs a tmp buffer anyway?
> > > > 
> > > > The temporary buffer is to ensure that in the case that the data doesn't
> > > > fit in the buffer, the buffer remains unmodified. The reason for this is
> > > > that when building up the json response we always have a valid json string.
> > > 
> > > i guessed this but you've now confirmed it. it makes sense in general
> > > that if the callee signals an error to the caller that the caller shall
> > > not observe any side-effects to do so is to take a dependency on what is
> > > more often than not an internal implementation detail.
> > > 
> > > > 
> > > > For example, suppose we are preparing a response with an array of two
> > > > strings. After the first string has been processed, the output buffer
> > > > contains: '["string1"]'. When json_snprintf is being called to add string2,
> > > > there are a couple of things to note:
> > > > * the text to be inserted will be put not at the end of the string, but
> > > >   before the closing "]".
> > > > * the actual text to be inserted will be ',"string2"]', so ensuring that
> > > >   the final buffer is valid.
> > > > However, the error case is problematic. While we can catch the case where
> > > > the string to be inserted overflows/has been truncated, doing a regular
> > > > snprintf means that our output buffer could contain invalid json, as our
> > > > end-terminator would have been overwritten, e.g. '["string1","string2'
> > > > To guarantee the output from telemetry is always valid json, even in case
> > > > of truncation, we use a temporary buffer to do the write initially, and if
> > > > it doesn't get truncated, we then copy that to the final buffer.
> > > > 
> > > > That's the logic for this temporary buffer. Now, thinking about it
> > > > yesterday evening, there are other ways in which we can do this, which can
> > > > avoid this temporary buffer.
> > > > 1. We can do the initial snprintf to an empty buffer to get the length that
> > > >    way. This will still be slower, as it means that we need to do printf
> > > >    processing twice rather than using memcpy to copy the result. However, it's
> > > >    probably less overhead than malloc and free.
> > > > 2. AFAIK, the normal case for this function being called is with a single
> > > >    terminator at the end of the string. We can take advantage of that, by
> > > >    checking if the '\0' just one character into the string we are printing,
> > > >    and, if so, to store that once character. If we have a snprintf error
> > > >    leading to truncation, it then allows us to restore the original string.
> > > > 
> > > > My suggestion is to use a combination of these methods. In json_snprintf
> > > > check if the input buffer is empty or has only one character in it, and use
> > > > method #2 if so. If that's not the case, then fallback to method #1 and do
> > > > a double snprintf.
> > > > 
> > > > Make sense? Any other suggestions?
> > > 
> > > your suggestion seems okay to me, aside from that there's always using
> > > some fixed sized buffer but i'm guessing this being json it's difficult
> > > to choose a reasonable constant size for a stack allocated buffer.
> > > 
> > Yes, choosing a reasonable size is very difficult. We could be snprintf-ing
> > a string containing a json-ized object a couple of KB long.
> 
> haven't checked recently, but i wonder what our normal usermode stack
> frame size limit is, which is why alloca() would be scary.
> 
> > 
> > I think suggestion #2 above should cover most cases, in which case using
> > your original suggestion of malloc would be ok too for the rare case (if
> > ever) where we don't just have one terminator on the end.
> 
> maybe a dumb'd down compromise is to have a fixed stack limit and then
> if it is exceeded always just go to malloc/free?
> 
Perhaps. If you like, I have have a try at implementing my own suggestions
above tomorrow. I'd like if we can get the "single-character-saving" option
working, because that would be the most efficient method of all.

/Bruce


More information about the dev mailing list