[RFC] eal_debug: do not use malloc in rte_dump_stack
Morten Brørup
mb at smartsharesystems.com
Sat Jan 29 09:25:26 CET 2022
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, 29 January 2022 02.11
>
> The glibc backtrace_symbols() calls malloc which makes it
> dangerous to use rte_dump_stack() in a signal handler that
> is handling errors that maybe due to memory corruption.
Yes. We have experienced that problem with backtrace_symbols(); so as a workaround, our failure signal handler dumps all other information first, and calls backtrace_symbols() last, in case it crashes.
>
> Instead, use dladdr() to lookup up symbols incrementally.
I took a brief look at the dladdr() source code, and it looks good to me.
>
> The format of the messages is based on what X org server
> has been doing for many years. It changes from bottom up
> to top down order.
Good idea. Seems more logical.
>
> Bugzilla ID: 929
> Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
> ---
> lib/eal/linux/eal_debug.c | 45 ++++++++++++++++++++++++++++-----------
> 1 file changed, 32 insertions(+), 13 deletions(-)
>
> diff --git a/lib/eal/linux/eal_debug.c b/lib/eal/linux/eal_debug.c
> index 64dab4e0da24..bf232f72f402 100644
> --- a/lib/eal/linux/eal_debug.c
> +++ b/lib/eal/linux/eal_debug.c
> @@ -4,6 +4,7 @@
>
> #ifdef RTE_BACKTRACE
> #include <execinfo.h>
> +#include <dlfcn.h>
> #endif
> #include <stdarg.h>
> #include <signal.h>
> @@ -18,26 +19,44 @@
>
> #define BACKTRACE_SIZE 256
>
> -/* dump the stack of the calling core */
> +/* Dump the stack of the calling core
> + *
> + * Note: this requires some careful usage in order to
> + * stay safe in case where called from a signal
> + * handler and the malloc pool may be corrupted.
> + */
> void rte_dump_stack(void)
> {
> #ifdef RTE_BACKTRACE
> void *func[BACKTRACE_SIZE];
> - char **symb = NULL;
> - int size;
> + int i, size;
>
> size = backtrace(func, BACKTRACE_SIZE);
> - symb = backtrace_symbols(func, size);
> -
> - if (symb == NULL)
> - return;
>
> - while (size > 0) {
> - rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL,
> - "%d: [%s]\n", size, symb[size - 1]);
> - size --;
> + for (i = 0; i < size; i++) {
> + void *pc = func[i];
> + const char *fname;
> + Dl_info info;
> +
> + if (dladdr(pc, &info) == 0) {
> + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL,
> + "%d: ?? [%p]\n", i, pc);
> + continue;
> + }
> +
> + fname = (info.dli_fname && *info.dli_fname) ?
> info.dli_fname : "(vdso)";
> + if (info.dli_saddr != NULL)
> + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL,
> + "%d: %s (%s+%#tx) [%p]\n",
> + i, fname, info.dli_sname,
> + (ptrdiff_t)((uintptr_t)pc -
> (uintptr_t)info.dli_saddr),
> + pc);
> + else
> + rte_log(RTE_LOG_ERR, RTE_LOGTYPE_EAL,
> + "%d: %s (%p+%#tx) [%p]\n",
> + i, fname, info.dli_fbase,
> + (ptrdiff_t)((uintptr_t)pc -
> (uintptr_t)info.dli_fbase),
> + pc);
> }
> -
> free(symb);
Probably something is lost in formatting here, but free(symb) must also be removed.
> #endif /* RTE_BACKTRACE */
> }
> --
> 2.34.1
>
Great improvement, Stephen!
Acked-by: Morten Brørup <mb at smartsharesystems.com>
More information about the dev
mailing list