[dpdk-dev] [PATCH 2/8] trace: simplify trace point registration
David Marchand
david.marchand at redhat.com
Tue May 5 10:23:19 CEST 2020
On Tue, May 5, 2020 at 9:33 AM Jerin Jacob <jerinjacobk at gmail.com> wrote:
> > > What the proposed patch here.
> > > # Making N constructors from one
> > > # Grouping global variable and register function under a single Marco
> > > and making it as N constructors.
> > > Why can we do the same logic for rte_log?
> >
> > rte_log is simple, there is nothing to simplify.
>
> Why not make, rte_log_register() and the global variable under a macro?
> That's something done by the proposed patch.
At the moment, there is not much that would go into such a macro, but
I had started to do some cleanups on logtype registration.
I did not finish because the question of the default log level is
still unclear (and I did not take the time).
Having the logtype definition as part of the macro would be fine to me.
https://patchwork.dpdk.org/patch/57743/
> > > > rte_trace requires 3 macros calls per trace type:
> > > > RTE_TRACE_POINT_REGISTER, RTE_TRACE_POINT_DEFINE, RTE_TRACE_POINT_ARGS
> > > > This patch is unifying the first 2 macro calls to make usage simpler,
> > > > and ease rte_trace adoption.
> > >
> > > RTE_TRACE_POINT_ARGS is NOP and for the syntax.
> > > It is similar to rte_log. rte_log don't have RTE_TRACE_POINT_REGISTER instead
> > > it is creating global variable see, "int otx2_logtype_base;
> > >
> > > >
> > > > Note: the other usability weirdness is mandating declaring each trace
> > > > function with a magic double underscore prefix which appears nowhere else.
> > > >
> > > >
> > > > > Analyze the impact wrt boot time and cross-platform pov as the log
> > > > > has a lot of entries to test. If the usage makes sense then it should make sense
> > > > > for rte_log too. I would like to NOT have trace to be the first
> > > > > library to explode
> > > > > with the constructor scheme. I suggest removing this specific patch from RC2 and
> > > > > revisit later.
> > > >
> > > > You still did not give any argument against increasing the number
> > > > of constructors.
> > >
> > > If you are proposing the new scheme, you have to prove the overhead
> > > with a significant number of constructors
> > > and why it has differed from existing scheme of things. That's is the
> > > norm in opensource.
> >
> > I say there is no overhead.
>
> Please share the data.
Measured time between first rte_trace_point_register and last one with
a simple patch:
diff --git a/lib/librte_eal/common/eal_common_trace.c
b/lib/librte_eal/common/eal_common_trace.c
index 875553d7e..95618314b 100644
--- a/lib/librte_eal/common/eal_common_trace.c
+++ b/lib/librte_eal/common/eal_common_trace.c
@@ -8,6 +8,7 @@
#include <regex.h>
#include <rte_common.h>
+#include <rte_cycles.h>
#include <rte_errno.h>
#include <rte_lcore.h>
#include <rte_per_lcore.h>
@@ -23,6 +24,9 @@ static RTE_DEFINE_PER_LCORE(int, ctf_count);
static struct trace_point_head tp_list = STAILQ_HEAD_INITIALIZER(tp_list);
static struct trace trace = { .args = STAILQ_HEAD_INITIALIZER(trace.args), };
+uint64_t first_register;
+uint64_t last_register;
+
struct trace *
trace_obj_get(void)
{
@@ -43,6 +47,8 @@ eal_trace_init(void)
/* Trace memory should start with 8B aligned for natural alignment */
RTE_BUILD_BUG_ON((offsetof(struct __rte_trace_header, mem) % 8) != 0);
+ trace_err("delta=%"PRIu64, last_register - first_register);
+
/* One of the trace point registration failed */
if (trace.register_errno) {
rte_errno = trace.register_errno;
@@ -425,6 +431,9 @@ __rte_trace_point_register(rte_trace_point_t
*handle, const char *name,
goto fail;
}
+ if (first_register == 0)
+ first_register = rte_get_tsc_cycles();
+
/* Check the size of the trace point object */
RTE_PER_LCORE(trace_point_sz) = 0;
RTE_PER_LCORE(ctf_count) = 0;
@@ -486,6 +495,8 @@ __rte_trace_point_register(rte_trace_point_t
*handle, const char *name,
STAILQ_INSERT_TAIL(&tp_list, tp, next);
__atomic_thread_fence(__ATOMIC_RELEASE);
+ last_register = rte_get_tsc_cycles();
+
/* All Good !!! */
return 0;
free:
I started testpmd 100 times for static and shared gcc builds
(test-meson-builds.sh) on a system with a 2.6GHz xeon.
v20.05-rc1-13-g08dd97130 (before patch):
static: count=100, min=580812, max=1482326, avg=1764932
shared: count=100, min=554648, max=1344163, avg=1704638
v20.05-rc1-14-g44250f392 (after patch):
static: count=100, min=668273, max=1530330, avg=1682548
shared: count=100, min=554634, max=1330264, avg=1672398
--
David Marchand
More information about the dev
mailing list