[dpdk-dev] [PATCH] hash: document breakage with multi-writer thread
Wang, Yipeng1
yipeng1.wang at intel.com
Thu Jun 4 22:22:06 CEST 2020
> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Sent: Thursday, June 4, 2020 12:34 PM
> To: Wang, Yipeng1 <yipeng1.wang at intel.com>; Stephen Hemminger
> <stephen at networkplumber.org>
> Cc: Gobriel, Sameh <sameh.gobriel at intel.com>; Richardson, Bruce
> <bruce.richardson at intel.com>; dev at dpdk.org; De Lara Guarch, Pablo
> <pablo.de.lara.guarch at intel.com>; nd <nd at arm.com>; Honnappa
> Nagarahalli <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>
> Subject: RE: [PATCH] hash: document breakage with multi-writer thread
>
> <snip>
>
> > > > >
> > > > > > Subject: [PATCH] hash: document breakage with multi-writer
> > > > > > thread
> > > > > >
> > > > > > The code in rte_cuckoo_hash multi-writer support is broken if
> > > > > > write operations are called from a non-EAL thread.
> > > > > >
> > > > > > rte_lcore_id() wil return LCORE_ID_ANY (UINT32_MAX) for non
> > > > > > EAL thread and that leads to using wrong local cache.
> > > > > >
> > > > > > Add error checks and document the restriction.
> > > > > Having multiple non-EAL writer threads is a valid use case.
> > > > > Should we fix the
> > > > issue instead?
> > > >
> > > > Discovered this the hard way...
> > > >
> > > > Fixing is non-trivial. Basically, the local cache has to be take
> > > > out and that leads to having to do real locking or atomic operations.
> > > Looking at rte_hash_create function:
> > >
> > > if (params->extra_flag &
> > > RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
> > > use_local_cache = 1;
> > > writer_takes_lock = 1;
> > > }
> > >
> > > The writer locks are in place already. The code to handle the case
> > > when local cache is taken out is also there.
> > > What we need is another input flag that says 'multi writer + non-eal
> > threads'
> > > which would set 'use_local_cache = 0' and 'writer_takes_lock = 1'.
> > > Not sure, it would be valuable addition. But looks like this is what
> > > you were expecting when you had enabled
> > > 'RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD'. Many other APIs in
> DPDK
> > do
> > > not provide this kind of MT safety.
> >
> > [Wang, Yipeng]
> > If possible, we can try to not add new flags, because there are
> > already a lot of flag options.
> > How about in the code, we check if the writer is a non-eal or not by
> > checking the rte_lcore_id, and operate on the global queue?
> > Could this work?
> > If(h->use_local_cache) {
> > lcore_id = rte_lcore_id();
> > if(lcore_id == LCORE_ID_ANY) { // this is non-eal threads
> > <call rte_ring_mp/mc_* to directly operate on global queue>
> > }
> > Else {
> > <original path>
> > }
> > }
> The other thing I wanted to do was saving on the memory allocated for the
> local cache when the writers are non-eal threads. Without knowing the kind
> of threads upfront, we might have to create the local cache when a writer
> adds the entry first time.
I got what you mean. If people only use non-eal threads, we could save the space of local cache completely.
Creating local cache during the first write is one solution. But the current rte_hash always allocate things during
table creation time. This provides guarantee that the program won't fail in the middle due to memory allocation issue.
Meanwhile I would rather be wasting some space than adding another option flag related to multi-threading.
In my opinion, all those flags are already confusing enough. It would also be harder to maintain in future.
More information about the dev
mailing list