[dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore

Liang, Cunming cunming.liang at intel.com
Fri Dec 19 02:28:47 CET 2014



> -----Original Message-----
> From: Walukiewicz, Miroslaw
> Sent: Thursday, December 18, 2014 8:20 PM
> To: Liang, Cunming; dev at dpdk.org
> Subject: RE: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> 
> I have another question regarding your patch.
> 
>  Could we extend values returned by rte_lcore_id() to set them per thread (really
> the DPDK lcore is a pthread but started on specific core) instead of creating linear
> thread id.
[Liang, Cunming] As you said, __lcore_id is already per thread. 
Per the semantic meaning, it stands for logic cpu id. 
When multi-thread running on the same lcore, they should get the same value return by rte_lcore_id().
The same effective like 'schedu_getcpu()', but less using cost.
> 
> The patch would be much simpler and will work same way. The only change
> would be extending rte_lcore_id when rte_pthread_create() is called.
[Liang, Cunming] I ever think about it which using rte_lcore_id() to get unique id per pthread rather than have a new API.
But the name lcore actually no longer identify for cpu id. It may impact all existing user application who use the exact meaning of it.
How do you think ?
> 
> The value __lcore_id has really an attribute __thread that means it is valid not
> only per CPU core but also per thread.
> 
> The mempools, timers, statistics would work without any modifications in that
> environment.
> 
>  I do not see any reason why old legacy DPDK applications would not work in that
> model.
> 
> Mirek
> 
> > -----Original Message-----
> > From: Liang, Cunming
> > Sent: Monday, December 15, 2014 12:53 PM
> > To: Walukiewicz, Miroslaw; dev at dpdk.org
> > Subject: RE: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> >
> > Hi Mirek,
> >
> > That sounds great.
> > Looking forward to it.
> >
> > -Cunming
> >
> > > -----Original Message-----
> > > From: Walukiewicz, Miroslaw
> > > Sent: Monday, December 15, 2014 7:11 PM
> > > To: Liang, Cunming; dev at dpdk.org
> > > Subject: RE: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> > >
> > > Hi Cunming,
> > >
> > > The timers could be used by any application/library started as a standard
> > > pthread.
> > > Each pthread needs to have assigned some identifier same way as you are
> > doing
> > > it for mempools (the rte_linear_thread_id and rte_lcore_id are good
> > examples)
> > >
> > > I made series of patches extending the rte timers API to use with such kind
> > of
> > > identifier keeping existing API working also.
> > >
> > > I will send it soon.
> > >
> > > Mirek
> > >
> > >
> > > > -----Original Message-----
> > > > From: Liang, Cunming
> > > > Sent: Friday, December 12, 2014 6:45 AM
> > > > To: Walukiewicz, Miroslaw; dev at dpdk.org
> > > > Subject: RE: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> > > >
> > > > Thanks Mirek. That's a good point which wasn't mentioned in cover
> > letter.
> > > > For 'rte_timer', I only expect it be used within the 'legacy per-lcore'
> > pthread.
> > > > I'm appreciate if you can give me some cases which can't use it to fit.
> > > > In case have to use 'rte_timer' in multi-pthread, there are some
> > > > prerequisites and limitations.
> > > > 1. Make sure thread local variable 'lcore_id' is set correctly (e.g. do
> > pthread
> > > > init by rte_pthread_prepare)
> > > > 2. As 'rte_timer' is not preemptable, when using
> > rte_timer_manager/reset in
> > > > multi-pthread, make sure they're not on the same core.
> > > >
> > > > -Cunming
> > > >
> > > > > -----Original Message-----
> > > > > From: Walukiewicz, Miroslaw
> > > > > Sent: Thursday, December 11, 2014 5:57 PM
> > > > > To: Liang, Cunming; dev at dpdk.org
> > > > > Subject: RE: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per
> > lcore
> > > > >
> > > > > Thank you Cunming for explanation.
> > > > >
> > > > > What about DPDK timers? They also depend on rte_lcore_id() to avoid
> > > > spinlocks.
> > > > >
> > > > > Mirek
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cunming
> > Liang
> > > > > > Sent: Thursday, December 11, 2014 3:05 AM
> > > > > > To: dev at dpdk.org
> > > > > > Subject: [dpdk-dev] [RFC PATCH 0/7] support multi-phtread per lcore
> > > > > >
> > > > > >
> > > > > > Scope & Usage Scenario
> > > > > > ========================
> > > > > >
> > > > > > DPDK usually pin pthread per core to avoid task switch overhead. It
> > gains
> > > > > > performance a lot, but it's not efficient in all cases. In some cases, it
> > may
> > > > > > too expensive to use the whole core for a lightweight workload. It's a
> > > > > > reasonable demand to have multiple threads per core and each
> > threads
> > > > > > share CPU
> > > > > > in an assigned weight.
> > > > > >
> > > > > > In fact, nothing avoid user to create normal pthread and using cgroup
> > to
> > > > > > control the CPU share. One of the purpose for the patchset is to clean
> > the
> > > > > > gaps of using more DPDK libraries in the normal pthread. In addition, it
> > > > > > demonstrates performance gain by proactive 'yield' when doing idle
> > loop
> > > > > > in packet IO. It also provides several 'rte_pthread_*' APIs to easy life.
> > > > > >
> > > > > >
> > > > > > Changes to DPDK libraries
> > > > > > ==========================
> > > > > >
> > > > > > Some of DPDK libraries must run in DPDK environment.
> > > > > >
> > > > > > # rte_mempool
> > > > > >
> > > > > > In rte_mempool doc, it mentions a thread not created by EAL must
> > not
> > > > use
> > > > > > mempools. The root cause is it uses a per-lcore cache inside
> > mempool.
> > > > > > And 'rte_lcore_id()' will not return a correct value.
> > > > > >
> > > > > > The patchset changes this a little. The index of mempool cache won't
> > be a
> > > > > > lcore_id. Instead of it, using a linear number generated by the
> > allocator.
> > > > > > For those legacy EAL per-lcore thread, it apply for an unique linear id
> > > > > > during creation. For those normal pthread expecting to use
> > > > rte_mempool, it
> > > > > > requires to apply for a linear id explicitly. Now the mempool cache
> > looks
> > > > like
> > > > > > a per-thread base. The linear ID actually identify for the linear thread
> > id.
> > > > > >
> > > > > > However, there's another problem. The rte_mempool is not
> > > > preemptable.
> > > > > > The
> > > > > > problem comes from rte_ring, so talk together in next section.
> > > > > >
> > > > > > # rte_ring
> > > > > >
> > > > > > rte_ring supports multi-producer enqueue and multi-consumer
> > > > dequeue.
> > > > > > But it's
> > > > > > not preemptable. There's conversation talking about this before.
> > > > > > http://dpdk.org/ml/archives/dev/2013-November/000714.html
> > > > > >
> > > > > > Let's say there's two pthreads running on the same core doing
> > enqueue
> > > > on
> > > > > > the
> > > > > > same rte_ring. If the 1st pthread is preempted by the 2nd pthread
> > while
> > > > it
> > > > > > has
> > > > > > already modified the prod.head, the 2nd pthread will spin until the 1st
> > > > one
> > > > > > scheduled agian. It causes time wasting. In addition, if the 2nd
> > pthread
> > > > has
> > > > > > absolutely higer priority, it's more terrible.
> > > > > >
> > > > > > But it doesn't means we can't use. Just need to narrow down the
> > > > situation
> > > > > > when
> > > > > > it's used by multi-pthread on the same core.
> > > > > > - It CAN be used for any single-producer or single-consumer situation.
> > > > > > - It MAY be used by multi-producer/consumer pthread whose
> > scheduling
> > > > > > policy
> > > > > > are all SCHED_OTHER(cfs). User SHOULD aware of the performance
> > > > penalty
> > > > > > befor
> > > > > > using it.
> > > > > > - It MUST not be used by multi-producer/consumer pthread, while
> > some
> > > > of
> > > > > > their
> > > > > > scheduling policies is SCHED_FIFO or SCHED_RR.
> > > > > >
> > > > > >
> > > > > > Performance
> > > > > > ==============
> > > > > >
> > > > > > It loses performance by introducing task switching. On packet IO
> > > > perspective,
> > > > > > we can gain some back by improving IO effective rate. When the
> > pthread
> > > > do
> > > > > > idle
> > > > > > loop on an empty rx queue, it should proactively yield. We can also
> > slow
> > > > > > down
> > > > > > rx for a bit while to take more advantage of the bulk receiving in the
> > next
> > > > > > loop. In practice, increase the rx ring size also helps to improve the
> > > > overrall
> > > > > > throughput.
> > > > > >
> > > > > >
> > > > > > Cgroup Control
> > > > > > ================
> > > > > >
> > > > > > Here's a simple example, there's four pthread doing packet IO on the
> > > > same
> > > > > > core.
> > > > > > We expect the CPU share rate is 1:1:2:4.
> > > > > > > mkdir /sys/fs/cgroup/cpu/dpdk
> > > > > > > mkdir /sys/fs/cgroup/cpu/dpdk/thread0
> > > > > > > mkdir /sys/fs/cgroup/cpu/dpdk/thread1
> > > > > > > mkdir /sys/fs/cgroup/cpu/dpdk/thread2
> > > > > > > mkdir /sys/fs/cgroup/cpu/dpdk/thread3
> > > > > > > cd /sys/fs/cgroup/cpu/dpdk
> > > > > > > echo 256 > thread0/cpu.shares
> > > > > > > echo 256 > thread1/cpu.shares
> > > > > > > echo 512 > thread2/cpu.shares
> > > > > > > echo 1024 > thread3/cpu.shares
> > > > > >
> > > > > >
> > > > > > -END-
> > > > > >
> > > > > > Any comments are welcome.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > *** BLURB HERE ***
> > > > > >
> > > > > > Cunming Liang (7):
> > > > > >   eal: add linear thread id as pthread-local variable
> > > > > >   mempool: use linear-tid as mempool cache index
> > > > > >   ring: use linear-tid as ring debug stats index
> > > > > >   eal: add simple API for multi-pthread
> > > > > >   testpmd: support multi-pthread mode
> > > > > >   sample: add new sample for multi-pthread
> > > > > >   eal: macro for cpuset w/ or w/o CPU_ALLOC
> > > > > >
> > > > > >  app/test-pmd/cmdline.c                    |  41 +++++
> > > > > >  app/test-pmd/testpmd.c                    |  84 ++++++++-
> > > > > >  app/test-pmd/testpmd.h                    |   1 +
> > > > > >  config/common_linuxapp                    |   1 +
> > > > > >  examples/multi-pthread/Makefile           |  57 ++++++
> > > > > >  examples/multi-pthread/main.c             | 232
> > > > > ++++++++++++++++++++++++
> > > > > >  examples/multi-pthread/main.h             |  46 +++++
> > > > > >  lib/librte_eal/common/include/rte_eal.h   |  15 ++
> > > > > >  lib/librte_eal/common/include/rte_lcore.h |  12 ++
> > > > > >  lib/librte_eal/linuxapp/eal/eal_thread.c  | 282
> > > > > > +++++++++++++++++++++++++++---
> > > > > >  lib/librte_mempool/rte_mempool.h          |  22 +--
> > > > > >  lib/librte_ring/rte_ring.h                |   6 +-
> > > > > >  12 files changed, 755 insertions(+), 44 deletions(-)
> > > > > >  create mode 100644 examples/multi-pthread/Makefile
> > > > > >  create mode 100644 examples/multi-pthread/main.c
> > > > > >  create mode 100644 examples/multi-pthread/main.h
> > > > > >
> > > > > > --
> > > > > > 1.8.1.4



More information about the dev mailing list