[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

Iremonger, Bernard bernard.iremonger at intel.com
Thu Feb 16 12:39:09 CET 2017


Hi Ferruh,

> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, February 16, 2017 9:14 AM
> To: Yigit, Ferruh <ferruh.yigit at intel.com>
> Cc: Thomas Monjalon <thomas.monjalon at 6wind.com>; Iremonger, Bernard
> <bernard.iremonger at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Doherty, Declan
> <declan.doherty at intel.com>; DPDK <dev at dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
> burst functions
> 
> On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
> > On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon)
> wrote:
> > > 2016-06-16 16:41, Iremonger, Bernard:
> > >> Hi Thomas,
> > >> <snip>
> > >>> 2016-06-16 15:32, Bruce Richardson:
> > >>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard
> wrote:
> > >>>>>> Why does this particular PMD need spinlocks when doing RX and
> > >>>>>> TX, while other device types do not? How is adding/removing
> > >>>>>> devices from a bonded device different to other control
> > >>>>>> operations that can be done on physical PMDs? Is this not
> > >>>>>> similar to say bringing down or hotplugging out a physical port
> > >>>>>> just before an RX or TX
> > >>> operation takes place?
> > >>>>>> For all other PMDs we rely on the app to synchronise control
> > >>>>>> and data plane operation - why not here?
> > >>>>>>
> > >>>>>> /Bruce
> > >>>>>
> > >>>>> This issue arose during VM live migration testing.
> > >>>>> For VM live migration it is necessary (while traffic is running)
> > >>>>> to be able to
> > >>> remove a bonded slave device, stop it, close it and detach it.
> > >>>>> It a slave device is removed from a bonded device while traffic
> > >>>>> is running
> > >>> a segmentation fault may occur in the rx/tx burst function. The
> > >>> spinlock has been added to prevent this occurring.
> > >>>>>
> > >>>>> The bonding device already uses a spinlock to synchronise
> > >>>>> between the
> > >>> add and remove functionality and the
> > >>> slave_link_status_change_monitor code.
> > >>>>>
> > >>>>> Previously testpmd did not allow, stop, close or detach of PMD
> > >>>>> while traffic was running. Testpmd has been modified with the
> > >>>>> following patchset
> > >>>>>
> > >>>>> http://dpdk.org/dev/patchwork/patch/13472/
> > >>>>>
> > >>>>> It now allows stop, close and detach of a PMD provided in it is
> > >>>>> not
> > >>> forwarding and is not a slave of bonded PMD.
> > >>>>>
> > >>>> I will admit to not being fully convinced, but if nobody else has
> > >>>> any serious objections, and since this patch has been reviewed
> > >>>> and acked, I'm ok to merge it in. I'll do so shortly.
> > >>>
> > >>> Please hold on.
> > >>> Seeing locks introduced in the Rx/Tx path is an alert.
> > >>> We clearly need a design document to explain where locks can be
> > >>> used and what are the responsibility of the control plane.
> > >>> If everybody agrees in this document that DPDK can have some locks
> > >>> in the fast path, then OK to merge it.
> > >>>
> > >>> So I would say NACK for 16.07 and maybe postpone to 16.11.
> > >>
> > >> Looking at the documentation for the bonding PMD.
> > >>
> > >>
> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_li
> > >> b.html
> > >>
> > >> In section 10.2 it states the following:
> > >>
> > >> Bonded devices support the dynamical addition and removal of slave
> devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove
> APIs.
> > >>
> > >> If a slave device is added or removed while traffic is running, there is the
> possibility of a segmentation fault in the rx/tx burst functions. This is most
> likely to occur in the round robin bonding mode.
> > >>
> > >> This patch set fixes what appears to be a bug in the bonding PMD.
> > >
> > > It can be fixed by removing this statement in the doc.
> > >
> > > One of the design principle of DPDK is to avoid locks.
> > >
> > >> Performance measurements have been made with this patch set
> applied and without the patches applied using 64 byte packets.
> > >>
> > >> With the patches applied the following drop in performance was
> observed:
> > >>
> > >> % drop for fwd+io:	0.16%
> > >> % drop for fwd+mac:	0.39%
> > >>
> > >> This patch set has been reviewed and ack'ed, so I think it should
> > >> be applied in 16.07
> > >
> > > I understand your point of view and I gave mine.
> > > Now we need more opinions from others.
> > >
> >
> > Hi,
> >
> > These patches are sitting in the patchwork for a long time. Discussion
> > never concluded and patches kept deferred each release.
> >
> > I think we should give a decision about them:
> >
> > 1- We can merge them in this release, they are fixing a valid problem,
> > and patches are already acked.
> >
> > 2- We can reject them, if not having them for more than six months not
> > caused a problem, perhaps they are not really that required. And if
> > somebody needs them in the future, we can resurrect them from
> patchwork.
> >
> > I vote for option 2, any comments?
> >
> +1 on option 2. There are obviously not badly needed if nobody is asking
> for them for over six months.
> 
> 	/Bruce

I am ok with option 2, provided they can be retrieved if needed.

Regards,

Bernard.




More information about the dev mailing list