[dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions

Ferruh Yigit ferruh.yigit at intel.com
Mon Feb 20 12:15:05 CET 2017


On 2/16/2017 11:39 AM, Iremonger, Bernard wrote:
> Hi Ferruh,
> 
>> -----Original Message-----
>> From: Richardson, Bruce
>> Sent: Thursday, February 16, 2017 9:14 AM
>> To: Yigit, Ferruh <ferruh.yigit at intel.com>
>> Cc: Thomas Monjalon <thomas.monjalon at 6wind.com>; Iremonger, Bernard
>> <bernard.iremonger at intel.com>; Ananyev, Konstantin
>> <konstantin.ananyev at intel.com>; Doherty, Declan
>> <declan.doherty at intel.com>; DPDK <dev at dpdk.org>
>> Subject: Re: [dpdk-dev] [PATCH v3 3/4] bonding: take queue spinlock in rx/tx
>> burst functions
>>
>> On Wed, Feb 15, 2017 at 06:01:45PM +0000, Ferruh Yigit wrote:
>>> On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon)
>> wrote:
>>>> 2016-06-16 16:41, Iremonger, Bernard:
>>>>> Hi Thomas,
>>>>> <snip>
>>>>>> 2016-06-16 15:32, Bruce Richardson:
>>>>>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard
>> wrote:
>>>>>>>>> Why does this particular PMD need spinlocks when doing RX and
>>>>>>>>> TX, while other device types do not? How is adding/removing
>>>>>>>>> devices from a bonded device different to other control
>>>>>>>>> operations that can be done on physical PMDs? Is this not
>>>>>>>>> similar to say bringing down or hotplugging out a physical port
>>>>>>>>> just before an RX or TX
>>>>>> operation takes place?
>>>>>>>>> For all other PMDs we rely on the app to synchronise control
>>>>>>>>> and data plane operation - why not here?
>>>>>>>>>
>>>>>>>>> /Bruce
>>>>>>>>
>>>>>>>> This issue arose during VM live migration testing.
>>>>>>>> For VM live migration it is necessary (while traffic is running)
>>>>>>>> to be able to
>>>>>> remove a bonded slave device, stop it, close it and detach it.
>>>>>>>> It a slave device is removed from a bonded device while traffic
>>>>>>>> is running
>>>>>> a segmentation fault may occur in the rx/tx burst function. The
>>>>>> spinlock has been added to prevent this occurring.
>>>>>>>>
>>>>>>>> The bonding device already uses a spinlock to synchronise
>>>>>>>> between the
>>>>>> add and remove functionality and the
>>>>>> slave_link_status_change_monitor code.
>>>>>>>>
>>>>>>>> Previously testpmd did not allow, stop, close or detach of PMD
>>>>>>>> while traffic was running. Testpmd has been modified with the
>>>>>>>> following patchset
>>>>>>>>
>>>>>>>> http://dpdk.org/dev/patchwork/patch/13472/
>>>>>>>>
>>>>>>>> It now allows stop, close and detach of a PMD provided in it is
>>>>>>>> not
>>>>>> forwarding and is not a slave of bonded PMD.
>>>>>>>>
>>>>>>> I will admit to not being fully convinced, but if nobody else has
>>>>>>> any serious objections, and since this patch has been reviewed
>>>>>>> and acked, I'm ok to merge it in. I'll do so shortly.
>>>>>>
>>>>>> Please hold on.
>>>>>> Seeing locks introduced in the Rx/Tx path is an alert.
>>>>>> We clearly need a design document to explain where locks can be
>>>>>> used and what are the responsibility of the control plane.
>>>>>> If everybody agrees in this document that DPDK can have some locks
>>>>>> in the fast path, then OK to merge it.
>>>>>>
>>>>>> So I would say NACK for 16.07 and maybe postpone to 16.11.
>>>>>
>>>>> Looking at the documentation for the bonding PMD.
>>>>>
>>>>>
>> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_li
>>>>> b.html
>>>>>
>>>>> In section 10.2 it states the following:
>>>>>
>>>>> Bonded devices support the dynamical addition and removal of slave
>> devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove
>> APIs.
>>>>>
>>>>> If a slave device is added or removed while traffic is running, there is the
>> possibility of a segmentation fault in the rx/tx burst functions. This is most
>> likely to occur in the round robin bonding mode.
>>>>>
>>>>> This patch set fixes what appears to be a bug in the bonding PMD.
>>>>
>>>> It can be fixed by removing this statement in the doc.
>>>>
>>>> One of the design principle of DPDK is to avoid locks.
>>>>
>>>>> Performance measurements have been made with this patch set
>> applied and without the patches applied using 64 byte packets.
>>>>>
>>>>> With the patches applied the following drop in performance was
>> observed:
>>>>>
>>>>> % drop for fwd+io:	0.16%
>>>>> % drop for fwd+mac:	0.39%
>>>>>
>>>>> This patch set has been reviewed and ack'ed, so I think it should
>>>>> be applied in 16.07
>>>>
>>>> I understand your point of view and I gave mine.
>>>> Now we need more opinions from others.
>>>>
>>>
>>> Hi,
>>>
>>> These patches are sitting in the patchwork for a long time. Discussion
>>> never concluded and patches kept deferred each release.
>>>
>>> I think we should give a decision about them:
>>>
>>> 1- We can merge them in this release, they are fixing a valid problem,
>>> and patches are already acked.
>>>
>>> 2- We can reject them, if not having them for more than six months not
>>> caused a problem, perhaps they are not really that required. And if
>>> somebody needs them in the future, we can resurrect them from
>> patchwork.
>>>
>>> I vote for option 2, any comments?
>>>
>> +1 on option 2. There are obviously not badly needed if nobody is asking
>> for them for over six months.
>>
>> 	/Bruce
> 
> I am ok with option 2, provided they can be retrieved if needed.

Patches marked as rejected in patchwork.

For future reference, patchwork ids:
http://dpdk.org/dev/patchwork/patch/13482/
http://dpdk.org/dev/patchwork/patch/13483/
http://dpdk.org/dev/patchwork/patch/13484/
http://dpdk.org/dev/patchwork/patch/13485/

Thanks,
ferruh



More information about the dev mailing list