[dpdk-dev] [PATCH] Revert "bonding: use existing enslaved device queues"

Ilya Maximets i.maximets at samsung.com
Fri Oct 28 08:14:17 CEST 2016


On 25.10.2016 09:26, Ilya Maximets wrote:
> On 24.10.2016 17:54, Jan Blunck wrote:
>> On Wed, Oct 19, 2016 at 5:47 AM, Ilya Maximets <i.maximets at samsung.com> wrote:
>>> On 18.10.2016 18:19, Jan Blunck wrote:
>>>> On Tue, Oct 18, 2016 at 2:49 PM, Ilya Maximets <i.maximets at samsung.com> wrote:
>>>>> On 18.10.2016 15:28, Jan Blunck wrote:
>>>>>> If the application already configured queues the PMD should not
>>>>>> silently claim ownership and reset them.
>>>>>>
>>>>>> What exactly is the problem when changing MTU? This works fine from
>>>>>> what I can tell.
>>>>>
>>>>> Following scenario leads to APP PANIC:
>>>>>
>>>>>         1. mempool_1 = rte_mempool_create()
>>>>>         2. rte_eth_rx_queue_setup(bond0, ..., mempool_1);
>>>>>         3. rte_eth_dev_start(bond0);
>>>>>         4. mempool_2 = rte_mempool_create();
>>>>>         5. rte_eth_dev_stop(bond0);
>>>>>         6. rte_eth_rx_queue_setup(bond0, ..., mempool_2);
>>>>>         7. rte_eth_dev_start(bond0);
>>>>>         * RX queues still use 'mempool_1' because reconfiguration doesn't affect them. *
>>>>>         8. rte_mempool_free(mempool_1);
>>>>>         9. On any rx operation we'll get PANIC because of using freed 'mempool_1':
>>>>>          PANIC in rte_mempool_get_ops():
>>>>>          assert "(ops_index >= 0) && (ops_index < RTE_MEMPOOL_MAX_OPS_IDX)" failed
>>>>>
>>>>> You may just start OVS 2.6 with DPDK bonding device and attempt to change MTU via 'mtu_request'.
>>>>> Bug is easily reproducible.
>>>>>
>>>>
>>>> I see. I'm not 100% that this is expected to work without leaking the
>>>> driver's queues though. The driver is allowed to do allocations in
>>>> its rx_queue_setup() function that are being freed via
>>>> rx_queue_release() later. But rx_queue_release() is only called if you
>>>> reconfigure the
>>>> device with 0 queues.
> 
> It's not true. Drivers usually calls 'rx_queue_release()' inside
> 'rx_queue_setup()' function while reallocating the already allocated
> queue. (See ixgbe driver for example). Also all HW queues are
> usually destroyed inside 'eth_dev_stop()' and reallocated in
> 'eth_dev_start()' or '{rx,tx}_queue_setup()'.
> So, there is no leaks at all.
> 
>>>> From what I understand there is no other way to
>>>> reconfigure a device to use another mempool.
>>>>
>>>> But ... even that wouldn't work with the bonding driver right now: the
>>>> bonding master only configures the slaves during startup. I can put
>>>> that on my todo list.
> 
> No, bonding master reconfigures new slaves in 'rte_eth_bond_slave_add()'
> if needed.
> 
>>>> Coming back to your original problem: changing the MTU for the bond
>>>> does work through rte_eth_dev_set_mtu() for slaves supporting that. In
>>>> any other case you could (re-)configure rxmode.max_rx_pkt_len (and
>>>> jumbo_frame / enable_scatter accordingly). This does work without a
>>>> call to rte_eth_rx_queue_setup().
>>>
>>> Thanks for suggestion, but using of rte_eth_dev_set_mtu() without
>>> reconfiguration will require to have mempools with huge mbufs (9KB)
>>> for all ports from the start. This is unacceptable because leads to
>>> significant performance regressions because of fast cache exhausting.
>>> Also this will require big work to rewrite OVS reconfiguration code
>>> this way.
>>> Anyway, it isn't the MTU only problem. Number of rx/tx descriptors
>>> also can't be changed in runtime.
>>>
>>>
>>> I'm not fully understand what is the use case for this 'reusing' code.
>>> Could you, please, describe situation where this behaviour is necessary?
>>
>> The device that is added to the bond was used before and therefore
>> already has allocated queues. Therefore we reuse the existing queues
>> of the devices instead of borrowing the queues of the bond device. If
>> the slave is removed from the bond again there is no need to allocate
>> the queues again.
>>
>> Hope that clarifies the usecase
> 
> So, As I see, this is just an optimization that leads to differently
> configured queues of same device and possible application crash if the
> old configuration doesn't valid any more.
> 
> With revert applied in your usecase while adding the device to bond
> it's queues will be automatically reconfigured according to configuration
> of the bond device. If the slave is removed from the bond all its'
> queues will remain as configured by bond. You can reconfigure them if
> needed. I guess, that in your case configuration of slave devices,
> actually, matches configuration of bond device. In that case slave
> will remain in the same state after removing from bond as it was
> before adding.

So, Jan, Declan, what do you think about this?

Best regards, Ilya Maximets.


More information about the dev mailing list