[dpdk-dev] [PATCH] Revert "bonding: use existing enslaved device queues"

Ilya Maximets i.maximets at samsung.com
Fri Nov 11 10:16:38 CET 2016


On 28.10.2016 09:14, Ilya Maximets wrote:
> On 25.10.2016 09:26, Ilya Maximets wrote:
>> On 24.10.2016 17:54, Jan Blunck wrote:
>>> On Wed, Oct 19, 2016 at 5:47 AM, Ilya Maximets <i.maximets at samsung.com> wrote:
>>>> On 18.10.2016 18:19, Jan Blunck wrote:
>>>>> On Tue, Oct 18, 2016 at 2:49 PM, Ilya Maximets <i.maximets at samsung.com> wrote:
>>>>>> On 18.10.2016 15:28, Jan Blunck wrote:
>>>>>>> If the application already configured queues the PMD should not
>>>>>>> silently claim ownership and reset them.
>>>>>>> What exactly is the problem when changing MTU? This works fine from
>>>>>>> what I can tell.
>>>>>> Following scenario leads to APP PANIC:
>>>>>>         1. mempool_1 = rte_mempool_create()
>>>>>>         2. rte_eth_rx_queue_setup(bond0, ..., mempool_1);
>>>>>>         3. rte_eth_dev_start(bond0);
>>>>>>         4. mempool_2 = rte_mempool_create();
>>>>>>         5. rte_eth_dev_stop(bond0);
>>>>>>         6. rte_eth_rx_queue_setup(bond0, ..., mempool_2);
>>>>>>         7. rte_eth_dev_start(bond0);
>>>>>>         * RX queues still use 'mempool_1' because reconfiguration doesn't affect them. *
>>>>>>         8. rte_mempool_free(mempool_1);
>>>>>>         9. On any rx operation we'll get PANIC because of using freed 'mempool_1':
>>>>>>          PANIC in rte_mempool_get_ops():
>>>>>>          assert "(ops_index >= 0) && (ops_index < RTE_MEMPOOL_MAX_OPS_IDX)" failed
>>>>>> You may just start OVS 2.6 with DPDK bonding device and attempt to change MTU via 'mtu_request'.
>>>>>> Bug is easily reproducible.
>>>>> I see. I'm not 100% that this is expected to work without leaking the
>>>>> driver's queues though. The driver is allowed to do allocations in
>>>>> its rx_queue_setup() function that are being freed via
>>>>> rx_queue_release() later. But rx_queue_release() is only called if you
>>>>> reconfigure the
>>>>> device with 0 queues.
>> It's not true. Drivers usually calls 'rx_queue_release()' inside
>> 'rx_queue_setup()' function while reallocating the already allocated
>> queue. (See ixgbe driver for example). Also all HW queues are
>> usually destroyed inside 'eth_dev_stop()' and reallocated in
>> 'eth_dev_start()' or '{rx,tx}_queue_setup()'.
>> So, there is no leaks at all.
>>>>> From what I understand there is no other way to
>>>>> reconfigure a device to use another mempool.
>>>>> But ... even that wouldn't work with the bonding driver right now: the
>>>>> bonding master only configures the slaves during startup. I can put
>>>>> that on my todo list.
>> No, bonding master reconfigures new slaves in 'rte_eth_bond_slave_add()'
>> if needed.
>>>>> Coming back to your original problem: changing the MTU for the bond
>>>>> does work through rte_eth_dev_set_mtu() for slaves supporting that. In
>>>>> any other case you could (re-)configure rxmode.max_rx_pkt_len (and
>>>>> jumbo_frame / enable_scatter accordingly). This does work without a
>>>>> call to rte_eth_rx_queue_setup().
>>>> Thanks for suggestion, but using of rte_eth_dev_set_mtu() without
>>>> reconfiguration will require to have mempools with huge mbufs (9KB)
>>>> for all ports from the start. This is unacceptable because leads to
>>>> significant performance regressions because of fast cache exhausting.
>>>> Also this will require big work to rewrite OVS reconfiguration code
>>>> this way.
>>>> Anyway, it isn't the MTU only problem. Number of rx/tx descriptors
>>>> also can't be changed in runtime.
>>>> I'm not fully understand what is the use case for this 'reusing' code.
>>>> Could you, please, describe situation where this behaviour is necessary?
>>> The device that is added to the bond was used before and therefore
>>> already has allocated queues. Therefore we reuse the existing queues
>>> of the devices instead of borrowing the queues of the bond device. If
>>> the slave is removed from the bond again there is no need to allocate
>>> the queues again.
>>> Hope that clarifies the usecase
>> So, As I see, this is just an optimization that leads to differently
>> configured queues of same device and possible application crash if the
>> old configuration doesn't valid any more.
>> With revert applied in your usecase while adding the device to bond
>> it's queues will be automatically reconfigured according to configuration
>> of the bond device. If the slave is removed from the bond all its'
>> queues will remain as configured by bond. You can reconfigure them if
>> needed. I guess, that in your case configuration of slave devices,
>> actually, matches configuration of bond device. In that case slave
>> will remain in the same state after removing from bond as it was
>> before adding.
> So, Jan, Declan, what do you think about this?
> Best regards, Ilya Maximets.

More information about the dev mailing list