[dpdk-dev] [PATCH v2] kni: Fix request overwritten
Ferruh Yigit
ferruh.yigit at intel.com
Mon Oct 4 16:51:13 CEST 2021
On 10/4/2021 3:25 PM, Elad Nachman wrote:
Can you please try to not top post, it will make impossible to follow this
discussion later from the mail archives.
> 1. Userspace will get an error
So there is nothing special with returning '-EAGAIN', user will only observe an
error.
Wasn't initial intention to use '-EAGAIN' to try request again?
> 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl unlocked
> for interface down command causes a crash because of a race condition in
> the device delete/unregister list in the kernel.
>
Why waiting with rthnl lock causes a deadlock? As said below we are already
doing it, why it is different with retry logic?
I agree to not wait with rtnl unlocked.
> FYI,
>
> Elad.
>
> בתאריך יום ב׳, 4 באוק׳ 2021, 17:13, מאת Ferruh Yigit <
> ferruh.yigit at intel.com>:
>
>> On 10/4/2021 2:09 PM, Elad Nachman wrote:
>>> Hi,
>>>
>>> EAGAIN is propogated back to the kernel and to the caller.
>>>
>>
>> So will the user get an error, or it will be handled by the kernel and
>> retried?
>>
>>> We cannot retry from the kni kernel module since we hold the rtnl lock.
>>>
>>
>> Why not? We are already waiting until a command time out, like
>> 'kni_net_open()'
>> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we can
>> limit the
>> number of retry for safety.
>>
>>> FYI,
>>>
>>> Elad
>>>
>>> בתאריך יום ב׳, 4 באוק׳ 2021, 16:05, מאת Ferruh Yigit <
>>> ferruh.yigit at intel.com>:
>>>
>>>> On 9/24/2021 11:54 AM, Elad Nachman wrote:
>>>>> Fix lack of multiple KNI requests handling support by introducing a
>>>>> request in progress flag which will fail additional requests with
>>>>> EAGAIN return code if the original request has not been processed
>>>>> by user-space.
>>>>>
>>>>> Bugzilla ID: 809
>>>>
>>>> Hi Eric,
>>>>
>>>> Can you please test this patch, if it solves the issue you reported?
>>>>
>>>>>
>>>>> Signed-off-by: Elad Nachman <eladv6 at gmail.com>
>>>>> ---
>>>>> kernel/linux/kni/kni_net.c | 9 +++++++++
>>>>> lib/kni/rte_kni.c | 2 ++
>>>>> lib/kni/rte_kni_common.h | 1 +
>>>>> 3 files changed, 12 insertions(+)
>>>>>
>>>>
>>>> <...>
>>>>
>>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device *dev,
>>>> struct rte_kni_request *req)
>>>>>
>>>>> mutex_lock(&kni->sync_lock);
>>>>>
>>>>> + /* Check that existing request has been processed: */
>>>>> + cur_req = (struct rte_kni_request *)kni->sync_kva;
>>>>> + if (cur_req->req_in_progress) {
>>>>> + ret = -EAGAIN;
>>>>
>>>> Overall logic in the KNI looks good to me, this helps to serialize the
>>>> requests
>>>> even for async ones.
>>>>
>>>> But can you please clarify how it behaves in the kernel side with
>> '-EAGAIN'
>>>> return type? Will linux call the ndo again, or will it just fail.
>>>>
>>>> If it just fails should we handle the re-try on '-EAGAIN' within the kni
>>>> module?
>>>>
>>>>
>>
>>
More information about the dev
mailing list