[PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
Ferruh Yigit
ferruh.yigit at amd.com
Mon Mar 6 12:00:36 CET 2023
On 3/6/2023 10:22 AM, Konstantin Ananyev wrote:
>
>
>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>> application may still invoking data path API. This will introduce a
>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>
>>>>>> Although the PMD added delay after setting data path pointers to cover
>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>> solve the problem.
>>>>>>
>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>> requirements are added:
>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>> report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>> 2. The application should stop data path API invocation when process
>>>>>> the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>>> report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> 4. The application should enable data path API invocation when process
>>>>>> the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>
>>>>
>>>> How this is solving the race-condition, by pushing responsibility to
>>>> stop data path to application?
>>>
>>> Exactly, it becomes application responsibility to make sure data-path is
>>> stopped/suspended before recovery will continue.
>>>
>>
>> From documentation of the feature:
>>
>> ``
>> Because the PMD recovers automatically,
>> the application can only sense that the data flow is disconnected for a
>> while and the control API returns an error in this period.
>>
>> In order to sense the error happening/recovering, as well as to restore
>> some additional configuration, three events are available:
>> ``
>>
>> It looks like initial design is to use events mainly inform application
>> about what happened and mainly for re-configuration.
>>
>> Although I am don't disagree to involve the application, I am not sure
>> that is part of current design.
>
> I thought we all agreed that initial design contain some fallacies that
> need to fixed, no?
> Statement that with current rte_ethdev design error recovery can be done
> without interaction with the app (to stop/suspend data/control path)
> is the main one I think.
> It needs some interaction with app layer, one way or another.
>
>>>>
>>>> What if application is not interested in recovery modes at all and not
>>>> registered any callback for the recovery?
>>>
>>>
>>> Are you saying there is no way for application to disable
>>> automatic recovery in PMD if it is not interested
>>> (or can't full-fill per-requesties for it)?
>>> If so, then yes it is a problem and we need to fix it.
>>> I assumed that such mechanism to disable unwanted events already exists,
>>> but I can't find anything.
>>> Wonder what would be the easiest way here - can PMD make a decision
>>> based on callback return value, or do we need a new API to
>>> enable/disable callbacks, or ...?
>>>
>>>
>>
>> As far as I can see automatic recovery is not configurable by app.
>>
>> But that is not all, PMD sends events to application but PMD can't know
>> if application is handling them or not, so with current design PMD can't
>> rely on to app.
>
> Well, PMD invokes user provided callback.
> One way to fix that problem - if there is no callback provided,
> or callback returns an error code - PMD can assume that recovery
> should not be done.
> That is probably not the best design choice, but at least it will allow
> to fix the problem without too many changes and introducing new API.
> That could be sort of a 'quick fix'.
> In a meanwhile we can think about new/better approach for that.
>
-rc2 for 23.03 is a few days away.
What do you think to have 'quick fix' as modifying how driver updates
burst ops to prevent the race condition, for this release?
And plan a design update for the next release?
>>
>>>> I think driver should not rely on application for this, unless
>>>> application explicitly says (to driver) that it is handling recovery,
>>>> right now there is no way for driver to know this.
>>>
>>> I think it is visa-versa:
>>> application should not enable auto-recovery if it can't meet
>>> per-requeststies for it (provide appropriate callback).
>>>
>>
>> I agree on above, we are saying similar thing in different perspective.
>
> Ok, that's good we are on the same page.
>
>
>>
>>>
>>>>
>>>>>> Also, this patch introduce a driver internal function
>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>
>>>>>> [1]
>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kaladi@intel.com/
>>>>>>
>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>> Cc: stable at dpdk.org
>>>>>>
>>>>>> Signed-off-by: Chengwen Feng <fengchengwen at huawei.com>
>>>>>> ---
>>>>>> doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>> lib/ethdev/ethdev_driver.c | 8 +++++++
>>>>>> lib/ethdev/ethdev_driver.h | 10 ++++++++
>>>>>> lib/ethdev/rte_ethdev.h | 32
>>>>>> +++++++++++++++----------
>>>>>> lib/ethdev/version.map | 1 +
>>>>>> 5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>
>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> index c145a9066c..e380ff135a 100644
>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>> in PASSIVE mode,
>>>>>> the PMD automatically recovers from error in PROACTIVE mode,
>>>>>> and only a small amount of work is required for the application.
>>>>>>
>>>>>> -During error detection and automatic recovery,
>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>> -(which will prevent the crash),
>>>>>> -and also make sure the control path operations fail with a return
>>>>>> code ``-EBUSY``.
>>>>>> -
>>>>>> -Because the PMD recovers automatically,
>>>>>> -the application can only sense that the data flow is disconnected
>>>>>> for a while
>>>>>> -and the control API returns an error in this period.
>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>> data path
>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>> operations
>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>
>>>>>> In order to sense the error happening/recovering,
>>>>>> as well as to restore some additional configuration,
>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>
>>>>>> ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>> Notify the application that an error is detected
>>>>>> - and the recovery is being started.
>>>>>> + and the recovery is about to start.
>>>>>> Upon receiving the event, the application should not invoke
>>>>>> - any control path function until receiving
>>>>>> + any control and data path API until receiving
>>>>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>
>>>>>> .. note::
>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>
>>>>>> ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>> Notify the application that the recovery from error is successful,
>>>>>> - the PMD already re-configures the port,
>>>>>> - and the effect is the same as a restart operation.
>>>>>> + the PMD already re-configures the port.
>>>>>> + The application should restore some additional configuration,
>>>>>> and then
>>>>>> + enable data path API invocation.
>>>>>>
>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>> Notify the application that the recovery from error failed,
>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>> *dev, const char *ring_name,
>>>>>> return rc;
>>>>>> }
>>>>>>
>>>>>> +void
>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>> +{
>>>>>> + if (dev == NULL)
>>>>>> + return;
>>>>>> + eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>> +}
>>>>>> +
>>>>>> const struct rte_memzone *
>>>>>> rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>> *ring_name,
>>>>>> uint16_t queue_id, size_t size, unsigned int align,
>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>> rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>> char *name,
>>>>>> uint16_t queue_id);
>>>>>>
>>>>>> +/**
>>>>>> + * @internal
>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>> + *
>>>>>> + * @param dev
>>>>>> + * Pointer to struct rte_eth_dev.
>>>>>> + */
>>>>>> +__rte_internal
>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>> +
>>>>>> /**
>>>>>> * @internal
>>>>>> * Atomically set the link status for the specific device.
>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>> */
>>>>>> RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>> /** Port recovering from a hardware or firmware error.
>>>>>> - * If PMD supports proactive error recovery,
>>>>>> - * it should trigger this event to notify application
>>>>>> - * that it detected an error and the recovery is being started.
>>>>>> - * Upon receiving the event, the application should not invoke
>>>>>> any control path API
>>>>>> - * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>> receiving
>>>>>> - * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>> - * The PMD will set the data path pointers to dummy functions,
>>>>>> - * and re-set the data path pointers to non-dummy functions
>>>>>> - * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>> - * It means that the application cannot send or receive any
>>>>>> packets
>>>>>> - * during this period.
>>>>>> + *
>>>>>> + * If PMD supports proactive error recovery, it should trigger
>>>>>> this
>>>>>> + * event to notify application that it detected an error and the
>>>>>> + * recovery is about to start.
>>>>>> + *
>>>>>> + * Upon receiving the event, the application should not invoke any
>>>>>> + * control and data path API until receiving
>>>>>> + * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>> + * event.
>>>>>> + *
>>>>>> + * Once this event is reported, the PMD will set the data path
>>>>>> pointers
>>>>>> + * to dummy functions, and re-set the data path pointers to valid
>>>>>> + * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>> event.
>>>>>> + *
>>>>>> * @note Before the PMD reports the recovery result,
>>>>>> * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>> again,
>>>>>> * because a larger error may occur during the recovery.
>>>>>> */
>>>>>> RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>> /** Port recovers successfully from the error.
>>>>>> - * The PMD already re-configured the port,
>>>>>> - * and the effect is the same as a restart operation.
>>>>>> + *
>>>>>> + * The PMD already re-configured the port:
>>>>>> * a) The following operation will be retained: (alphabetically)
>>>>>> * - DCB configuration
>>>>>> * - FEC configuration
>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>> * (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>> * c) Any other configuration will not be stored
>>>>>> * and will need to be re-configured.
>>>>>> + *
>>>>>> + * The application should restore some additional configuration
>>>>>> + * (see above case b/c), and then enable data path API invocation.
>>>>>> */
>>>>>> RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>> /** Port recovery failed.
>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>> --- a/lib/ethdev/version.map
>>>>>> +++ b/lib/ethdev/version.map
>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>> rte_eth_devices;
>>>>>> rte_eth_dma_zone_free;
>>>>>> rte_eth_dma_zone_reserve;
>>>>>> + rte_eth_fp_ops_setup;
>>>>>> rte_eth_hairpin_queue_peer_bind;
>>>>>> rte_eth_hairpin_queue_peer_unbind;
>>>>>> rte_eth_hairpin_queue_peer_update;
>>>>>> --
>>>>> Acked-by: Konstantin Ananyev <konstantin.ananyev at huawei.com>
>>>>>
>>>>>> 2.17.1
>>>>>
>>>>
>>>
>
More information about the dev
mailing list