[PATCH 1/5] ethdev: fix race-condition of proactive error handling mode

Konstantin Ananyev konstantin.v.ananyev at yandex.ru
Sun Mar 5 16:23:44 CET 2023


>>>>
>>>> In the proactive error handling mode, the PMD will set the data path
>>>> pointers to dummy functions and then try recovery, in this period the
>>>> application may still invoking data path API. This will introduce a
>>>> race-condition with data path which may lead to crash [1].
>>>>
>>>> Although the PMD added delay after setting data path pointers to
>>>> cover the above race-condition, it reduces the probability, but it
>>>> doesn't solve the problem.
>>>>
>>>> To solve the race-condition problem fundamentally, the following
>>>> requirements are added:
>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>      report RTE_ETH_EVENT_ERR_RECOVERING event.
>>> Do you mean to say, PMD should set the data path pointers after calling the
>> call back function?
>>> The PMD is running in the context of multiple EAL threads. How do these
>> threads synchronize such that only one thread sets these data pointers?
>>
>> As I understand this event callback supposed to be called in the context of EAL
>> interrupt thread (whoever is more familiar with original idea, feel free to correct
>> me if I missed something).
> I could not figure this out. It looks to be called from the data plane thread context.
> I also have a thought on alternate design at the end, appreciate if you can take a look.
>   
>> How it is going to signal data-path threads that they need to stop/suspend
>> calling data-path API - that's I suppose is left to application to decide...
>> Same as right now it is application responsibility to stop data-path threads
>> before doing dev_stop()/dev/_config()/etc.
> Ok, good, this expectation is not new. The application must have a mechanism already.
> 
>>
>>
>>>
>>>> 2. The application should stop data path API invocation when process
>>>>      the RTE_ETH_EVENT_ERR_RECOVERING event.
>>> Any thoughts on how an application can do this?
> We can ignore this question as there is already similar expectation set for earlier functionalities.
> 
>>>
>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>      report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> 4. The application should enable data path API invocation when process
>>>>      the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>> Do you mean to say that the application should not call the datapath APIs
>> while the PMD is running the recovery process?
>>
>> Yes, I believe that's the intention.
> Ok, this is good and makes sense.
> 
>>
>>>>
>>>> Also, this patch introduce a driver internal function
>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>
>>>> [1]
>>>>
>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2
>>>> -
>>>> ashok.k.kaladi at intel.com/
>>>>
>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>> Cc: stable at dpdk.org
>>>>
>>>> Signed-off-by: Chengwen Feng <fengchengwen at huawei.com>
>>>> ---
>>>>    doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>    lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>    lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>    lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>>    lib/ethdev/version.map                  |  1 +
>>>>    5 files changed, 46 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> index c145a9066c..e380ff135a 100644
>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>> in PASSIVE mode,  the PMD automatically recovers from error in
>>>> PROACTIVE mode,  and only a small amount of work is required for the
>> application.
>>>>
>>>> -During error detection and automatic recovery, -the PMD sets the
>>>> data path pointers to dummy functions -(which will prevent the
>>>> crash), -and also make sure the control path operations fail with a return
>> code ``-EBUSY``.
>>>> -
>>>> -Because the PMD recovers automatically, -the application can only
>>>> sense that the data flow is disconnected for a while -and the control
>>>> API returns an error in this period.
>>>> +During error detection and automatic recovery, the PMD sets the data
>>>> +path pointers to dummy functions and also make sure the control path
>>>> +operations failed with a return code ``-EBUSY``.
>>>>
>>>>    In order to sense the error happening/recovering,  as well as to
>>>> restore some additional configuration, @@ -653,9 +648,9 @@ three events
>> are available:
>>>>
>>>>    ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>       Notify the application that an error is detected
>>>> -   and the recovery is being started.
>>>> +   and the recovery is about to start.
>>>>       Upon receiving the event, the application should not invoke
>>>> -   any control path function until receiving
>>>> +   any control and data path API until receiving
>>>>       ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>
>>>>    .. note::
>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>
>>>>    ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>       Notify the application that the recovery from error is successful,
>>>> -   the PMD already re-configures the port,
>>>> -   and the effect is the same as a restart operation.
>>>> +   the PMD already re-configures the port.
>>>> +   The application should restore some additional configuration, and
>>>> + then
>>> What is the additional configuration? Is this specific to each NIC/PMD?
>>> I thought, this is an auto recovery process and the application does not require
>> to reconfigure anything. If the application has to restore the configuration, how
>> does auto recovery differ from typical recovery process?
>>>
>>>> +   enable data path API invocation.
>>>>
>>>>    ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>       Notify the application that the recovery from error failed, diff
>>>> --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index
>>>> 0be1e8ca04..f994653fe9 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>> *dev, const char *ring_name,
>>>>    	return rc;
>>>>    }
>>>>
>>>> +void
>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) {
>>>> +	if (dev == NULL)
>>>> +		return;
>>>> +	eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); }
>>>> +
>>>>    const struct rte_memzone *
>>>>    rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>> *ring_name,
>>>>    			 uint16_t queue_id, size_t size, unsigned int align, diff -
>> -git
>>>> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index
>>>> 2c9d615fb5..0d964d1f67 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1621,6 +1621,16 @@ int
>>>>    rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char
>>>> *name,
>>>>    		 uint16_t queue_id);
>>>>
>>>> +/**
>>>> + * @internal
>>>> + * Setup eth fast-path API to ethdev values.
>>>> + *
>>>> + * @param dev
>>>> + *  Pointer to struct rte_eth_dev.
>>>> + */
>>>> +__rte_internal
>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>> +
>>>>    /**
>>>>     * @internal
>>>>     * Atomically set the link status for the specific device.
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
>>>> 049641d57c..44ee7229c1 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>    	 */
>>>>    	RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>    	/** Port recovering from a hardware or firmware error.
>>>> -	 * If PMD supports proactive error recovery,
>>>> -	 * it should trigger this event to notify application
>>>> -	 * that it detected an error and the recovery is being started.
>>>> -	 * Upon receiving the event, the application should not invoke any
>>>> control path API
>>>> -	 * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>>> -	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>> -	 * The PMD will set the data path pointers to dummy functions,
>>>> -	 * and re-set the data path pointers to non-dummy functions
>>>> -	 * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> -	 * It means that the application cannot send or receive any packets
>>>> -	 * during this period.
>>>> +	 *
>>>> +	 * If PMD supports proactive error recovery, it should trigger this
>>>> +	 * event to notify application that it detected an error and the
>>>> +	 * recovery is about to start.
>>>> +	 *
>>>> +	 * Upon receiving the event, the application should not invoke any
>>>> +	 * control and data path API until receiving
>>>> +	 * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>> RTE_ETH_EVENT_RECOVERY_FAILED
>>>> +	 * event.
>>>> +	 *
>>>> +	 * Once this event is reported, the PMD will set the data path pointers
>>>> +	 * to dummy functions, and re-set the data path pointers to valid
>>>> +	 * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>> event.
>>> Why do we need to set the data path pointers to dummy functions if the
>> application is restricted from invoking any control and data path APIs till the
>> recovery process is completed?
>>
>> You are right, in theory it is not mandatory.
>> Though it helps to flag a problem if user will still try to call them while recovery is
>> in progress.
> Ok, may be in debug mode.
> I mean, we have already set an expectation to the application that it should not call and the application has implemented a method to do the same. Why do we need to complicate this?
> If the application calls the APIs, it is a programming error.


My preference would be to keep it this way for both debug and non-debug 
mode.
It doesn't cost anything to us in terms of perfomance, but helps to 
catch problems with wrong behaving app.

> 
>> Again, same as we doing in dev_stop().
> 
>>
>>>
>>>> +	 *
>>>>    	 * @note Before the PMD reports the recovery result,
>>>>    	 * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>> again,
>>>>    	 * because a larger error may occur during the recovery.
>>>>    	 */
>>>>    	RTE_ETH_EVENT_ERR_RECOVERING,
>>> I understand this is not a change in this patch. But, just wondering, what is the
>> purpose of this? How is the application supposed to use this?
>>>
>>>>    	/** Port recovers successfully from the error.
>>>> -	 * The PMD already re-configured the port,
>>>> -	 * and the effect is the same as a restart operation.
>>>> +	 *
>>>> +	 * The PMD already re-configured the port:
>>>>    	 * a) The following operation will be retained: (alphabetically)
>>>>    	 *    - DCB configuration
>>>>    	 *    - FEC configuration
>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>    	 *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>    	 * c) Any other configuration will not be stored
>>>>    	 *    and will need to be re-configured.
>>>> +	 *
>>>> +	 * The application should restore some additional configuration
>>>> +	 * (see above case b/c), and then enable data path API invocation.
>>>>    	 */
>>>>    	RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>    	/** Port recovery failed.
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
>>>> 357d1a88c0..c273e0bdae 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>    	rte_eth_devices;
>>>>    	rte_eth_dma_zone_free;
>>>>    	rte_eth_dma_zone_reserve;
>>>> +	rte_eth_fp_ops_setup;
>>>>    	rte_eth_hairpin_queue_peer_bind;
>>>>    	rte_eth_hairpin_queue_peer_unbind;
>>>>    	rte_eth_hairpin_queue_peer_update;
>>>> --
>>>> 2.17.1
>>>
> 
> Is there any reason not to design this in the same way as 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?

I suppose it is a question for the authors of original patch...

> We could have a similar API 'rte_eth_dev_recover' to do the recovery functionality.

I suppose such approach is also possible.
Personally I am fine with both ways: either existing one or what you 
propose, as long as we'll fix existing race-condition.
What is good with what you suggest - that way we probably don't need to
worry how to allow user to enable/disable auto-recovery inside PMD.

Konstantin




More information about the dev mailing list