[dpdk-dev] [PATCH v7 1/4] ethdev: support device reset and recovery events
Ray Kinsella
mdr at ashroe.eu
Wed Feb 2 12:44:28 CET 2022
Ferruh Yigit <ferruh.yigit at intel.com> writes:
> On 1/28/2022 12:48 PM, Kalesh A P wrote:
>> From: Kalesh AP <kalesh-anakkur.purayil at broadcom.com>
>> Adding support for the device reset and recovery events in the
>> rte_eth_event framework. FW error and FW reset conditions would be
>> managed internally by the PMD without needing application intervention.
>> In such cases, PMD would need reset/recovery events to notify application
>> that PMD is undergoing a reset.
>> While most of the recovery process is transparent to the application since
>> most of the driver ensures recovery from FW reset or FW error conditions,
>> the application will have to reprogram any flows which were offloaded to
>> the underlying hardware.
>> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil at broadcom.com>
>> Signed-off-by: Somnath Kotur <somnath.kotur at broadcom.com>
>> Reviewed-by: Ajit Khaparde <ajit.khaparde at broadcom.com>
>> ---
>> doc/guides/prog_guide/poll_mode_drv.rst | 24 ++++++++++++++++++++++++
>> lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>> 2 files changed, 42 insertions(+)
>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>> b/doc/guides/prog_guide/poll_mode_drv.rst
>> index 6831289..9ecc0e4 100644
>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>> @@ -623,3 +623,27 @@ by application.
>> The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger
>> the application to handle reset event. It is duty of application to
>> handle all synchronization before it calls rte_eth_dev_reset().
>> +
>> +Error recovery support
>> +~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +When the PMD detects a FW reset or error condition, it may try to recover
>> +from the error without needing the application intervention. In such cases,
>> +PMD would need events to notify the application that it is undergoing
>> +an error recovery.
>> +
>> +The PMD should trigger RTE_ETH_EVENT_ERR_RECOVERING event to notify the
>> +application that PMD detected a FW reset or FW error condition. PMD may
>> +try to recover from the error by itself. Data path may be quiesced and
>> +control path operations may fail during the recovery period. The application
>> +should stop polling till it receives RTE_ETH_EVENT_RECOVERED event from the PMD.
>> +
>> +The PMD should trigger RTE_ETH_EVENT_RECOVERED event to notify the application
>> +that the it has recovered from the error condition. PMD re-configures the port
>> +to the state prior to the error condition. Control path and data path are up now.
>> +Since the device has undergone a reset, flow rules offloaded prior to reset
>> +may be lost and the application should recreate the rules again.
>> +
>> +The PMD should trigger RTE_ETH_EVENT_INTR_RMV event to notify the application
>> +that it has failed to recover from the error condition. The device may not be
>> +usable anymore.
>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>> index 147cc1c..a46819f 100644
>> --- a/lib/ethdev/rte_ethdev.h
>> +++ b/lib/ethdev/rte_ethdev.h
>> @@ -3818,6 +3818,24 @@ enum rte_eth_event_type {
>> RTE_ETH_EVENT_DESTROY, /**< port is released */
>> RTE_ETH_EVENT_IPSEC, /**< IPsec offload related event */
>> RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected */
>> + RTE_ETH_EVENT_ERR_RECOVERING,
>> + /**< port recovering from an error
>> + *
>> + * PMD detected a FW reset or error condition.
>> + * PMD will try to recover from the error.
>> + * Data path may be quiesced and Control path operations
>> + * may fail at this time.
>> + */
>> + RTE_ETH_EVENT_RECOVERED,
>> + /**< port recovered from an error
>> + *
>> + * PMD has recovered from the error condition.
>> + * Control path and Data path are up now.
>> + * PMD re-configures the port to the state prior to the error.
>> + * Since the device has undergone a reset, flow rules
>> + * offloaded prior to reset may be lost and
>> + * the application should recreate the rules again.
>> + */
>> RTE_ETH_EVENT_MAX /**< max value of this enum */
>
>
> Also ABI check complains about 'RTE_ETH_EVENT_MAX' value check, cc'ed more people
> to evaluate if it is a false positive:
>
>
> 1 function with some indirect sub-type change:
> [C] 'function int rte_eth_dev_callback_register(uint16_t, rte_eth_event_type, rte_eth_dev_cb_fn, void*)' at rte_ethdev.c:4637:1 has some indirect sub-type changes:
> parameter 3 of type 'typedef rte_eth_dev_cb_fn' has sub-type changes:
> underlying type 'int (typedef uint16_t, enum rte_eth_event_type, void*, void*)*' changed:
> in pointed to type 'function type int (typedef uint16_t, enum rte_eth_event_type, void*, void*)':
> parameter 2 of type 'enum rte_eth_event_type' has sub-type changes:
> type size hasn't changed
> 2 enumerator insertions:
> 'rte_eth_event_type::RTE_ETH_EVENT_ERR_RECOVERING' value '11'
> 'rte_eth_event_type::RTE_ETH_EVENT_RECOVERED' value '12'
> 1 enumerator change:
> 'rte_eth_event_type::RTE_ETH_EVENT_MAX' from value '11' to '13' at rte_ethdev.h:3807:1
I don't immediately see the problem that this would cause.
There are no array sizes etc dependent on the value of MAX for instance.
Looks safe?
--
Regards, Ray K
More information about the dev
mailing list