[dpdk-dev] [PATCH] ethdev: claim device reset as async

Andrew Rybchenko arybchenko at solarflare.com
Fri Apr 5 10:15:53 CEST 2019


On 4/5/19 1:01 AM, Thomas Monjalon wrote:
> Hi,
>
> You forgot to Cc Andrew, co-maintainer of ethdev.
>
> 20/03/2019 05:54, Qi Zhang:
>> Device reset should be implemented in an async way since it is
>> possible to be invoked in interrupt thread and sometimes to reset a
>> device need to wait for some dependency, for example, a VF expects for
>> PF ready or a NIC function as part of a SOC wait for the whole system
>> reset complete, and all these time-consuming tasks will block the
>> interrupt thread.
>> The patch rename rte_eth_dev_reset to rte_eth_dev_reset_async and
>> rework the implementation. It will spawn a new thread which will call
>> ops->dev_reset, and when finished it will raise the event
>> RTE_ETH_EVENT_RESET_COMPLETE. The application should always wait for
>> this event before it continues to configure and restart the device.
>>
>> Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
>> ---
>> - * When this function is called, it first stops the port and then calls the
>> - * PMD specific dev_uninit( ) and dev_init( ) to return the port to initial
>> - * state, in which no Tx and Rx queues are setup, as if the port has been
>> - * reset and not started. The port keeps the port id it had before the
>> - * function call.
>> - *
>> - * After calling rte_eth_dev_reset( ), the application should use
>> - * rte_eth_dev_configure( ), rte_eth_rx_queue_setup( ),
>> - * rte_eth_tx_queue_setup( ), and rte_eth_dev_start( )
>> - * to reconfigure the device as appropriate.
>> - *
>> - * Note: To avoid unexpected behavior, the application should stop calling
>> - * Tx and Rx functions before calling rte_eth_dev_reset( ). For thread
>> - * safety, all these controlling functions should be called from the same
>> - * thread.
>> + * @note
>> + * Device reset may have the dependency, for example, a VF reset expects
>> + * PF ready, or a NIC function as a part of a SOC need to wait for other
>> + * parts of the system be ready, these are time-consuming tasks and will
>> + * block current thread.
>> + *
>> + * As the name, rte_eth_dev_reset_async is an async API, it will spwan a
>> + * new thread to call ops->dev_reset, once it is finished, it will raise
>> + * the RTE_ETH_EVENT_RESET_COMPLETE event to notify application.  That makes
>> + * things easy for an application that want to reset the device from the
>> + * interrupt thread since typically a RTE_ETH_EVENT_INTR_RESET handler is
>> + * invoked in interrupt thread. The typical implementation of ops->dev_reset
>> + * will do some hardware reset operations through calling dev_uninit() and
>> + * dev_init().
>> + *
>> + * Application should not assume device reset is finished after
>> + * rte_eth_dev_reset_async return, it should always wait for a
>> + * RTE_ETH_EVENT_RESET_COMPLETE event and check the reset result.
>> + * If reset success, application should call rte_eth_dev_configure( ),
>> + * rte_eth_rx_queue_setup( ), rte_eth_tx_queue_setup( ),
>> + * and rte_eth_dev_start( ) to reconfigure the device as appropriate.
>> + *
>> + * @Note
>> + * To avoid unexpected behavior, the application should stop calling
>> + * Tx and Rx functions before calling rte_eth_dev_reset_async( ).
>>    *
>>    * @param port_id
>>    *   The port identifier of the Ethernet device.
>> @@ -1880,12 +1892,10 @@ void rte_eth_dev_close(uint16_t port_id);
>>    *   - (0) if successful.
>>    *   - (-EINVAL) if port identifier is invalid.
>>    *   - (-ENOTSUP) if hardware doesn't support this function.
>> - *   - (-EPERM) if not ran from the primary process.
>> - *   - (-EIO) if re-initialisation failed or device is removed.
>>    *   - (-ENOMEM) if the reset failed due to OOM.
>> - *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
>> + *   - (<0) other errors from low level driver.
>>    */
>> -int rte_eth_dev_reset(uint16_t port_id);
>> +int rte_eth_dev_reset_async(uint16_t port_id);
> Sorry I didn't check whether this API is better or not,
> but I know it cannot be accepted before proposing a deprecation notice.
> Perhaps you may keep the old API and just add the new one.
>
> Honestly, I never really agreed with the purpose of the reset API.
> So making it async or not, I have no real opinion...
> ... but spawning a new thread at each function call, I feel it is bad.

I agree with Thomas.

It looks very dangerous from locking point of view when a PMD
callback is executed from dynamically created thread.



More information about the dev mailing list