[dpdk-dev] [PATCH] dmadev: introduce DMA device library

fengchengwen fengchengwen at huawei.com
Tue Jul 6 10:20:38 CEST 2021


On 2021/7/5 18:52, Bruce Richardson wrote:
> On Sun, Jul 04, 2021 at 03:00:30PM +0530, Jerin Jacob wrote:
>> On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen at huawei.com> wrote:

[snip]

>>> + *
>>> + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
>>> + * code.
>>> + * When using cookies, comply with the following rules:
>>> + * a) Cookies for each virtual queue are independent.
>>> + * b) For a virt queue, the cookie are monotonically incremented, when it reach
>>> + *    the INT_MAX, it wraps back to zero.
> 
> I disagree with the INT_MAX (or INT32_MAX) value here. If we use that
> value, it means that we cannot use implicit wrap-around inside the CPU and
> have to check for the INT_MAX value. Better to:
> 1. Specify that it wraps at UINT16_MAX which allows us to just use a
> uint16_t internally and wrap-around automatically, or:
> 2. Specify that it wraps at a power-of-2 value >= UINT16_MAX, giving
> drivers the flexibility at what value to wrap around.

+1 for option 1
BTW: option 2 seem a little complicated for driver and application.

>> When the system is configured as IOVA as VA
>> 1) Device supports any VA address like memory from rte_malloc(),
>> rte_memzone(), malloc, stack memory
>> 2) Device support only VA address from rte_malloc(), rte_memzone() i.e
>> memory backed by hugepage and added to DMA map.
>>
>> When the system is configured as IOVA as PA
>> 1) Devices support only PA addresses .
>>
>> IMO, Above needs to be  advertised as capability and application needs
>> to align with that
>> and I dont think application requests the driver to work in any of the modes.
>>
>>
> 
> I don't think we need this level of detail for addressing capabilities.
> Unless I'm missing something, the hardware should behave exactly as other
> hardware does taking in iova's.  If the user wants to check whether virtual
> addresses to pinned memory can be used directly, the user can call
> "rte_eal_iova_mode". We can't have a situation where some hardware uses one
> type of addresses and another hardware the other.
> 
> Therefore, the only additional addressing capability we should need to
> report is that the hardware can use SVM/SVA and use virtual addresses not
> in hugepage memory.
> 

I discuss the addressing capability in previous thread.
Indeed, we can reduce it to just one capability.

>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>> + *
>>> + * Enqueue a fill operation onto the DMA virt queue
>>> + *
>>> + * This queues up a fill operation to be performed by hardware, but does not
>>> + * trigger hardware to begin that operation.
>>> + *
>>> + * @param dev_id
>>> + *   The identifier of the device.
>>> + * @param vq_id
>>> + *   The identifier of virt queue.
>>> + * @param pattern
>>> + *   The pattern to populate the destination buffer with.
>>> + * @param dst
>>> + *   The address of the destination buffer.
>>> + * @param length
>>> + *   The length of the destination buffer.
>>> + * @param flags
>>> + *   An opaque flags for this operation.
>>
>> PLEASE REMOVE opaque stuff from fastpath it will be a pain for
>> application writers as
>> they need to write multiple combinations of fastpath. flags are OK, if
>> we have a valid
>> generic flag now to control the transfer behavior.
>>
> 
> +1. Flags need to be explicitly listed. If we don't have any flags for now,
> we can specify that the value must be given as zero and it's for future
> use.
> 

+1, I will delete the flags parameters.

Currently we have fence which was implemented by ops, if later need more flags,
maybe we need create one new ops, this is not the way to expand.

So I think we need change fence ops to extra_flags ops:
	rte_dmadev_fence(uint16_t dev_id, uint16_t vq_id)
to
	rte_dmadev_extra_flags(uint16_t dev_id, uint16_t vq_id, uint64_t flags);

So we could add fence by: rte_dmadev_extra_flags(dev_id, vq_id, RTE_DMA_FLAG_FENCE);
	
>>> +/**
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice.
>>> + *
>>> + * Returns the number of operations that failed to complete.
>>> + * NOTE: This API was used when rte_dmadev_completed has_error was set.
>>> + *
>>> + * @param dev_id
>>> + *   The identifier of the device.
>>> + * @param vq_id
>>> + *   The identifier of virt queue.
>> (> + * @param nb_status
>>> + *   Indicates the size  of status array.
>>> + * @param[out] status
>>> + *   The error code of operations that failed to complete.
>>> + * @param[out] cookie
>>> + *   The last failed completed operation's cookie.
>>> + *
>>> + * @return
>>> + *   The number of operations that failed to complete.
>>> + *
>>> + * NOTE: The caller must ensure that the input parameter is valid and the
>>> + *       corresponding device supports the operation.
>>> + */
>>> +__rte_experimental
>>> +static inline uint16_t
>>> +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
>>> +                          const uint16_t nb_status, uint32_t *status,
>>> +                          dma_cookie_t *cookie)
>>
>> IMO, it is better to move cookie/rind_idx at 3.
>> Why it would return any array of errors? since it called after
>> rte_dmadev_completed() has
>> has_error. Is it better to change
>>
>> rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
>> *cookie,  uint32_t *status)
>>
>> I also think, we may need to set status as bitmask and enumerate all
>> the combination of error codes
>> of all the driver and return string from driver existing rte_flow_error
>>
>> See
>> struct rte_flow_error {
>>         enum rte_flow_error_type type; /**< Cause field and error types. */
>>         const void *cause; /**< Object responsible for the error. */
>>         const char *message; /**< Human-readable error message. */
>> };
>>
> 
> I think we need a multi-return value API here, as we may add operations in
> future which have non-error status values to return. The obvious case is
> DMA engines which support "compare" operations. In that case a successful

Just curious, what the 'compare' operations's application scenario ?

> compare (as in there were no DMA or HW errors) can return "equal" or
> "not-equal" as statuses. For general "copy" operations, the faster
> completion op can be used to just return successful values (and only call
> this status version on error), while apps using those compare ops or a
> mixture of copy and compare ops, would always use the slower one that
> returns status values for each and every op..

In the current design, rte_dmadev_completed_fails applies only to failure
scenarios. Do you mean in 'compare' operations, the status always non-zero
whether or not the two are consistent ?

> 
> The ioat APIs used 32-bit integer values for this status array so as to
> allow e.g. 16-bits for error code and 16-bits for future status values. For
> most operations there should be a fairly small set of things that can go
> wrong, i.e. bad source address, bad destination address or invalid length.
> Within that we may have a couple of specifics for why an address is bad,
> but even so I don't think we need to start having multiple bit
> combinations.
> 
>>> +{
>>> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
>>> +       return (*dev->completed_fails)(dev, vq_id, nb_status, status, cookie);
>>> +}
>>> +
>>> +struct rte_dmadev_stats {
>>> +       uint64_t enqueue_fail_count;
>>> +       /**< Conut of all operations which failed enqueued */
>>> +       uint64_t enqueued_count;
>>> +       /**< Count of all operations which successful enqueued */
>>> +       uint64_t completed_fail_count;
>>> +       /**< Count of all operations which failed to complete */
>>> +       uint64_t completed_count;
>>> +       /**< Count of all operations which successful complete */
>>> +};
>>
>> We need to have capability API to tell which items are
>> updated/supported by the driver.
>>
> 
> I also would remove the enqueue fail counts, since they are better counted
> by the app. If a driver reports 20,000 failures we have no way of knowing
> if that is 20,000 unique operations which failed to enqueue or a single
> operation which failed to enqueue 20,000 times but succeeded on attempt
> 20,001.
> 

This does exist, The application may just show a DEBUG trace other than recording.
So I would recommend keeping at least know if it happens after a long run.

> 



More information about the dev mailing list