[dpdk-dev] [PATCH] dmadev: introduce DMA device library

Jerin Jacob jerinjacobk at gmail.com
Thu Jul 8 20:35:40 CEST 2021


On Thu, Jul 8, 2021 at 8:41 AM fengchengwen <fengchengwen at huawei.com> wrote:
>

> >>>
> >>> It's just more conditionals and branches all through the code. Inside the
> >>> user application, the user has to check whether to set the flag or not (or
> >>> special-case the last transaction outside the loop), and within the driver,
> >>> there has to be a branch whether or not to call the doorbell function. The
> >>> code on both sides is far simpler and more readable if the doorbell
> >>> function is exactly that - a separate function.
> >>
> >> I disagree. The reason is:
> >>
> >> We will have two classes of applications
> >>
> >> a) do dma copy request as and when it has data(I think, this is the
> >> prime use case), for those,
> >> I think, it is considerable overhead to have two function invocation
> >> per transfer i.e
> >> rte_dma_copy() and rte_dma_perform()
> >>
> >> b) do dma copy when the data is reached to a logical state,  like copy
> >> IP frame from Ethernet packets or so,
> >> In that case, the application will have  a LOGIC to detect when to
> >> perform it so on the end of
> >> that rte_dma_copy() flag can be updated to fire the doorbell.
> >>
> >> IMO, We are comparing against a branch(flag is already in register) vs
> >> a set of instructions for
> >> 1) function pointer overhead
> >> 2) Need to use the channel context again back in another function.
> >>
> >> IMO, a single branch is most optimal from performance PoV.
> >>
> > Ok, let's try it and see how it goes.
>
> Test result show:
> 1) For Kunpeng platform (ARMv8) could benefit very little with doorbell in flags
> 2) For Xeon E5-2690 v2 (X86) could benefit with separate function
> 3) Both platform could benefit with doorbell in flags if burst < 5
>
> There is a performance gain in small bursts (<5). Given the extensive use of bursts
 in DPDK applications and users are accustomed to the concept, I do
not recommend
> using the 'doorbell' in flags.

There is NO concept change between one option vs other option. Just
argument differnet.
Also, _perform() scheme not used anywhere in DPDK. I

Regarding performance, I have added dummy instructions to simulate the real work
load[1], now burst also has some gain in both x86 and arm64[3]

I have modified your application[2] to dpdk test application to use
cpu isolation etc.
So this is gain in flag scheme ad code is checked in to Github[2[

[1]
static inline void
delay(void)
{
        volatile int k;

        for (k = 0; k < 16; k++) {


      }

}

__rte_noinline
int
hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst,
int len, const int flags)
{
         delay();

        *ring = 1;

         return 0;
}

__rte_noinline
int
hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src,
void *dst, int len, const int flags)
{
        delay();

        *ring = 1;

        if (unlikely(flags == 1)) {
                rte_wmb();
                *doorbell = 1;
        }
      return 0;
}


[2]

https://github.com/jerinjacobk/dpdk-dmatest/commit/4fc9bc3029543bbc4caaa5183d98bac93c34f588

Update results [3]

Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xf00

core=24 Timer running at 2600.00MHz
   test_for_perform_after_multiple_enqueue: burst=1 cycles=46.000000
      test_for_last_enqueue_issue_doorbell: burst=1 cycles=45.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=2 cycles=90.000000
      test_for_last_enqueue_issue_doorbell: burst=2 cycles=89.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=3 cycles=134.000000
      test_for_last_enqueue_issue_doorbell: burst=3 cycles=133.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=4 cycles=177.000000
      test_for_last_enqueue_issue_doorbell: burst=4 cycles=176.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=5 cycles=221.000000
      test_for_last_enqueue_issue_doorbell: burst=5 cycles=221.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=6 cycles=265.000000
      test_for_last_enqueue_issue_doorbell: burst=6 cycles=265.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=7 cycles=333.000000
      test_for_last_enqueue_issue_doorbell: burst=7 cycles=309.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=8 cycles=375.000000
      test_for_last_enqueue_issue_doorbell: burst=8 cycles=373.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=9 cycles=418.000000
      test_for_last_enqueue_issue_doorbell: burst=9 cycles=414.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=10 cycles=462.000000
      test_for_last_enqueue_issue_doorbell: burst=10 cycles=458.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=11 cycles=507.000000
      test_for_last_enqueue_issue_doorbell: burst=11 cycles=501.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=12 cycles=552.000000
      test_for_last_enqueue_issue_doorbell: burst=12 cycles=546.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=13 cycles=593.000000
      test_for_last_enqueue_issue_doorbell: burst=13 cycles=590.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=14 cycles=638.000000
      test_for_last_enqueue_issue_doorbell: burst=14 cycles=634.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=15 cycles=681.000000
      test_for_last_enqueue_issue_doorbell: burst=15 cycles=678.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=16 cycles=725.000000
      test_for_last_enqueue_issue_doorbell: burst=16 cycles=722.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=17 cycles=770.000000
      test_for_last_enqueue_issue_doorbell: burst=17 cycles=767.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=18 cycles=815.000000
      test_for_last_enqueue_issue_doorbell: burst=18 cycles=812.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=19 cycles=857.000000
      test_for_last_enqueue_issue_doorbell: burst=19 cycles=854.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=20 cycles=902.000000
      test_for_last_enqueue_issue_doorbell: burst=20 cycles=899.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=21 cycles=945.000000
      test_for_last_enqueue_issue_doorbell: burst=21 cycles=943.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=22 cycles=990.000000
      test_for_last_enqueue_issue_doorbell: burst=22 cycles=988.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=23 cycles=1033.000000
      test_for_last_enqueue_issue_doorbell: burst=23 cycles=1031.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=24 cycles=1077.000000
      test_for_last_enqueue_issue_doorbell: burst=24 cycles=1075.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=25 cycles=1121.000000
      test_for_last_enqueue_issue_doorbell: burst=25 cycles=1119.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=26 cycles=1166.000000
      test_for_last_enqueue_issue_doorbell: burst=26 cycles=1163.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=27 cycles=1208.000000
      test_for_last_enqueue_issue_doorbell: burst=27 cycles=1208.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=28 cycles=1252.000000
      test_for_last_enqueue_issue_doorbell: burst=28 cycles=1252.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=29 cycles=1295.000000
      test_for_last_enqueue_issue_doorbell: burst=29 cycles=1295.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=30 cycles=1342.000000
      test_for_last_enqueue_issue_doorbell: burst=30 cycles=1340.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=31 cycles=1386.000000
      test_for_last_enqueue_issue_doorbell: burst=31 cycles=1384.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=32 cycles=1429.000000
      test_for_last_enqueue_issue_doorbell: burst=32 cycles=1428.000000
-------------------------------------------------------------------------------



octeontx2:

See https://doc.dpdk.org/guides/prog_guide/profile_app.html section
62.2.3. High-resolution cycle counter


meson --cross config/arm/arm64_octeontx2_linux_gcc
-Dc_args='-DRTE_ARM_EAL_RDTSC_USE_PMU' build

 echo "dma_perf_autotest" | ./build/app/test/dpdk-test --no-huge -c 0xff0000
RTE>>dma_perf_autotest^M
lcore=16 Timer running at 2400.00MHz
   test_for_perform_after_multiple_enqueue: burst=1 cycles=105.000000
      test_for_last_enqueue_issue_doorbell: burst=1 cycles=105.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=2 cycles=207.000000
      test_for_last_enqueue_issue_doorbell: burst=2 cycles=207.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=3 cycles=309.000000
      test_for_last_enqueue_issue_doorbell: burst=3 cycles=310.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=4 cycles=411.000000
      test_for_last_enqueue_issue_doorbell: burst=4 cycles=410.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=5 cycles=513.000000
      test_for_last_enqueue_issue_doorbell: burst=5 cycles=512.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=6 cycles=615.000000
      test_for_last_enqueue_issue_doorbell: burst=6 cycles=615.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=7 cycles=717.000000
      test_for_last_enqueue_issue_doorbell: burst=7 cycles=716.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=8 cycles=819.000000
      test_for_last_enqueue_issue_doorbell: burst=8 cycles=818.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=9 cycles=921.000000
      test_for_last_enqueue_issue_doorbell: burst=9 cycles=922.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=10 cycles=1023.000000
      test_for_last_enqueue_issue_doorbell: burst=10 cycles=1022.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=11 cycles=1126.000000
      test_for_last_enqueue_issue_doorbell: burst=11 cycles=1124.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=12 cycles=1227.000000
      test_for_last_enqueue_issue_doorbell: burst=12 cycles=1227.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=13 cycles=1329.000000
      test_for_last_enqueue_issue_doorbell: burst=13 cycles=1328.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=14 cycles=1431.000000
      test_for_last_enqueue_issue_doorbell: burst=14 cycles=1430.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=15 cycles=1534.000000
      test_for_last_enqueue_issue_doorbell: burst=15 cycles=1534.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=16 cycles=1638.000000
      test_for_last_enqueue_issue_doorbell: burst=16 cycles=1640.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=17 cycles=1746.000000
      test_for_last_enqueue_issue_doorbell: burst=17 cycles=1739.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=18 cycles=1847.000000
      test_for_last_enqueue_issue_doorbell: burst=18 cycles=1841.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=19 cycles=1950.000000
      test_for_last_enqueue_issue_doorbell: burst=19 cycles=1944.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=20 cycles=2051.000000
       test_for_last_enqueue_issue_doorbell: burst=20 cycles=2045.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=21 cycles=2154.000000
      test_for_last_enqueue_issue_doorbell: burst=21 cycles=2148.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=22 cycles=2257.000000
      test_for_last_enqueue_issue_doorbell: burst=22 cycles=2249.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=23 cycles=2358.000000
      test_for_last_enqueue_issue_doorbell: burst=23 cycles=2352.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=24 cycles=2459.000000
      test_for_last_enqueue_issue_doorbell: burst=24 cycles=2454.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=25 cycles=2562.000000
      test_for_last_enqueue_issue_doorbell: burst=25 cycles=2555.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=26 cycles=2665.000000
      test_for_last_enqueue_issue_doorbell: burst=26 cycles=2657.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=27 cycles=2766.000000
      test_for_last_enqueue_issue_doorbell: burst=27 cycles=2760.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=28 cycles=2867.000000
      test_for_last_enqueue_issue_doorbell: burst=28 cycles=2861.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=29 cycles=2970.000000
      test_for_last_enqueue_issue_doorbell: burst=29 cycles=2964.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=30 cycles=3073.000000
      test_for_last_enqueue_issue_doorbell: burst=30 cycles=3065.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=31 cycles=3174.000000
      test_for_last_enqueue_issue_doorbell: burst=31 cycles=3168.000000
-------------------------------------------------------------------------------
   test_for_perform_after_multiple_enqueue: burst=32 cycles=3275.000000
      test_for_last_enqueue_issue_doorbell: burst=32 cycles=3269.000000
-------------------------------------------------------------------------------
Test OK
RTE>




> And also user may confuse about the doorbell operations.
>
> Kunpeng platform test result:
>     [root at SZ tmp]# ./a1 1
>     burst = 1
>     perform_after_multiple_enqueue: burst:1 cost:0s.554422us
>     doorbell_for_every_enqueue: burst:1 cost:0s.450927us
>     last_enqueue_issue_doorbell: burst:1 cost:0s.450479us
>     [root at SZ tmp]#
>     [root at SZ tmp]# ./a1 2
>     burst = 2
>     perform_after_multiple_enqueue: burst:2 cost:0s.900884us
>     doorbell_for_every_enqueue: burst:2 cost:0s.866732us
>     last_enqueue_issue_doorbell: burst:2 cost:0s.732469us
>     [root at SZ tmp]# ./a1 5
>     burst = 5
>     perform_after_multiple_enqueue: burst:5 cost:1s.732410us
>     doorbell_for_every_enqueue: burst:5 cost:2s.115479us
>     last_enqueue_issue_doorbell: burst:5 cost:1s.759349us
>     [root at SZ tmp]# ./a1 10
>     burst = 10
>     perform_after_multiple_enqueue: burst:10 cost:3s.490716us
>     doorbell_for_every_enqueue: burst:10 cost:4s.194691us
>     last_enqueue_issue_doorbell: burst:10 cost:3s.331825us
>     [root at SZ tmp]# ./a1 30
>     burst = 30
>     perform_after_multiple_enqueue: burst:30 cost:9s.61761us
>     doorbell_for_every_enqueue: burst:30 cost:12s.517082us
>     last_enqueue_issue_doorbell: burst:30 cost:9s.614802us
>
> X86 platform test result:
>     fengchengwen at SZ:~/tmp$ ./a1 1
>     burst = 1
>     perform_after_multiple_enqueue: burst:1 cost:0s.406331us
>     doorbell_for_every_enqueue: burst:1 cost:0s.331109us
>     last_enqueue_issue_doorbell: burst:1 cost:0s.381782us
>     fengchengwen at SZ:~/tmp$ ./a1 2
>     burst = 2
>     perform_after_multiple_enqueue: burst:2 cost:0s.569024us
>     doorbell_for_every_enqueue: burst:2 cost:0s.643449us
>     last_enqueue_issue_doorbell: burst:2 cost:0s.486639us
>     fengchengwen at SZ:~/tmp$ ./a1 5
>     burst = 5
>     perform_after_multiple_enqueue: burst:5 cost:1s.166384us
>     doorbell_for_every_enqueue: burst:5 cost:1s.602369us
>     last_enqueue_issue_doorbell: burst:5 cost:1s.209392us
>     fengchengwen at SZ:~/tmp$ ./a1 10
>     burst = 10
>     perform_after_multiple_enqueue: burst:10 cost:2s.229901us
>     doorbell_for_every_enqueue: burst:10 cost:3s.754802us
>     last_enqueue_issue_doorbell: burst:10 cost:2s.328705us
>     fengchengwen at SZ:~/tmp$
>     fengchengwen at SZ:~/tmp$ ./a1 30
>     burst = 30
>     perform_after_multiple_enqueue: burst:30 cost:6s.132817us
>     doorbell_for_every_enqueue: burst:30 cost:9s.944619us
>     last_enqueue_issue_doorbell: burst:30 cost:7s.73551us
>
>
> test-code:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <stdbool.h>
> #include <time.h>
> #include <sys/time.h>
>
> struct dmadev;
>
> unsigned int dev_reg[10240];
> volatile unsigned int *ring;
> volatile unsigned int *doorbell;
>
> void init_global(void)
> {
>         ring = &dev_reg[100];
>         doorbell = &dev_reg[10000];
> }
>
> #define rte_wmb() asm volatile("dmb oshst" : : : "memory")
> //#define rte_wmb() asm volatile ("" : : : "memory")
>
> typedef int (*enqueue_t)(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags);
> typedef void (*perform_t)(struct dmadev *dev, int vchan);
>
> struct dmadev {
>         enqueue_t enqueue;
>         perform_t perform;
>         char rsv[512];
> };
>
> int hisi_dma_enqueue(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags)
> {
>         *ring = 1;
> }
>
> int hisi_dma_enqueue_doorbell(struct dmadev *dev, int vchan, void *src, void *dst, int len, int flags)
> {
>         *ring = 1;
>         if (flags == 1) {
>                 rte_wmb();
>                 *doorbell = 1;
>         }
> }
>
> void hisi_dma_perform(struct dmadev *dev, int vchan)
> {
>         rte_wmb();
>         *doorbell = 1;
> }
>
> struct dmadev devlist[64];
>
> void init_devlist(bool enq_doorbell)
> {
>         int i;
>         for (i = 0; i < 64; i++) {
>                 devlist[i].enqueue = enq_doorbell ? hisi_dma_enqueue_doorbell : hisi_dma_enqueue;
>                 devlist[i].perform = hisi_dma_perform;
>         }
> }
>
> static inline int dma_enqueue(int dev_id, int vchan, void *src, void *dst, int len, int flags)
> {
>         struct dmadev *dev = &devlist[dev_id];
>         return dev->enqueue(dev, vchan, src, dst, len, flags);
> }
>
> static inline void dma_perform(int dev_id, int vchan)
> {
>         struct dmadev *dev = &devlist[dev_id];
>         return dev->perform(dev, vchan);
> }
>
> #define MAX_LOOPS       90000000
>
> void test_for_perform_after_multiple_enqueue(int burst)
> {
>         struct timeval start, end, delta;
>         unsigned int i, j;
>         init_devlist(false);
>         gettimeofday(&start, NULL);
>         for (i = 0; i < MAX_LOOPS; i++) {
>                 for (j = 0; j < burst; j++)
>                         (void)dma_enqueue(10, 0, NULL, NULL, 0, 0);
>                 dma_perform(10, 0);
>         }
>         gettimeofday(&end, NULL);
>         timersub(&end, &start, &delta);
>         printf("perform_after_multiple_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
> }
>
> void test_for_doorbell_for_every_enqueue(int burst)
> {
>         struct timeval start, end, delta;
>         unsigned int i, j;
>         init_devlist(true);
>         gettimeofday(&start, NULL);
>         for (i = 0; i < MAX_LOOPS; i++) {
>                 for (j = 0; j < burst; j++)
>                         (void)dma_enqueue(10, 0, NULL, NULL, 0, 1);
>         }
>         gettimeofday(&end, NULL);
>         timersub(&end, &start, &delta);
>         printf("doorbell_for_every_enqueue: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
> }
>
> void test_for_last_enqueue_issue_doorbell(int burst)
> {
>         struct timeval start, end, delta;
>         unsigned int i, j;
>         init_devlist(true);
>         gettimeofday(&start, NULL);
>         for (i = 0; i < MAX_LOOPS; i++) {
>                 for (j = 0; j < burst - 1; j++)
>                         (void)dma_enqueue(10, 0, NULL, NULL, 0, 0);
>                 dma_enqueue(10, 0, NULL, NULL, 0, 1);
>         }
>         gettimeofday(&end, NULL);
>         timersub(&end, &start, &delta);
>         printf("last_enqueue_issue_doorbell: burst:%d cost:%us.%uus \n", burst, delta.tv_sec, delta.tv_usec);
> }
>
> void main(int argc, char *argv[])
> {
>         if (argc < 2) {
>                 printf("please input burst parameter!\n");
>                 return;
>         }
>         init_global();
>         int burst = atol(argv[1]);
>         printf("burst = %d \n", burst);
>         test_for_perform_after_multiple_enqueue(burst);
>         test_for_doorbell_for_every_enqueue(burst);
>         test_for_last_enqueue_issue_doorbell(burst);
> }
>
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>> <snip>
> >>>>>>>>> + +/** + * @warning + * @b EXPERIMENTAL: this API may change
> >>>>>>>>> without prior notice.  + * + * Returns the number of operations
> >>>>>>>>> that failed to complete.  + * NOTE: This API was used when
> >>>>>>>>> rte_dmadev_completed has_error was set.  + * + * @param dev_id
> >>>>>>>>> + *   The identifier of the device.  + * @param vq_id + *   The
> >>>>>>>>> identifier of virt queue.
> >>>>>>>> (> + * @param nb_status
> >>>>>>>>> + *   Indicates the size  of status array.  + * @param[out]
> >>>>>>>>> status + *   The error code of operations that failed to
> >>>>>>>>> complete.  + * @param[out] cookie + *   The last failed
> >>>>>>>>> completed operation's cookie.  + * + * @return + *   The number
> >>>>>>>>> of operations that failed to complete.  + * + * NOTE: The
> >>>>>>>>> caller must ensure that the input parameter is valid and the +
> >>>>>>>>> *       corresponding device supports the operation.  + */
> >>>>>>>>> +__rte_experimental +static inline uint16_t
> >>>>>>>>> +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id, +
> >>>>>>>>> const uint16_t nb_status, uint32_t *status, +
> >>>>>>>>> dma_cookie_t *cookie)
> >>>>>>>>
> >>>>>>>> IMO, it is better to move cookie/rind_idx at 3.  Why it would
> >>>>>>>> return any array of errors? since it called after
> >>>>>>>> rte_dmadev_completed() has has_error. Is it better to change
> >>>>>>>>
> >>>>>>>> rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id,
> >>>>>>>> dma_cookie_t *cookie,  uint32_t *status)
> >>>>>>>>
> >>>>>>>> I also think, we may need to set status as bitmask and enumerate
> >>>>>>>> all the combination of error codes of all the driver and return
> >>>>>>>> string from driver existing rte_flow_error
> >>>>>>>>
> >>>>>>>> See struct rte_flow_error { enum rte_flow_error_type type; /**<
> >>>>>>>> Cause field and error types. */ const void *cause; /**< Object
> >>>>>>>> responsible for the error. */ const char *message; /**<
> >>>>>>>> Human-readable error message. */ };
> >>>>>>>>
> >>>>>>>
> >>>>>>> I think we need a multi-return value API here, as we may add
> >>>>>>> operations in future which have non-error status values to return.
> >>>>>>> The obvious case is DMA engines which support "compare" operations.
> >>>>>>> In that case a successful compare (as in there were no DMA or HW
> >>>>>>> errors) can return "equal" or "not-equal" as statuses. For general
> >>>>>>> "copy" operations, the faster completion op can be used to just
> >>>>>>> return successful values (and only call this status version on
> >>>>>>> error), while apps using those compare ops or a mixture of copy and
> >>>>>>> compare ops, would always use the slower one that returns status
> >>>>>>> values for each and every op..
> >>>>>>>
> >>>>>>> The ioat APIs used 32-bit integer values for this status array so
> >>>>>>> as to allow e.g. 16-bits for error code and 16-bits for future
> >>>>>>> status values. For most operations there should be a fairly small
> >>>>>>> set of things that can go wrong, i.e. bad source address, bad
> >>>>>>> destination address or invalid length.  Within that we may have a
> >>>>>>> couple of specifics for why an address is bad, but even so I don't
> >>>>>>> think we need to start having multiple bit combinations.
> >>>>>>
> >>>>>> OK. What is the purpose of errors status? Is it for application
> >>>>>> printing it or Does the application need to take any action based on
> >>>>>> specific error requests?
> >>>>>
> >>>>> It's largely for information purposes, but in the case of SVA/SVM
> >>>>> errors could occur due to the memory not being pinned, i.e. a page
> >>>>> fault, in some cases. If that happens, then it's up the app to either
> >>>>> touch the memory and retry the copy, or to do a SW memcpy as a
> >>>>> fallback.
> >>>>>
> >>>>> In other error cases, I think it's good to tell the application if it's
> >>>>> passing around bad data, or data that is beyond the scope of hardware,
> >>>>> e.g.  a copy that is beyond what can be done in a single transaction
> >>>>> for a HW instance. Given that there are always things that can go
> >>>>> wrong, I think we need some error reporting mechanism.
> >>>>>
> >>>>>> If the former is scope, then we need to define the standard enum
> >>>>>> value for the error right?  ie. uint32_t *status needs to change to
> >>>>>> enum rte_dma_error or so.
> >>>>>>
> >>>>> Sure. Perhaps an error/status structure either is an option, where we
> >>>>> explicitly call out error info from status info.
> >>>>
> >>>> Agree. Better to have a structure with filed like,
> >>>>
> >>>> 1)  enum rte_dma_error_type 2)  memory to store, informative message on
> >>>> fine aspects of error.  LIke address caused issue etc.(Which will be
> >>>> driver-specific information).
> >>>>
> >>> The only issue I have with that is that once we have driver specific
> >>> information it is of little use to the application, since it can't know
> >>> anything about it excepy maybe log it.  I'd much rather have a set of error
> >>> codes telling user that "source address is wrong", "dest address is wrong",
> >>> and a generic "an address is wrong" in case driver/HW cannot distinguish
> >>> source of error. Can we see how far we get with just error codes before we
> >>> start into passing string messages around and all the memory management
> >>> headaches that implies.
> >>
> >> Works for me. It should be "enum rte_dma_error_type" then, which has a standard
> >> error type. Which is missing in the spec now.
> >>
> > +1
> > .
> >


More information about the dev mailing list