[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

Wang, Zhihong zhihong.wang at intel.com
Fri Nov 4 14:09:17 CET 2016



> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
> Sent: Friday, November 4, 2016 8:54 PM
> To: Wang, Zhihong <zhihong.wang at intel.com>; Yuanhan Liu
> <yuanhan.liu at linux.intel.com>
> Cc: stephen at networkplumber.org; Pierre Pfister (ppfister)
> <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>; dev at dpdk.org;
> vkaplans at redhat.com; mst at redhat.com
> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support
> to the TX path
> 
> 
> 
> On 11/04/2016 01:30 PM, Wang, Zhihong wrote:
> >
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
> >> Sent: Friday, November 4, 2016 7:23 PM
> >> To: Wang, Zhihong <zhihong.wang at intel.com>; Yuanhan Liu
> >> <yuanhan.liu at linux.intel.com>
> >> Cc: stephen at networkplumber.org; Pierre Pfister (ppfister)
> >> <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>;
> dev at dpdk.org;
> >> vkaplans at redhat.com; mst at redhat.com
> >> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
> support to the
> >> TX path
> >>
> >>
> >>
> >>>>>> Hi Maxime,
> >>>>>>
> >>>>>> I did a little more macswap test and found out more stuff here:
> >>>>> Thanks for doing more tests.
> >>>>>
> >>>>>>
> >>>>>>  1. I did loopback test on another HSW machine with the same H/W,
> >>>>>>     and indirect_desc on and off seems have close perf
> >>>>>>
> >>>>>>  2. So I checked the gcc version:
> >>>>>>
> >>>>>>      *  Previous: gcc version 6.2.1 20160916 (Fedora 24)
> >>>>>>
> >>>>>>      *  New: gcc version 5.4.0 20160609 (Ubuntu 16.04.1 LTS)
> >>>>>
> >>>>> On my side, I tested with RHEL7.3:
> >>>>>  - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
> >>>>>
> >>>>> It certainly contains some backports from newer GCC versions.
> >>>>>
> >>>>>>
> >>>>>>     On previous one indirect_desc has 20% drop
> >>>>>>
> >>>>>>  3. Then I compiled binary on Ubuntu and scp to Fedora, and as
> >>>>>>     expected I got the same perf as on Ubuntu, and the perf gap
> >>>>>>     disappeared, so gcc is definitely one factor here
> >>>>>>
> >>>>>>  4. Then I use the Ubuntu binary on Fedora for PVP test, then the
> >>>>>>     perf gap comes back again and the same with the Fedora binary
> >>>>>>     results, indirect_desc causes about 20% drop
> >>>>>
> >>>>> Let me know if I understand correctly:
> >>>
> >>> Yes, and it's hard to breakdown further at this time.
> >>>
> >>> Also we may need to check whether it's caused by certain NIC
> >>> model. Unfortunately I don't have the right setup right now.
> >>>
> >>>>> Loopback test with macswap:
> >>>>>  - gcc version 6.2.1 : 20% perf drop
> >>>>>  - gcc version 5.4.0 : No drop
> >>>>>
> >>>>> PVP test with macswap:
> >>>>>  - gcc version 6.2.1 : 20% perf drop
> >>>>>  - gcc version 5.4.0 : 20% perf drop
> >>>>
> >>>> I forgot to ask, did you recompile only host, or both host and guest
> >>>> testmpd's in your test?
> >>
> >>> Both.
> >>
> >> I recompiled testpmd on a Fedora 24 machine using GCC6:
> >> gcc (GCC) 6.1.1 20160621 (Red Hat 6.1.1-3)
> >> Testing loopback with macswap on my Haswell RHEL7.3 machine gives me
> the
> >> following results:
> >>   - indirect on: 7.75Mpps
> >>   - indirect off: 7.35Mpps
> >>
> >> Surprisingly, I get better results with indirect on my setup (I
> >> reproduced the tests multiple times).
> >>
> >> Do you have a document explaining the tuning/config you apply to both
> >> the host and the guest (isolation, HT, hugepage size, ...) in your
> >> setup?
> >
> >
> > The setup where it goes wrong:
> >  1. Xeon E5-2699, HT on, turbo off, 1GB hugepage for both host and guest
> On the Haswell machine (on which I don't have BIOS access), HT is on,
> but I unplug siblings at runtime.
> I also have 1G pages on both sides, and I isolate the cores used by both
> testpmd and vCPUS.
> 
> >  2. Fortville 40G
> >  3. Fedora 4.7.5-200.fc24.x86_64
> >  4. gcc version 6.2.1
> >  5. 16.11 RC2 for both host and guest
> >  6. PVP, testpmd macswap for both host and guest
> >
> > BTW, I do see indirect_desc gives slightly better performance for loopback
> > in tests on other platforms, but don't know how PVP performs yet.
> Interesting, other platforms are also Haswell/Broadwell?

Yes, but with different OS.

If you don't have the setup I can do more detailed profiling for the
root cause next week, since my platform is the only one right now that
reporting the drop.


> 
> For PVP benchmarks, are your figures with 0% pkt loss?

No, for testpmd perf analysis it's not necessary in my opinion.

I do tried low rate though, the result is the same.

> 
> Thanks,
> Maxime
> 
> >
> >
> >>
> >> Regards,
> >> Maxime


More information about the dev mailing list