[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

Maxime Coquelin maxime.coquelin at redhat.com
Fri Nov 4 13:54:00 CET 2016



On 11/04/2016 01:30 PM, Wang, Zhihong wrote:
>
>
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>> Sent: Friday, November 4, 2016 7:23 PM
>> To: Wang, Zhihong <zhihong.wang at intel.com>; Yuanhan Liu
>> <yuanhan.liu at linux.intel.com>
>> Cc: stephen at networkplumber.org; Pierre Pfister (ppfister)
>> <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>; dev at dpdk.org;
>> vkaplans at redhat.com; mst at redhat.com
>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the
>> TX path
>>
>>
>>
>>>>>> Hi Maxime,
>>>>>>
>>>>>> I did a little more macswap test and found out more stuff here:
>>>>> Thanks for doing more tests.
>>>>>
>>>>>>
>>>>>>  1. I did loopback test on another HSW machine with the same H/W,
>>>>>>     and indirect_desc on and off seems have close perf
>>>>>>
>>>>>>  2. So I checked the gcc version:
>>>>>>
>>>>>>      *  Previous: gcc version 6.2.1 20160916 (Fedora 24)
>>>>>>
>>>>>>      *  New: gcc version 5.4.0 20160609 (Ubuntu 16.04.1 LTS)
>>>>>
>>>>> On my side, I tested with RHEL7.3:
>>>>>  - gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11)
>>>>>
>>>>> It certainly contains some backports from newer GCC versions.
>>>>>
>>>>>>
>>>>>>     On previous one indirect_desc has 20% drop
>>>>>>
>>>>>>  3. Then I compiled binary on Ubuntu and scp to Fedora, and as
>>>>>>     expected I got the same perf as on Ubuntu, and the perf gap
>>>>>>     disappeared, so gcc is definitely one factor here
>>>>>>
>>>>>>  4. Then I use the Ubuntu binary on Fedora for PVP test, then the
>>>>>>     perf gap comes back again and the same with the Fedora binary
>>>>>>     results, indirect_desc causes about 20% drop
>>>>>
>>>>> Let me know if I understand correctly:
>>>
>>> Yes, and it's hard to breakdown further at this time.
>>>
>>> Also we may need to check whether it's caused by certain NIC
>>> model. Unfortunately I don't have the right setup right now.
>>>
>>>>> Loopback test with macswap:
>>>>>  - gcc version 6.2.1 : 20% perf drop
>>>>>  - gcc version 5.4.0 : No drop
>>>>>
>>>>> PVP test with macswap:
>>>>>  - gcc version 6.2.1 : 20% perf drop
>>>>>  - gcc version 5.4.0 : 20% perf drop
>>>>
>>>> I forgot to ask, did you recompile only host, or both host and guest
>>>> testmpd's in your test?
>>
>>> Both.
>>
>> I recompiled testpmd on a Fedora 24 machine using GCC6:
>> gcc (GCC) 6.1.1 20160621 (Red Hat 6.1.1-3)
>> Testing loopback with macswap on my Haswell RHEL7.3 machine gives me the
>> following results:
>>   - indirect on: 7.75Mpps
>>   - indirect off: 7.35Mpps
>>
>> Surprisingly, I get better results with indirect on my setup (I
>> reproduced the tests multiple times).
>>
>> Do you have a document explaining the tuning/config you apply to both
>> the host and the guest (isolation, HT, hugepage size, ...) in your
>> setup?
>
>
> The setup where it goes wrong:
>  1. Xeon E5-2699, HT on, turbo off, 1GB hugepage for both host and guest
On the Haswell machine (on which I don't have BIOS access), HT is on,
but I unplug siblings at runtime.
I also have 1G pages on both sides, and I isolate the cores used by both
testpmd and vCPUS.

>  2. Fortville 40G
>  3. Fedora 4.7.5-200.fc24.x86_64
>  4. gcc version 6.2.1
>  5. 16.11 RC2 for both host and guest
>  6. PVP, testpmd macswap for both host and guest
>
> BTW, I do see indirect_desc gives slightly better performance for loopback
> in tests on other platforms, but don't know how PVP performs yet.
Interesting, other platforms are also Haswell/Broadwell?

For PVP benchmarks, are your figures with 0% pkt loss?

Thanks,
Maxime

>
>
>>
>> Regards,
>> Maxime


More information about the dev mailing list