[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path

Maxime Coquelin maxime.coquelin at redhat.com
Thu Nov 3 09:11:05 CET 2016



On 11/02/2016 11:51 AM, Maxime Coquelin wrote:
>
>
> On 10/31/2016 11:01 AM, Wang, Zhihong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Maxime Coquelin [mailto:maxime.coquelin at redhat.com]
>>> Sent: Friday, October 28, 2016 3:42 PM
>>> To: Wang, Zhihong <zhihong.wang at intel.com>; Yuanhan Liu
>>> <yuanhan.liu at linux.intel.com>
>>> Cc: stephen at networkplumber.org; Pierre Pfister (ppfister)
>>> <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>; dev at dpdk.org;
>>> vkaplans at redhat.com; mst at redhat.com
>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>> support
>>> to the TX path
>>>
>>>
>>>
>>> On 10/28/2016 02:49 AM, Wang, Zhihong wrote:
>>>>
>>>>>> -----Original Message-----
>>>>>> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
>>>>>> Sent: Thursday, October 27, 2016 6:46 PM
>>>>>> To: Maxime Coquelin <maxime.coquelin at redhat.com>
>>>>>> Cc: Wang, Zhihong <zhihong.wang at intel.com>;
>>>>>> stephen at networkplumber.org; Pierre Pfister (ppfister)
>>>>>> <ppfister at cisco.com>; Xie, Huawei <huawei.xie at intel.com>;
>>> dev at dpdk.org;
>>>>>> vkaplans at redhat.com; mst at redhat.com
>>>>>> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors
>>> support
>>>>>> to the TX path
>>>>>>
>>>>>> On Thu, Oct 27, 2016 at 12:35:11PM +0200, Maxime Coquelin wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/27/2016 12:33 PM, Yuanhan Liu wrote:
>>>>>>>>>> On Thu, Oct 27, 2016 at 11:10:34AM +0200, Maxime Coquelin
>>> wrote:
>>>>>>>>>>>> Hi Zhihong,
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/27/2016 11:00 AM, Wang, Zhihong wrote:
>>>>>>>>>>>>>> Hi Maxime,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Seems indirect desc feature is causing serious performance
>>>>>>>>>>>>>> degradation on Haswell platform, about 20% drop for both
>>>>>>>>>>>>>> mrg=on and mrg=off (--txqflags=0xf00, non-vector version),
>>>>>>>>>>>>>> both iofwd and macfwd.
>>>>>>>>>>>> I tested PVP (with macswap on guest) and Txonly/Rxonly on an
>>> Ivy
>>>>>> Bridge
>>>>>>>>>>>> platform, and didn't faced such a drop.
>>>>>>>>>>
>>>>>>>>>> I was actually wondering that may be the cause. I tested it with
>>>>>>>>>> my IvyBridge server as well, I saw no drop.
>>>>>>>>>>
>>>>>>>>>> Maybe you should find a similar platform (Haswell) and have a
>>>>>>>>>> try?
>>>>>>>> Yes, that's why I asked Zhihong whether he could test Txonly in
>>>>>>>> guest
>>> to
>>>>>>>> see if issue is reproducible like this.
>>>>>>
>>>>>> I have no Haswell box, otherwise I could do a quick test for you.
>>>>>> IIRC,
>>>>>> he tried to disable the indirect_desc feature, then the performance
>>>>>> recovered. So, it's likely the indirect_desc is the culprit here.
>>>>>>
>>>>>>>> I will be easier for me to find an Haswell machine if it has not
>>>>>>>> to be
>>>>>>>> connected back to back to and HW/SW packet generator.
>>>> In fact simple loopback test will also do, without pktgen.
>>>>
>>>> Start testpmd in both host and guest, and do "start" in one
>>>> and "start tx_first 32" in another.
>>>>
>>>> Perf drop is about 24% in my test.
>>>>
>>>
>>> Thanks, I never tried this test.
>>> I managed to find an Haswell platform (Intel(R) Xeon(R) CPU E5-2699 v3
>>> @ 2.30GHz), and can reproduce the problem with the loop test you
>>> mention. I see a performance drop about 10% (8.94Mpps/8.08Mpps).
>>> Out of curiosity, what are the numbers you get with your setup?
>>
>> Hi Maxime,
>>
>> Let's align our test case to RC2, mrg=on, loopback, on Haswell.
>> My results below:
>>  1. indirect=1: 5.26 Mpps
>>  2. indirect=0: 6.54 Mpps
>>
>> It's about 24% drop.
> OK, so on my side, same setup on Haswell:
> 1. indirect=1: 7.44 Mpps
> 2. indirect=0: 8.18 Mpps
>
> Still 10% drop in my case with mrg=on.
>
> The strange thing with both of our figures is that this is below from
> what I obtain with my SandyBridge machine. The SB cpu freq is 4% higher,
> but that doesn't explain the gap between the measurements.
>
> I'm continuing the investigations on my side.
> Maybe we should fix a deadline, and decide do disable indirect in
> Virtio PMD if root cause not identified/fixed at some point?
>
> Yuanhan, what do you think?

I have done some measurements using perf, and know understand better
what happens.

With indirect descriptors, I can see a cache miss when fetching the
descriptors in the indirect table. Actually, this is expected, so
we prefetch the first desc as soon as possible, but still not soon
enough to make it transparent.
In direct descriptors case, the desc in the virtqueue seems to be
remain in the cache from its previous use, so we have a hit.

That said, in realistic use-case, I think we should not have a hit,
even with direct descriptors.
Indeed, the test case use testpmd on guest side with the forwarding set
in IO mode. It means the packet content is never accessed by the guest.

In my experiments, I am used to set the "macswap" forwarding mode, which
swaps src and dest MAC addresses in the packet. I find it more
realistic, because I don't see the point in sending packets to the guest
if it is not accessed (not even its header).

I tried again the test case, this time with setting the forwarding mode
to macswap in the guest. This time, I get same performance with both
direct and indirect (indirect even a little better with a small
optimization, consisting in prefetching the 2 first descs
systematically as we know there are contiguous).

Do you agree we should assume that the packet (header or/and buf) will
always be accessed by the guest application?
If so, do you agree we should keep indirect descs enabled, and maybe
update the test cases?

Thanks,
Maxime


More information about the dev mailing list