[dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path
maxime.coquelin at redhat.com
Fri Nov 4 08:41:58 CET 2016
On 11/04/2016 07:18 AM, Xu, Qian Q wrote:
> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maxime Coquelin
> Sent: Thursday, November 3, 2016 4:11 PM
> To: Wang, Zhihong <zhihong.wang at intel.com>; Yuanhan Liu <yuanhan.liu at linux.intel.com>
> Cc: mst at redhat.com; dev at dpdk.org; vkaplans at redhat.com
> Subject: Re: [dpdk-dev] [PATCH v4] vhost: Add indirect descriptors support to the TX path
>> The strange thing with both of our figures is that this is below from
>> what I obtain with my SandyBridge machine. The SB cpu freq is 4%
>> higher, but that doesn't explain the gap between the measurements.
>> I'm continuing the investigations on my side.
>> Maybe we should fix a deadline, and decide do disable indirect in
>> Virtio PMD if root cause not identified/fixed at some point?
>> Yuanhan, what do you think?
> I have done some measurements using perf, and know understand better what happens.
> With indirect descriptors, I can see a cache miss when fetching the descriptors in the indirect table. Actually, this is expected, so we prefetch the first desc as soon as possible, but still not soon enough to make it transparent.
> In direct descriptors case, the desc in the virtqueue seems to be remain in the cache from its previous use, so we have a hit.
> That said, in realistic use-case, I think we should not have a hit, even with direct descriptors.
> Indeed, the test case use testpmd on guest side with the forwarding set in IO mode. It means the packet content is never accessed by the guest.
> In my experiments, I am used to set the "macswap" forwarding mode, which swaps src and dest MAC addresses in the packet. I find it more realistic, because I don't see the point in sending packets to the guest if it is not accessed (not even its header).
> I tried again the test case, this time with setting the forwarding mode to macswap in the guest. This time, I get same performance with both direct and indirect (indirect even a little better with a small optimization, consisting in prefetching the 2 first descs systematically as we know there are contiguous).
> Do you agree we should assume that the packet (header or/and buf) will always be accessed by the guest application?
> ----Maybe it's true in many real use case. But we also need ensure the performance for "io fwd" has no performance drop. As I know, OVS-DPDK team will do the performance benchmark based on "IO fwd" for virtio part, so they will also see some performance drop. And we just thought if it's possible to make the feature default off then if someone wanted to use it can turn it on. People can choose if they want to use the feature, just like vhost dequeue zero copy.
OVS adds an overhead compared to testpmd on host, and its cache
utilization might have the same effect as doing macswap.
Do you know who could test with OVS? I would be interested in the
And the difference today with zero-copy is that it can be enabled at
runtime, whereas we can only do it at build time on guest side for
It can be disabled in QEMU command line though.
More information about the dev