[dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

Maxime Coquelin maxime.coquelin at redhat.com
Thu Sep 29 23:23:35 CEST 2016



On 09/29/2016 10:21 PM, Michael S. Tsirkin wrote:
> On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
>>
>>
>> On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
>>> On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote:
>> ...
>>>>
>>>> Before enabling anything by default, we should first optimize the 1 slot
>>>> case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17%
>>>> perf regression for 64 bytes case:
>>>>  - 2 descs per packet: 11.6Mpps
>>>>  - 1 desc per packet: 9.6Mpps
>>>>
>>>> This is due to the virtio header clearing in virtqueue_enqueue_xmit().
>>>> Removing it, we get better results than with 2 descs (1.20Mpps).
>>>> Since the Virtio PMD doesn't support offloads, I wonder whether we can
>>>> just drop the memset?
>>>
>>> What will happen? Will the header be uninitialized?
>> Yes..
>> I didn't look closely at the spec, but just looked at DPDK's and Linux
>> vhost implementations. IIUC, the header is just skipped in the two
>> implementations.
>
> In linux guest skbs are initialized AFAIK. See virtio_net_hdr_from_skb
> first thing it does is
>         memset(hdr, 0, sizeof(*hdr));

I meant in vhost-net linux implementation, the header is just skipped:

static void handle_tx(struct vhost_net *net)
{
...
		/* Skip header. TODO: support TSO. */
		len = iov_length(vq->iov, out);
		iov_iter_init(&msg.msg_iter, WRITE, vq->iov, out, len);
		iov_iter_advance(&msg.msg_iter, hdr_size);

And the same is done is done in DPDK:

static inline int __attribute__((always_inline))
copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
		  uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
		  struct rte_mempool *mbuf_pool)
{
...
	/*
	 * A virtio driver normally uses at least 2 desc buffers
	 * for Tx: the first for storing the header, and others
	 * for storing the data.
	 */
	if (likely((desc->len == dev->vhost_hlen) &&
		   (desc->flags & VRING_DESC_F_NEXT) != 0)) {
		desc = &descs[desc->next];
		if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
			return -1;

		desc_addr = gpa_to_vva(dev, desc->addr);
		if (unlikely(!desc_addr))
			return -1;

		rte_prefetch0((void *)(uintptr_t)desc_addr);

		desc_offset = 0;
		desc_avail  = desc->len;
		nr_desc    += 1;

		PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
	} else {
		desc_avail  = desc->len - dev->vhost_hlen;
		desc_offset = dev->vhost_hlen;
	}
>
>
>
>>>
>>> The spec says:
>>> 	The driver can send a completely checksummed packet. In this case, flags
>>> 	will be zero, and gso_type
>>> 	will be VIRTIO_NET_HDR_GSO_NONE.
>>>
>>> and
>>> 	The driver MUST set num_buffers to zero.
>>> 	If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set flags to
>>> 	zero and SHOULD supply a fully
>>> 	checksummed packet to the device.
>>>
>>> and
>>> 	If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have been
>>> 	negotiated, the driver MUST
>>> 	set gso_type to VIRTIO_NET_HDR_GSO_NONE.
>>>
>>> so doing this unconditionally would be a spec violation, but if you see
>>> value in this, we can add a feature bit.
>> Right it would be a spec violation, so it should be done conditionally.
>> If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER?
>> It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set.
>> If negotiated, we wouldn't need to prepend a header.
>
> Yes but two points.
>
> 1. why is this memset expensive? Is the test completely skipping looking
>    at the packet otherwise?
Yes.
>
> 2. As long as we are doing this, see
> 	Alignment vs. Networking
> 	========================
> in Documentation/unaligned-memory-access.txt
Thanks, I'll have a look tomorrow.

Maxime



More information about the dev mailing list