[dpdk-dev] [PATCH v2 12/12] virtio: add Tso support

Yuanhan Liu yuanhan.liu at linux.intel.com
Thu Oct 13 17:29:35 CEST 2016


On Thu, Oct 13, 2016 at 05:15:24PM +0200, Olivier MATZ wrote:
> 
> 
> On 10/13/2016 05:01 PM, Yuanhan Liu wrote:
> >On Thu, Oct 13, 2016 at 04:52:25PM +0200, Olivier MATZ wrote:
> >>
> >>
> >>On 10/13/2016 04:16 PM, Yuanhan Liu wrote:
> >>>On Thu, Oct 13, 2016 at 04:02:49PM +0200, Olivier MATZ wrote:
> >>>>
> >>>>
> >>>>On 10/13/2016 10:18 AM, Yuanhan Liu wrote:
> >>>>>On Mon, Oct 03, 2016 at 11:00:23AM +0200, Olivier Matz wrote:
> >>>>>>+/* When doing TSO, the IP length is not included in the pseudo header
> >>>>>>+ * checksum of the packet given to the PMD, but for virtio it is
> >>>>>>+ * expected.
> >>>>>>+ */
> >>>>>>+static void
> >>>>>>+virtio_tso_fix_cksum(struct rte_mbuf *m)
> >>>>>>+{
> >>>>>>+	/* common case: header is not fragmented */
> >>>>>>+	if (likely(rte_pktmbuf_data_len(m) >= m->l2_len + m->l3_len +
> >>>>>>+			m->l4_len)) {
> >>>>>...
> >>>>>>+		/* replace it in the packet */
> >>>>>>+		th->cksum = new_cksum;
> >>>>>>+	} else {
> >>>>>...
> >>>>>>+		/* replace it in the packet */
> >>>>>>+		*rte_pktmbuf_mtod_offset(m, uint8_t *,
> >>>>>>+			m->l2_len + m->l3_len + 16) = new_cksum.u8[0];
> >>>>>>+		*rte_pktmbuf_mtod_offset(m, uint8_t *,
> >>>>>>+			m->l2_len + m->l3_len + 17) = new_cksum.u8[1];
> >>>>>>+	}
> >>>>>
> >>>>>The tcp header will always be in the mbuf, right? Otherwise, you can't
> >>>>>update the cksum field here. What's the point of introducing the "else
> >>>>>clause" then?
> >>>>
> >>>>Sorry, I don't see the problem you're pointing out here.
> >>>>
> >>>>What I want to solve here is to support the cases where the mbuf is
> >>>>segmented in the middle of the network header (which is probably a rare
> >>>>case).
> >>>
> >>>How it's gonna segmented?
> >>
> >>The mbuf is given by the application. So if the application generates a
> >>segmented mbuf, it should work.
> >>
> >>This could happen for instance if the application uses mbuf clones to share
> >>the IP/TCP/data part of the mbuf and prepend a specific Ethernet/vlan for
> >>different destination.
> >>
> >>
> >>>>In the "else" part, I only access the mbuf byte by byte using the
> >>>>rte_pktmbuf_mtod_offset() accessor. An alternative would have been to copy
> >>>>the header in a linear buffer, fix the checksum, then copy it again in the
> >>>>packet, but there is no mbuf helpers to do these copies for now.
> >>>
> >>>In the "else" clause, the ip header is still in the mbuf, right?
> >>>Why do you have to access it the way like:
> >>>
> >>>	ip_version = *rte_pktmbuf_mtod_offset(m, uint8_t *,
> >>>		m->l2_len) >> 4;
> >>>
> >>>Why can't you just use
> >>>
> >>>	iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len);
> >>>	iph->version_ihl ....;
> >>
> >>AFAIK, there is no requirement that each network header has to be contiguous
> >>in a mbuf segment.
> >>
> >>Of course, a split in the middle of a network header probably never
> >>happens... but we never knows, as it is not forbidden. I think the code
> >>should be robust enough to avoid accesses to wrong addresses.
> >>
> >>Hope it's clear enough :)
> >
> >Thanks, but not really. Maybe let me ask this way: what wrong would
> >happen if we use
> >	iph = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, m->l2_len);
> >to access the IP header? Is it about the endian?
> 
> If you have a packet split like this:
> 
> mbuf segment 1                     mbuf segment 2
> ----------------------------       ------------------------------
> | Ethernet header |  IP hea|       |der | TCP header | data
> ----------------------------       ------------------------------
>                    ^
>                    iph

Thanks, that's clear. How could you be able to access the tcp header
from the first mbuf then? I mean, how is the following code supposed
to work?

    prev_cksum.u8[0] = *rte_pktmbuf_mtod_offset(m, uint8_t *,
			m->l2_len + m->l3_len + 16);

> The IP header is not contiguous. So accessing to the end of the structure
> will access to a wrong location.
> 
> >One more question is do you have any case to trigger the "else" clause?
> 
> No, but I think it may happen.

A piece of untest code is not trusted though ...

	--yliu


More information about the dev mailing list