[dpdk-dev] [PATCH v2 03/20] i40e: call i40e_txd_enable_checksum only for offloaded packets

Zhang, Helin helin.zhang at intel.com
Fri Feb 13 03:25:36 CET 2015



> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Thursday, February 12, 2015 1:14 AM
> To: Zhang, Helin; dev at dpdk.org
> Cc: Ananyev, Konstantin; Liu, Jijiang
> Subject: Re: [PATCH v2 03/20] i40e: call i40e_txd_enable_checksum only for
> offloaded packets
> 
> Hi Helin,
> 
> On 02/11/2015 06:32 AM, Zhang, Helin wrote:
> >> On 02/10/2015 07:03 AM, Zhang, Helin wrote:
> >>>>    		/* Enable checksum offloading */
> >>>>    		cd_tunneling_params = 0;
> >>>> -		i40e_txd_enable_checksum(ol_flags, &td_cmd, &td_offset,
> >>>> -						l2_len, l3_len, outer_l2_len,
> >>>> -						outer_l3_len,
> >>>> -						&cd_tunneling_params);
> >>>> +		if (ol_flags & I40E_TX_CKSUM_OFFLOAD_MASK) {
> >>> likely should be added.
> >>
> >> I would say unlikely() instead. I think the non-offload case should
> >> be the default one. What do you think?
> 
> Maybe you missed this comment. Any thoughts?
Ohh, sorry for the missing!
I'd prefer to have likely, as hardware offload is preferred if there is. If you
don't think so, how about to keep nothing (no likely/unlikely) as it is.

> 
> 
> >>>> +			i40e_txd_enable_checksum(ol_flags, &td_cmd,
> &td_offset,
> >>>> +				l2_len, l3_len, outer_l2_len,
> >>>> +				outer_l3_len,
> >>>> +				&cd_tunneling_params);
> >>>> +		}
> >>> As this code changes are in fast path, performance regression test
> >>> is needed. I would like to see the performance difference with or
> >>> without this patch set. Hopefully nothing different. If you need any
> >>> helps, just let me
> >> know.
> >>
> >> I'm sorry, I won't have the needed resources to bench this as I would
> >> have to setup a performance platform with i40e devices.
> >>
> >> But I'm pretty sure that the code in non-offload case would be faster
> >> with this patch as it will avoid many operations in
> i40e_txd_enable_checksum().
> >>
> >> For the offload case, as we also removed the if (l2_len == 0) and if
> >> (l3_len == 0), I think there are also less tests than before my patch series.
> >>
> >> So in my opinion, adding this test does not really justify to check
> >> the performance.
> > As 40G is quite sensitive on cpu cycles, we'd better to avoid any
> > performance drop during our modifying the code for fast path.
> > Performance is what we care about too much. Based on my experiences,
> > even minor code changes may result in big performance impact.
> > It seems that we may need to help you on performance measurement.
> 
> Thanks, indeed it's helpful if you can check performance non-regression.
I have asked our validation guys here to help you on that, but might not in
high priority. In addition, we all will take vocation for the coming Chinese new year.

Regards,
Helin

> 
> Regards,
> Olivier



More information about the dev mailing list