[dpdk-dev] rte_ether_addr_copy() strange comment

Van Haaren, Harry harry.van.haaren at intel.com
Fri Jun 26 14:41:29 CEST 2020


> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit at intel.com>
> Sent: Friday, June 26, 2020 1:08 PM
> To: Morten Brørup <mb at smartsharesystems.com>; dev at dpdk.org
> Cc: Olivier Matz <olivier.matz at 6wind.com>; Van Haaren, Harry
> <harry.van.haaren at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>
> Subject: Re: [dpdk-dev] rte_ether_addr_copy() strange comment
> 
> On 6/25/2020 4:45 PM, Morten Brørup wrote:
> > The function rte_ether_addr_copy() checks for __INTEL_COMPILER and has a
> comment about "a strange gcc warning". It says:
> >
> > static inline void rte_ether_addr_copy(const struct rte_ether_addr *ea_from,
> > 				   struct rte_ether_addr *ea_to)
> > {
> > #ifdef __INTEL_COMPILER
> > 	uint16_t *from_words = (uint16_t *)(ea_from->addr_bytes);
> > 	uint16_t *to_words   = (uint16_t *)(ea_to->addr_bytes);
> >
> > 	to_words[0] = from_words[0];
> > 	to_words[1] = from_words[1];
> > 	to_words[2] = from_words[2];
> > #else
> > 	/*
> > 	 * Use the common way, because of a strange gcc warning.
> > 	 */
> > 	*ea_to = *ea_from;
> > #endif
> > }
> >
> > I can see that from_words discards the const qualifier. Is that the "strange" gcc
> warning the comment is referring to?
> >
> > This goes back to before the first public release of DPDK in 2013, ref.
> https://git.dpdk.org/dpdk/log/lib/librte_ether/rte_ether.h
> >
> >
> > It should be fixed as follows:
> >
> > -	uint16_t *from_words = (uint16_t *)(ea_from->addr_bytes);
> > -	uint16_t *to_words   = (uint16_t *)(ea_to->addr_bytes);
> > +	const uint16_t *from_words = (const uint16_t *)ea_from;
> > +	uint16_t       *to_words   = (uint16_t *)ea_to;
> >
> > And the consequential question: Is copying the three shorts faster than
> copying the struct? In other words: Should we get rid of the #ifdef and use the
> first method only?
> 
> 
> I tried to investigate this in godbolt: https://godbolt.org/z/YSmvDn

There was a small hiccup in the struct mac definition, it is aligned to 2, not 16 as the above Godbolt link...
With the aligned attribute changed to 2 (as per DPDK header https://git.dpdk.org/dpdk/tree/lib/librte_net/rte_ether.h#n59 )
we get the required (but less performant) smaller stores:

WORD_COPY = 0, Aligned = 16
NOTE: Incorrect alignment provided, and invalid ASM as it stores over the 10 bytes after eth addr.
This code is from a GodBolt test only, and this bug is NOT present in upstream DPDK.
        movdqa  (%rsi), %xmm0
        movaps  %xmm0, (%rdi)
        ret

WORD_COPY = 0, Aligned = 2 (correct, as per dpdk header)
        movl    (%rsi), %eax
        movl    %eax, (%rdi)
        movzwl  4(%rsi), %eax
        movw    %ax, 4(%rdi)
        ret

Word Copy = 1 (aligned = 2)
        movzwl  (%rsi), %eax
        movw    %ax, (%rdi)
        movzwl  2(%rsi), %eax
        movw    %ax, 2(%rdi)
        movzwl  4(%rsi), %eax
        movw    %ax, 4(%rdi)
        ret

<snip previous output>

Ferruh said:
> Related to the struct copy vs word copy, struct copy seems with less instruction [1],[2],
> my vote to remove ifdef and keep struct copy.

+1 for struct copy here too, it gives the compiler space to optimize within its alignment bounds.
It does the best that it can (given a 6 byte area to store into), with 1x 4byte store, 1x 2byte store.

Regards, -Harry

PS: For extra bonus points, here's a SIMD version that only uses one store
https://godbolt.org/z/VAR2La. Unless you intend on copying billions of
L1 resident eth addrs, this may or may not be a useful optimization.
Note that it requires the 10 bytes after the ether addr to be valid to read.
It loads 16B across both SRC and DST, blends 48 bits of SRC into DST and
writes the result back to DST.
        movdqu  (%rsi), %xmm0
        movdqu  (%rdi), %xmm1
        pblendw $7, %xmm1, %xmm0
        movups  %xmm0, (%rdi)
        ret

Actually, its possible to do this using a uint64_t (8 byte scalar) load/store too,
with some masking and bitwise OR... left as an exercise to the reader? :)


More information about the dev mailing list