[dpdk-dev] [PATCH v2] mbuf: optimize memory loads during mbuf freeing

Olivier Matz olivier.matz at 6wind.com
Fri Mar 27 09:13:14 CET 2020


Hi,

On Fri, Mar 20, 2020 at 03:55:15PM +0000, Alexander Kozyrev wrote:
> Introduction of pinned external buffers doubled memory loads in the
> rte_pktmbuf_prefree_seg() function. Analysis of the generated assembly
> code shows unnecessary load of the pool field of the rte_mbuf structure.
> Here is the snippet of the assembly for "if (!RTE_MBUF_DIRECT(m))":
> Before the change the code was:
> 	movq  0x18(%rbx), %rax // load the ol_flags field
> 	test %r13, %rax	       // check if ol_flags equals to 0x60...0
> 	jz 0x9a8718 <Block 2>  // jump out to "if (m->next != NULL)"
> After the change the code became:
> 	movq  0x18(%rbx), %rax // load ol_flags
> 	test %r14, %rax	       // check if ol_flags equals to 0x60...0
> 	jnz 0x9bea38 <Block 2> // jump in to "if (!RTE_MBUF_HAS_EXTBUF(m)"
> 	movq  0x48(%rbx), %rax // load the pool field
> 	jmp 0x9bea78 <Block 7> // jump out to "if (m->next != NULL)"
> Look like this absolutely unneeded memory load of the pool field is an
> optimization for the external buffer case in GCC (4.8.5), since Clang
> generates the same assembly for both before and after the change versions.
> Plus, GCC favors the external buffer case over the simple case.
> This assembly code layout causes the performance degradation because the
> rte_pktmbuf_prefree_seg() function is a part of a very hot path.
> Workaround this compilation issue by moving the check for pinned buffer
> apart from the check for external buffer and restore the initial code
> flow that favors the direct mbuf case over the external one.
> 
> Fixes: 6ef1107ad4c6 ("mbuf: detach mbuf with pinned external buffer")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Alexander Kozyrev <akozyrev at mellanox.com>
> Acked-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>

Acked-by: Olivier Matz <olivier.matz at 6wind.com>

Thanks!


More information about the dev mailing list