|SUCCESS| [v3] net/ice: add NEON-optimised Rx/Tx vector paths
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Thu Apr 9 03:37:51 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/162909
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-04-09
I'll review this DPDK patch for correctness bugs, coding style compliance, and guideline violations.
git sha: Not available in patch (patchwork format)
Author: Jay Wang <jay.wang2 at arm.com>
Subject: [PATCH v3] net/ice: add NEON-optimised Rx/Tx vector paths
This patch adds NEON-optimized receive and transmit paths to the Intel
ice driver for ARM64 platforms.
> diff --git a/.mailmap b/.mailmap
> index beccc84425..dfe92b0399 100644
> --- a/.mailmap
> +++ b/.mailmap
[ ... ]
> diff --git a/drivers/net/intel/ice/ice_ethdev.h b/drivers/net/intel/ice/ice_ethdev.h
> index 4b3718f715..f6fd3bf106 100644
> --- a/drivers/net/intel/ice/ice_ethdev.h
> +++ b/drivers/net/intel/ice/ice_ethdev.h
[ ... ]
> diff --git a/drivers/net/intel/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
> index 31b74be9ba..b34231c212 100644
> --- a/drivers/net/intel/ice/ice_rxtx.c
> +++ b/drivers/net/intel/ice/ice_rxtx.c
> @@ -2515,7 +2515,9 @@ ice_dev_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements)
> ad->rx_func_type == ICE_RX_AVX512 ||
> ad->rx_func_type == ICE_RX_AVX512_SCATTERED ||
> ad->rx_func_type == ICE_RX_AVX512_OFFLOAD ||
> - ad->rx_func_type == ICE_RX_AVX512_SCATTERED_OFFLOAD)
> + ad->rx_func_type == ICE_RX_AVX512_SCATTERED_OFFLOAD ||
> + ad->rx_func_type == ICE_RX_NEON ||
> + ad->rx_func_type == ICE_RX_NEON_SCATTERED)
> return ptypes;
Does this code need RTE_ARCH_ARM64 guards around the NEON checks? The
AVX512 checks are x86-specific, but are these enum values always defined
regardless of architecture?
> @@ -3356,6 +3358,26 @@ static const struct ci_rx_path_info ice_rx_path_infos[] = {
> }
> },
> #endif
> +#elif defined(RTE_ARCH_ARM64)
> + [ICE_RX_NEON] = {
> + .pkt_burst = ice_recv_pkts_vec,
> + .info = "Vector Neon",
> + .features = {
> + .rx_offloads = ICE_RX_VECTOR_OFFLOAD_OFFLOADS,
> + .simd_width = RTE_VECT_SIMD_128,
> + .bulk_alloc = true
> + }
> + },
> + [ICE_RX_NEON_SCATTERED] = {
> + .pkt_burst = ice_recv_scattered_pkts_vec,
> + .info = "Vector Neon Scattered",
> + .features = {
> + .rx_offloads = ICE_RX_VECTOR_OFFLOAD_OFFLOADS,
> + .simd_width = RTE_VECT_SIMD_128,
> + .scattered = true,
> + .bulk_alloc = true
> + }
> + },
> #endif
[ ... ]
> diff --git a/drivers/net/intel/ice/ice_rxtx.h b/drivers/net/intel/ice/ice_rxtx.h
> index 77ed41f9fd..999b6b30d6 100644
> --- a/drivers/net/intel/ice/ice_rxtx.h
> +++ b/drivers/net/intel/ice/ice_rxtx.h
> @@ -261,6 +261,12 @@ const uint32_t *ice_dev_supported_ptypes_get(struct rte_eth_dev *dev,
> void ice_select_rxd_to_pkt_fields_handler(struct ci_rx_queue *rxq,
> uint32_t rxdid);
>
> +uint16_t ice_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> + uint16_t nb_pkts);
> +uint16_t ice_recv_scattered_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> + uint16_t nb_pkts);
> +uint16_t ice_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
> + uint16_t nb_pkts);
Are these function declarations needed in the header when they're
architecture-specific? The x86 vector implementations don't expose these
in the header. Should these be guarded with RTE_ARCH_ARM64 or declared
in a private header?
> diff --git a/drivers/net/intel/ice/ice_rxtx_vec_neon.c b/drivers/net/intel/ice/ice_rxtx_vec_neon.c
> new file mode 100644
> index 0000000000..8ec5942541
> --- /dev/null
> +++ b/drivers/net/intel/ice/ice_rxtx_vec_neon.c
> @@ -0,0 +1,761 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2026 Intel Corporation
> + * Copyright(c) 2026 Arm Limited
> + */
> +
> +#include "ice_rxtx_vec_common.h"
> +
> +#include "../common/rx_vec_arm.h"
> +
> +#include <rte_vect.h>
Does this include order violate the coding style? Per the style guide,
system includes should precede local includes. The order should be:
rte_vect.h before ice_rxtx_vec_common.h and rx_vec_arm.h.
> +static __rte_always_inline uint32x4_t
> +ice_flex_rxd_to_fdir_flags_vec(const uint32x4_t fdir_id0_3)
> +{
> +#define FDID_MIS_MAGIC 0xFFFFFFFFu
Should this local macro be undefined at the end of the function to avoid
namespace pollution?
> +static __rte_always_inline void
> +ice_flex_rxd_to_olflags_v(struct ci_rx_queue *rxq, uint64x2_t descs[4],
> + struct rte_mbuf **rx_pkts)
Does this indentation follow the style guide? The continuation line for
rx_pkts should be aligned differently. Per DPDK style, function
parameters should be indented by one tab plus alignment.
> + /**
> + * mask everything except checksum, RSS and VLAN flags
> + * bit fields defined in enum ice_rx_flex_desc_status_error_0_bits
> + * bit7:4 for checksum.
> + * bit12 for RSS indication.
> + * bit13 for VLAN indication.
> + */
> + /**
> + * map the checksum, rss and vlan fields to the checksum, rss
> + * and vlan flag.
> + */
> + /**
> + * extract status_error0 field from 4 descriptors,
> + * and mask out everything else not in desc_mask
> + */
> + /**
> + * shift each 32-bit lane right by 4 so that we can use
> + * the checksum bit as an index into cksum_flags
> + */
> + /**
> + * shift left by 1 bit because we shift right by 1 bit
> + * in cksum_flags
> + */
> + /**
> + * we need to mask out the redundant bits introduced by RSS or
> + * VLAN fields.
> + */
> + /**
> + * At this point, we have the 4 sets of flags in the low 16-bits
> + * of each 32-bit value in flags.
> + * We want to extract these, and merge them with the mbuf init data
> + * so we can do a single 16-byte write to the mbuf to set the flags
> + * and all the other initialization fields. Extracting the
> + * appropriate flags means that we have to do a shift and blend for
> + * each mbuf before we do the write.
> + */
Are these multi-line comments properly formatted? Several violate the
requirement that multi-line comments should have opening /* and closing
*/ on separate lines.
> +static __rte_always_inline void
> +ice_rxq_rearm(struct ci_rx_queue *rxq)
> +{
> + ci_rxq_rearm(rxq);
> +}
Does this trivial wrapper function provide any value? It's a one-line
function that just calls another function with the same parameters.
> + const uint8x16_t cksum_flags = {
> + (RTE_MBUF_F_RX_OUTER_L4_CKSUM_GOOD >> 20 | RTE_MBUF_F_RX_L4_CKSUM_GOOD |
> + RTE_MBUF_F_RX_IP_CKSUM_GOOD) >> 1,
Does this code have excessive line length? This initialization spans
multiple lines but could be formatted more clearly with the expressions
broken at different points for readability.
> + uint16x8_t crc_adjust = {
> + 0, 0, /* ignore pkt_type field */
> + rxq->crc_len, /* sub crc on pkt_len */
> + 0, /* ignore high-16bits of pkt_len */
> + rxq->crc_len, /* sub crc on data_len */
> + 0, 0, 0 /* ignore non-length fields */
> + };
Does this comment alignment use tabs? Comments should be aligned with
tabs, not spaces.
> + /* mask to shuffle from flex descriptor to mbuf */
> + const uint8x16_t shuf_msk = {
> + 0xFF, 0xFF, /* pkt_type set as unknown */
> + 0xFF, 0xFF, /* pkt_type set as unknown */
> + 4, 5, /* octet 4~5, low bits pkt_len */
> + 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */
> + 4, 5, /* octet 4~5, 16 bits data_len */
> + 10, 11, /* octet 10~11, 16 bits vlan_macip */
> + 0xFF, 0xFF, /* rss hash parsed separately */
> + 0xFF, 0xFF,
> + };
Does this trailing comma in the initializer comply with DPDK style?
> + /* 4 packets DD mask */
> + const uint16x8_t dd_check = {
> + 0x0001, 0x0001, 0x0001, 0x0001,
> + 0, 0, 0, 0
> + };
> +
> + /* 4 packets EOP mask */
> + const uint8x16_t eop_check = {
> + 0x2, 0, 0x2, 0, 0x2, 0, 0x2, 0,
> + 0, 0, 0, 0, 0, 0, 0, 0
> + };
> + /* check to see if there is actually a packet available */
> + if (!(rxdp->wb.status_error0 &
> + rte_cpu_to_le_32(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S)))
> + return 0;
Does this code use implicit zero comparison? Per the coding style,
explicit comparison should be used: "== 0".
> + uint16x8_t sterr_tmp1, sterr_tmp2;
> + uint64x2_t mbp1, mbp2;
> + uint16x8_t staterr;
> + uint16x8_t tmp;
> + uint64_t stat;
Does this variable declaration order follow reverse Christmas tree
ordering as required by DPDK style?
> +#ifndef RTE_NET_INTEL_USE_16BYTE_DESC
Does this preprocessor directive have unnecessary whitespace? There's a
tab character before RTE_NET_INTEL_USE_16BYTE_DESC.
> + /**
> + * needs to load 2nd 16B of each desc for RSS hash parsing,
> + * will cause performance drop to get into this context.
> + */
Is this multi-line comment properly formatted? The opening /** should be
on its own line.
> + /**
> + * to shift the 32b RSS hash value to the
> + * highest 32b of each 128b before mask
> + */
> + /**
> + * the staterr values are not in order, even though
> + * the count of DD bits doesn't care. However, for
> + * end of packet tracking, we do care, so shuffle.
> + * Previously: descs[3] descs[1] descs[2] descs[0]
> + * Shuffled: descs[0] descs[1] descs[2] descs[3]
> + */
Are these multi-line comments using the correct style?
> + /**
> + * move the statue bit (bit0) into the sign bit (bit15)
> + * of each 16-bit lane
> + */
Does "statue" need to be "status"? This appears to be a typo.
> + /**
> + * reinterpret staterr as a signed 16-bit and
> + * arithmetic-shift-right by 15
> + * each lane becomes 0xFFFF if original DD bit was 1, otherwise 0.
> + * then interpret back to unsigned u16 vector
> + */
> + /**
> + * reinterpret u16x8 vector as u64x2, and fetch the low u64
> + * which contains the first four 16-bit lanes, and invert
> + * all bits.
> + */
> + if (unlikely(stat == 0)) {
Does this code use explicit zero comparison correctly per the style
guide?
> + /* check no split flags in both previous and current bursts */
> + if (!rxq->pkt_first_seg &&
> + split_fl64[0] == 0 && split_fl64[1] == 0 &&
> + split_fl64[2] == 0 && split_fl64[3] == 0)
Does this condition use implicit pointer comparison? The style guide
requires "== NULL" for pointer comparisons.
> + if (!rxq->pkt_first_seg) {
Does this code use implicit pointer comparison? Should be
"== NULL".
> +static __rte_always_inline void
> +ice_vtx1(volatile struct ci_tx
More information about the test-report
mailing list