[dpdk-dev] [PATCH v2 5/7] net/mlx4: remove unnecessary variables in Tx burst
Adrien Mazarguil
adrien.mazarguil at 6wind.com
Wed Oct 25 18:49:52 CEST 2017
Hi Ophir/Matan,
On Mon, Oct 23, 2017 at 02:21:58PM +0000, Ophir Munk wrote:
> From: Matan Azrad <matan at mellanox.com>
>
> Remove usage of variables which doesn't add new information for
> performance improvement.
>
> Signed-off-by: Matan Azrad <matan at mellanox.com>
I'm almost 100% sure this commit wasn't validated for performance on its
own. Don't mention "performance improvement" in that case.
If you're removing a couple of local variables for readability, just say
so.
More below.
> ---
> drivers/net/mlx4/mlx4_rxtx.c | 8 ++------
> 1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c
> index 014a6d3..e8d9a35 100644
> --- a/drivers/net/mlx4/mlx4_rxtx.c
> +++ b/drivers/net/mlx4/mlx4_rxtx.c
> @@ -285,8 +285,6 @@ mlx4_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
> struct txq *txq = (struct txq *)dpdk_txq;
> unsigned int elts_head = txq->elts_head;
> const unsigned int elts_n = txq->elts_n;
> - unsigned int elts_comp = 0;
> - unsigned int bytes_sent = 0;
> unsigned int i;
> unsigned int max;
> struct mlx4_sq *sq = &txq->msq;
> @@ -498,8 +496,7 @@ mlx4_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
> MLX4_BIT_WQE_OWN : 0));
> sq->head += nr_txbbs;
> elt->buf = buf;
> - bytes_sent += buf->pkt_len;
> - ++elts_comp;
> + txq->stats.obytes += buf->pkt_len;
> elts_head = elts_head_next;
> }
> /* Take a shortcut if nothing must be sent. */
> @@ -507,13 +504,12 @@ mlx4_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n)
> return 0;
> /* Increment send statistics counters. */
> txq->stats.opackets += i;
> - txq->stats.obytes += bytes_sent;
> /* Make sure that descriptors are written before doorbell record. */
> rte_wmb();
> /* Ring QP doorbell. */
> rte_write32(txq->msq.doorbell_qpn, txq->msq.db);
> txq->elts_head = elts_head;
> - txq->elts_comp += elts_comp;
> + txq->elts_comp += i;
> return i;
> }
For bytes_sent, reading these changes and assuming -O0 with the compiler
attempting to convert the code without reordering/improving things, this
replaces register variables used in a loop with memory operations on a large
structure through a pointer (txq->stats update for every iteration instead
of once at the end).
Are you really sure this is more optimized that way? Although the compiler
likely does it already with -O3, helping it avoid unnecessary memory writes
is good in my opinion.
OK for the removal of redundant elts_comp though. Although I'm pretty sure
once again the compiler didn't wait for this patch to optimize it away.
--
Adrien Mazarguil
6WIND
More information about the dev
mailing list