net/i40e: relax barrier in Tx for NEON
authorGavin Hu <gavin.hu@arm.com>
Mon, 13 Apr 2020 16:40:24 +0000 (00:40 +0800)
committerFerruh Yigit <ferruh.yigit@intel.com>
Tue, 21 Apr 2020 11:57:08 +0000 (13:57 +0200)
To keep ordering of mixed accesses, 'DMB OSH' is sufficient.
'DSB' inside the I40E_PCI_REG_WRITE is overkill.[1]

This patch fixes by replacing with just sufficient barriers in the
normal PMD and vPMD.

It showed 7% performance uplift on ThunderX2 and 4% on Arm N1SDP.
The test case is the RFC2544 zero-loss test running testpmd.

[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
qf0Kpn89EMdGDajepKoZQ@mail.gmail.com

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
drivers/net/i40e/i40e_rxtx_vec_neon.c

index d7d6dec..8b99354 100644 (file)
@@ -72,8 +72,9 @@ i40e_rxq_rearm(struct i40e_rx_queue *rxq)
        rx_id = (uint16_t)((rxq->rxrearm_start == 0) ?
                             (rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));
 
+       rte_cio_wmb();
        /* Update the tail pointer on the NIC */
-       I40E_PCI_REG_WRITE(rxq->qrx_tail, rx_id);
+       I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rx_id);
 }
 
 static inline void
@@ -564,7 +565,8 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
 
        txq->tx_tail = tx_id;
 
-       I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
+       rte_cio_wmb();
+       I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
 
        return nb_pkts;
 }