net/i40e: remove memory barrier from NEON Rx
authorGavin Hu <gavin.hu@arm.com>
Tue, 13 Aug 2019 10:43:30 +0000 (18:43 +0800)
committerFerruh Yigit <ferruh.yigit@intel.com>
Tue, 3 Sep 2019 15:12:37 +0000 (17:12 +0200)
commit78b50591c8e7ae3d010e8f4005e0e95c17800941
tree4354f71faf085e6faaff435f329c47f0f7c3b0d1
parentceadf1a405d9951c6d15ed27e010f5b1a80dbdf5
net/i40e: remove memory barrier from NEON Rx

For x86, the descriptors needs to be loaded in order, so in between two
descriptors loading, there is a compiler barrier in place.[1]
For aarch64, a patch [2] is in place to survive with discontinuous DD
bits, the barriers can be removed to take full advantage of out-of-order
execution.

50% performance gain in the RFC2544 NDR test was measured on ThunderX2.
12.50% performance gain in the RFC2544 NDR test was measured on Ampere
eMAG80 platform.

[1] http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71548@
SHSMSX105.ccr.corp.intel.com/
[2] https://mails.dpdk.org/archives/stable/2017-October/003324.html

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
drivers/net/i40e/i40e_rxtx_vec_neon.c