net/i40e: reduce L1 cache misses in NEON Rx
For N1 platform, packet mbuf load and descs load are hot spots to limit
the performance for "desc_to_ptype_v" and "desc_to_olflags_v" functions
in i40e rx NEON path. This is because packet mbuf and descs are evicted
from l1d-cache to l2d-cache.
To reduce l1d-cache-misses and improve the performance, change the code
order and move "desc_to_ptype_v" and "desc_to_olflags_v" functions
forward to the location, where packet mbuf and descs are just loaded.
Test Result:
dpdk:21.08-rc1
gcc-9
For n1sdp, the patch improves the performance by 1.8%.
For thunderx2, no performance changes.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>