examples/l3fwd: prefetch the content of the next packet
authorFeifei Wang <feifei.wang@arm.com>
Wed, 14 Aug 2019 08:54:30 +0000 (16:54 +0800)
committerThomas Monjalon <thomas@monjalon.net>
Sun, 27 Oct 2019 17:21:16 +0000 (18:21 +0100)
commit39d21077e5258cbbb17eed07111f16a799ea2fa8
tree93d6d37adaf851790f859f1cff04427cf3697a02
parentda5350ef29afd35c1adabe76f60832f3092269ad
examples/l3fwd: prefetch the content of the next packet

The cache-misses problem is very serious when the function
lpm_cb_parse_ptype is called to read the content of packets. That is
because the contents of packages previously stored in the cache are
overwritten by the following instructions or variables.
Thus the prefetch order can be used to prefetch the next packet into
the cache to avoid CPU spending too much time on it.

On Octeon TX platform with built-in NIC, 12% performance gain was
measured by running RFC2544 NDR test with l3fwd. Furthermore, the
cache-misses event of the function lpm_cb_parse_ptype was reduced by
20%, and the CPU task-clock of it dropped from 16.49% to 11.3%, based
on the forwarding test for one minute with the 64B packet.
On the dpaa2 platform, no performance improvement nor drop were seen
with this patch by running RFC2544 NDR test with l3fwd.
On the x86 platform, 15.7% performance gain was measured by running
RFC2544 NDR test with l3fwd.

Signed-off-by: Feifei Wang <feifei.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
examples/l3fwd/l3fwd_lpm.c