mempool: remove memory wastage on non-x86
The existing optimize_object_size() function address the memory object
alignment constraint on x86 for better performance.
Different (micro) architecture may have different memory alignment
constraint for better performance and it not the same as the existing
optimize_object_size().
Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution
based on the address and some may have a different formula.
Introducing arch_mem_object_align() function to abstract
the difference between different (micro) architectures to avoid
wasting memory for mempool object alignment for the architecture
that it is not required to do so.
Details on the amount of memory saving:
Currently, arm64 based architectures use the default (nchan=4,
nrank=1). The worst case is for an object whose size (including mempool
header) is 2 cache lines, where it is optimized to 3 cache lines (+50%).
Examples for cache lines size = 64:
orig optimized
64 -> 64 +0%
128 -> 192 +50%
192 -> 192 +0%
256 -> 320 +25%
320 -> 320 +0%
384 -> 448 +16%
...
2304 -> 2368 +2.7% (~mbuf size)
Additional details:
https://www.mail-archive.com/dev@dpdk.org/msg149157.html
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>