mempool/octeontx2: optimize for L1D cache architecture
OCTEON TX2 has 8 sets, 41 ways L1D cache, VA<9:7> bits dictate
the set selection.
Add additional padding to ensure that the element size always
occupies odd number of cachelines to ensure even distribution
of elements among L1D cache sets.