net/mlx5: add per-lcore cache to the list utility
When mlx5 list object is accessed by multiple cores, the list lock
counter is all the time written by all the cores what increases cache
misses in the memory caches.
In addition, when one thread accesses the list for add\remove\lookup
operation, all the other threads coming to do an operation in the list
are stuck in the lock.
Add per lcore cache to allow thread manipulations to be lockless when
the list objects are mostly reused.
Synchronization with atomic operations should be done in order to
allow threads to unregister an entry from other thread cache.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>