hash: add scalable multi-writer insertion with Intel TSX
This patch introduced scalable multi-writer Cuckoo Hash insertion
based on a split Cuckoo Search and Move operation using Intel
TSX. It can do scalable hash insertion with 22 cores with little
performance loss and negligible TSX abortion rate.
* Added an extra rte_hash flag definition to switch default single writer
Cuckoo Hash behavior to multiwriter.
- If HTM is available, it would use hardware feature for concurrency.
- If HTM is not available, it would fall back to spinlock.
* Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related
cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
select x86 file or other platform-specific implementations. While HTM check
is still done at runtime (same idea with
RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
* Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
rte_cuckoo_hash_x86.h or future platform dependent functions to include.
* Following new functions are created for consistent names when new platform
TM support are added.
- rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement.
- rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement.
* One extra multi-writer test case is added.
Signed-off-by: Wei Shen <wei1.shen@intel.com>
Signed-off-by: Sameh Gobriel <sameh.gobriel@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>