Each workslot is always bound to a specific lcore there is no multi-core
contention to cause cache trashing as a result it is safe to remove the
WFE. Also, in dual workslot dequeue work will mostlikely be available on
the pair workslot making WFE impractical.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
rte_prefetch_non_temporal(lookup_mem);
#ifdef RTE_ARCH_ARM64
asm volatile(
- " ldr %[tag], [%[tag_loc]] \n"
- " ldr %[wqp], [%[wqp_loc]] \n"
- " tbz %[tag], 63, done%= \n"
- " sevl \n"
- "rty%=: wfe \n"
+ "rty%=: \n"
" ldr %[tag], [%[tag_loc]] \n"
" ldr %[wqp], [%[wqp_loc]] \n"
" tbnz %[tag], 63, rty%= \n"