The RQ WQEs must be written in the memory before the HW gets the RQ
doorbell, hence a memory barrier should be triggered after the WQEs
writing and before the doorbell writing.
The current code used rte_wmb barrier which ensures that all the memory
stores were done while it is enough to use rte_cio_wmb barrier for the
local memory stores because the WQEs are in local memory.