net/mlx5: fix Tx doorbell memory barrier
authorYongseok Koh <yskoh@mellanox.com>
Wed, 25 Oct 2017 00:27:25 +0000 (17:27 -0700)
committerFerruh Yigit <ferruh.yigit@intel.com>
Thu, 26 Oct 2017 00:33:01 +0000 (02:33 +0200)
Configuring UAR as IO-mapped makes maximum throughput decline by
noticeable amount. If UAR is configured as write-combining register,
a write memory barrier is needed on ringing a doorbell.

rte_wmb() is mostly effective when the size of a burst is comparatively
small. Revert the register back to write-combining and enforce a write
memory barrier instead, except for vectorized Tx burst routines.
Application can change it by setting MLX5_SHUT_UP_BF under its own
necessity.

Fixes: 9f9bebae5530 ("net/mlx5: don't map doorbell register to write combining")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

No differences found