net/mlx5: fix Tx doorbell memory barrier
authorYongseok Koh <yskoh@mellanox.com>
Wed, 25 Oct 2017 00:27:25 +0000 (17:27 -0700)
committerFerruh Yigit <ferruh.yigit@intel.com>
Thu, 26 Oct 2017 00:33:01 +0000 (02:33 +0200)
commitfb870be5a879c6617fecabf47873ae2b576e6e69
treeabd79339ea355b7eeed46287da4090dff6baffa5
parent2262eed7523b305e59e8172a141b75e385b43cf0
net/mlx5: fix Tx doorbell memory barrier

Configuring UAR as IO-mapped makes maximum throughput decline by
noticeable amount. If UAR is configured as write-combining register,
a write memory barrier is needed on ringing a doorbell.

rte_wmb() is mostly effective when the size of a burst is comparatively
small. Revert the register back to write-combining and enforce a write
memory barrier instead, except for vectorized Tx burst routines.
Application can change it by setting MLX5_SHUT_UP_BF under its own
necessity.

Fixes: 9f9bebae5530 ("net/mlx5: don't map doorbell register to write combining")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
doc/guides/nics/mlx5.rst
drivers/net/mlx5/mlx5.c
drivers/net/mlx5/mlx5_rxtx.h
drivers/net/mlx5/mlx5_rxtx_vec_neon.h
drivers/net/mlx5/mlx5_rxtx_vec_sse.h