The hairpin TX/RX queue depth and packet size is fixed in the past.
When the firmware has some fix or improvement, the PMD will not
make full use of it. And also, 32 packets for a single queue will not
guarantee a good performance for hairpin flows. It will make the
stride size larger and for small packets, it is a waste of memory.
The recommended stride size is 64B now.
The parameter of hairpin queue setup needs to be adjusted.
1. A proper buffer size should support the standard jumbo frame with
9KB, and also more than 1 jumbo frame packet for performance.
2. Number of packets of a single queue should be the maximum
supported value (total buffer size / stride size).
There is no need to support the max capacity of total buffer size
because the memory consumption should also be taken into
consideration.
Fixes:
e79c9be91515 ("net/mlx5: support Rx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
#define MLX5_FLOW_MREG_HNAME "MARK_COPY_TABLE"
#define MLX5_DEFAULT_COPY_ID UINT32_MAX
+/* Hairpin TX/RX queue configuration parameters. */
+#define MLX5_HAIRPIN_QUEUE_STRIDE 6
+#define MLX5_HAIRPIN_JUMBO_LOG_SIZE (15 + 2)
+
/* Definition of static_assert found in /usr/include/assert.h */
#ifndef HAVE_STATIC_ASSERT
#define static_assert _Static_assert
struct mlx5_devx_create_rq_attr attr = { 0 };
struct mlx5_rxq_obj *tmpl = NULL;
int ret = 0;
+ uint32_t max_wq_data;
MLX5_ASSERT(rxq_data);
MLX5_ASSERT(!rxq_ctrl->obj);
tmpl->type = MLX5_RXQ_OBJ_TYPE_DEVX_HAIRPIN;
tmpl->rxq_ctrl = rxq_ctrl;
attr.hairpin = 1;
- /* Workaround for hairpin startup */
- attr.wq_attr.log_hairpin_num_packets = log2above(32);
- /* Workaround for packets larger than 1KB */
+ max_wq_data = priv->config.hca_attr.log_max_hairpin_wq_data_sz;
+ /* Jumbo frames > 9KB should be supported, and more packets. */
attr.wq_attr.log_hairpin_data_sz =
- priv->config.hca_attr.log_max_hairpin_wq_data_sz;
+ (max_wq_data < MLX5_HAIRPIN_JUMBO_LOG_SIZE) ?
+ max_wq_data : MLX5_HAIRPIN_JUMBO_LOG_SIZE;
+ /* Set the packets number to the maximum value for performance. */
+ attr.wq_attr.log_hairpin_num_packets =
+ attr.wq_attr.log_hairpin_data_sz -
+ MLX5_HAIRPIN_QUEUE_STRIDE;
tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->ctx, &attr,
rxq_ctrl->socket);
if (!tmpl->rq) {
struct mlx5_devx_create_sq_attr attr = { 0 };
struct mlx5_txq_obj *tmpl = NULL;
int ret = 0;
+ uint32_t max_wq_data;
MLX5_ASSERT(txq_data);
MLX5_ASSERT(!txq_ctrl->obj);
tmpl->txq_ctrl = txq_ctrl;
attr.hairpin = 1;
attr.tis_lst_sz = 1;
- /* Workaround for hairpin startup */
- attr.wq_attr.log_hairpin_num_packets = log2above(32);
- /* Workaround for packets larger than 1KB */
+ max_wq_data = priv->config.hca_attr.log_max_hairpin_wq_data_sz;
+ /* Jumbo frames > 9KB should be supported, and more packets. */
attr.wq_attr.log_hairpin_data_sz =
- priv->config.hca_attr.log_max_hairpin_wq_data_sz;
+ (max_wq_data < MLX5_HAIRPIN_JUMBO_LOG_SIZE) ?
+ max_wq_data : MLX5_HAIRPIN_JUMBO_LOG_SIZE;
+ /* Set the packets number to the maximum value for performance. */
+ attr.wq_attr.log_hairpin_num_packets =
+ attr.wq_attr.log_hairpin_data_sz -
+ MLX5_HAIRPIN_QUEUE_STRIDE;
attr.tis_num = priv->sh->tis->id;
tmpl->sq = mlx5_devx_cmd_create_sq(priv->sh->ctx, &attr);
if (!tmpl->sq) {