git.droids-corp.org - dpdk.git/log

net/pcap: fix Rx with small buffers

If the pkt pool contains only buffers smaller than the default headroom,
then the driver will compute an invalid buffer size (negative value cast
to an uint16_t).
Rely on the mbuf api to check how much space is available in the mbuf.

Fixes: 6eb0ae218a98 ("pcap: fix mbuf allocation")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/bnxt: reduce verbosity of a message

Change verbosity of a message to DEBUG from ERROR.
This is just debug message.

Fixes: 3e92fd4e4ec0 ("net/bnxt: use dynamic log type")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: fix endianness

Use rte_cpu_to_le_16/32 while parsing the hwrm command response.

Fixes: 11e5e19695c7 ("net/bnxt: support redirecting tunnel packets to VF")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: fix tunnel redirect commands

Modified to send the tunnel redirect commands to Chimp HWRM channel as
Kong does not support these commands.

Fixes: 11e5e19695c7 ("net/bnxt: support redirecting tunnel packets to VF")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: fix extended port counter statistics

We were trying to fill in more rx extended stats than the size allocated
for stats causing segfault. Fixed this by adding an explicit check.
Rearranged the code to return statistic values in xstats_get as per the
names returned in xstats_get_names.

Fixes: f55e12f33416 ("net/bnxt: support extended port counters")
Cc: stable@dpdk.org
Signed-off-by: Rahul Gupta <rahul.gupta@broadcom.com>
Signed-off-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: use dedicated CPR for async events

This commit enables the creation of a dedicated completion
ring for asynchronous event handling instead of handling these
events on a receive completion ring.

For the stingray platform and other platforms needing tighter
control of resource utilization, we retain the ability to
process async events on a receive completion ring.

For Thor-based adapters, we use a dedicated NQ (notification
queue) ring for async events (async events can't currently
be received on a completion ring due to a firmware limitation).

Rename "def_cp_ring" to "async_cp_ring" to better reflect its
purpose (async event notifications) and to avoid confusion with
VNIC default receive completion rings.

Allow rxq 0 to be stopped when not being used for async events.

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/sfc: unify power of 2 alignment check macro

Substitute driver-defined IS_P2ALIGNED() with EFX_IS_P2ALIGNED()
defined in libefx.

Add type argument and cast value and alignment to one specified type.

Fixes: e1b944598579 ("net/sfc: build libefx")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>

net/sfc: fix align to power of 2 when align has smaller type

Substitute driver-defined P2ALIGN() with EFX_P2ALIGN() defined in
libefx.

Cast value and alignment to one specified type to guarantee result
correctness.

Fixes: e1b944598579 ("net/sfc: build libefx")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>

net/sfc: fix power of 2 round up when align has smaller type

Substitute driver-defined P2ROUNDUP() h with EFX_P2ROUNDUP()
defined in libefx.

Cast value and alignment to one specified type to guarantee result
correctness.

Fixes: e1b944598579 ("net/sfc: build libefx")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>

net/i40e: replace license text with SPDX tag

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>

net/e1000: replace license text with SPDX tag

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

net/fm10k: replace license text with SPDX tag

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

net/iavf: replace license text with SPDX tag

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

net/mlx5: fix Rx queue release of resources

Function rxq_release_rq_resources() releases resources of RQ object
created by DevX API.

This patch updates this function to properly clear the released
resources, to avoid repeated release of the same resource.

Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: fix doorbell release on Rx queue release

Function mlx5_rxq_release() calls mlx5_release_dbr() to release the
doorbell allocated for this Rx queue.
This call is relevant only for Rx queue objects created using
DevX API.

This patch adds the required check, to call mlx5_release_dbr()
only when relevant.
It also updates mlx5_release_dbr() to use the input offset correctly.

Fixes: dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/ice: fix VXLAN/NVGRE flow matching

For VXLAN/NVGRE packet, vni/tni should be included in the matching
keys. This patch fixes this issue.

Fixes: d76116a4678f ("net/ice: add generic flow API")
Signed-off-by: Ying A Wang <ying.a.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: remove type cast in Rx/Tx ring setup

The memzone's start virtual address pointer (addr) is of type void *,
no need to add type cast.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>

net/i40e: fix RSS hash update for X722 VF

This patch fixes X722 VF problem when received packet don't have
HASH value.
1) Packet classifier types update should support X722 VF, not only
for X722 PF;
2) MAC type is invalid for X722 VF when set packet classifier type,
so move it after MAC type is set correctly;

Fixes: a286ebeb0714 ("net/i40e: add dynamic mapping of SW flow types to HW pctypes")
Cc: stable@dpdk.org
Signed-off-by: Peng Huang <peng.huang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/i40e: fix request queue in VF

When the VF configuration is larger than the number of queues reserved
by PF, VF sends the request queue command through admin queue. When PF
received this command, it may reset the VF and send a notification
before resetting. If this notification is read by the timed task alarm,
Task request queue will lost notification. This patch prevents two
tasks from running simultaneously.

Fixes: ee653bd80044 ("net/i40e: determine number of queues per VF at run time")
Cc: stable@dpdk.org
Signed-off-by: Tao Zhu <taox.zhu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: fix bitmap and/or routines

There was an issue with ice_and_bitmap and ice_or_bitmap when
dealing with bit array sizes that are not even multiples of 32,
where some of relevant bits in the highest 32 bits were being
cleared. This patch fixes those problems.

Fixes: c9e37832c95f ("net/ice/base: rework on bit ops")
Cc: stable@dpdk.org
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: cleanup hardware register macros

Cleanup hardware registers macros in ice_auto_generator.h.

Fixes: 51c7f09f3f81 ("net/ice/base: add registers for Intel E800 Series NIC")
Cc: stable@dpdk.org
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: use macro instead of function name

use __func__ instead of function name in ice_debug calls.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: fix packet type size

Change ptype variable to correctly be 16-bits in ice_prof_map
structure.

Fixes: 51d04e4933e3 ("net/ice/base: add flexible pipeline module")
Cc: stable@dpdk.org
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: fix resource leak

We don't free s_rule if ice_aq_sw_rules() returns a non-zero status. If
it returned a zero status, s_rule would be freed right after, so this
implies it should be freed within the scope of the function regardless.

Fixes: c7dd15931183 ("net/ice/base: add virtual switch code")
Cc: stable@dpdk.org
Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: fix inner TCP and UDP support for GRE

The dummy packets for GRE were set up for IP, but not inner
TCP or UDP. There are some applications that want to be
able to parse on those inner L4 headers so add them to
the dummy packets.

Also, the GRE dummy packet was formatted differently from
the other dummy packets so change the formatting to match
all the other dummy packets.

Fixes: 839c0a4b77e6 ("net/ice/base: enable additional switch rules")
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/e1000: fix i219 hang on reset/close

Unit hang may occur if multiple descriptors are available in the rings
during reset or close. This state can be detected by configure status
by bit 8 in register. If the bit is set and there are pending
descriptors in one of the rings, we must flush them before reset or
close.

Fixes: 805803445a02 ("e1000: support EM devices (also known as e1000/e1000e)")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>

compressdev: clarify destination buffer size

Clarify the corner case with incompressible data
whereby the output can actually be greater than the
uncompressed data.

Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>

security: remove duplicated symbols from map file

Fixes: f63ffee26f9c ("security: restore experimental tag for unimplemented APIs")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>

cryptodev: fix typo in comment

Remove extra ';' which is probably added unintentionally, reported by
./devtools/check-includes.sh script.

Fixes: 26008aaed14c ("cryptodev: add asymmetric xform and op definitions")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>

doc: update release notes for IPsec

Update release notes for recently supported features in IPsec library and
IPsec Security Gateway application.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>

test/crypto: improve asymmetric setup

Improve logic:
  * to get list of valid devices based on driver id so that to
    eliminate unnecessary if check for driver id match in device loop
  * loop till 1st device supporting asymmetric feature is found unlike
    previous logic which breaks on 1st device

Signed-off-by: Kanaka Durga Kotamarthy <kkotamarthy@marvell.com>
Signed-off-by: Ayuj Verma <ayverma@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>

compress/zlib: fix error handling

Add missing return after setting the error status in case of
invalid flush_flag in the operation.
The issue was found by the coverity scan as the fin_flush variable,
not initialized in such case, was used later in the flow.

Coverity issue: 340859
Fixes: c7b436ec95fd ("compress/zlib: support burst enqueue/dequeue")
Cc: stable@dpdk.org
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>

app/compress-perf: prevent output buffer overflow

This patch fixes the issue of memory overwrite after the end of
the output buffer by calculating its size as the number of all
segments multipled by the output segment size.
Additionally buffer overflow errors returned by PMD driver are
detected and shown, ending the test.

Also the output buffer size multiplier was increased from 105%
to 110% to allow running the tests on noncompressible files that
expand to over 107% of original size during the compression.

The changes were made in the verification part of the flow and
they don't affect the benchmark results.

Fixes: 424dd6c8c1 ("app/compress-perf: add weak functions for multicore test")
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>

app/compress-perf: improve results report

This patch adds extra features to the compress performance
test. Some important parameters (memory allocation,
number of ops, number of segments) are calculated and
printed out.
Information about threads, cores, devices and queue-pairs
is also printed.

Signed-off-by: Artur Trybula <arturx.trybula@intel.com>
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>

app/crypto-perf: disable asymmetric crypto

Asymmetric crypto is not required for test-crypto-perf application.
Disabling the feature using 'ff_disable' field.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>

test/compress: fix some checks

CID 340857: Null pointer dereferences (NULL_RETURNS)
CID 340856: (CONSTANT_EXPRESSION_RESULT)

Coverity issue: 340856, 340857
Fixes: 3be12ea52ad8 ("test/compress: improve debug trace setup")
Cc: stable@dpdk.org
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>

version: 19.08-rc2

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

raw/ioat: fix include quotes

Some builds with clang report an error because '<>' rather than '""' were
used for including the ioat spec header file.

  Target: x86_64-native-bsdapp-clang
  error: 'rte_ioat_spec.h' file not found with <angled> include; use "quotes" instead
  #include <rte_ioat_spec.h>
           ^~~~~~~~~~~~~~~~~
           "rte_ioat_spec.h"
  1 error generated.

Since this file should always be in the same directory as the main header,
we can safely change the include line to fix this error.

Fixes: abff4333ec20 ("raw/ioat: create device on probe and destroy on release")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

vfio: use contiguous mapping for IOVA as VA mode

When using IOVA as VA mode, there is no need to map segments
page by page. This normally isn't a problem, but it becomes one
when attempting to use DPDK in no-huge mode, where VFIO subsystem
simply runs out of space to store mappings.

Fix this for x86 by triggering different callbacks based on whether
IOVA as VA mode is enabled.

Fixes: 73a639085938 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Andrius Sirvys <andrius.sirvys@intel.com>

ethdev: avoid getting uninitialized info for bad port

rte_eth_dev_info_get() returns void and caller does know if the function
does its job or not. Changing of the return value to int would be
API/ABI breakage which requires deprecation process and cannot be
backported to stable branches. For now, make sure that device info is
initialized even in the case of invalid port ID.

Fixes: a30268e9a2d0 ("ethdev: reset whole dev info structure before filling")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

net/mlx5: fix NVGRE matching

NVGRE has a GRE header with c_rsvd0_ver value 0x2000 and protocol
value 0x6558.
These should be matched when item_nvgre is provided.

This patch adds validation function of NVGRE item.
It also updates the translate function of NVGRE item, to add the
required values, if they were not specified.

Original work by Xiaoyu Min <jackmin@mellanox.com>

Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>

net/mlx5: adjust maximum LRO message size

LRO message is contained in the MPRQ strides.
While the LRO message size cannot be bigger than 65280 according to the
PRM, the strides which contain it may be bigger than the maximum buffer
size allowed in dpdk mbuf - 0xFFFF.

Adjust the maximum LRO message size to avoid buffer length overflow.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: zero LRO mbuf headroom

LRO packet may consume all the stride memory, hence the PMD cannot
guaranty head-room for the LRO mbuf.

The issue is lack in HW support to write the packet in offset from the
stride start.

A new striding RQ feature may be added in CX6 DX to allow head-room and
tail-room for the LRO strides.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: handle LRO packets in Rx queue

When LRO offload is configured in Rx queue, the HW may coalesce TCP
packets from same TCP connection into single packet.

In this case the SW should fix the relevant packet headers because the
HW doesn't update them according to the new created packet
characteristics.

Add update header code to the mprq Rx burst function to support LRO
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: update LRO fields in completion entry

Update the CQE structure to include LRO fields.

Some reserved values were changed, hence also data-path code used the
reserved values were updated accordingly.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: replace external mbuf shared memory

As an arrangement to the LRO support when a packet can consume all the
stride memory, the external mbuf shared information cannot be anymore
in the end of the stride, because the HW may write the packet data to
all the stride memory.

Move the shared information memory from the stride to the control
memory of the external mbuf.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: support LRO with single RxQ object

Implement LRO support using a single RQ object per DPDK RxQ.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: create advanced RxQ via DevX

Function mlx5_rxq_obj_new(), previously called mlx5_rxq_ibv_new(),
supports creating Rx queue objects using verbs.
This patch expands the relevant functions, to support creating
verbs or DevX Rx queue objects:
Function mlx5_rxq_obj_new() updated to create RQ object using DevX.
Function mlx5_ind_table_obj_new() updated to create RQT object using DevX.
Function mlx5_hrxq_new() updated to create TIR object using DevX.
New utility functions added to perform specific operations:
mlx5_devx_rq_new(), mlx5_devx_wq_attr_fill(),
mlx5_devx_create_rq_attr_fill().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: add function for Rx verbs work queue

Verbs WQ for RxQ is created inside function mlx5_rxq_obj_new().
This patch moves the creation of verbs WQ to dedicated function
mlx5_ibv_wq_new().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: add function for Rx verbs completion queue

Verbs CQ for RxQ is created inside function mlx5_rxq_obj_new().
This patch moves the creation of CQ to dedicated function
mlx5_ibv_cq_new().

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: store protection domain number on create

Function mlx5_alloc_shared_ibctx() allocates Protection Domain using
verbs API, as part of shared IB device context.
This patch adds reading and storing of pdn value from the created PD
object, using DV API.
The pdn value is required when creating WQ using DevX API.

This patch also updates function flow_dv_create_counter_stat_mem_mng()
which uses the pdn value as well.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: update queue state modify for DevX

Function mlx5_queue_state_modify_primary() was implemented to handle
state change for queues created using Verbs API.

This patch update function mlx5_queue_state_modify_primary() to
support state change of RQ object created using DevX API.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: rename hash RxQ verbs to general

Prepare for introducing use of DevX TIR object.
Hash Rx queue is currently created using verbs QP only.
The next patches will add the option to create it with a TIR object
using DevX.
This patch renames hrxq_ibv to hrxq wherever relevant, and adds
the DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: rename verbs indirection table to obj

Prepare for introducing of DevX RQT object.
Rx indirection table object is currently created using verbs only.
The next patches will add the option to create an RQT object using
DevX.
This patch renames ind_table_ibv to ind_table_obj wherever relevant,
and adds the DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: rename RxQ verbs to general RxQ object

Prepare for introducing of DevX RxQ object.
RxQ object is currently created using verbs only.
The next patches will add the option to create RxQ object using DevX.
This patch renames rxq_ibv to rxq_obj wherever relevant, and adds the
DevX items to relevant structs.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: allocate door-bells via DevX

When using DevX API, memory for door-bell records should be allocated
by PMD and registered using DevX API.

This patch implements the utility functions to support it:
- Add struct mlx5_devx_dbr_page, containing door-bells page data.
- Add list of struct mlx5_devx_dbr_page door-bell pages to device
  private data.
- Implement function mlx5_alloc_dbr_page() to allocate page for
  door-bell records, and register it using DevX API.
- Implement function mlx5_get_dbr(). to acquire a door-bell record
  from the door-bells page, allocating a new page if needed.
- Implement function mlx5_release_dbr() to release a door-bell
  record that is no longer needed, freeing the containing page if
  it becomes empty.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: create advanced RxQ table via DevX

Implement function mlx5_devx_cmd_create_rqt() to create RQT
object using DevX API.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: create advanced Rx object via DevX

Implement function mlx5_devx_cmd_create_tir() to create TIR
object using DevX API..
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: modify advanced RxQ object via DevX

Implement function mlx5_devx_cmd_modify_rq() to modify RQ.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: create advanced RxQ object via DevX

Implement function mlx5_devx_cmd_create_rq() to create RQ object using
DevX API.
Add related structs in mlx5.h and mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: update Tx queue create for LRO

Update function mlx5_txq_ibv_new(), query and store the TIS
transport domain value.
It is required later on Rx side when creating matching TIR.
Add field in mlx5 data structure to store Transport Domain ID.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: support Tx interface query via DevX

Implement function mlx5_devx_cmd_qp_tis_td_query(), to query
QP TIS Transport Domain value.

Add related structs in mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: check conditions to enable LRO

Use DevX API to read device LRO capabilities.
Check if LRO is supported and can be enabled.
Check if MPRQ is supported and can be used.
Enable MPRQ for LRO use if not enabled by user.
Added note for mlx5_mprq_enabled(), to emphasize that LRO
enables MPRQ.
Disable CQE compression and CRC stripping if LRO is enabled.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: add glue for create action via DevX

Add compile option HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR, and matching
dest_tir flag in device configuration structure.
Add glue function pointer dv_create_flow_action_dest_devx_tir, and
function mlx5_glue_dv_create_flow_action_dest_devx_tir(),
to invoke API mlx5dv_dr_action_create_dest_devx_tir();

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: add glue for queue query via DevX

Add function mlx5_glue_devx_qp_query().
Add glue function pointer devx_qp_query to run it.
Glue version updated to 19.08.0.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: query LRO capabilities via DevX

Update function mlx5_devx_cmd_query_hca_attr() to query HCA
capabilities related to LRO.

Add relevant structs in drivers/net/mlx5/mlx5_prm.h.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: introduce LRO

Add command-line argument to set LRO session timeout.
Add LRO settings struct in PMD configuration struct.
Add support of LRO offload in port configuration.
Add macros and function to check if LRO is supported and enabled.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: remove redundant item from union

A variable of type struct ibv_cq_ex is declared in 2 unions, but
isn't used.
This patch removes the 2 redundant declarations.

Fixes: 6218063b39a6 ("net/mlx5: refactor Rx data path")
Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

app/testpmd: fix MAC address parser for flow rule

MAC address parsing was causing failure [1],
this patch partially reverts the commit
commit b5ddce8959b2 ("app/testpmd: use new ethernet address parser")

[1]
testpmd> flow validate 0 priority 2 ingress group 0 pattern eth dst
is 98:03:9B:5C:D9:00 / end actions queue index 0 / end
Bad arguments

Fixes: b5ddce8959b2 ("app/testpmd: use new ethernet address parser")
Reported-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Raslan Darawsheh <rasland@mellanox.com>

net/i40e: fix flow director rule destroy

We should tear down the fdir when the last flow is destroyed, current
logic is opposite to expected behavior, this patch fixes this issue.

Fixes: 2e67a7fbf3ff ("net/i40e: config flow director automatically")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>

net/i40e: fix ethernet flow rule

i40e FDIR doesn't allow to create flow with empty spec and mask for
ethertype pattern. Without this patch, below flow would be created
successfully which is unexpected.

> flow create 0 ingress pattern eth / end actions drop / end

Fixes: 7d83c152a207 ("net/i40e: parse flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>

net/octeontx2: fix driver reconfiguration

When configure returns error, e.g. in case not supported offloads
(outer ip and sctp) driver released Rx,Tx queues. Then in case of
correct configuration the driver could not start due to queues already
released but the driver thought it was configured correctly.

Secondly if driver returns error from configuration librte_ethdev will
release, rx queues and tx queues, without chaining driver configured
state.

Fix that by 'releasing' configuration and changing driver state when
error is returned from otx2_nix_configure.

Fixes: 548b5839a32b ("net/octeontx2: add device configure operation")
Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>
Reviewed-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

net/ark: remove resources when port is close

Set RTE_ETH_DEV_CLOSE_REMOVE upon probe so all the resources
for the port can be freed by rte_eth_dev_close()

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>

net/ark: fix queue packet replacement

Queue index was incorrectly incremented with port, which
caused incorrect queue packet placement. This manifested
when port number was != 0.

Fixes: c33d45af3633 ("net/ark: add Tx initial version")
Cc: stable@dpdk.org
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>

net/dpaa2: fix multi-segment Tx

This patch resets frc and ctrl in sg tx fd to avoid corruption.

Fixes: 774e9ea91992 ("net/dpaa2: add support for multi seg buffers")
Cc: stable@dpdk.org
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

net/dpaa: check multi-segment external buffers

This patch add check to return error as the handling
for external buffer packets with SG is currently missing.

Fixes: 37f9b54bd3cf ("net/dpaa: support Tx and Rx queue setup")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>

net/mlx5: revert Netlink socket sharing

This reverts commit e28111ac9864af09e826241a915dfff87a9c00ad.
The netlink requests are replaced by ifindex caching and
not needed anymore.

Fixes: e28111ac9864 ("net/mlx5: fix master device Netlink socket sharing")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: cache associated network device index

The associated device index is retrieved via Netlink request to
underlying Infiniband device driver. This network device index
is permanent throughout the lifetime of device. We do not
spawn the rte_eth_dev ports without associated network device, and
if network device is being unbound we get the remove notification
message and rte_eth_dev port is also detached. So, we may store
the ifindex in mlx5_device_spawn() routine at rte_eth_dev port
creation and initialization time and use the cached value further
instead of doing actual Netlink request.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: report max number of mbuf segments

This patch fills the tx_desc_lim.nb_seg_max and
tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info
structure to report thee maximal number of packet
segments, requested inline data configuration is
taken into account in conservative way.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: implement Tx burst template

This patch adds the implementation of tx_burst routine template.
The template supports all Tx offloads and multiple optimized
tx_burst routines can be generated by compiler from this one.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: introduce Tx burst routine template

Mellanox NICs support the wide set of Tx offloads. The supported
offloads are reported by the mlx5 PMD in rte_eth_dev_info
tx_offload_capa field.
An application may choose any combination of supported offloads
and configure the device appropriately. Some of Tx offloads may be
not requested by application, or ever all of them may be omitted.
Most of the Tx offloads require some code branches in tx_burst routine
to support ones. If Tx offload is not requested the tx_burst routine
code may be significantly simplified and consume less CPU cycles.

For example, if application does not engage TSO offload this code
can be omitted, if multi-segment packet is not supposed the tx_burst
may assume single mbuf packets only, etc.

Currently, the mlx5 PMD implements multiple tx_burst subroutines
for most common combinations of requested Tx offloads, each branch
has its own dedicated implementation. It is not very easy to update,
support and develop such kind of code - multiple branches impose
the multiple points to process. Also many of frequently requested
offload combinations are not supported yet. That leads to selecting of
not completely matching tx_burst routine and harms the performance.

This patch introduces the new approach for tx_burst code. It is proposed
to develop the unified template for tx_burst routine, which supports
all the Tx offloads and takes the compile time defined parameter
describing the supposed set of supported offloads. On the base
of this template, the compiler is able to generate multiple tx_burst
routines highly optimized for the statically specified set of
Tx offloads.
Next, in runtime, at Tx queue configuration the best matching optimized
implementation of tx_burst is chosen.

This patch intentionally omits the template internal implementation,
but just introduces the template itself to emboss the approach of
the multiple specially tuned tx_burst routines.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: add Tx configuration and setup

This patch updates the Tx datapath control and configuration
structures and code for managing Tx datapath settings.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: extend NIC attributes query via DevX

This patch extends the NIC attributes query via DevX.
The appropriate interface structures are borrowed from
kernel driver headers and DevX calls are added to
mlx5_devx_cmd_query_hca_attr() routine.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: update Tx definitions

This patch updates Tx datapath definitions, mostly hardware related.
The Tx descriptor structures are redefined with required fields,
size definitions are renamed to reflect the meanings in more
appropriate way. This is a preparation step before introducing
the new Tx datapath implementation.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: add Tx devargs

This patch introduces new mlx5 PMD devarg options:

- txq_inline_min - specifies minimal amount of data to be inlined into
  WQE during Tx operations. NICs may require this minimal data amount
  to operate correctly. The exact value may depend on NIC operation
  mode, requested offloads, etc.

- txq_inline_max - specifies the maximal packet length to be completely
  inlined into WQE Ethernet Segment for ordinary SEND method. If packet
  is larger the specified value, the packet data won't be copied by the
  driver at all, data buffer is addressed with a pointer. If packet
  length is less or equal all packet data will be copied into WQE.

- txq_inline_mpw - specifies the maximal packet length to be completely
  inlined into WQE for Enhanced MPW method.

Driver documentation is also updated.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: remove Tx implementation

This patch removes the existing Tx datapath code
as preparation step before introducing the new
implementation. The following entities are being
removed:

- deprecated devargs support
- tx_burst() routines
- related PRM definitions
- SQ configuration code
- Tx routine selection code
- incompatible Tx completion code

The following devargs are deprecated and ignored:
- "txq_inline" is going to be converted to "txq_inline_max"
for compatibility issue
- "tx_vec_en"
- "txqs_max_vec"
- "txq_mpw_hdr_dseg_en"
- "txq_max_inline_len" is going to be converted
to "txq_inline_mpw" for compatibility issue

The deprecated devarg keys are recognized by PMD
and ignored/converted to the new ones in order not
to block device probing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>

net/mlx5: fix typos in comments

Some spelling mistakes were found in comments.
This patch fixes them.

Fixes: d10b09db0a45 ("net/mlx5: fix allocation when no memory on device NUMA node")
Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items")
Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support")
Fixes: f6d9ab4e769f ("net/mlx5: check Tx queue size overflow")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx4: fix typo in comment

A spelling mistake was found in comment.
This patch fixes it.

Fixes: 8e4937640022 ("net/mlx4: add external allocator for Verbs object")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>

net/mlx5: fix flow item flags bitmap

In functions flow_dv_translate() and flow_dv_validate(), the flow
items are scanned and each item is marked in item_flags bitmap.
The code handling some of the items was ported from another project,
where items are marked in a slightly different manner.

This patch fixes the setting of items in bitmap, adapting it to the
required manner.

Fixes: d53aa89aea91 ("net/mlx5: support matching on ICMP/ICMP6")
Fixes: 5865955ad994 ("net/mlx5: match GRE key and present bits")
Fixes: 2e4c987aad91 ("net/mlx5: validate Direct Rule E-Switch")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>

net/octeontx2: support IPv6 ext parsing

Adding support for ipv6_ext header parsing in the octeontx2 flow.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

net/octeontx2: add build check on fast path fields

Add build bug on on fast path used fields that are
dependent on their positions and values.

Fixes: f1eff76ab63e ("net/octeontx2: add Rx vector version")
Fixes: ddc1bc26e9ed ("net/octeontx2: add Tx vector version")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

net/octeontx: use driver log type

All log messages should use driver logtype. RTE_LOGTYPE_PMD is
planned to be deprecated in the future.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jerin Jacob <jerinj@marvell.com>

net/ice: fix unsafe tailq element removal

TAILQ_FOREACH macro is not safe to remove elements
during iterating tailq lists. Replace it with
TAILQ_FOREACH_SAFE.

Fixes: d76116a4678f ("net/ice: add generic flow API")
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/e1000: fix buffer overrun while i219 processing DMA

Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.

Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html

Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/bnxt: remove unnecessary interrupt disable

Remove an unnecessary rte_intr_disable() call to disable interrupt
during device init.

Fixes: c09f57b49c13 ("net/bnxt: add start/stop/link update operations")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: disable vector mode Tx with VLAN offload

The vector mode transmit path does not currently support VLAN tag
insertion, so we need to disable vector transmit when transmit
VLAN insertion offload is enabled.

Fixes: bc4a000f2f53 ("net/bnxt: implement SSE vector mode")
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: fix interrupt rearm logic

Rearm will intimate hardware that current interrupts are processed
and it can continue to send more.

Fixes: 1fe427fd08ee ("net/bnxt: support enable/disable interrupt")
Cc: stable@dpdk.org
Signed-off-by: Rahul Gupta <rahul.gupta@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: fix Rx interrupt vector

The receive interrupt vector should be offset by the constant
RTE_INTR_VEC_RXTX_OFFSET; otherwise setting up some queue interrupts
will fail.

Fixes: 1fe427fd08ee ("net/bnxt: support enable/disable interrupt")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Tested-by: Rahul Gupta <rahul.gupta@broadcom.com>