dpdk.git
3 years agonet/virtio: add callback for device closing
Maxime Coquelin [Tue, 26 Jan 2021 10:16:07 +0000 (11:16 +0100)]
net/virtio: add callback for device closing

This patch introduces a new callback for device closing,
making virtio_dev_close() bus-agnostic.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agonet/virtio: store PCI type in virtio device metadata
Maxime Coquelin [Tue, 26 Jan 2021 10:16:06 +0000 (11:16 +0100)]
net/virtio: store PCI type in virtio device metadata

Going further in making the Virtio ethdev layer bus agnostic,
this patch adds a boolean in the Virtio PCI device metadata.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agonet/virtio: force IOVA as VA mode for virtio-user
Maxime Coquelin [Tue, 26 Jan 2021 10:16:05 +0000 (11:16 +0100)]
net/virtio: force IOVA as VA mode for virtio-user

At least Vhost-user backend of Virtio-user PMD requires
IOVA as VA mode. Until now, it was implemented as a hack
by forcing to use mbuf's buf_addr field instead of buf_iova.

This patch removes all this logic and just fails probing
if IOVA as VA mode is not selected. It simplifies the
code overall, and removes some bus-specific logic from
generic virtio_ethdev.c.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: move MSI-X detection to PCI ethdev
Maxime Coquelin [Tue, 26 Jan 2021 10:16:04 +0000 (11:16 +0100)]
net/virtio: move MSI-X detection to PCI ethdev

This patch introduces a new callback to notify the bus
driver some interrupt related operation was done.

This is used by Virtio PCI driver to check msix status.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: move PCI specific dev init to PCI ethdev init
Maxime Coquelin [Tue, 26 Jan 2021 10:16:03 +0000 (11:16 +0100)]
net/virtio: move PCI specific dev init to PCI ethdev init

This patch moves the PCI specific initialization from
eth_virtio_dev_init() to eth_virtio_pci_init().

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agonet/virtio: move PCI device init in dedicated file
Maxime Coquelin [Tue, 26 Jan 2021 10:16:02 +0000 (11:16 +0100)]
net/virtio: move PCI device init in dedicated file

This patch moves the PCI Ethernet device registration bits
in a dedicated patch. In following patches, more code will
be moved there, with the goal of making virtio_ethdev.c
truly bus-agnostic.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: introduce PCI device metadata
Maxime Coquelin [Tue, 26 Jan 2021 10:16:01 +0000 (11:16 +0100)]
net/virtio: introduce PCI device metadata

This patch initiate refactoring of Virtio PCI, by introducing
a new device structure for PCI-specific metadata.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: refactor virtio-user device
Maxime Coquelin [Tue, 26 Jan 2021 10:16:00 +0000 (11:16 +0100)]
net/virtio: refactor virtio-user device

This patch moves the virtio_hw structure into the virtio_user_dev
structure, with the goal of making virtio_hw bus-agnostic.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: introduce virtio bus type
Maxime Coquelin [Tue, 26 Jan 2021 10:15:59 +0000 (11:15 +0100)]
net/virtio: introduce virtio bus type

This patch is preliminary work for introducing a bus layer
in Virtio PMD, in order to improve Virtio-user integration.

A new bus type is added to provide a unified way to distinguish
which bus type is used (PCI modern, PCI legacy or Virtio-user).

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: fix getting old status on reconnect
Maxime Coquelin [Tue, 26 Jan 2021 10:15:58 +0000 (11:15 +0100)]
net/virtio: fix getting old status on reconnect

This patch fixes getting reset status from the restarted
vhost-user backend in case of reconnection, instead of the
status at the time of the disconnection.

This issue was not spotted earlier because Vhost-user
protocol status feature was disabled in server mode.

Fixes: 47235f16505f ("net/virtio-user: set status on socket reconnect")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agobus/vdev: add driver IOVA VA mode requirement
Maxime Coquelin [Tue, 26 Jan 2021 10:15:57 +0000 (11:15 +0100)]
bus/vdev: add driver IOVA VA mode requirement

This patch adds driver flag in vdev bus driver so that
vdev drivers can require VA IOVA mode to be used, which
for example the case of Virtio-user PMD.

The patch implements the .get_iommu_class() callback, that
is called before devices probing to determine the IOVA mode
to be used, and adds a check right before the device is
probed to ensure compatible IOVA mode has been selected.

It also adds a ABI exception rule to accommodate with an
update on the driver registration API

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agobus/vdev: add helper to get vdev from ethdev
Maxime Coquelin [Tue, 26 Jan 2021 10:15:56 +0000 (11:15 +0100)]
bus/vdev: add helper to get vdev from ethdev

This patch adds an helper macro to get the rte_vdev_device
pointer from a rte_eth_dev pointer.

This is similar to RTE_ETH_DEV_TO_PCI().

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agonet/mlx5: use global default miss for E-Switch sampling
Jiawei Wang [Tue, 26 Jan 2021 12:53:33 +0000 (14:53 +0200)]
net/mlx5: use global default miss for E-Switch sampling

In E-Switch steering domain there was dedicated default miss action
created for every sampling flow.
The patch replaces this one with the global default miss action.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix E-Switch sample action creation
Jiawei Wang [Tue, 26 Jan 2021 12:53:32 +0000 (14:53 +0200)]
net/mlx5: fix E-Switch sample action creation

This patch fixes the incorrect checking for the return value
of default miss action creation.

Fixes: 14020ad53d4e ("net/mlx5: wrap default miss flow action per OS")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: support mbuf fast free offload
Viacheslav Ovsiienko [Fri, 22 Jan 2021 17:12:09 +0000 (17:12 +0000)]
net/mlx5: support mbuf fast free offload

This patch adds support of the mbuf fast free offload to the
transmit datapath. This offload allows freeing the mbufs on
transmit completion in the most efficient way. It requires
the all mbufs were allocated from the same pool, have
the reference counter value as 1, and have no any externally
attached buffers.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: optimize inline mbuf freeing
Viacheslav Ovsiienko [Fri, 22 Jan 2021 17:12:08 +0000 (17:12 +0000)]
net/mlx5: optimize inline mbuf freeing

The mlx5 PMD supports packet data inlining by pushing data
to the transmit descriptor. If packet is short enough and all
data are inline, the mbuf is not needed for data send anymore
and can be freed.

The mbuf free was performed in the most inner loop building
the transmit descriptors. This patch postpones the mbuf free
transaction to the tx_burst routine exit, optimizing the loop
and allowing the bulk freeing for the multiple mbufs in single
pool API call.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/dpaa2: support MPLS distribution
Apeksha Gupta [Wed, 20 Jan 2021 06:22:38 +0000 (11:52 +0530)]
net/dpaa2: support MPLS distribution

add support for MPLS based distribution is supported.

Signed-off-by: Apeksha Gupta <apeksha.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
3 years agoethdev: add MPLS RSS offload type
Hemant Agrawal [Wed, 20 Jan 2021 06:22:37 +0000 (11:52 +0530)]
ethdev: add MPLS RSS offload type

This patch defines new RSS offload types for MPLS. The distribution
will on the basis of MPLS tag.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: add IPv6 DSCP option for modify field action
Alexander Kozyrev [Tue, 26 Jan 2021 15:13:35 +0000 (15:13 +0000)]
ethdev: add IPv6 DSCP option for modify field action

IPv6 DSCP field ID is missing from the original list of Field IDs
for MODIFY_FIELD action. Add it to support IPv6 header fully.
Add ipv6_dscp option for the corresponding header field in testpmd.

Fixes: 73b68f4c54a0 ("ethdev: introduce generic modify flow action")

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: fix close failure handling
Thomas Monjalon [Fri, 22 Jan 2021 17:58:04 +0000 (18:58 +0100)]
ethdev: fix close failure handling

If a failure happens when closing a port,
it was unnecessarily failing again in the function eth_err(),
because of a check against HW removal cause.
Indeed there is a big chance the port is released at this point.
Given the port is in the middle (or at the end) of a close process,
checking the error cause by accessing the port is a non-sense.
The error check is replaced by a simple return in the close function.

Bugzilla ID: 624
Fixes: 8a5a0aad5d3e ("ethdev: allow close function to return an error")
Cc: stable@dpdk.org
Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
3 years agonet/sfc: fix generic byte statistics to exclude FCS bytes
Viacheslav Galaktionov [Wed, 20 Jan 2021 12:44:18 +0000 (15:44 +0300)]
net/sfc: fix generic byte statistics to exclude FCS bytes

Generic byte statistics should not include Ethernet FCS bytes.

Fixes: 1caab2f1e684 ("net/sfc: add basic statistics")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agoethdev: clarify what is included in generic byte statistics
Viacheslav Galaktionov [Wed, 20 Jan 2021 12:44:17 +0000 (15:44 +0300)]
ethdev: clarify what is included in generic byte statistics

Different hardware gathers statistics differently, so some general
rules need to be established.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/ixgbe: disable NFS filtering
Dapeng Yu [Tue, 26 Jan 2021 03:03:08 +0000 (11:03 +0800)]
net/ixgbe: disable NFS filtering

Disable NFS header filtering whether NFS packets coalescing are
required or not, in order to make RSS can work on NFS packets.

The code without the patch does follow datasheet, but not consistent
with the ixgbe kernel driver. It causes NFS packets to be filtered
and make them flow into queue 0, before RSS can work on them.

Fixes: b826efba6de4 ("net/ixgbe: align register setting when RSC is disabled")
Fixes: 8eecb3295aed ("ixgbe: add LRO support")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
3 years agonet/iavf: adjust VLAN initialize failure handling
Haiyue Wang [Mon, 25 Jan 2021 11:44:21 +0000 (19:44 +0800)]
net/iavf: adjust VLAN initialize failure handling

Instead of returning error on VLAN init failure, just log the error, so
that the VF device can continue to be configured.

Fixes: 1c301e8c3cff ("net/iavf: support new VLAN capabilities")

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/iavf: fix VLAN offload requests to PF
Haiyue Wang [Mon, 25 Jan 2021 04:30:54 +0000 (12:30 +0800)]
net/iavf: fix VLAN offload requests to PF

If the underlying PF doesn't support a specific ethertype or the ability
to toggle VLAN insertion and/or stripping, then prevent the VF sending
an invalid message to the PF.

Fixes: 1c301e8c3cff ("net/iavf: support new VLAN capabilities")

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/iavf: fix VLAN strip configuration
Junfeng Guo [Fri, 22 Jan 2021 13:35:41 +0000 (13:35 +0000)]
net/iavf: fix VLAN strip configuration

For AVF with single VLAN mode (SVM), port VLAN stripping config
has already been disabled by PF. In this scenario, the error of
-ENOTSUP can be ignored.

Fixes: 1c301e8c3cff ("net/iavf: support new VLAN capabilities")

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
3 years agonet/iavf: fix symmetric flow rule creation
Xuan Ding [Fri, 22 Jan 2021 03:19:22 +0000 (03:19 +0000)]
net/iavf: fix symmetric flow rule creation

Only allow to create symmetric rule for L3/L4.

Fixes: 91f27b2e39ab ("net/iavf: refactor RSS")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/mlx4: fix port attach in secondary process
Suanming Mou [Sun, 24 Jan 2021 11:02:06 +0000 (19:02 +0800)]
net/mlx4: fix port attach in secondary process

Currently, the secondary process port UAR register mapping used by Tx
queue is done during port initializing.

Unluckily, in port hot-plug case, the secondary process will be
requested to initialize the port when primary process probe the port.
At that time, the port Tx queue number is still not configured, the
secondary process get Tx queue number as 0. This causes the UAR register
not be mapped as secondary process get Tx queue number 0.

This commit adds the check of Tx queue number in secondary process when
port starts is requested. Once the Tx queue number is not matching, do
UAR mapping with the latest Tx queue number.

Fixes: 0203d33a1059 ("net/mlx4: support secondary process")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix port attach in secondary process
Suanming Mou [Sun, 24 Jan 2021 11:02:05 +0000 (19:02 +0800)]
net/mlx5: fix port attach in secondary process

Currently, the secondary process port UAR register mapping used by Tx
queue is done during port initializing.

Unluckily, in port hot-plug case, the secondary process was requested
to initialize the port when primary process did not complete the
device configuration and the port Tx queue number is not configured
yet. Hence, the secondary process gets the zero Tx queue number during
probing, causing the UAR registers not be mapped in the correct
fashion.

This commit checks the configured number of Tx queues in secondary
process when the port start is requested. In case the Tx queue
number mismatch found the UAR mapping is reinitialized accordingly.

Fixes: 2aac5b5d119f ("net/mlx5: sync stop/start with secondary process")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix crash on secondary process port close
Suanming Mou [Sun, 24 Jan 2021 11:02:04 +0000 (19:02 +0800)]
net/mlx5: fix crash on secondary process port close

When secondary process starts, in rte_eth_dev_attach_secondary()
function, the secondary process port device data in struct rte_eth_dev
will be initialized to be shared with primary process port.

When failsafe sub-port hot-plug happens, both primary and secondary
process will release the sub-port, and primary process will clear the
sub-port device data in fs_dev_remove() deactivate stage first before
request secondary process to release the sub-port. In this case, the
secondary process will not be able to get the priv memory pointer from
the shared device data memory anymore, since the device data memory
has been cleared.

Since what secondary process needs in port detach is the UAR table size
to unmap the UAR addresses. It used Tx queue number as size of UAR table
in priv. In fact the uar_table_sz in struct mlx5_proc_priv means the
size of UAR register table - the number of UAR records. However, the
code set this field incorrectly to the size of mlx5_proc_priv structure.

This commit fixes UAR table size to match with relevant Tx queue number,
uses the UAR table size directly to avoid the secondary process to
access the priv pointer in the shared device data memory when unmapping
the UAR address.

Fixes: 120dc4a7dcd3 ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix multi-process port ID
Suanming Mou [Sun, 24 Jan 2021 11:02:03 +0000 (19:02 +0800)]
net/mlx5: fix multi-process port ID

The device port_id is used for inter-process communication and must
be the same both for primary and secondary process

This IPC port_id was configured with the invalid temporary value in
port spawn routine. This temporary value was used by the function
rte_eth_dev_get_port_by_name() to check whether the port exists.

This commit corrects the mp port_id with rte_eth_dev port_id.

Fixes: 2eb4d0107acc ("net/mlx5: refactor PCI probing on Linux")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agodoc: update flow mark action in mlx5 guide
Viacheslav Ovsiienko [Fri, 11 Dec 2020 12:05:40 +0000 (12:05 +0000)]
doc: update flow mark action in mlx5 guide

There some limitations added for the MARK action value range.

Fixes: 2d241515ebaf ("net/mlx5: add devarg for extensive metadata support")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: refuse empty VLAN in flow pattern
Shiri Kuzin [Tue, 19 Jan 2021 17:07:00 +0000 (19:07 +0200)]
net/mlx5: refuse empty VLAN in flow pattern

In verbs, an empty VLAN is equivalent to a packet without VLAN layer,
hence, the VLAN item should not be empty and this case is rejected.

However, the case for ether type of VLAN without following VLAN item
was not validated, allowing the creation of a flow with empty
VLAN item.

To fix this issue a validation was added requiring ether type of VLAN
will be followed with VLAN item.

Fixes: 0b1edd21cd78 ("net/mlx5: refuse empty VLAN flow specification")
Cc: stable@dpdk.org
Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: fix flow tag decompression
Alexander Kozyrev [Thu, 14 Jan 2021 21:32:19 +0000 (21:32 +0000)]
net/mlx5: fix flow tag decompression

Packets can get a wrong Flow Tag on x86 architecture with the Flow Tag
compression format (rxq_cqe_comp_en=2) enabled inside the SSE Rx burst.
The shuffle mask that extracts a Flow Tag from the pair of compressed
CQEs is reversed. This leads to the wrong Flow Tag assignment.
Correct the shuffle mask to get proper bytes for a Flow Tag from
miniCQEs.

Fixes: 54c2d46b160f ("net/mlx5: support flow tag and packet header miniCQEs")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix drop action in tunnel offload mode
Gregory Etelson [Wed, 20 Jan 2021 19:17:10 +0000 (21:17 +0200)]
net/mlx5: fix drop action in tunnel offload mode

Tunnel offload mode allows application to restore partially offloaded
tunneled packets to its original state.
The mode was designed to optimize packet recovery. It must not
block flow actions that are allowed by MLX5 PMD.

The patch allows tunnel offload match rules to use drop flow action.

Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix mark action in active tunnel offload
Gregory Etelson [Wed, 20 Jan 2021 19:17:09 +0000 (21:17 +0200)]
net/mlx5: fix mark action in active tunnel offload

Tunnel offload mode allows application to restore partially offloaded
tunneled packets to its original state.
MLX5 PMD stores internal data required to restore partially offloaded
packet in packet mark section. Therefore MLX5 PMD will not allow
applications to use mark action if tunnel offload mode was activated.
The restriction is applied both to regular and tunnel offload rules.

The patch rejects application rules with mark action while tunnel
offload is active.

Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/octeontx2: fix PF flow action for Tx
Liron Himi [Mon, 18 Jan 2021 10:50:31 +0000 (12:50 +0200)]
net/octeontx2: fix PF flow action for Tx

pf-func is 16bit but the current reserved location
used in tx action is 8bits. Moved it to bits 63-48.

Fixes: 32e6aaa97c40 ("net/octeontx2: support flow parse actions")
Cc: stable@dpdk.org
Signed-off-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
3 years agonet/bnxt: fix null termination of Rx mbuf chain
Lance Richardson [Fri, 22 Jan 2021 21:49:00 +0000 (16:49 -0500)]
net/bnxt: fix null termination of Rx mbuf chain

The last mbuf in a multi-segment packet needs to be
NULL-terminated.

Fixes: 0958d8b6435d ("net/bnxt: support LRO")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agoapp/testpmd: fix key for RSS flow rule
Alvin Zhang [Thu, 21 Jan 2021 09:41:54 +0000 (17:41 +0800)]
app/testpmd: fix key for RSS flow rule

Since the patch '1848b117' has initialized the variable 'key' in
'struct rte_flow_action_rss' with 'NULL', the PMD cannot get the
RSS key now. Details as bellow:

testpmd> flow create 0 ingress pattern eth / ipv4 / end actions
         rss types ipv4-other end key
         1234567890123456789012345678901234567890FFFFFFFFFFFF123
         4567890123456789012345678901234567890FFFFFFFFFFFF
         queues end / end
Flow rule #1 created
testpmd> show port 0 rss-hash key
RSS functions:
         all ipv4-other ip
RSS key:
         4439796BB54C5023B675EA5B124F9F30B8A2C03DDFDC4D02A08C9B3
         34AF64A4C05C6FA343958D8557D99583AE138C92E81150366

This patch sets offset and size of the 'key' variable as the first
parameter of the token 'key'. Later, the address of the RSS key will
be copied to 'key' variable.

Fixes: 1848b117cca1 ("app/testpmd: fix RSS key for flow API RSS rule")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Tested-by: Jun W Zhou <junx.w.zhou@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/ice/base: add ethertype offset for QinQ dummy packet
Yuying Zhang [Tue, 19 Jan 2021 07:12:53 +0000 (07:12 +0000)]
net/ice/base: add ethertype offset for QinQ dummy packet

Add the ethertype offset for QinQ switch rule dummy packet to
allow matching the corresponding field.

Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix symmetric rule creation
Xuan Ding [Thu, 21 Jan 2021 07:26:57 +0000 (07:26 +0000)]
net/ice: fix symmetric rule creation

Only allow to create symmetric rule for L3/L4.

Fixes: 38d632cbdc88 ("net/ice: refactor PF RSS")

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: drain out DCF AdminQ command queue
Haiyue Wang [Thu, 21 Jan 2021 17:31:37 +0000 (01:31 +0800)]
net/ice: drain out DCF AdminQ command queue

The virtchnl message is handled one by one by checking opcode to match
the response for the request.

The DCF AdminQ command with buffer needs two virtchnl commands, one is
to handle the AdminQ descriptor, the other is to the handle AdminQ
buffer. If both of them are sent to PF successfully, it needs to wait
two responses from PF, even if the AdminQ descriptor command gets the
failure response. Since PF will handle them one by one, and send back
the response for each.

If not wait for the buffer message response until timeout to drain out
the virtchnl command queue, it will cause the next AdminQ command with
buffer to get the stall buffer response from previous.

Fixes: daa714d55c72 ("net/ice: handle AdminQ command by DCF")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/bnxt: fix lock handling in stop and close
Somnath Kotur [Thu, 14 Jan 2021 02:27:16 +0000 (18:27 -0800)]
net/bnxt: fix lock handling in stop and close

err_recovery_lock needs to be released before returning in
stop and close_op if FW_RESET flag is set.

Fixes: 6f5f3b99821e ("net/bnxt: check chip reset in stop and close")

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: refactor init/uninit
Somnath Kotur [Thu, 14 Jan 2021 02:19:14 +0000 (18:19 -0800)]
net/bnxt: refactor init/uninit

Move all the individual driver fields allocation routines to one
routine - bnxt_drv_init(). This houses all such routines where
memory needs to be allocated once during the driver's lifetime
and does not need to be torn down during error recovery.

Rename some function names in accordance with their functionality.
bnxt_init_board() is doing nothing more than mapping the PCI bars,
so rename it as such.

Given that there is a bnxt_shutdown_nic that is called in dev_stop_op,
rename it's counterpart - bnxt_init_chip() that is called in
dev_start_op, to bnxt_start_nic. Also helps avoid confusion with some of
the other bnxt_init_xxx routines.
Rename bnxt_init_fw() to bnxt_get_config() as that is what that routine
is doing mostly functionality wise.

Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix packet type index calculation
Lance Richardson [Mon, 18 Jan 2021 21:57:09 +0000 (16:57 -0500)]
net/bnxt: fix packet type index calculation

Fix mask to include all four bits of hardware packet type
field.

Fixes: 97b1db288dd0 ("net/bnxt: use table based packet type translation")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix FW version log
Kalesh AP [Mon, 18 Jan 2021 04:12:11 +0000 (09:42 +0530)]
net/bnxt: fix FW version log

Driver is not logging the complete FW version along with HSI version.
Fix it to indicate complete FW version string.

Fixes: 9a891c1764ea ("net/bnxt: update HWRM to version 1.9.2")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/mlx4: fix handling of probing failure
Michael Baum [Wed, 20 Jan 2021 08:14:51 +0000 (08:14 +0000)]
net/mlx4: fix handling of probing failure

In mlx4 PCI probing, there are some validations for the Ethernet device
configuration.
From each PCI device the function creates one or two Ethernet devices.

When one of validations fails during the creation of the second device,
the first device is not freed what caused a memory leak.

Free it.

Fixes: 7fae69eeff13 ("mlx4: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: David Marchand <david.marchand@redhat.com>
3 years agonet/mlx4: fix device detach
Michael Baum [Wed, 20 Jan 2021 08:14:50 +0000 (08:14 +0000)]
net/mlx4: fix device detach

When mlx4 device is probed, 2 different ethdev ports may be created for
the 2 physical ports of the device.

Wrongly, when the device is removed, the created ports are not released.

Close and release the ethdev ports in remove process.

Bugzilla ID: 488
Fixes: 7fae69eeff13 ("mlx4: new poll mode driver")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: David Marchand <david.marchand@redhat.com>
3 years agonet/mlx5: fix flow split combined with age action
Dekel Peled [Sun, 17 Jan 2021 09:40:45 +0000 (11:40 +0200)]
net/mlx5: fix flow split combined with age action

Currently, for a flow containing an age action, if flow is split to
sub-flows, a new age action will be created for each sub-flow.
However only the action created for the last sub-flow will be queried
on flow query and cleared on flow removal.

This behavior is wrong, causing a leak of resources.
Need to create just one action per flow, and use it for all sub-flows.

This patch adds the required check to make sure an age action is
created just once per flow, and used by all sub-flows.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: fix flow split combined with counter
Dekel Peled [Sun, 17 Jan 2021 09:40:46 +0000 (11:40 +0200)]
net/mlx5: fix flow split combined with counter

Currently, for a flow containing a count action, if flow is split to
sub-flows, a new counter will be created for each sub-flow.
However only the counter created for the last sub-flow will be queried
on flow query and cleared on flow removal.

This behavior is wrong, causing a leak of resources.
Need to create just one counter per flow, and use it for all sub-flows.

This patch adds the required check to make sure a counter is
created just once per flow, and used by all sub-flows.

Fixes: fa2d01c87d2b ("net/mlx5: support flow aging")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: enlarge maximal flow priority
Dong Zhou [Tue, 12 Jan 2021 06:40:00 +0000 (08:40 +0200)]
net/mlx5: enlarge maximal flow priority

Currently, the maximal flow priority in non-root table to user
is 4, it's not enough for user to do some flow match by priority,
such as LPM, for one IPV4 address, we need 32 priorities for each
bit of 32 mask length.

PMD will manage 3 sub-priorities per user priority according to L2,
L3 and L4. The internal priority is 16 bits, user can use priorities
from 0 - 21843.

Those enlarged flow priorities are only used for ingress or egress
flow groups greater than 0 and for any transfer flow group.

Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: support E-Switch mirroring with modify action
Jiawei Wang [Tue, 12 Jan 2021 10:29:18 +0000 (12:29 +0200)]
net/mlx5: support E-Switch mirroring with modify action

While there's the modify action and sample action with ratio=1
in the E-Switch flow, and modify action is after the sample
action, means that the modify should only impact on after sample.
MLX5 PMD will monitor the above case and split the E-Switch flow
into two sub flows, similar as sample flow did before:

 - the prefix sub flow with all actions preceding the sample and the
   sample action itself, also append the new jump action after sample
   in the prefix sub flow;
 - the suffix sub flow with the modify action and other actions
   following the sample action.

The flow split as below:

Original flow: items / actions pre / sample / modify / actions sfx
    prefix sub flow -
    items / actions pre / set_tag action / sample / jump
    suffix sub flow -
    tag_item / modify / actions sfx

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: extend skip scale flag
Jiawei Wang [Tue, 12 Jan 2021 10:29:17 +0000 (12:29 +0200)]
net/mlx5: extend skip scale flag

The sampling feature introduces the scale flow group with factor,
then the scaled table value can be used for the normal path table
due to this table be created implicitly.

But if the input group value already be scaled, for example the
group value of sampling suffix flow, then use 'skip_scale" flag
to skip the scale twice in the translation action.

Consider the flow with jump action and this jump action could be
created implicitly, PMD may only scale the original flow group
value or scale the jump group value or both, so extend the
'skip_scale' flag to two bits:
If bit0 of 'skip_scale' flag is set to 1, then skip the scale the
original flow group;
If bit1 of 'skip_scale' flag is set to 1, then skip the scale the
jump flow group.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: support E-Switch mirroring and jump in one flow
Jiawei Wang [Tue, 12 Jan 2021 10:29:16 +0000 (12:29 +0200)]
net/mlx5: support E-Switch mirroring and jump in one flow

mlx5 E-Switch mirroring is implemented as multiple destination array in
one steering table. The array currently supports only port ID as
destination actions.

This patch adds the jump action support to the array as one of
destination.
The packets can be mirrored to the port and jump to the next table in
the same destination array allowing to continue handling in the new
table.

For example:
    set sample_actions 0 port_id id 1 / end
    flow create 0 ingress transfer pattern eth / end actions
    sample ratio 1 index 0 / jump group 1 / end
    flow create 1 ingress transfer group 1 pattern eth / end actions
    set_mac_dst mac_addr 00:aa:bb:cc:dd:ee / port_id id 2 / end

The flow results all the matched ingress packets are mirrored
to port id 1 and go to group 1. In the group 1, packets are modified
with the destination mac and sent to port id 2.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: query preserve capability via DevX
Jiawei Wang [Tue, 12 Jan 2021 10:29:15 +0000 (12:29 +0200)]
common/mlx5: query preserve capability via DevX

Update function mlx5_devx_cmd_query_hca_attr() to add the
reg_c_preserve bit query.

The stored metadata in register C may be lost in NIC Tx and
FDB egress while doing one of the following operations:
 - packet encapsulation.
 - packet mirroring (multiple processing paths).
 - packet sampling (using Flow Sampler).

If the reg_c_preserve bit is set to 1, then the above
limitation is obsolete, the all metadata registers Cx
preserve their values even through the operations mentioned
above.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/ice: support E823C family devices
Qi Zhang [Thu, 21 Jan 2021 01:34:44 +0000 (09:34 +0800)]
net/ice: support E823C family devices

Add device ID support for family E823C, also update the base code
BSD version.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
3 years agonet/ice/base: support configuring device in double VLAN mode
Qi Zhang [Thu, 21 Jan 2021 01:32:26 +0000 (09:32 +0800)]
net/ice/base: support configuring device in double VLAN mode

In order to support configuring the device in Double VLAN Mode (DVM),
the DDP and FW have to support DVM. If both support DVM, the PF
that downloads the package needs to update the default recipes and set
the VLAN mode. This is done in ice_set_dvm().

In order to support updating the default recipes in DVM add support
for updating an existing switch recipe's lkup_idx and mask.
This is done by first calling the get recipe AQ (0x0292) with the
desired recipe ID. Then, if that is successful update one of the lookup
indices (lkup_idx) and its associated mask if the mask is valid
otherwise the already existing mask will be used.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
3 years agonet/ice/base: add VLAN TPID for VLAN filters
Qi Zhang [Thu, 21 Jan 2021 01:07:41 +0000 (09:07 +0800)]
net/ice/base: add VLAN TPID for VLAN filters

Currently VLAN filters via RID4 are only based on VLAN ID. However, with
incoming support for Double VLAN Mode (DVM), the driver needs to be able
to support filtering on VLAN ID + VLAN TPID (i.e. 0x8100, 0x88a8, etc.).
Add support for this by adding two fields to the ice_fltr_info
structure. First, add the tpid_valid field so the code can determine
whether or not to overwrite the default 0x8100 value for programming
packets or use the tpid field.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
3 years agonet/ice/base: fix outer VLAN related macro
Qi Zhang [Thu, 21 Jan 2021 00:58:21 +0000 (08:58 +0800)]
net/ice/base: fix outer VLAN related macro

Fix the wrong value of ICE_AQ_VSI_OUTER_VLAN_PORT_BASED_ACCEPT_HOST

Fixes: 9ea028123a0b ("net/ice/base: align add VSI and update VSI AQ command buffer")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
3 years agonet/e1000: fix flow control mode setting
Wenjun Wu [Wed, 20 Jan 2021 06:53:37 +0000 (14:53 +0800)]
net/e1000: fix flow control mode setting

E1000_CTRL register should be updated according to fc_conf->mode's
value.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
3 years agodoc: update release notes for iavf
Yuying Zhang [Wed, 20 Jan 2021 10:22:32 +0000 (18:22 +0800)]
doc: update release notes for iavf

Add iavf PMD new feature in release notes.

Fixes: 61abc5f611a0 ("net/iavf: support TCP/UDP flow item without input set")

Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/iavf: align to new VLAN offload name
Haiyue Wang [Wed, 20 Jan 2021 04:17:49 +0000 (12:17 +0800)]
net/iavf: align to new VLAN offload name

Since the VLAN offload virtchnl message name has been renamed to setting
style, the internal Ethernet type setting name needs be changed to avoid
confusing.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agocommon/iavf: support VLAN filtering
Haiyue Wang [Wed, 20 Jan 2021 04:17:48 +0000 (12:17 +0800)]
common/iavf: support VLAN filtering

In order to support enable/disable VLAN filtering the VF has to
negotiate the capability via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2. If
VIRTCHNL_VLAN_TOGGLE is allowed for the VLAN filtering capabilities,
then there needs to be a method to allow this. Make the necessary
changes to support this.

Also, since the virtchnl_vlan_offload message has the desired format,
change the structure name to virtchnl_vlan_setting so it can be used for
VIRTCHNL_OP_ENABLE_VLAN_FILTERING_V2 and
VIRTCHNL_OP_DISABLE_VLAN_FILTERING_V2.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/ice: enable eCPRI tunnel port configure in DCF
Jeff Guo [Wed, 20 Jan 2021 10:07:04 +0000 (18:07 +0800)]
net/ice: enable eCPRI tunnel port configure in DCF

Add eCPRI tunnel port add and rm ops to configure eCPRI UDP tunnel port
in "Device Config Function" (DCF).

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agoapp/testpmd: fix packets dump overlapping
Jiawei Wang [Wed, 20 Jan 2021 06:50:17 +0000 (08:50 +0200)]
app/testpmd: fix packets dump overlapping

When testpmd enabled the verbosity for the received packets, if two
packets were received at the same time, for example, sampling packet and
normal packet, the dump output of these packets may be overlapping due
to multiple core handling the multiple queues simultaneously.

The patch uses one string buffer that collects all the packet dump
output into this buffer and then printouts it at last, that guarantees
to printout separately the dump output per packet.

Fixes: d862c45b5955 ("app/testpmd: move dumping packets to a separate function")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agodevtools: remove check-includes script
Bruce Richardson [Fri, 29 Jan 2021 16:48:23 +0000 (16:48 +0000)]
devtools: remove check-includes script

The check-includes script allowed checking header files in a given
directory to ensure that each header compiled alone without requiring
any other header inclusions.

With header checking now being done by the chkincs app in the build
system this script can be removed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoci: enable header includes check
Bruce Richardson [Fri, 29 Jan 2021 16:55:59 +0000 (16:55 +0000)]
ci: enable header includes check

For CI builds, turn on the checking of includes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
3 years agobuild: add header includes check
Bruce Richardson [Fri, 29 Jan 2021 16:48:21 +0000 (16:48 +0000)]
build: add header includes check

To verify that all DPDK headers are ok for inclusion directly in a C file,
and are not missing any other pre-requisite headers, we can auto-generate
for each header an empty C file that includes that header. Compiling these
files will throw errors if any header has unmet dependencies.

For some libraries, there may be some header files which are not for direct
inclusion, but rather are to be included via other header files. To allow
later checking of these files for missing includes, we separate out the
indirect include files from the direct ones.

To ensure ongoing compliance, we enable this build test as part of the
default x86 build in "test-meson-builds.sh".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoeventdev: make driver-only headers private
Bruce Richardson [Fri, 29 Jan 2021 16:48:20 +0000 (16:48 +0000)]
eventdev: make driver-only headers private

The rte_eventdev_pmd*.h files are for drivers only and should be private
to DPDK, and not installed for app use.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoethdev: make driver-only headers private
Bruce Richardson [Fri, 29 Jan 2021 16:48:19 +0000 (16:48 +0000)]
ethdev: make driver-only headers private

The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
3 years agorib: fix missing header include
Bruce Richardson [Fri, 29 Jan 2021 16:48:18 +0000 (16:48 +0000)]
rib: fix missing header include

The rte_rib6 header was using RTE_MIN macro from rte_common.h but not
including the header file.

Fixes: f7e861e21c46 ("rib: support IPv6")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
3 years agopower: fix missing header includes
Bruce Richardson [Fri, 29 Jan 2021 16:48:17 +0000 (16:48 +0000)]
power: fix missing header includes

The rte_power_guest_channel.h file did not include its dependent
headers, so add them.

Fixes: 5f443cc0f905 ("power: create guest channel public header file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoeal: fix internal ABI tag with clang
Bruce Richardson [Fri, 29 Jan 2021 16:48:16 +0000 (16:48 +0000)]
eal: fix internal ABI tag with clang

Clang does not have an "error" attribute for functions, so for marking
internal functions we need to check for the error attribute, and provide
a fallback if it is not present. For clang, we can use "diagnose_if"
attribute, similarly checking for its presence before use.

Fixes: fba5af82adc8 ("eal: add internal ABI tag definition")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoeal: fix MCS lock header include
Bruce Richardson [Fri, 29 Jan 2021 16:48:15 +0000 (16:48 +0000)]
eal: fix MCS lock header include

Include 'rte_branch_prediction.h' to get the likely/unlikely macro
definitions.

Fixes: 2173f3333b61 ("mcslock: add MCS queued lock implementation")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
3 years agocrypto/dpaa2_sec: fix memory allocation check
Gagandeep Singh [Mon, 25 Jan 2021 08:15:34 +0000 (16:15 +0800)]
crypto/dpaa2_sec: fix memory allocation check

When key length is 0, zmalloc will return NULL pointer
and in that case it should not return NOMEM.
So in this patch, adding a check on key length.

Fixes: 8d1f3a5d751b ("crypto/dpaa2_sec: support crypto operation")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
3 years agotest/ipsec: fix result code for not supported
Gagandeep Singh [Mon, 25 Jan 2021 08:15:33 +0000 (16:15 +0800)]
test/ipsec: fix result code for not supported

During SA creation, if the required algorithm is not supported,
drivers can return ENOTSUP. But in most of the IPsec test cases,
if the SA creation does not success, it just returns
TEST_FAILED.

This patch fixes this issue by returning the actual return values
from the driver to the application, so that it can make decisions
whether the test case is passed, failed or unsupported.

Fixes: 05fe65eb66b2 ("test/ipsec: introduce functional test")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
3 years agocompress/mlx5: add supported capabilities
Matan Azrad [Wed, 20 Jan 2021 11:29:35 +0000 (11:29 +0000)]
compress/mlx5: add supported capabilities

Add all the capabilities supported by the device.

Add the driver documentations.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: support 32-bit systems
Matan Azrad [Wed, 20 Jan 2021 11:29:34 +0000 (11:29 +0000)]
compress/mlx5: support 32-bit systems

In order to support 32-bit systems, the 8B doorbell write should be
done by 2 4B stores.

The order between the store is important, that's why memory barrier
should be used between them.

The doorbell address is shared between all the queues, that's why a lock
should wrap the 2 stores.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: add statistics operations
Matan Azrad [Wed, 20 Jan 2021 11:29:33 +0000 (11:29 +0000)]
compress/mlx5: add statistics operations

Add support for the next statistics operations:
- stats_get
- stats_reset

These statistics are counted by the SW data-path.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: add data-path functions
Matan Azrad [Wed, 20 Jan 2021 11:29:32 +0000 (11:29 +0000)]
compress/mlx5: add data-path functions

Add implementation for the next compress data-path functions:
- dequeue_burst
- enqueue_burst

Add the next operation for starting \ stopping data-path:
- dev_stop
- dev_close

Each compress API enqueued operation is translated to a WQE.
Once WQE is done, the HW sends CQE to the CQ, when SW see the CQE the
operation will be updated and dequeued.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: add memory region management
Matan Azrad [Wed, 20 Jan 2021 11:29:31 +0000 (11:29 +0000)]
compress/mlx5: add memory region management

Mellanox user space drivers don't deal with physical addresses, that's
why any mbuf virtual address moved directly to the HW descriptor(WQE).

The mapping between the virtual address to the physical address is saved
in MR configured by the kernel to the HW.

Each MR has a key that should also be moved to the WQE by the SW.

When the SW see address which is not mapped, it extends the address
range and creates a MR using a system call.

Add memory region cache management:
- 2 level cache per queue-pair - no locks.
- 1 shared cache between all the queues using a lock.

Using this way, the MR key search per data-path address is optimized.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: add transformation operations
Matan Azrad [Wed, 20 Jan 2021 11:29:30 +0000 (11:29 +0000)]
compress/mlx5: add transformation operations

Add support for the next operations:
- private_xform_create
- private_xform_free

The driver transformation structure includes preparations for the next
GGA WQE fields used by the enqueue function: opcode. compress specific
fields (window size, block size and dynamic size) checksum type and
compress type.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: support queue pair operations
Matan Azrad [Wed, 20 Jan 2021 11:29:29 +0000 (11:29 +0000)]
compress/mlx5: support queue pair operations

Add support for the next operations:
- queue_pair_setup
- queue_pair_release

Create and initialize DevX SQ and CQ for each compress API queue-pair.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: add compress primitives
Matan Azrad [Wed, 20 Jan 2021 11:29:28 +0000 (11:29 +0000)]
common/mlx5: add compress primitives

Add the GGA compress WQE related structures and definitions.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: support basic control operations
Matan Azrad [Wed, 20 Jan 2021 11:29:27 +0000 (11:29 +0000)]
compress/mlx5: support basic control operations

Add initial support for the next operations:
- dev_configure
- dev_close
- dev_infos_get

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocompress/mlx5: introduce PMD
Matan Azrad [Wed, 20 Jan 2021 11:29:26 +0000 (11:29 +0000)]
compress/mlx5: introduce PMD

Add a new compress PMD for Mellanox devices.

The MLX5 compress driver library provides support for Mellanox
BlueField 2 families of 25/50/100/200 Gb/s adapters.

GGAs (Generic Global Accelerators) are offload engines that can be used
to do memory to memory tasks on data.
These engines are part of the ARM complex of the BlueField 2 chip, and
as such they do not use NIC related resources (e.g. RX/TX bandwidth).
They do share the same PCI and memory bandwidth.

So, using the BlueField 2 device, the compress class operations can be
run in parallel to the net, vdpa, and regex class operations.

This driver is depending on rdma-core like the other mlx5 PMDs, also it
is going to use mlx5 DevX to create HW objects directly by the FW.

Add the probing functions, PCI bus connectivity, HW capabilities checks
and some basic objects preparations.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: add DevX attributes for compress
Matan Azrad [Wed, 20 Jan 2021 11:29:25 +0000 (11:29 +0000)]
common/mlx5: add DevX attributes for compress

Add the DevX attributes for compress related engines:
- compress
- decompress
- dma

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocrypto/qat: fix digest in buffer
Fan Zhang [Wed, 20 Jan 2021 17:33:51 +0000 (17:33 +0000)]
crypto/qat: fix digest in buffer

This patch fixes the missed digest in buffer support to
QAT symmetric raw API. Originally digest in buffer is
supported only for wireless algorithms

Fixes: 728c76b0e50f ("crypto/qat: support raw datapath API")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
3 years agoapp/crypto-perf: add script to graph perf results
Ciara Power [Wed, 20 Jan 2021 17:29:30 +0000 (17:29 +0000)]
app/crypto-perf: add script to graph perf results

The python script introduced in this patch runs the crypto performance
test application for various test cases, and graphs the results.

Test cases are defined in config JSON files, this is where parameters
are specified for each test. Currently there are various test cases for
devices crypto_qat, crypto_aesni_mb and crypto_gcm. Tests for the
ptest types Throughput and Latency are supported for each.

The results of each test case are graphed and saved in PDFs (one PDF for
each test suite graph type, with all test cases).
The graphs output include various grouped barcharts for throughput
tests, and histogram and boxplot graphs are used for latency tests.

Documentation is added to outline the configuration and usage for the
script.

Usage:
A JSON config file must be specified when running the script,
"./dpdk-graph-crypto-perf <config_file>"

The script uses the installed app by default (from ninja install).
Alternatively we can pass path to app by
"-f <rel_path>/<build_dir>/app/dpdk-test-crypto-perf"

All device test suites are run by default.
Alternatively we can specify by adding arguments,
"-t latency" - to run latency test suite only
"-t throughput latency"
- to run both throughput and latency test suites

A directory can be specified for all output files,
or the script directory is used by default.
"-o <output_dir>"

To see the output from the dpdk-test-crypto-perf app,
use the verbose option "-v".

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
3 years agoapp/crypto-perf: fix CSV output format
Ciara Power [Wed, 20 Jan 2021 17:29:29 +0000 (17:29 +0000)]
app/crypto-perf: fix CSV output format

The csv output for each ptest type used ";" instead of ",".
This has now been fixed to use the comma format that is used in the csv
headers.

Fixes: f6cefe253cc8 ("app/crypto-perf: add range/list of sizes")
Fixes: 96dfeb609be1 ("app/crypto-perf: add new PMD benchmarking mode")
Fixes: da40ebd6d383 ("app/crypto-perf: display results in test runner")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
3 years agoapp/crypto-perf: fix latency CSV output
Ciara Power [Wed, 20 Jan 2021 17:29:28 +0000 (17:29 +0000)]
app/crypto-perf: fix latency CSV output

The csv output for the latency performance test had an extra header,
"Packet Size", which is a duplicate of "Buffer Size", and had no
corresponding value in the output. This is now removed.

Fixes: f6cefe253cc8 ("app/crypto-perf: add range/list of sizes")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
3 years agotest/event_crypto: set cipher operation in transform
Ankur Dwivedi [Mon, 18 Jan 2021 16:19:40 +0000 (21:49 +0530)]
test/event_crypto: set cipher operation in transform

The symmetric session configure callback function in OCTEON TX2 crypto
PMD returns error if the cipher operation is not set to either encrypt
or decrypt. This patch sets the cipher operation for the null cipher
to encrypt.

Fixes: 74449375237f ("test/event_crypto_adapter: fix configuration")
Cc: stable@dpdk.org
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
3 years agoapp/eventdev: remove unnecessary barriers in order test
Feifei Wang [Thu, 14 Jan 2021 07:08:30 +0000 (15:08 +0800)]
app/eventdev: remove unnecessary barriers in order test

For the wmb in order_process_stage_1 and order_process_stage_invalid in
the order test, they can be removed. This is because when the test results
are wrong, the worker core writes 'true' to t->err. Then other worker
cores, producer cores and the main core will load the 'error' index and
stop testing. So, for the worker cores, no other storing operation needs
to be guaranteed after this when errors happen.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoapp/eventdev: remove unnecessary barrier in pipeline test
Feifei Wang [Thu, 14 Jan 2021 07:08:29 +0000 (15:08 +0800)]
app/eventdev: remove unnecessary barrier in pipeline test

For "processed_pkts" function, no operations should keep the order that
being executed before loading "worker[i].processed_pkts".

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoapp/eventdev: replace a barrier with thread fence
Feifei Wang [Thu, 14 Jan 2021 07:08:28 +0000 (15:08 +0800)]
app/eventdev: replace a barrier with thread fence

Simply replace rte_smp barrier with atomic threand fence.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoapp/eventdev: remove unnecessary barriers in perf test
Feifei Wang [Thu, 14 Jan 2021 07:08:27 +0000 (15:08 +0800)]
app/eventdev: remove unnecessary barriers in perf test

For "processed_pkts" and "total_latency" functions, no operations should
keep the order that being executed before loading
"worker[i].processed_pkts". Thus rmb is unnecessary before loading.

For "perf_launch_lcores" function, wmb after that the main lcore
updates the variable "t->done", which represents the end of the test
signal, is unnecessary. Because after the main lcore updates this
siginal variable, it will jump out of the launch function loop, and wait
other lcores stop or return error in the main function(evt_main.c).
During this time, there is no important storing operation and thus no
need for wmb.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoapp/eventdev: fix SMP barrier in performance test
Feifei Wang [Thu, 14 Jan 2021 07:08:26 +0000 (15:08 +0800)]
app/eventdev: fix SMP barrier in performance test

This patch fixes RTE SMP barrier bugs for the perf test of eventdev.

For the "perf_process_last_stage" function, wmb after storing
processed_pkts should be moved before it. This is because the worker
lcore should ensure it has really finished data processing, e.g. event
stored into buffers, before the shared variables "w->processed_pkts"are
stored.

For the "perf_process_last_stage_latency", on the one hand, the wmb
should be moved before storing into "w->processed_pkts". The reason is
the same as above. But on the other hand, for "w->latency", wmb is
unnecessary due to data dependency.

Fixes: 2369f73329f8 ("app/testeventdev: add perf queue worker functions")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoexamples/eventdev: move ethdev stop to the end
Feifei Wang [Thu, 14 Jan 2021 10:31:01 +0000 (18:31 +0800)]
examples/eventdev: move ethdev stop to the end

Move eth stop code from "signal_handler" function to the end of "main"
function. There are two reasons for this:

First, this improves code maintenance and makes code look simple and
clear. Based on this change, after receiving the interrupt signal,
"fdata->done" is set as 1. Then the main thread will wait all worker
lcores to jump out of the loop. Finally, the main thread will stop and
then close eth dev port.

Second, for older version, the main thread first stops eth dev port and
then waits the end of worker lcore. This may cause errors because it may
stop the eth dev port which worker lcores are using. This moving change
can fix this by waiting all worker threads to exit and then stop the
eth dev port.

In the meanwhile, remove wmb in signal_handler.

This is because when the main lcore receive the stop signal, it stores 1
into fdata->done. And then the worker lcores load "fdata->done" and jump
out of the loop to stop running. Nothing should be stored after updating
fdata->done, so the wmb is unnecessary.

Fixes: 085edac2ca38 ("examples/eventdev_pipeline: support Tx adapter")
Cc: stable@dpdk.org
Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
3 years agoexamples/eventdev: add info output for main core
Feifei Wang [Thu, 14 Jan 2021 10:31:00 +0000 (18:31 +0800)]
examples/eventdev: add info output for main core

When the main core is set as tx/rx/sched/worker core, it also needs to
print some information to show this. Thus, add info output for the main
core, and add a "dump" function to print core information for the sake
of code simplicity and easy maintenance.

In the meanwhile, fix the count error. For the variable "worker_idx", it
should be incremented when the core is set as worker core. However, when
the main core is set as rx/tx/sched core, the worker_idx is also
incremented. Though this error may not have a substantial impact due to
that the main core is the last launched core, but it should be corrected
from the perspective of code correctness.

Fixes: 1094ca96689c ("doc: add SW eventdev pipeline to sample app guide")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
3 years agoexamples/eventdev: check CPU core enabling
Feifei Wang [Thu, 14 Jan 2021 10:30:59 +0000 (18:30 +0800)]
examples/eventdev: check CPU core enabling

In the case that the cores are isolated, if "-l" or "-c" parameter is not
added, the cores will not be enabled and can not launch worker function
correctly. In the meanwhile, no error information is reported.

For example:
totally CPUs:16
isolated CPUs:1-8
command: sudo gdb -args ./dpdk-eventdev_pipeline --vdev event_sw0 \
        -- -r1 -t1 -e4 -w F00 -s4 -n0 -c32 -W1000 -D

cores information:
rte_config->lcore_role = {ROLE_RTE, ROLE_OFF, ROLE_OFF, ROLE_OFF,
                          ROLE_OFF, ROLE_OFF, ROLE_OFF, ROLE_OFF,
                          ROLE_OFF, ROLE_RTE, ROLE_RTE, ROLE_RTE,
                          ROLE_RTE, ROLE_RTE, ROLE_RTE, ROLE_RTE}

output information:
...
[main()] lcore 9 executing worker, using eventdev port 0
[main()] lcore 10 executing worker, using eventdev port 1
[main()] lcore 11 executing worker, using eventdev port 2

This is because "RTE_LCORE_FOREACH_WORKER" chooses the enabled core. In
the case that the cores are isolated, "the lcore_role" flag of isolated
cores are set as "ROLE_OFF" by default(not enabled). So if we choose
these isolated cores as workers, "RTE_LCORE_FOREACH_WORKER" will ignore
these cores and not launch worker functions on them.

To fix this, add "-l" parameters to doc and add lcore enabled check.

Fixes: 1094ca96689c ("doc: add SW eventdev pipeline to sample app guide")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>