dpdk.git
4 years agocommon/mlx5: add glue function for domain sync
Bing Zhao [Tue, 27 Oct 2020 14:46:53 +0000 (22:46 +0800)]
common/mlx5: add glue function for domain sync

In rdma-core, the "mlx5dv_dr_domain_sync" function was already
provided. It is used to flush the rule submission queue. The wrapper
function in the glue layer is added for using this.
It only supports DR flows right now the same as domain creating and
destroying functions.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: use C11 atomics for flow tables
Alexander Kozyrev [Tue, 27 Oct 2020 15:28:24 +0000 (15:28 +0000)]
net/mlx5: use C11 atomics for flow tables

The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use the relaxed ordering for RTE flow tables.
Enforce Acquire/Release model for managing DevX pools.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: use C11 atomics for RxQ/TxQ refcounts
Alexander Kozyrev [Tue, 27 Oct 2020 15:28:23 +0000 (15:28 +0000)]
net/mlx5: use C11 atomics for RxQ/TxQ refcounts

The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use the relaxed ordering for RxQ/TxQ refcounts.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agocommon/mlx5: use C11 atomics for netlink sequence
Alexander Kozyrev [Tue, 27 Oct 2020 15:28:22 +0000 (15:28 +0000)]
common/mlx5: use C11 atomics for netlink sequence

The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use __atomic_add_fetch instead of
rte_atomic32_add_return to generate a Netlink sequence number.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agocommon/mlx5: use C11 atomics for memory allocation
Alexander Kozyrev [Tue, 27 Oct 2020 15:28:21 +0000 (15:28 +0000)]
common/mlx5: use C11 atomics for memory allocation

The rte_atomic API is deprecated and needs to be replaced with
C11 atomic builtins. Use the relaxed ordering for mlx5 mallocs.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: fix Tx queue start
Matan Azrad [Tue, 27 Oct 2020 06:43:25 +0000 (06:43 +0000)]
net/mlx5: fix Tx queue start

The Tx queue stop\start operations update the HW state of the Tx queue
object. The stop API should update the state from ready to reset in
order to stop any queue traffic and the start API should update the
state from reset to ready in order to open the traffic path.

The start API wrongly tried to change the state from ready to ready what
caused a failure in FW on the current state validation.

Replace ready to ready command by reset to ready command in the Tx start
API.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Asaf Penso <asafp@nvidia.com>
4 years agonet/mlx5: support item type error message in flow Verbs
Li Zhang [Mon, 28 Sep 2020 06:55:46 +0000 (09:55 +0300)]
net/mlx5: support item type error message in flow Verbs

Update the flow verbs error message to "item type X not supported",
when it is not supported,
instead of a generic error message "item not supported".

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agocommon/mlx5: add ConnectX-7 and Bluefield-3 device IDs
Raslan Darawsheh [Mon, 26 Oct 2020 11:41:47 +0000 (13:41 +0200)]
common/mlx5: add ConnectX-7 and Bluefield-3 device IDs

This adds the ConnectX-7 and Bluefield-3 device ids to the list of
supported Mellanox devices that run the MLX5 PMDs.
The devices is still in development stage.

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: support VLAN matching fields
Matan Azrad [Sun, 25 Oct 2020 16:03:39 +0000 (16:03 +0000)]
net/mlx5: support VLAN matching fields

The fields ``has_vlan`` and ``has_more_vlan`` were added in rte_flow by
patch [1].

Using these fields, the application can match all the VLAN options by
single flow: any, VLAN only and non-VLAN only.

Add the support for the fields.
By the way, add the support for QinQ packets matching.

VLAN\QinQ limitations are listed in the driver document.

[1] https://patches.dpdk.org/patch/80965/

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
4 years agodoc: update hairpin support in mlx5 guide
Bing Zhao [Mon, 26 Oct 2020 16:37:47 +0000 (00:37 +0800)]
doc: update hairpin support in mlx5 guide

Hairpin between two ports will be supported by mlx5 PMD.

The supported scenarios and limitations are listed in "mlx5.rst".

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: do not split hairpin flow in explicit mode
Bing Zhao [Mon, 26 Oct 2020 16:37:46 +0000 (00:37 +0800)]
net/mlx5: do not split hairpin flow in explicit mode

In the current implementation, the hairpin flow will be split into
two flows implicitly if there is some action that only belongs to the
Tx part. A Tx device flow will be inserted by the mlx5 PMD itself.

In hairpin between two ports, the explicit Tx flow mode will be the
only one to be supported. It is not the appropriate behavior to
insert a Tx flow into another device implicitly. The application
could create any flow as it likes and has full control of the user
flows. Hairpin flows will have no difference from standard flows and
the application can decide how to chain Rx and Tx flows together.

Even in the single port hairpin, this explicit Tx flow mode could
also be supported.

When checking if the hairpin needs to be split, it will just return
if the hairpin queue is with "tx_explicit" attribute. Then in the
following steps for validation and translation, the code path will
be the same as that for standard flows.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: change hairpin ingress flow validation
Bing Zhao [Mon, 26 Oct 2020 16:37:45 +0000 (00:37 +0800)]
net/mlx5: change hairpin ingress flow validation

In the current implementation of the single port hairpin, there is
an implicit splitting process for actions. When inserting a hairpin
flow, all the actions will be included with the ingress attribute.
The flow engine will check and decide which actions should be moved
into the TX flow part, e.g., encapsulation, VLAN push.

In some NICs, some actions can only be done in one direction. Since
the hairpin flow will be split into two parts, such validation will
be skipped.

With the hairpin explicit TX flow mode, no splitting is needed any
more. The hairpin flow may have no big difference from a standard
flow (except the queue). The application should take full charge of
the actions and the flow engine should validate the hairpin flow in
the same way as other flows.

In the meanwhile, a new internal API is added to get the hairpin
configuration. This will bypass the useless atomic operation to save
the CPU cycles.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: add conditional hairpin auto bind
Bing Zhao [Mon, 26 Oct 2020 16:37:44 +0000 (00:37 +0800)]
net/mlx5: add conditional hairpin auto bind

In single port hairpin mode, after the queues are configured during
start up. The binding process will be enabled automatically in the
port start phase and the default control flow for egress will be
created.

When switching to two ports hairpin mode, the auto binding process
should be skipped if there is no TX queue with the peer RX queue on
the same device, and it should be skipped also if the queues are
configured with manual bind attribute.

If the explicit TX flow rule mode is configured or hairpin is
between two ports, the default control flows for TX queues should
not be created.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: support getting hairpin peer ports
Bing Zhao [Mon, 26 Oct 2020 16:37:43 +0000 (00:37 +0800)]
net/mlx5: support getting hairpin peer ports

In real-life business, one device could be attached and detached
dynamically. The hairpin configuration of this port to/from all the
other ports should be enabled and disabled accordingly.

The RTE ethdev lib and PMD should provide this ability to get the
peer ports list in case that the application doesn't save it. It is
recommended that the size of the array to save the port IDs is as
large as the "RTE_MAX_ETHPORTS" to have the maximal capacity.

The order of the peer port IDs may be different from that during
hairpin queues set in the initialization stage. The peer port ID
could be the same as the current device port ID when the hairpin
peer ports contain itself - the single port hairpin.

The application should check the ports' status and decide if the
peer port should be bound / unbound when starting / stopping the
current device.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: support two ports hairpin mode
Bing Zhao [Mon, 26 Oct 2020 16:37:42 +0000 (00:37 +0800)]
net/mlx5: support two ports hairpin mode

In order to support hairpin between two ports, mlx5 PMD needs to
implement the functions and provide them as the function pointers.

The bind and unbind functions are executed per port pairs. All the
hairpin queues between the two ports should have the same attributes
during queues setup. Different configurations among queue pairs from
the same ports are not supported. It is allowed that two ports only
have one direction hairpin.

In order to set up the connection between two queues, peer Rx queue
HW information must be fetched via the internal RTE API and the queue
information could be used to modify the SQ object. Then the RQ object
will be modified with the Tx queue HW information. The reverse
operation is not supported right now.

When disconnecting the queues pair, SQ and RQ object should be reset
without any peer HW information. The unbinding operation will try to
disconnect all Tx queues from the port from the Rx queues of the peer
port.

Tx explicit mode attribute will be saved and used when creating a
hairpin flow.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: change hairpin queue peer checking
Bing Zhao [Mon, 26 Oct 2020 16:37:41 +0000 (00:37 +0800)]
net/mlx5: change hairpin queue peer checking

In the current implementation of single port mode hairpin, the peer
queue should belong to the same port of the current queue. When the
two ports hairpin mode is introduced, such checking should be removed
to make the hairpin queue setup execute successfully since it is not
an invalid condition, if the Tx port and Rx port are not the same.

In the meanwhile, different devices could have different queue
configurations. The queues number of peer port is unknown to the
current device. The checking should be removed also.

If the Tx and Rx port IDs of a hairpin peer are different, only the
manual binding and explicit Tx flows are supported. Or else, the four
combinations of modes could be supported. The mode attributes
consistency checking will be done when connecting the queue with its
peer queue.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agocommon/mlx5: fix PCI driver name
Bing Zhao [Mon, 26 Oct 2020 08:37:42 +0000 (16:37 +0800)]
common/mlx5: fix PCI driver name

In the refactor of mlx5 common layer, the PCI driver name to the RTE
device was changed from "net_mlx5" to "mlx5_pci". The string of name
"mlx5_pci" is used directly in the structure rte_pci_driver.

In the past, a macro "MLX5_DRIVER_NAME" is used instead of any direct
string, and now it is missing. The functions that use
"MLX5_DRIVER_NAME" will get some mismatch, e.g mlx5_eth_find_next.

It needs to use this macro again in all code to make everything get
aligned.

Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class drivers")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/bnxt: fix Rx performance by removing spinlock
Rahul Gupta [Mon, 26 Oct 2020 03:56:16 +0000 (20:56 -0700)]
net/bnxt: fix Rx performance by removing spinlock

The spinlock was trying to protect scenarios where rx_queue stop/start
could be initiated dynamically. Assigning bnxt_dummy_recv_pkts and
bnxt_dummy_xmit_pkts immediately to avoid concurrent access of mbuf in Rx
and cleanup path should help achieve the same result.

Fixes: 14255b351537 ("net/bnxt: fix queue start/stop operations")

Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Rahul Gupta <rahul.gupta@broadcom.com>
4 years agonet/bnxt: set thread safe flow ops flag
Ajit Khaparde [Mon, 26 Oct 2020 03:56:15 +0000 (20:56 -0700)]
net/bnxt: set thread safe flow ops flag

PMD supports thread-safe flow operations. Set the
RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE dev_flag to indicate this info
to the application. rte_flow API functions can avoid using its
own mutex for safe multi-thread flow handling.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix resetting mbuf data offset
Ajit Khaparde [Mon, 26 Oct 2020 03:56:14 +0000 (20:56 -0700)]
net/bnxt: fix resetting mbuf data offset

Reset mbuf->data_off before handing the Rx packet to the application.
We were not doing this in the TPA path. It can cause applications
using this field for post processing to work incorrectly.

Fixes: 0958d8b6435d ("net/bnxt: support LRO")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
4 years agonet/bnxt: increase size of Rx CQ
Ajit Khaparde [Mon, 26 Oct 2020 03:56:13 +0000 (20:56 -0700)]
net/bnxt: increase size of Rx CQ

LRO aka TPA and jumbo frame support uses aggregation ring for placing
Rx buffers. These features can generate multiple Rx completions for a
single Rx packet. Increase size of Rx Completion Queue to handle TPA
and aggregation ring events.

Fixes: daef48efe5e5 ("net/bnxt: support set MTU")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Qingmin Liu <qingmin.liu@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
4 years agonet/bnxt: support VXLAN decap offload
Venkat Duvvuru [Mon, 26 Oct 2020 03:56:12 +0000 (20:56 -0700)]
net/bnxt: support VXLAN decap offload

VXLAN decap offload can happen in stages. The offload request may
not come as a single flow request rather may come as two flow offload
requests F1 & F2. This patch is adding support for this two stage
offload design. The match criteria for F1 is O_DMAC, O_SMAC,
O_DST_IP, O_UDP_DPORT and actions are COUNT, MARK, JUMP. The match
criteria for F2 is O_SRC_IP, O_DST_IP, VNI and inner header fields.
F1 and F2 flow offload requests can come in any order. If F2 flow
offload request comes first then F2 can’t be offloaded as there is
no O_DMAC information in F2. In this case, F2 will be deferred until
F1 flow offload request arrives. When F1 flow offload request is
received it will have O_DMAC information. Using F1’s O_DMAC, driver
creates an L2 context entry in the hardware as part of offloading F1.
F2 will now use F1’s O_DMAC to get the L2 context id associated with
this O_DMAC and other flow fields that are cached already at the time
of deferring F2 for offloading. F2s that arrive after F1 is offloaded
will be directly programmed and not cached.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add VXLAN decap templates
Venkat Duvvuru [Mon, 26 Oct 2020 03:56:11 +0000 (20:56 -0700)]
net/bnxt: add VXLAN decap templates

Templates for outer tunnel & inner tunnel flow are added in this patch.
This will be used by subsequent patches to implement support for
VXLAN decap rte_flow offload.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: refactor flow id allocation
Venkat Duvvuru [Mon, 26 Oct 2020 03:56:10 +0000 (20:56 -0700)]
net/bnxt: refactor flow id allocation

Currently, the flow id is allocated inside ulp_mapper_flow_create.
However with vxlan decap feature if F2 flow comes before F1 flow
then F2 is cached and not really installed in the hardware which
means the code will return without calling ulp_mapper_flow_create.
But, ULP has to still return valid flow id to the stack.
Hence, move the flow id allocation outside ulp_mapper_flow_create.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add mapper support for wildcard TCAM
Kishore Padmanabha [Mon, 26 Oct 2020 03:56:09 +0000 (20:56 -0700)]
net/bnxt: add mapper support for wildcard TCAM

Added support for the key and mask fields encoding for the
wildcard TCAM entry. Also add internal function to post process
the key/mask blobs for wildcard TCAM table. The size of the
wildcard TCAM slice is 80 bytes.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: modify HWRM command to create reps
Somnath Kotur [Mon, 26 Oct 2020 03:56:08 +0000 (20:56 -0700)]
net/bnxt: modify HWRM command to create reps

Use cfa pair alloc for configuring reps.
Instead of cfa_vfr_alloc for Wh+ and cfa_pair_alloc for Stingray,
converge to cfa_pair_alloc/free for both devices. Set the command
request structure bits accordingly.
As part of this, remove the old cfa_vfr_alloc cmd definitions as FW
has deprecated support for those commands.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Shahaji Bhosle <sbhosle@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add hierarchical flow counters
Kishore Padmanabha [Mon, 26 Oct 2020 03:56:07 +0000 (20:56 -0700)]
net/bnxt: add hierarchical flow counters

Add support for hierarchical flow counter accumulation.
In case of hierarchical flows, involving parent and child flows,
the child flow counters are aggregated to get the parent flow counter
information. This should help in cases where one ore more flows
is related to a previously offloaded flow.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Shahaji Bhosle <sbhosle@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix flow query count
Somnath Kotur [Mon, 26 Oct 2020 03:56:06 +0000 (20:56 -0700)]
net/bnxt: fix flow query count

Fix infinite loop in flow query count.
`nxt_resource_idx` could be zero in some cases which is invalid and
should be part of the while loop condition. Also synchronize access to
the flow db using the fdb_lock

Fixes: 306c2d28e247 ("net/bnxt: support count action in flow query")

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: update ULP resource counts
Peter Spreadborough [Mon, 26 Oct 2020 03:56:05 +0000 (20:56 -0700)]
net/bnxt: update ULP resource counts

Update ULP resource counts for Stingray device.
- FW needs some resources for normal operation. Account those
in the resource manager.
- Update the SR ULP requested resource counts to reflect
those available after AFM resources are accounted for.
- Add build option to select either 2 or 4 slot EM entries.
The default is 4 slot entries.

Signed-off-by: Peter Spreadborough <peter.spreadborough@broadcom.com>
Signed-off-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add table scope to PF mapping
Farah Smith [Mon, 26 Oct 2020 03:56:04 +0000 (20:56 -0700)]
net/bnxt: add table scope to PF mapping

Add table scope to PF Mapping for SR and Wh+ devices.
Legacy devices require PF set of base addresses for EEM operation.
A table scope id is a logical construct and is mapped to the PF
associated with the communications channel used.
In the case of a VF, the parent PF is used.

Signed-off-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support two table scopes
Jay Ding [Mon, 26 Oct 2020 03:56:03 +0000 (20:56 -0700)]
net/bnxt: support two table scopes

Adding support for two table scopes. One for Exact Match tables
and other for External Exact Match tables.
New API to map a PARIF to an EEM table scope (set of Rx and Tx EEM
base addresses). It uses HWRM_TF_GLOBAL_CFG_SET HWRM to configure.
PARIF is handler to a partition of the physical port.
Adjustments to tf_global_cfg_set() to reduce overhead and nominal
name clarification.

Signed-off-by: Jay Ding <jay.ding@broadcom.com>
Signed-off-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add Stingray support to core layer
Peter Spreadborough [Mon, 26 Oct 2020 03:56:02 +0000 (20:56 -0700)]
net/bnxt: add Stingray support to core layer

- Moved P4 chip specific code under the P4 directory
- Added P45 skeleton code for SR to build on
- Add SR support in TRUFLOW core layer.
The TRUFLOW core or the tf-core is a shim layer which communicates with
the CFA block in the hardware.

Signed-off-by: Peter Spreadborough <peter.spreadborough@broadcom.com>
Signed-off-by: Jay Ding <jay.ding@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/ice/base: update version
Qi Zhang [Tue, 20 Oct 2020 22:39:38 +0000 (06:39 +0800)]
net/ice/base: update version

Update base code version in readme.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: specify global RSS LUT id in get/set RSS LUT
Qi Zhang [Tue, 20 Oct 2020 03:48:57 +0000 (11:48 +0800)]
net/ice/base: specify global RSS LUT id in get/set RSS LUT

There is no way to specify a global RSS lookup table (LUT) ID with the
current API and 0 is the only global LUT ID that can be supported since
it's hard coded.
Upcoming support to specify a global LUT ID will require this
flexibility. To fix this, update the API for ice_aq_get_rss_lut() and
ice_aq_set_rss_lut() to take the new structure
ice_aq_get_set_rss_params, which includes a global_lut_id member. A new
structure was introduced instead of adding another parameter to the
previously mentioned functions for 2 reasons:

1. Reduce the number of parameters passed to the functions.
2. Reduce the amount of change required if the arguments ever need to be
   updated in the future.

Also, reduce duplicate code that was checking for an invalid vsi_handle
and lut parameter by moving the checks to the lower level
__ice_aq_get_set_rss_lut().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: refactor RSS configure API
Qi Zhang [Tue, 20 Oct 2020 02:43:08 +0000 (10:43 +0800)]
net/ice/base: refactor RSS configure API

Use struct ice_rss_hash_cfg as parameter for
ice_add_rss_cfg, ice_add_rss_cfg_sync and
ice_rem_rss_cfg, ice_rem_rss_cfg_sync.

Introduce enmu ice_rss_cfg_hdr_type to allow user specify the more
flexible RSS configure.

ICE_RSS_OUTER_HEADERS - take outer layer as RSS inputset
ICE_RSS_INNER_HEADERS - take inner layer as RSS inputset
ICE_RSS_INNER_HEADERS_W_OUTER_IPV4
- take inner layer as RSS inputset for packet with outer IPV4
ICE_RSS_INNER_HEADERS_W_OUTER_IPV6
- take inner layer as RSS inputset for packet with outer IPV6
ICE_RSS_ANY_HEADERS - try with outer first then inner
(same as the behaviour without this change)

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: use macro to get variable size array length
Qi Zhang [Tue, 20 Oct 2020 01:33:53 +0000 (09:33 +0800)]
net/ice/base: use macro to get variable size array length

Use the FLEX_ARRAY_SIZE() helper with the recently added flexible array
members in structures.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: remove duplicated AQ command flag setting
Qi Zhang [Tue, 20 Oct 2020 01:26:08 +0000 (09:26 +0800)]
net/ice/base: remove duplicated AQ command flag setting

When sending the indirect Read/Write SFF EEPROM AQ command. The flag is
already added later in the code flow for all indirect AQ commands, i.e.
commands that provide an additional data buffer.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: support extended GPIO access
Qi Zhang [Tue, 20 Oct 2020 01:16:59 +0000 (09:16 +0800)]
net/ice/base: support extended GPIO access

Added two new admin commands called: SW Set GPIO and SW Get GPIO
(0x6EF and 0x6F0 respectively) which extends GPIO handling
capabilities by SW driver

Signed-off-by: Shay Amir <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: fix parameter name in comment
Qi Zhang [Tue, 20 Oct 2020 01:06:01 +0000 (09:06 +0800)]
net/ice/base: fix parameter name in comment

Fix parameter name for cookie_high and cookie_low.

Fixes: a90fae1d0755 ("net/ice/base: add admin queue structures and commands")
Cc: stable@dpdk.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: recognize 860 as iSCSI port in CEE mode
Qi Zhang [Tue, 20 Oct 2020 01:00:40 +0000 (09:00 +0800)]
net/ice/base: recognize 860 as iSCSI port in CEE mode

iSCSI can use both TCP ports 860 and 3260. However, in our current
implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command
doesn't provide a way to communicate the protocol port number to the
AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the
AQ's caller layer.

In this patch, we will rely on the dcbx-willing mode, desired QOS and
remote QOS configuration to determine which port number that iSCSI will
use.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: implement shared rate limiter
Qi Zhang [Tue, 20 Oct 2020 00:52:03 +0000 (08:52 +0800)]
net/ice/base: implement shared rate limiter

Implemented shared bandwidth rate limit functionality to account for
dedicated bandwidth and minimum bandwidth. It requires non default
profile be programmed for CIR, EIR/PIR, and SRL.

Signed-off-by: Tarun Singh <tarun.k.singh@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: return error directly
Qi Zhang [Tue, 20 Oct 2020 00:47:31 +0000 (08:47 +0800)]
net/ice/base: return error directly

As there is nothing to unroll, return the error directly. Remove the
label as this is the only reference to that label.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: support class 5+ modules
Qi Zhang [Fri, 18 Sep 2020 05:24:34 +0000 (13:24 +0800)]
net/ice/base: support class 5+ modules

Currently QSFP/SFP modules up to power class 4 are supported.
100G modules require higher power in many cases.
Also, low power mode requires support of power classes 7 and even 8.

This change extends "Get Link Status" AQ command (0x0607) to
support class 5+ modules.

The patch also add couple other missing bits for link status.

Signed-off-by: Shay Amir <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: use malloc instead of calloc
Qi Zhang [Fri, 18 Sep 2020 05:21:48 +0000 (13:21 +0800)]
net/ice/base: use malloc instead of calloc

Use *malloc() instead of *calloc() when allocating only a single object
as opposed to an array of objects.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: use package info from ice segment metadata
Qi Zhang [Fri, 18 Sep 2020 05:18:30 +0000 (13:18 +0800)]
net/ice/base: use package info from ice segment metadata

There are two package versions in the package binary. Today, these two
version numbers are the same. However, in the future that may change.

Update code to use the package info from the ice segment metadata
section, which is the package information that is actually downloaded to
the firmware during the download package process.

Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: add more capability to admin queue
Qi Zhang [Fri, 18 Sep 2020 05:13:38 +0000 (13:13 +0800)]
net/ice/base: add more capability to admin queue

Add below 3 new capability to "Get Capabilities" AQ commands
0x000A and 0x000B.

ICE_AQC_CAPS_IWARP
ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE
ICE_AQC_CAPS_NVM_MGMT

Signed-off-by: Shay Amir <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: allocate and free RSS global lookup table
Qi Zhang [Fri, 18 Sep 2020 05:10:31 +0000 (13:10 +0800)]
net/ice/base: allocate and free RSS global lookup table

Currently there is no API to allocate and free a RSS global LUT.
Incoming changes to support VFs having >16 queues will require using
RSS global LUT resources. The functions included will allow a PF to
configure a RSS global LUT for VFs that request >16 queues.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: read security revision
Qi Zhang [Fri, 18 Sep 2020 05:07:11 +0000 (13:07 +0800)]
net/ice/base: read security revision

The main NVM module and the Option ROM module contain a security
revision in their CSS header. This security revision is used to
determine whether or not the signed module should be loaded at bootup.
If the module security revision is lower than the associated minimum
security revision, it will not be loaded.

The CSS header does not have a module id associated with it, and thus
requires flat NVM reads in order to access it. To do this, take
advantage of the cached bank information. Introduce a new
"ice_read_flash_module" function that takes the module and bank to read.
Implement both ice_read_active_nvm_module and
ice_read_active_orom_module. These functions will use the cached values
to determine the active bank and calculate the appropriate offset.

Using these new access functions, extract the security revision for both
the main NVM bank and the Option ROM into the associated info structure.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: move sched function prototypes
Qi Zhang [Fri, 18 Sep 2020 05:03:57 +0000 (13:03 +0800)]
net/ice/base: move sched function prototypes

These functions reside in ice_sched.c but the function protypes are
declared in ice_common.h. Move the function prototypes to ice_sched.h.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: rename ptype bitmap
Qi Zhang [Fri, 18 Sep 2020 05:02:00 +0000 (13:02 +0800)]
net/ice/base: rename ptype bitmap

Align all ptype bitmap to follow ice_ptypes_xxx prefix.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: modify ptype bitmap for outer MAC
Qi Zhang [Fri, 18 Sep 2020 04:59:44 +0000 (12:59 +0800)]
net/ice/base: modify ptype bitmap for outer MAC

Add below ptypes into ice_ptypes_mac_ofos:

MAC_IPV4[6]_ESP
MAC_IPV4[6]_AH
MAC_IPV4[6]_NAT_T_ESP
MAC_IPV4[6]_NAT_T_IKE
MAC_IPV4[6]_NAT_T_KEEP
MAC_IPV4[6]_PFCP_NODE
MAC_IPV4[6]_PFCP_SESSION
MAC_IPV4[6]_L2TPV3

So above ptype can also be selected by a filter when outer mac header
is required.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: add NVM write response flags
Qi Zhang [Fri, 18 Sep 2020 04:57:36 +0000 (12:57 +0800)]
net/ice/base: add NVM write response flags

Added NVM Write Admin Command (0x703) ARQ response flags - as
returned in "Response flags" field.
Three flags are supported: POR, PERST and EMPR. All indicate the
type of reset required to get the NVM bank update effective.

Signed-off-by: Shay Amir <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: support tunnel for flow director
Qi Zhang [Fri, 18 Sep 2020 04:54:57 +0000 (12:54 +0800)]
net/ice/base: support tunnel for flow director

Add struct to store outer part for tunnel rule.
Add vxlan ptype in ipv4 mac bitmap. So when create a vxlan rule, the
ptype group will be valid.

Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice: show RSS hash configuration
Tao Zhu [Thu, 29 Oct 2020 06:37:57 +0000 (14:37 +0800)]
net/ice: show RSS hash configuration

Implement interface 'ice_rss_hash_conf_get' to support show RSS
hash configuration.

Note:
Only return rss_hf from latest dev_configure or dev_rss_hash_update.
All configures from rte_flow are ignored.

Signed-off-by: Tao Zhu <taox.zhu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/iavf: enable AVX512 for Tx
Wenzhuo Lu [Thu, 29 Oct 2020 01:24:04 +0000 (09:24 +0800)]
net/iavf: enable AVX512 for Tx

To enhance the per-core performance, this patch adds some AVX512
instructions to the data path to handle the Tx descriptors.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/iavf: enable AVX512 for flexible Rx
Wenzhuo Lu [Thu, 29 Oct 2020 01:24:03 +0000 (09:24 +0800)]
net/iavf: enable AVX512 for flexible Rx

To enhance the per-core performance, this patch adds some AVX512
instructions to the data path to handle the flexible Rx descriptors.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/iavf: enable AVX512 for legacy Rx
Wenzhuo Lu [Thu, 29 Oct 2020 01:24:02 +0000 (09:24 +0800)]
net/iavf: enable AVX512 for legacy Rx

To enhance the per-core performance, this patch adds some AVX512
instructions to the data path to handle the legacy Rx descriptors.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: fix DCF crash on Rx
Haiyue Wang [Thu, 29 Oct 2020 01:13:22 +0000 (09:13 +0800)]
net/ice: fix DCF crash on Rx

The initialization of selecting the handler for scalar Rx path FlexiMD
fields extraction into mbuf is missed, it will cause segmentation fault
(core dumped).

Also add the missed support to handle RXDID 16, which has RSS hash value
on Qword 1.

Fixes: 7a340b0b4e03 ("net/ice: refactor Rx FlexiMD handling")
Cc: stable@dpdk.org
Reported-by: Alvin Zhang <alvinx.zhang@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/iavf: fix Rx offload flags in SSE path
Alvin Zhang [Tue, 27 Oct 2020 10:15:07 +0000 (18:15 +0800)]
net/iavf: fix Rx offload flags in SSE path

Update reading offload flags of last two of four packets.

Fixes: 1162f5a0ef31 ("net/iavf: support flexible Rx descriptor in SSE path")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agodoc: fix a typo in flow API guide
Ajit Khaparde [Wed, 28 Oct 2020 04:46:09 +0000 (21:46 -0700)]
doc: fix a typo in flow API guide

flow_type_rss_offloads was misspelt as flow_tpe_rss_offloads

Fixes: 6abee736abe6 ("doc: update RSS flow action with best effort")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
4 years agoethdev: move non-offload capabilities
Thomas Monjalon [Tue, 27 Oct 2020 13:20:22 +0000 (14:20 +0100)]
ethdev: move non-offload capabilities

The definitions of RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP
and RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP were inserted
before the last comment of Tx offloads.

It is moved in a better place,
with comments moved to be before the definition.
A group comment is added to better describe device capabilities.

Fixes: cac923cfea47 ("ethdev: support runtime queue setup")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
4 years agonet/ice: fix Rx offload flags in SSE path
Alvin Zhang [Fri, 23 Oct 2020 02:13:22 +0000 (10:13 +0800)]
net/ice: fix Rx offload flags in SSE path

Update reading offload flags of last two of four packets.

Fixes: ece1f8a8f1c8 ("net/ice: switch to flexible descriptor in SSE path")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/i40e: fix flow director for eth + VLAN pattern
Beilei Xing [Tue, 27 Oct 2020 06:21:47 +0000 (14:21 +0800)]
net/i40e: fix flow director for eth + VLAN pattern

Currently, can't create more than one following flow for
ETH + VLAN pattern.

1. flow create 0 ingress pattern eth / vlan vid is 350 / end
   actions queue index 2 / end
2. flow create 0 ingress pattern eth / vlan vid is 351 / end
   actions queue index 3 / end

The root cause is the vlan_tci is not set correctly, it will
cause the keys of both of the two flows are the same.

Fixes: 42044b69c67d ("net/i40e: support input set selection for FDIR")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/txgbe: add Tx done cleanup
Jiawen Wu [Tue, 27 Oct 2020 06:23:15 +0000 (14:23 +0800)]
net/txgbe: add Tx done cleanup

Add support for API rte_eth_tx_done_cleanup().

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: add Rx and Tx descriptor status
Jiawen Wu [Tue, 27 Oct 2020 06:23:14 +0000 (14:23 +0800)]
net/txgbe: add Rx and Tx descriptor status

Supports check the status of Rx and Tx descriptors.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agodoc: add Rx buffer split limitation to mlx5 guide
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:05 +0000 (11:55 +0000)]
doc: add Rx buffer split limitation to mlx5 guide

The buffer split feature is mentioned in the mlx5 PMD
documentation, the limitation is description is added
as well.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: report Rx buffer split capabilities
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:04 +0000 (11:55 +0000)]
net/mlx5: report Rx buffer split capabilities

Add rte_eth_dev_info->rx_seg_capa parameters:
  - receiving to multiple pools is supported
  - buffer offsets are supported
  - no offset alignment requirement
  - reports the maximal number of segments
  - reports the buffer split offload flag

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: support Rx buffer split on datapath
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:03 +0000 (11:55 +0000)]
net/mlx5: support Rx buffer split on datapath

Only the regular rx_burst routine is updated to support split,
because the vectorized ones does not support scatter and MPRQ
does not support split at all.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: register multiple pool for Rx queue
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:02 +0000 (11:55 +0000)]
net/mlx5: register multiple pool for Rx queue

The split feature for receiving packets was added to the mlx5
PMD, now Rx queue can receive the data to the buffers belonging
to the different pools and the memory of all the involved pool
must be registered for DMA operations in order to allow hardware
to store the data.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: configure Rx queue for buffer split
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:01 +0000 (11:55 +0000)]
net/mlx5: configure Rx queue for buffer split

The scatter-gather elements should be configured
accordingly to support the buffer split feature.
The application provides the desired settings for
the segments at the beginning of the packets and
PMD pads the buffer chain (if needed) with attributes
of last specified segment to accommodate the packet
of maximal length.

There are some limitations are implied. The MPRQ
feature should be disengaged if split is requested,
due to MPRQ neither supports pushing data to the
dedicated pools nor follows the flexible buffer sizes.
The vectorized rx_burst routines does not support
the scattering (these ones are extremely simplified
and work over the single segment only) and can't
handle split as well.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: support Rx buffer split description
Viacheslav Ovsiienko [Mon, 26 Oct 2020 11:55:00 +0000 (11:55 +0000)]
net/mlx5: support Rx buffer split description

The routine to provide Rx queue setup with specifying
extended receiving buffer description is added.
It allows application to specify desired segment
lengths, data position offsets in the buffer
and dedicated memory pool for each segment.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/mlx5: implement tunnel offload
Gregory Etelson [Sun, 25 Oct 2020 14:08:09 +0000 (16:08 +0200)]
net/mlx5: implement tunnel offload

Tunnel Offload API provides hardware independent, unified model
to offload tunneled traffic. Key model elements are:
 - apply matches to both outer and inner packet headers
   during entire offload procedure;
 - restore outer header of partially offloaded packet;
 - model is implemented as a set of helper functions.

Implementation details:
* tunnel_offload PMD parameter must be set to 1 to enable the feature.
* application cannot use MARK and META flow actions with tunnel.
* offload JUMP action is restricted to steering tunnel rule only.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: support shared action for RSS
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:10 +0000 (13:24 +0300)]
net/mlx5: support shared action for RSS

Implement shared action create/destroy/update/query. The current
implementation support is limited to shared RSS action only. The shared
RSS action create operation prepares hash RX queue objects for all
supported permutations of the hash.  The shared RSS action update
operation relies on functionality to modify hash RX queue introduced in
one of the previous commits in this patch series.

Implement RSS shared action and handle shared RSS on flow apply and
release. The lookup for hash RX queue object for RSS action is limited
to the set of objects stored in the shared action itself and when
handling shared RSS action. The lookup for hash RX queue object inside
shared action is performed by hash only.

Current implementation limited to DV flow driver operations i.e. verbs
flow driver operations doesn't support shared action.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: translate shared action for RSS action
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:09 +0000 (13:24 +0300)]
net/mlx5: translate shared action for RSS action

Handle shared action on flow validation/creation/destruction.
mlx5 PMD translates shared action into a regular one before handling
flow validation/creation. The shared action translation applied to
utilize the same execution path for both shared and regular actions.
The current implementation supports shared action translation for shared
RSS action only.

RSS action validation split to validate shared RSS action on its
creation in addition to action validation in flow validation/creation
path.

Implement rte_flow shared action API for mlx5 PMD, mostly forwarding
calls to flow driver operations (see struct mlx5_flow_driver_ops).

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: modify hash Rx queue objects
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:08 +0000 (13:24 +0300)]
net/mlx5: modify hash Rx queue objects

Implement modification for hashed table of Rx queue object (see
mlx5_hrxq_modify()). This implementation relies on the capability to
modify TIR object via DevX API, i.e. current implementation doesn't
support verbs HW object operations. The functionality to modify hashed
table of Rx queue object is prerequisite to implement
rete_flow_shared_action_update() for shared RSS action in mlx5 PMD.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agocommon/mlx5: modify advanced Rx object via DevX
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:07 +0000 (13:24 +0300)]
common/mlx5: modify advanced Rx object via DevX

Implement TIR modification (see mlx5_devx_cmd_modify_tir()) using DevX
API. TIR is the object containing the hashed table of Rx queue. The
functionality to configure/modify this HW-related object is prerequisite
to implement rete_flow_shared_action_update() for shared RSS action in
mlx5 PMD. HW-related structures for TIR modification add in mlx5_prm.h.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/bnxt: update PMD supported features
Lance Richardson [Thu, 22 Oct 2020 20:19:51 +0000 (16:19 -0400)]
net/bnxt: update PMD supported features

Mark "BSD nic_uio", "Usage doc", and "Perf doc" as supported
for the bnxt PMD.

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: use shorter SIMD initializers
Lance Richardson [Thu, 22 Oct 2020 18:50:51 +0000 (14:50 -0400)]
net/bnxt: use shorter SIMD initializers

Make SIMD initialization code less verbose by using appropriate
intrinsics when all lanes of a vector are initialized to the
same value.

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix boolean operator usage
Lance Richardson [Thu, 22 Oct 2020 18:45:10 +0000 (14:45 -0400)]
net/bnxt: fix boolean operator usage

Use boolean AND operator instead of bitwise operator.

Coverity issue: 323488
Fixes: b42c15c83e88 ("net/bnxt: support trusted VF")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/ice: update writeback policy to reduce latency
Jesse Brandeburg [Fri, 23 Oct 2020 20:22:00 +0000 (13:22 -0700)]
net/ice: update writeback policy to reduce latency

Just like iavf, setting the value to 2us allows for generally good
streaming packet performance while keeping latency down, and
generally keeps the performance of the PF and VF interfaces similar.

The previous value of 0x10 was making latency on a single packet
receive be as much as 16us.

Fixes: 65dfc889d86b ("net/ice: support Rx queue interruption")
Cc: stable@dpdk.org
Reported-by: Brian Johnson <brian.johnson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/iavf: fix performance with writeback policy
Jesse Brandeburg [Fri, 23 Oct 2020 20:21:59 +0000 (13:21 -0700)]
net/iavf: fix performance with writeback policy

The iavf driver was trying to use writeback on ITR, but was
never setting an ITR, so it didn't work. This caused performance
to be limited due to too much PCIe traffic and partial writes
during most benchmarking workloads.
Set the ITR during queue setup, which can be checked at runtime
by reading register 0x2800. Setting the value to 2us allows
for generally good streaming packet performance while keeping
latency down.

Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt")
Cc: stable@dpdk.org
Reported-by: Brian Johnson <brian.johnson@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agoraw/ifpga/base: enhance driver reliability in multi-process
Wei Huang [Fri, 23 Oct 2020 08:59:59 +0000 (04:59 -0400)]
raw/ifpga/base: enhance driver reliability in multi-process

Current hardware protection is based on pthread mutex which
work just for situation of multi-thread in one process. In
multi-process environment, hardware state machine would be
corrupted by concurrent access, that means original pthread
mutex mechanism need be enhanced.

The major modifications in this patch are list below:
1. Create a mutex for adapter in shared memory named
   "mutex.IFPGA:domain:bus:dev.func" when device is probed.
2. Create a shared memory named "IFPGA:domain:bus:dev.func" during opae
   adapter is initializing. There is a reference count in shared memory.
   Shared memory will be destroyed once reference count turned to zero.
3. Two mutexs are created in shared memory and initialized with flag
   PTHREAD_PROCESS_SHARED. One for SPI and the other for I2C. They will
   be passed to SPI and I2C driver subsequently.
4. DTB data in flash will be cached in shared memory. Then MAX10 driver
   can read DTB from shared memory instead of flash. This avoid
   confliction of concurrent flash access between hardware and software.

Signed-off-by: Wei Huang <wei.huang@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
4 years agoraw/ifpga/base: free resources when destroying device
Wei Huang [Fri, 23 Oct 2020 08:59:58 +0000 (04:59 -0400)]
raw/ifpga/base: free resources when destroying device

Add two functions to complete the resource free work, one is
'ifpga_adapter_destroy()', the other is 'ifpga_bus_uinit()'.

Then call 'opae_adapter_destroy()' and 'opae_adapter_data_free()'
in 'ifpga_rawdev_close()' to free resources.

Also 'opae_adapter_free()' is removed from 'ifpga_rawdev_destroy()',
because opae adapter is pointed by dev_private member in raw_dev,
it will be freed in 'rte_rawdev_pmd_release()'.

Signed-off-by: Wei Huang <wei.huang@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
4 years agoraw/ifpga/base: fix return of IRQ unregister
Wei Huang [Fri, 23 Oct 2020 08:59:57 +0000 (04:59 -0400)]
raw/ifpga/base: fix return of IRQ unregister

Since 'rte_intr_callback_unregister()' can return positive
value as success, but 'ifpga_rawdev_destroy()' handle it as
an error.

Instead, only negative return is treated as failure.

Fixes: e0a1aafe2af9 ("raw/ifpga: introduce IRQ functions")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
4 years agoraw/ifpga/base: handle unsupported interrupt type
Wei Huang [Fri, 23 Oct 2020 08:59:56 +0000 (04:59 -0400)]
raw/ifpga/base: handle unsupported interrupt type

Handle unsupported interrupt type requests properly,
on unsupported interrupt case:
'ifpga_unregister_msix_irq()' returns success,
'ifpga_register_msix_irq()' return failure.

Fixes: e0a1aafe2af9 ("raw/ifpga: introduce IRQ functions")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
4 years agoraw/ifpga/base: fix interrupt handler instance usage
Wei Huang [Fri, 23 Oct 2020 08:59:55 +0000 (04:59 -0400)]
raw/ifpga/base: fix interrupt handler instance usage

Interrupt handler copied to the local 'intr_handle' variable by value
before passing it to IRQ functions.
This leads IRQ functions update the local variable instead of
'ifpga_irq_handle'.

Instead, using 'intr_handle' local variable as pointer to
'ifpga_irq_handle' as intended.

Fixes: e0a1aafe2af9 ("raw/ifpga: introduce IRQ functions")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
4 years agonet/txgbe: support DCB info get
Jiawen Wu [Mon, 19 Oct 2020 08:54:12 +0000 (16:54 +0800)]
net/txgbe: support DCB info get

Add DCB information get operation.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support PTP
Jiawen Wu [Mon, 19 Oct 2020 08:54:11 +0000 (16:54 +0800)]
net/txgbe: support PTP

Add PTP support.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support device LED on and off
Jiawen Wu [Mon, 19 Oct 2020 08:54:09 +0000 (16:54 +0800)]
net/txgbe: support device LED on and off

Support device LED on and off.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support register dump
Jiawen Wu [Mon, 19 Oct 2020 08:54:08 +0000 (16:54 +0800)]
net/txgbe: support register dump

Add register dump support.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support EEPROM info get
Jiawen Wu [Mon, 19 Oct 2020 08:54:07 +0000 (16:54 +0800)]
net/txgbe: support EEPROM info get

Add EEPROM information get related operations.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support getting FW version
Jiawen Wu [Mon, 19 Oct 2020 08:54:06 +0000 (16:54 +0800)]
net/txgbe: support getting FW version

Add firmware version get operation.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support MTU set
Jiawen Wu [Mon, 19 Oct 2020 08:54:05 +0000 (16:54 +0800)]
net/txgbe: support MTU set

Add MTU set operation.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: add device promiscuous and allmulticast mode
Jiawen Wu [Mon, 19 Oct 2020 08:54:04 +0000 (16:54 +0800)]
net/txgbe: add device promiscuous and allmulticast mode

Add device promiscuous and allmulticast mode.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support priority flow control
Jiawen Wu [Mon, 19 Oct 2020 08:54:03 +0000 (16:54 +0800)]
net/txgbe: support priority flow control

Add priority flow control support.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support FC auto negotiation
Jiawen Wu [Mon, 19 Oct 2020 08:54:02 +0000 (16:54 +0800)]
net/txgbe: support FC auto negotiation

Add flow control negotiation with link partner.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support flow control
Jiawen Wu [Mon, 19 Oct 2020 08:54:01 +0000 (16:54 +0800)]
net/txgbe: support flow control

Add flow control support.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support DCB
Jiawen Wu [Mon, 19 Oct 2020 08:54:00 +0000 (16:54 +0800)]
net/txgbe: support DCB

Add DCB transmit and receive mode configurations,
and allocate DCB packet buffer.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: support RSS
Jiawen Wu [Mon, 19 Oct 2020 08:53:59 +0000 (16:53 +0800)]
net/txgbe: support RSS

Add RSS configure, support to RSS hash and reta operations for PF.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/txgbe: add VMDq configure
Jiawen Wu [Mon, 19 Oct 2020 08:53:58 +0000 (16:53 +0800)]
net/txgbe: add VMDq configure

Add multiple queue setting with VMDq.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>