dpdk.git
3 years agoethdev: add hairpin queue operations
Bing Zhao [Thu, 15 Oct 2020 13:08:54 +0000 (21:08 +0800)]
ethdev: add hairpin queue operations

Every hairpin queue pair should be configured properly and the
connection between Tx and Rx queues should be established, before
hairpin function works. In single port hairpin mode, the queues of
each pair belong to the same device. It is easy to get the hardware
and software information of each queue and configure the hairpin
connection with such information. In two ports hairpin mode, it is
not easy or inappropriate to access one queue's information from
another device.

Since hairpin is configured per queue pair, three new APIs are
introduced and they are internal for the PMD using.

The peer update API helps to pass one queue's information to the
peer queue and get the peer's information back for the next step.
The peer bind API configures the current queue with the peer's
information. For each hairpin queue pair, this API may need to be
called twice to configure the Tx, Rx queues separately.
The peer unbind API resets the current queue configuration and state
to disconnect it from the peer queue. Also, it may need to be called
twice to disconnect Tx, Rx queues from each other.

Some parameter of the above APIs might not be mandatory, and it
depends on the PMD implementation.

The structure of `rte_hairpin_peer_info` is only a declaration and
the actual members will be defined in each PMD when being used.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoethdev: add function to get hairpin peer ports list
Bing Zhao [Thu, 15 Oct 2020 13:08:53 +0000 (21:08 +0800)]
ethdev: add function to get hairpin peer ports list

After hairpin queues are configured, in general, the application will
maintain the ports topology and even the queues configuration for
the hairpin. But sometimes it will not.

If there is no hot-plug, it is easy to bind and unbind hairpin among
all the ports. The application can just connect or disconnect the
hairpin egress ports to/from all the probed ingress ports. Then all
the connections could be handled properly.

But with hot-plug / hot-unplug, one port could be probed and removed
dynamically. With two ports hairpin, all the connections from and to
this port should be handled after start(bind) or before stop(unbind).
It is necessary to know the hairpin topology with this port.

This function will return the ports list with the actual peer ports
number after configuration. Either peer Rx or Tx ports will be
gotten with this function call.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoethdev: add new attributes to hairpin config
Bing Zhao [Thu, 15 Oct 2020 13:08:52 +0000 (21:08 +0800)]
ethdev: add new attributes to hairpin config

To support two ports hairpin mode and keep the backward compatibility
for the application, two new attribute members of the hairpin queue
configuration structure will be added.

`tx_explicit` means if the application itself will insert the Tx part
flow rules. If not set, PMD will insert the rules implicitly.
`manual_bind` means if the hairpin Tx queue and peer Rx queue will be
bound automatically during the device start stage.

Different Tx and Rx queue pairs could have different values, but it
is highly recommended that all paired queues between one egress and
its peer ingress ports have the same values, in order not to bring
any chaos to the system. The actual support of these attribute
parameters will be checked and decided by the PMD drivers.

In the single port hairpin, if both are zero without any setting, the
behavior will remain the same as before. It means that no bind API
needs to be called and no Tx flow rules need to be inserted manually
by the application.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
3 years agoethdev: add hairpin bind and unbind API
Bing Zhao [Thu, 15 Oct 2020 13:08:51 +0000 (21:08 +0800)]
ethdev: add hairpin bind and unbind API

In single port hairpin mode, all the hairpin Tx and Rx queues belong
to the same device. After the queues are set up properly, there is
no other dependency between the Tx queue and its Rx peer queue. The
binding process that connected the Tx and Rx queues together from
hardware level will be done automatically during the device start
procedure. Everything required is configured and initialized already
for the binding process.

But in two ports hairpin mode, there will be some cross-dependences
between two different ports. Usually, the ports will be initialized
serially by the main thread but not in parallel. The earlier port
will not be able to enable the bind if the following peer port is
not yet configured with HW resources. What's more, if one port is
detached / attached dynamically, it would introduce more trouble
for the hairpin binding.

To overcome these, new APIs for binding and unbinding are added.
During startup, only the hairpin Tx and Rx peer queues will be set
up. Nothing will be done when starting the device if the queues are
without auto-bind attribute. Only after the required ports pair
started, the `rte_eth_hairpin_bind()` API can be called to bind the
all Tx queues of the egress port to the Rx queues of the peer port.
Then the connection between the egress and ingress ports pair will
be established.

The `rte_eth_hairpin_unbind()` API could be used to disconnect the
egress and the peer ingress ports. This should only be called before
the device is closed if needed. When doing the clean up, all the
egress and ingress pairs related to a single port should be taken
into consideration, especially in the hot unplug case.
mode is described.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
3 years agonet/virtio: fix indirect desc length
Marvin Liu [Thu, 15 Oct 2020 05:46:07 +0000 (13:46 +0800)]
net/virtio: fix indirect desc length

When transmitting indirect descriptors, first desc will store net_hdr
and following descs will be mapped to mbuf segments. Total desc number
will be seg_num plus one. Meaning of variable needed is the number of
used descs in packed ring. This value will always be two for indirect
desc. Now use mbuf segments number for calculating correct desc length.

Fixes: b473061b0e1d ("net/virtio: fix indirect descriptors in packed datapaths")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agovhost: fix async unregister deadlock
Patrick Fu [Tue, 13 Oct 2020 01:45:46 +0000 (09:45 +0800)]
vhost: fix async unregister deadlock

When async unregister function is invoked in certain vhost event
callbacks (e.g. vring state change), deadlock may occur due to
recursive spinlock acquire. This patch uses trylock() primitive in
the unregister API to avoid deadlock.

Fixes: 78639d54563a ("vhost: introduce async enqueue registration API")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agovhost: fix async vector buffer overrun
Patrick Fu [Tue, 13 Oct 2020 01:45:45 +0000 (09:45 +0800)]
vhost: fix async vector buffer overrun

Add check on the async vector buffer usage to prevent the buf overrun.
If the unused vector buffer is not sufficient to prepare for next
packet's iov creation, an async transfer will be triggered immediately
to free the vector buffer.

Fixes: 78639d54563a ("vhost: introduce async enqueue registration API")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agovhost: allocate async memory dynamically
Patrick Fu [Tue, 13 Oct 2020 01:45:44 +0000 (09:45 +0800)]
vhost: allocate async memory dynamically

Allocate async internal memory buffer by rte_malloc(), replacing array
declaration inside vq structure. Dynamic allocation can help to save
memory footprint when async path is not registered.

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agovhost: simplify async copy completion
Patrick Fu [Tue, 13 Oct 2020 01:45:43 +0000 (09:45 +0800)]
vhost: simplify async copy completion

Current async ops allows check_completed_copies() callback to return
arbitrary number of async iov segments finished from backend async
devices. This design creates complexity for vhost to handle breaking
transfer of a single packet (i.e. transfer completes in the middle
of a async descriptor) and prevents application callbacks from
leveraging hardware capability to offload the work. Thus, this patch
enforces the check_completed_copies() callback to return the number
of async memory descriptors, which is aligned with async transfer
data ops callbacks. vhost async data path are revised to work with
new ops define, which provides a clean and simplified processing.

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agonet/bnxt: fix UDP tunnel port removal
Kalesh AP [Mon, 12 Oct 2020 15:44:59 +0000 (21:14 +0530)]
net/bnxt: fix UDP tunnel port removal

The HWRM supports only one global destination port for a tunnel type.

When port is stopped, driver deletes the UDP tunnel port configured
in the HW, but it does not update the counter which causes the
tunnel port addition to fail after port is started again.

Fixed to update the counter when tunnel port is deleted.

Fixes: 10d074b2022d ("net/bnxt: support tunneling")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
3 years agonet/hns3: support SVE Tx
Chengwen Feng [Wed, 14 Oct 2020 10:01:14 +0000 (18:01 +0800)]
net/hns3: support SVE Tx

This patch adds SVE vector instructions to optimize Tx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
3 years agonet/hns3: support SVE Rx
Wei Hu (Xavier) [Wed, 14 Oct 2020 10:01:13 +0000 (18:01 +0800)]
net/hns3: support SVE Rx

This patch adds SVE vector instructions to optimize Rx burst process.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
3 years agonet/enic: check in error path
John Daley [Wed, 14 Oct 2020 20:25:22 +0000 (13:25 -0700)]
net/enic: check in error path

Coverity issue: 363046
Fixes: bb66d562aefc ("net/enic: share flow actions with same signature")

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoapp/testpmd: support IPv6 fragment extension item
Dekel Peled [Wed, 14 Oct 2020 16:35:51 +0000 (19:35 +0300)]
app/testpmd: support IPv6 fragment extension item

rte_flow update, following RFC [1], added to ethdev the rte_flow item
ipv6_frag_ext.
This patch updates testpmd CLI to support the new item and its fields.

To match on fragmented IPv6 packets, this item is added to pattern:
... ipv6 / ipv6_frag_ext ...

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoapp/testpmd: support IPv6 fragments
Dekel Peled [Wed, 14 Oct 2020 16:35:50 +0000 (19:35 +0300)]
app/testpmd: support IPv6 fragments

rte_flow update, following RFC [1], introduced has_frag_ext field for
IPv6 header item, used to indicate match on fragmented/non-fragmented
packets.
This patch updates testpmd CLI to support the new field.

To match on non-fragmented IPv6 packets, need to use pattern:
... ipv6 has_frag_ext spec 0 has_frag_ext mask 1 ...
To match on fragmented IPv6 packets, need to use pattern:
... ipv6 has_frag_ext spec 1 has_frag_ext mask 1 ...
To match on any IPv6 packets, the has_frag_ext field should
not be specified for match.

[1] https://mails.dpdk.org/archives/dev/2020-August/177257.html

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoapp/testpmd: support IPv4 fragments
Dekel Peled [Wed, 14 Oct 2020 16:35:49 +0000 (19:35 +0300)]
app/testpmd: support IPv4 fragments

This patch updates testpmd CLI to support fragment_offset field of
IPv4 header item.

To match on non-fragmented IPv4 packets, need to use pattern:
... ipv4 fragment_offset spec 0 fragment_offset mask 0x3fff ...
To match on fragmented IPv4 packets, need to use pattern:
... ipv4 fragment_offset spec 1 fragment_offset last 0x3fff
fragment_offset mask 0x3fff ...
(Use the full available range 1 to 0x3fff to include all possible
values.)
To match on any IPv4 packets, fragmented and non-fragmented,
the fragment_offset field should not be specified for match.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoethdev: add IPv6 fragment extension header item
Dekel Peled [Wed, 14 Oct 2020 16:35:48 +0000 (19:35 +0300)]
ethdev: add IPv6 fragment extension header item

Applications handling fragmented IPv6 packets need to match on IPv6
fragment extension header, in order to identify the fragments order
and location in the packet.
This patch introduces the IPv6 fragment extension header item,
proposed in [1].

Relevant definitions are moved from lib/librte_ip_frag/rte_ip_frag.h
to lib/librte_net/rte_ip.h, as they are needed for IPv6 header handling.
struct ipv6_extension_fragment renamed to rte_ipv6_fragment_ext to
adapt it to the common naming convention.

Default mask is not defined, since all fields are optional.

[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
3 years agoethdev: add extensions attributes to IPv6 item
Dekel Peled [Wed, 14 Oct 2020 16:35:47 +0000 (19:35 +0300)]
ethdev: add extensions attributes to IPv6 item

Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.

Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.

For example, fragmented IPv6 packets contain a dedicated extension header
(which is implemented in a later patch of this series).
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.

This patch implements the update as detailed in RFC [1].
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.
To match on non-fragmented IPv6 packets, need to use has_frag_ext 0.
To match on fragmented IPv6 packets, need to use has_frag_ext 1.
To match on any IPv6 packets, the has_frag_ext field should
not be specified for match.

[1] https://mails.dpdk.org/archives/dev/2020-August/177257.html

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
3 years agoethdev: fix memory ordering for callback functions
Honnappa Nagarahalli [Tue, 13 Oct 2020 16:25:37 +0000 (11:25 -0500)]
ethdev: fix memory ordering for callback functions

Call back functions are registered on the control plane. They
are accessed from the data plane. Hence, correct memory orderings
should be used to avoid race conditions.

Fixes: 4dc294158cac ("ethdev: support optional Rx and Tx callbacks")
Fixes: c8231c63ddcb ("ethdev: insert Rx callback as head of list")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agoethdev: replace full barrier with relaxed barrier
Phil Yang [Tue, 13 Oct 2020 16:25:36 +0000 (11:25 -0500)]
ethdev: replace full barrier with relaxed barrier

While registering the call back functions full write barrier
can be replaced with one-way write barrier.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agoapp/testpmd: support shared flow action
Andrey Vesnovaty [Wed, 14 Oct 2020 11:40:15 +0000 (14:40 +0300)]
app/testpmd: support shared flow action

This patch adds shared action support to testpmd CLI.

All shared actions created via testpmd CLI assigned ID for further
reference in other CLI commands. Shared action ID supplied as CLI
argument or assigned by testpmd is similar to flow ID & limited to
scope of testpdm CLI.

Create shared action syntax:
flow shared_action {port_id} create [action_id {shared_action_id}]
[ingress] [egress] action {action} / end

Create shared action examples:
flow shared_action 0 create action_id 100 \
ingress action rss queues 1 2 end / end
This creates shared rss action with id 100 on port 0.

flow shared_action 0 create action_id \
ingress action rss queues 0 1 end / end
This creates shared rss action with id assigned by testpmd
on port 0.

Update shared action syntax:
flow shared_action {port_id} update {shared_action_id}
action {action} / end

Update shared action example:
flow shared_action 0 update 100 \
action rss queues 0 3 end / end
This updates shared rss action having id 100 on port 0
with rss to queues 0 3 (in create example rss queues were
1 & 2).

Destroy shared action syntax:
flow shared_action {port_id} destroy action_id {shared_action_id} [...]

Destroy shared action example:
flow shared_action 0 destroy action_id 100 action_id 101
This destroys shared actions having id 100 & 101

Query shared action syntax:
flow shared_action {port} query {shared_action_id}

Query shared action example:
flow shared_action 0 query 100
This queries shared actions having id 100

Use shared action as flow action syntax:
flow create {port_id} ... / end actions [action / [...]]
shared {action_id} / [action / [...]] end

Use shared action as flow action example:
flow create 0 ingress pattern ... / end \
actions shared 100 / end
This creates flow rule where rss action is shared rss action
having id 100.

All shared action CLIs report status of the command.
Shared action query CLI output depends on action type.

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agoethdev: add shared actions to flow API
Andrey Vesnovaty [Wed, 14 Oct 2020 11:40:14 +0000 (14:40 +0300)]
ethdev: add shared actions to flow API

Introduce extension of flow action API enabling sharing of single
rte_flow_action in multiple flows. The API intended for PMDs, where
multiple HW offloaded flows can reuse the same HW essence/object
representing flow action and modification of such an essence/object
affects all the rules using it.

Motivation and example
===
Adding or removing one or more queues to RSS used by multiple flow rules
imposes per rule toll for current DPDK flow API; the scenario requires
for each flow sharing cloned RSS action:
- call `rte_flow_destroy()`
- call `rte_flow_create()` with modified RSS action

API for sharing action and its in-place update benefits:
- reduce the overhead of multiple RSS flow rules reconfiguration
- optimize resource utilization by sharing action across multiple
  flows

Change description
===

Shared action
===
In order to represent flow action shared by multiple flows new action
type RTE_FLOW_ACTION_TYPE_SHARED is introduced (see `enum
rte_flow_action_type`).
Actually the introduced API decouples action from any specific flow and
enables sharing of single action by its handle across multiple flows.

Shared action create/use/destroy
===
Shared action may be reused by some or none flow rules at any given
moment, i.e. shared action resides outside of the context of any flow.
Shared action represent HW resources/objects used for action offloading
implementation.
API for shared action create (see `rte_flow_shared_action_create()`):
- should allocate HW resources and make related initializations required
  for shared action implementation.
- make necessary preparations to maintain shared access to
  the action resources, configuration and state.
API for shared action destroy (see `rte_flow_shared_action_destroy()`)
should release HW resources and make related cleanups required for shared
action implementation.

In order to share some flow action reuse the handle of type
`struct rte_flow_shared_action` returned by
rte_flow_shared_action_create() as a `conf` field of
`struct rte_flow_action` (see "example" section).

If some shared action not used by any flow rule all resources allocated
by the shared action can be released by rte_flow_shared_action_destroy()
(see "example" section). The shared action handle passed as argument to
destroy API should not be used any further i.e. result of the usage is
undefined.

Shared action re-configuration
===
Shared action behavior defined by its configuration can be updated via
rte_flow_shared_action_update() (see "example" section). The shared
action update operation modifies HW related resources/objects allocated
on the action creation. The number of operations performed by the update
operation should not depend on the number of flows sharing the related
action. On return of shared action update API action behavior should be
according to updated configuration for all flows sharing the action.

Shared action query
===
Provide separate API to query shared action state (see
rte_flow_shared_action_update()). Taking a counter as an example: query
returns value aggregating all counter increments across all flow rules
sharing the counter. This API doesn't query shared action configuration
since it is controlled by rte_flow_shared_action_create() and
rte_flow_shared_action_update() APIs and no supposed to change by other
means.

example
===

struct rte_flow_action actions[2];
struct rte_flow_shared_action_conf conf;
struct rte_flow_action action;
/* skipped: initialize conf and action */
struct rte_flow_shared_action *handle =
rte_flow_shared_action_create(port_id, &conf, &action, &error);
actions[0].type = RTE_FLOW_ACTION_TYPE_SHARED;
actions[0].conf = handle;
actions[1].type = RTE_FLOW_ACTION_TYPE_END;
/* skipped: init attr0 & pattern0 args */
struct rte_flow *flow0 = rte_flow_create(port_id, &attr0, pattern0,
actions, error);
/* create more rules reusing shared action */
struct rte_flow *flow1 = rte_flow_create(port_id, &attr1, pattern1,
actions, error);
/* skipped: for flows 2 till N */
struct rte_flow *flowN = rte_flow_create(port_id, &attrN, patternN,
actions, error);
/* update shared action */
struct rte_flow_action updated_action;
/*
 * skipped: initialize updated_action according to desired action
 * configuration change
 */
rte_flow_shared_action_update(port_id, handle, &updated_action, error);
/*
 * from now on all flows 1 till N will act according to configuration of
 * updated_action
 */
/* skipped: destroy all flows 1 till N */
rte_flow_shared_action_destroy(port_id, handle, error);

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agodoc: add sample flow limitation in mlx5 guide
Jiawei Wang [Tue, 13 Oct 2020 14:11:51 +0000 (17:11 +0300)]
doc: add sample flow limitation in mlx5 guide

Add description about the sample flow limitation.
Sample Flow supports in NIC-Rx and E-Switch domains.
Due to Metadata register c0 is deleted while doing the loopback,
so that only support forward the sampling packet into
E-Switch manager port, no additional action support in sample flow.

Add the offloads minimum versions for new sampling feature.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: update translate function for mirroring
Jiawei Wang [Tue, 13 Oct 2020 14:11:50 +0000 (17:11 +0300)]
net/mlx5: update translate function for mirroring

Translate the attribute of sample action that include sample ratio
and sub actions list.
PMD will check the destination action number in current flow,
if found multiple destination actions, then create the new destination
array rdma action that group actions for each destination.
Currently only support port or queue for destination action, and only
encap action can be attached into one port destination.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: update flow mirroring validation
Jiawei Wang [Tue, 13 Oct 2020 14:11:49 +0000 (17:11 +0300)]
net/mlx5: update flow mirroring validation

Mirroring flow using sample action with ratio is 1, and it doesn't
support jump action with the same one flow.

Sample action must have destination actions like port or queue for
mirroring, and don't need split function as sampling flow.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: add glue function for mirroring
Jiawei Wang [Tue, 13 Oct 2020 14:11:48 +0000 (17:11 +0300)]
common/mlx5: add glue function for mirroring

The new DR destination array action is supported since the
rdma-core version v32.

Destination array action is used group DR actions to a single action,
And it can be used for mirroring packet and forward to every
destination (port or queue) in the array.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: update translate function for sample action
Jiawei Wang [Tue, 13 Oct 2020 14:11:47 +0000 (17:11 +0300)]
net/mlx5: update translate function for sample action

Translate the attribute of sample action that include sample ratio
and sub actions list, then create the sample DR action.

The metadata register value will be lost in the default path after
Sampler in FDB due to CX5 HW limitation.

Since source vport also be shared with metadata register c0, MLX5
PMD would set the source vport to rdma-core and rdma-core will
restore the regc0 value after sampler.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: split sample flow into two sub-flows
Jiawei Wang [Tue, 13 Oct 2020 14:11:46 +0000 (17:11 +0300)]
net/mlx5: split sample flow into two sub-flows

The flow with sample action will be split into two sub flows:
the prefix sub flow with the all actions preceding the sample
action and sample action itself, and the suffix sub flow with
the actions following the sample action.

The original items remain in the prefix sub flow, add the
implicit tag action with unique id to set in metadata register,
and suffix sub flow uses the tag item to match with that unique id.

The flow split as below:

Original flow: items / actions pre / sample / actions sfx ->
    prefix sub flow -
            items / actions pre / set_tag action / sample
    suffix sub flow -
            tag_item / actions sfx

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: validate sample action
Jiawei Wang [Tue, 13 Oct 2020 14:11:45 +0000 (17:11 +0300)]
net/mlx5: validate sample action

Add sample action validate function.

Sample Flow is supported in NIC-RX and FDB domains. For the NIC-RX
the Sample Flow action list must include the destination queue action.

Only NIC-RX domain supports the optional actions list. FDB doesn't
support any optional actions, the sampled packets is always forwarded
to the E-Switch manager port.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: query sampler object capability via DevX
Jiawei Wang [Tue, 13 Oct 2020 14:11:44 +0000 (17:11 +0300)]
common/mlx5: query sampler object capability via DevX

Update function mlx5_devx_cmd_query_hca_attr() to add the NIC Flow
Table attributes query, then get the log_max_flow_sampler_num from
flow table properties.

Add the related structs definition in mlx5_prm.h.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agocommon/mlx5: add glue for sample action
Jiawei Wang [Tue, 13 Oct 2020 14:11:43 +0000 (17:11 +0300)]
common/mlx5: add glue for sample action

The new DR sample action is supported since OFED version
5.1.2 or rdma-core version v32.

MLX5 PMD adds the rdma-core command in glue to create this action.

Sample action is used for creating the sample object to implement
the sampling/mirroring function.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/bnx2x: add QLogic vendor id for BCM57840
Rasesh Mody [Mon, 12 Oct 2020 22:48:14 +0000 (15:48 -0700)]
net/bnx2x: add QLogic vendor id for BCM57840

Add QLogic vendor id support for BCM57840 device ids.

Fixes: 9fb557035d90 ("bnx2x: enable PMD build")
Cc: stable@dpdk.org
Reported-by: Souvik Dey <sodey@rbbn.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
3 years agodoc: fix typo in pcap guide
Sarosh Arif [Wed, 14 Oct 2020 12:23:41 +0000 (17:23 +0500)]
doc: fix typo in pcap guide

Changed "net_pcap1;" to "net_pcap1," in order to make the command
correct.

Fixes: 53bf48403409 ("net/pcap: capture only ingress packets from Rx iface")
Cc: stable@dpdk.org
Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/ice: support drop action for DCF switch
Simei Su [Mon, 28 Sep 2020 02:31:56 +0000 (10:31 +0800)]
net/ice: support drop action for DCF switch

This patch adds drop action in DCF switch filter.

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agodoc: advertise Alveo SN1000 SmartNICs family support
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:53 +0000 (14:45 +0100)]
doc: advertise Alveo SN1000 SmartNICs family support

Alveo SN1000 family is SmartNICs based on EF100 architecture.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support Rx interrupts for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:52 +0000 (14:45 +0100)]
net/sfc: support Rx interrupts for EF100

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: forward function control window offset to datapath
Igor Romanov [Tue, 13 Oct 2020 13:45:51 +0000 (14:45 +0100)]
net/sfc: forward function control window offset to datapath

Store function control window offset to correctly set the offset
of prime EvQ in EF100.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support user mark and flag Rx for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:50 +0000 (14:45 +0100)]
net/sfc: support user mark and flag Rx for EF100

Flow rules may be used mark packets. Support delivery of mark/flag
values to user in mbuf fields.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support per-queue Rx RSS hash offload for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:49 +0000 (14:45 +0100)]
net/sfc: support per-queue Rx RSS hash offload for EF100

Riverhead allows to choose Rx prefix (which contains RSS hash value
and valid flag) per queue.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support per-queue Rx prefix for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:48 +0000 (14:45 +0100)]
net/sfc: support per-queue Rx prefix for EF100

Riverhead FW supports Rx prefix choice based on required fields in Rx
prefix. The feature is generalized in libefx to provide Rx prefixes
layout for other NICs and firmware variants. Now driver can get
the prefix layout after Rx queue start and use the layout details to
check its expectations or simply in run-time.

Rx prefix choice and query interface is defined in SF-119689-TC
EF100 host interface.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agonet/sfc: map Rx offload RSS hash to corresponding RxQ flag
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:47 +0000 (14:45 +0100)]
net/sfc: map Rx offload RSS hash to corresponding RxQ flag

If RSS hash offload is requested, Rx queue should be configured
to request RSS hash information delivery.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: provide helper to check Rx prefix
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:46 +0000 (14:45 +0100)]
common/sfc_efx/base: provide helper to check Rx prefix

A new function allows to check if used Rx prefix layout matches
available Rx prefix layout. Length check is out-of-scope of the
function and caller should ensure length is either checked or
different length with everything required in place is handled
properly.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agocommon/sfc_efx/base: provide control to deliver RSS hash
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:45 +0000 (14:45 +0100)]
common/sfc_efx/base: provide control to deliver RSS hash

When Rx queue is created, allow to specify if the driver would like
to get RSS hash value calculated by the hardware.

Use the flag to choose Rx prefix on Riverhead.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: simplify requesting Rx prefix fields
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:44 +0000 (14:45 +0100)]
common/sfc_efx/base: simplify requesting Rx prefix fields

Introduce an extra variable with required Rx prefix fields mask
to make it easier to request more fields.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agonet/sfc: support Rx checksum offload for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:43 +0000 (14:45 +0100)]
net/sfc: support Rx checksum offload for EF100

Also support Rx packet type offload.

Checksumming is actually always enabled. Report it per-queue offload
to give applications maximum flexibility.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support Tx VLAN insertion offload for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:42 +0000 (14:45 +0100)]
net/sfc: support Tx VLAN insertion offload for EF100

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support tunnel TSO for EF100 native Tx
Ivan Malov [Tue, 13 Oct 2020 13:45:41 +0000 (14:45 +0100)]
net/sfc: support tunnel TSO for EF100 native Tx

Handle VXLAN and Geneve TSO on EF100 native Tx datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support TSO for EF100 native datapath
Ivan Malov [Tue, 13 Oct 2020 13:45:40 +0000 (14:45 +0100)]
net/sfc: support TSO for EF100 native datapath

Riverhead boards support TSO version 3.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support tunnels for EF100 native Tx
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:39 +0000 (14:45 +0100)]
net/sfc: support tunnels for EF100 native Tx

Add support for outer IPv4/UDP and inner IPv4/UDP/TCP checksum offloads.
Use partial checksum offload for inner TCP/UDP offload.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: add header segments check for EF100 Tx
Ivan Malov [Tue, 13 Oct 2020 13:45:38 +0000 (14:45 +0100)]
net/sfc: add header segments check for EF100 Tx

EF100 native Tx datapath demands that packet header be contiguous
when partial checksum offloads are used since helper function is
used to calculate pseudo-header checksum (and the function requires
contiguous header).

Add an explicit check for this assumption and restructure the code
to avoid TSO header linearisation check since TSO header
linearisation is not done on EF100 native Tx datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support IPv4 header checksum offload for EF100 Tx
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:37 +0000 (14:45 +0100)]
net/sfc: support IPv4 header checksum offload for EF100 Tx

Use outer layer 3 full checksum offload which does not require any
assistance from driver.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support TCP and UDP checksum offloads for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:36 +0000 (14:45 +0100)]
net/sfc: support TCP and UDP checksum offloads for EF100

Use outer layer 4 full checksum offload which does not require any
assistance from driver.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support multi-segment Tx for EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:35 +0000 (14:45 +0100)]
net/sfc: support multi-segment Tx for EF100

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: implement EF100 native Tx
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:34 +0000 (14:45 +0100)]
net/sfc: implement EF100 native Tx

No offloads support yet including multi-segment (Tx gather).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: implement EF100 native Rx
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:33 +0000 (14:45 +0100)]
net/sfc: implement EF100 native Rx

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support datapath logs which may be compiled out
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:32 +0000 (14:45 +0100)]
net/sfc: support datapath logs which may be compiled out

Add datapath log level which limits logs included in build since
on datapath it is too expensive to dive into rte_log() function
even if it does nothing.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: log DMA allocations addresses
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:31 +0000 (14:45 +0100)]
net/sfc: log DMA allocations addresses

The information about DMA allocations is very useful for debugging.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: implement libefx Tx descs complete event callbacks
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:30 +0000 (14:45 +0100)]
net/sfc: implement libefx Tx descs complete event callbacks

These callbacks are used when event queue is polled via libefx.
The libefx polling is used for management event queue, but we do not
expect any Tx complete events on it, and for datapath event queue at
flushing.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: implement libefx Rx packets event callbacks
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:29 +0000 (14:45 +0100)]
net/sfc: implement libefx Rx packets event callbacks

These callbacks are used when event queue is polled via libefx.
The libefx polling is used for management event queue, but we do not
expect any Rx events on it, and for datapath event queue at flushing
(when these events are typically ignored, since queue is being stopped).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: use BAR layout discovery to find control window
Igor Romanov [Tue, 13 Oct 2020 13:45:28 +0000 (14:45 +0100)]
net/sfc: use BAR layout discovery to find control window

Control window is required to talk to NIC.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: support EF100
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:27 +0000 (14:45 +0100)]
net/sfc: support EF100

Riverhead is the first NIC of the EF100 architecture.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: add capabilities for Rx/Tx support in libefx
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:26 +0000 (14:45 +0100)]
net/sfc: add capabilities for Rx/Tx support in libefx

libefx usage may be limited to control path only and its
implementation of datapath may not support NIC family or
PMD efx Rx/Tx datapaths implementation may be not yet ported
to updated libefx.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: log doorbell addresses useful for debugging
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:25 +0000 (14:45 +0100)]
net/sfc: log doorbell addresses useful for debugging

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet/sfc: check for maximum number of Rx scatter buffers
Igor Romanov [Tue, 13 Oct 2020 13:45:24 +0000 (14:45 +0100)]
net/sfc: check for maximum number of Rx scatter buffers

Update generic code to check that MTU and Rx buffer sizes
do not result in more Rx scatter segments than NIC can make.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agocommon/sfc_efx/base: add max number of Rx scatter buffers
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:23 +0000 (14:45 +0100)]
common/sfc_efx/base: add max number of Rx scatter buffers

Riverhead QDMA has limitation on maximum number of Rx scatter
buffers to be used by a packet. If the limitation is violated,
the datapath is dead. FW should ensure that it is OK, but
drivers need to know the limitation anyway to check parameters
when Rx queues are configured and MTU is set.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: fix PHY config failure on Riverhead
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:22 +0000 (14:45 +0100)]
common/sfc_efx/base: fix PHY config failure on Riverhead

Riverhead does not support LED control yet. It is perfectly
fine to ignore LED set failure because of no support if
configured LED mode is the default.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: factor out MCDI wrapper to set LEDs
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:21 +0000 (14:45 +0100)]
common/sfc_efx/base: factor out MCDI wrapper to set LEDs

For consistency it is better to have separate MCDI wrappers.

Make efx_phy_led_mode_t visible even if EFSYS_OPT_PHY_LED_CONTROL
is disabled to be able to use it in the added wrapper arguments.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: factor out wrapper to set PHY link
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:20 +0000 (14:45 +0100)]
common/sfc_efx/base: factor out wrapper to set PHY link

Make ef10_phy_reconfigure() simpler to read and less error-prone.
Avoid confusing case when two MCDI's are called from one function.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agodoc: avoid references to removed config in sfc guide
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:19 +0000 (14:45 +0100)]
doc: avoid references to removed config in sfc guide

CONFIG_* variables were used by make-based build system which is
removed.

Fixes: 3cc6ecfdfe85 ("build: remove makefiles")

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agodoc: fix EF10 Rx mode name in sfc guide
Andrew Rybchenko [Tue, 13 Oct 2020 13:45:18 +0000 (14:45 +0100)]
doc: fix EF10 Rx mode name in sfc guide

Fixes: 390f9b8d82c9 ("net/sfc: support equal stride super-buffer Rx mode")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agonet: add function to calculate IPv4 header length
Michael Pfeiffer [Mon, 12 Oct 2020 14:55:46 +0000 (16:55 +0200)]
net: add function to calculate IPv4 header length

Add a function to calculate the length of an IPv4 header as suggested
on the mailing list [1]. Call where appropriate.

[1] https://mails.dpdk.org/archives/dev/2020-October/184471.html

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/ring: check internal arguments
Kevin Laatz [Tue, 13 Oct 2020 13:07:04 +0000 (14:07 +0100)]
net/ring: check internal arguments

Add a check for the return value of the sscanf call in
parse_internal_args(), returning an error if we don't get the expected
result.

Coverity issue: 362049
Fixes: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
Cc: stable@dpdk.org
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/af_xdp: forbid umem sharing for xsks with same context
Ciara Loftus [Tue, 13 Oct 2020 13:10:08 +0000 (13:10 +0000)]
net/af_xdp: forbid umem sharing for xsks with same context

AF_XDP PMDs who wish to share a UMEM must have a unique context
(ctx) ie. netdev,qid tuple. For instance, the following will not
work since both PMDs' contexts are identical.

  --vdev net_af_xdp0,iface=ens786f1,start_queue=0,shared_umem=1
  --vdev net_af_xdp1,iface=ens786f1,start_queue=0,shared_umem=1

Supporting this scenario would require locks, which would impact
the performance of the more typical cases - xsks with different
netdev,qid tuples.

Fixes: 74b46340e2d4 ("net/af_xdp: support shared UMEM")

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
3 years agonet/failsafe: fix state synchro cleanup
Gaetan Rivet [Mon, 12 Oct 2020 14:19:04 +0000 (16:19 +0200)]
net/failsafe: fix state synchro cleanup

During a hotplug attempt, failsafe will try to bring a subdevice that
just appeared to its internal state. On error, the subdevice is marked
for removal and will be cleaned up.

However failsafe_dev_remove() only remove active devices. Devices that
failed during probe will be stuck in DEV_PARSED state repeatedly.

Consider all devices when doing a removal round, but limit burst control
and stats saving to active devices.

Fixes: 598fb8aec6f6 ("net/failsafe: support device removal")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <grive@u256.net>
3 years agoethdev: check queue id in Rx interrupt control
Wei Hu (Xavier) [Tue, 13 Oct 2020 11:50:55 +0000 (19:50 +0800)]
ethdev: check queue id in Rx interrupt control

This patch add queue ID checks to Rx interrupt control routines.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: check if queue setup in queue-related APIs
Wei Hu (Xavier) [Tue, 13 Oct 2020 11:50:54 +0000 (19:50 +0800)]
ethdev: check if queue setup in queue-related APIs

This patch adds checking whether the related Tx or Rx queue has been
setup in the queue-related API functions to avoid illegal address
access.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: extract checking queue id into common functions
Wei Hu (Xavier) [Tue, 13 Oct 2020 11:50:53 +0000 (19:50 +0800)]
ethdev: extract checking queue id into common functions

This patch extract checking rx_queue_id or tx_queue_id into two separate
common functions named eth_dev_validate_rx_queue and
eth_dev_validate_tx_queue.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoapp/testpmd: support query of age action
Dekel Peled [Wed, 7 Oct 2020 13:28:43 +0000 (16:28 +0300)]
app/testpmd: support query of age action

Following ethdev update in the previous patch of this series, this
patch adds CLI support to query information related to AGE action.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agoethdev: support query of age action
Dekel Peled [Wed, 7 Oct 2020 13:28:42 +0000 (16:28 +0300)]
ethdev: support query of age action

Existing API supports AGE action to monitor the aging of a flow.
This patch implements RFC [1], introducing the response format for query
of an AGE action.
Application will be able to query the AGE action state.
The response will be returned in the format implemented here.

[1] https://mails.dpdk.org/archives/dev/2020-September/180061.html

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agonet/ice: refactor RSS config wrap and fix potential bugs
Junfeng Guo [Tue, 13 Oct 2020 06:25:55 +0000 (14:25 +0800)]
net/ice: refactor RSS config wrap and fix potential bugs

Current implementation for PF RSS config wrap function has some
potential bugs about GTPU, e.g., same input set for GTPU inner and
non-TUN have different hash values, which should be same. Thus, we
use extra pre and post processing to re-config GTPU rules.

Fixes: 185fe122f489 ("net/ice: fix GTPU down/uplink and extension conflict")
Cc: stable@dpdk.org
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agoapp/testpmd: support port and encap for sample action
Jiawei Wang [Fri, 9 Oct 2020 13:46:06 +0000 (16:46 +0300)]
app/testpmd: support port and encap for sample action

Use sample action with ratio is 1 for mirroring flow, add
supports to set the different port or encap action for mirrored
packets.

The example of test-pmd command:

1. set sample_actions 1 port_id id 1 / end
   flow create 0 ... pattern eth / end actions
sample ratio 1 index 1 / port_id id 2...
The flow will result in all the matched ingress packets will be sent to
port 2, and also mirrored the packets and sent to port 1.

2. set raw_encap 0 eth src.../ ipv4.../...
   set raw_encap 1 eth src.../ ipv4.../...
   set sample_actions 2 raw_encap index 0 / port_id id 0 / end
   flow create 0 ... pattern eth / end actions
sample ratio 1 index 2 / raw_encap index 1 / port_id id 0...
The flow will result in all the matched egress packets will be
encapsulated and sent to wire, and also mirrored the packets and with
the different encapsulated data and sent to wire.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agoapp/testpmd: add command for sample action
Jiawei Wang [Fri, 9 Oct 2020 13:46:05 +0000 (16:46 +0300)]
app/testpmd: add command for sample action

Add a new testpmd command 'set sample_actions' that supports the
multiple sample actions list configuration by using the index:
set sample_actions <index> <actions list>

The examples for the sample flow use case and result as below:

1. set sample_actions 0 mark id 0x8 / queue index 2 / end
.. pattern eth / end actions sample ratio 2 index 0 / jump group 2 ...

This flow will result in all the matched ingress packets will be
jumped to next flow table, and the each second packet will be
marked and sent to queue 2 of the control application.

2. ...pattern eth / end actions sample ratio 2 / port_id id 2 ...

The flow will result in all the matched ingress packets will be
duplicated and sent to the representor peer (VF or wire) on DPDK port 2,
and the each second packet will also be sent to E-Switch manager vport.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agoethdev: introduce flow sample action
Jiawei Wang [Fri, 9 Oct 2020 13:46:04 +0000 (16:46 +0300)]
ethdev: introduce flow sample action

When using full offload, all traffic will be handled by the HW, and
forwarded to the requested VF or wire and the control application does
not see this traffic anymore. So there's a need for an action that
enables the control application some forwarded traffic visibility.

The solution introduces a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination.

The packets sampled equals is '1/ratio', the ratio value set to 1
means that the packets will be completely mirrored. The sample packet
can be assigned with different set of actions from the original packet.

In order to support the sample packet in rte_flow, new rte_flow action
definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample
will be introduced.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/pcap: fix crash on exit for infinite Rx
Ferruh Yigit [Fri, 9 Oct 2020 15:50:39 +0000 (16:50 +0100)]
net/pcap: fix crash on exit for infinite Rx

If the infinite Rx argument ('infinite_rx') is provided a ring is
allocated and filled in the '.rx_queue_setup' dev_ops.
Later this ring freed in the '.dev_close' dev_ops.

If the 'infinite_rx' provided and '.dev_close' called before
'.rx_queue_setup', the ring will be NULL and trying to empty/free it
will cause a crash.

This is fixed by adding ring NULL check before trying to empty/free it.

Bugzilla ID: 548
Fixes: a3f5252e5cbd ("net/pcap: enable infinitely Rx a pcap file")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/memif: use abstract socket address
Jakub Grajciar [Mon, 12 Oct 2020 08:28:29 +0000 (10:28 +0200)]
net/memif: use abstract socket address

Abstract socket address has no connection with
filesystem pathnames and the socket disappears
once all open references are closed.

Memif pmd will use abstract socket address by default.
For backwards compatibility use new argument
'socket-abstract=no'

Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
3 years agonet/octeontx2: remove useless check before free
Yunjian Wang [Fri, 9 Oct 2020 12:39:05 +0000 (20:39 +0800)]
net/octeontx2: remove useless check before free

The glibc free allows free(NULL) as null operation,
so remove this useless null checks.

Coverity issue: 357719
Fixes: da138cd47e06 ("net/octeontx2: handle port reconfigure")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/bnxt: add parent child flow create and free
Kishore Padmanabha [Fri, 9 Oct 2020 11:11:29 +0000 (16:41 +0530)]
net/bnxt: add parent child flow create and free

Added support in the ULP mapper to enable parent child flow
creation and destroy. This feature enables support for the vxlan
decap functionality.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: remove flow db table type from templates
Mike Baucom [Fri, 9 Oct 2020 11:11:28 +0000 (16:41 +0530)]
net/bnxt: remove flow db table type from templates

FDB type is now driven by the caller, not the template.
So remove it.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
3 years agonet/bnxt: consolidate template table processing
Mike Baucom [Fri, 9 Oct 2020 11:11:27 +0000 (16:41 +0530)]
net/bnxt: consolidate template table processing

Name changes due to consolidating the template table processing
and hence are not necessary.

- chip before type in name
- removal of class in key field info

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
3 years agonet/bnxt: support parent child flow database
Kishore Padmanabha [Fri, 9 Oct 2020 11:11:26 +0000 (16:41 +0530)]
net/bnxt: support parent child flow database

Added support for parent child flow database apis. This
feature adds support to enable vxlan decap support where
flows needs to maintain parent-child flow relationship.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: support runtime EM selection
Mike Baucom [Fri, 9 Oct 2020 11:11:25 +0000 (16:41 +0530)]
net/bnxt: support runtime EM selection

This patch adds support to select internal Exact Match vs
External Exact Match support while loading the PMD.
- Added new mem type conditional opcode for internal/external
- Adapted the flowdb resource counts based on selected mode
- Template changes to use the new opcode
- The decision for internal/external EM support is based on the
  devargs parameter max_num_kflows.  If this is set, external EM
  is used.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: handle default VNIC change async event
Venkat Duvvuru [Fri, 9 Oct 2020 11:11:24 +0000 (16:41 +0530)]
net/bnxt: handle default VNIC change async event

Currently, we are only registering to this event if the function
is a trusted VF. This patch extends it for PFs as well.

Fixes: 322bd6e70272 ("net/bnxt: add port representor infrastructure")
Cc: stable@dpdk.org
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: combine default and regular flows
Kishore Padmanabha [Fri, 9 Oct 2020 11:11:23 +0000 (16:41 +0530)]
net/bnxt: combine default and regular flows

The default and regular flows are stored in the same flow table
instead of different flow tables. This should help code reuse
and reducing the number of allocations.
So combine default and regular flows in flow database.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: consolidate template table processing
Mike Baucom [Fri, 9 Oct 2020 11:11:22 +0000 (16:41 +0530)]
net/bnxt: consolidate template table processing

The table processing has been consolidated to be able to reuse the same
code for action and classification template processing.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix PF support in SR-IOV mode
Venkat Duvvuru [Fri, 9 Oct 2020 11:11:21 +0000 (16:41 +0530)]
net/bnxt: fix PF support in SR-IOV mode

1. Implement HWRM_FUNC_VF_RESOURCE_CFG command and use it to
   reserve resources for VFs when NEW RM is enabled.
2. Invoke PF’s FUNC_CFG before configuring VFs resources.
3. Don’t consider max_rx_em_flows in max_l2_ctx calculation
   when VFs are configured.
4. Issue HWRM_FUNC_QCFG instead of HWRM_FUNC_QCAPS to find
   out the actual allocated resources for VF.
5. Don’t add random mac to the VF.
6. Handle completion type CMPL_BASE_TYPE_HWRM_FWD_REQ instead
   of CMPL_BASE_TYPE_HWRM_FWD_RESP.
7. Don't enable HWRM_FUNC_DRV_RGTR_INPUT_FLAGS_FWD_NONE_MODE
   when the list of HWRM commands that needs to be forwarded
   to the PF is specified in HWRM_FUNC_DRV_RGTR.
8. Update the HWRM commands list that can be forwarded to the
   PF.

Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: add Stingray device support to ULP
Mike Baucom [Fri, 9 Oct 2020 11:11:20 +0000 (16:41 +0530)]
net/bnxt: add Stingray device support to ULP

- Add new template files for Stingray
- Add new TRUFLOW resources for Stingray

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: add multi-device infrastructure
Mike Baucom [Fri, 9 Oct 2020 11:11:19 +0000 (16:41 +0530)]
net/bnxt: add multi-device infrastructure

In order to support multiple devices this patch:
- Breaks the template into device specific files
- Changes template list retrieval to use device id
- Determines the software device id using the bp pointer
- Determines the TRUFLOW resources based on device id

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
3 years agonet/bnxt: fix corruption of session details
Kishore Padmanabha [Fri, 9 Oct 2020 11:11:18 +0000 (16:41 +0530)]
net/bnxt: fix corruption of session details

The session details that is shared among multiple ports
need to be outside the bnxt structure.

Fixes: 70e64b27af5b ("net/bnxt: support ULP session manager cleanup")
Cc: stable@dpdk.org
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix non-vector fast mbuf free offload
Lance Richardson [Fri, 9 Oct 2020 16:36:42 +0000 (12:36 -0400)]
net/bnxt: fix non-vector fast mbuf free offload

The fast mbuf free offload for non-vector mode requires
additional checks in order to handle long tx buffer
descriptors, so dedicated functions are needed for
vector- and non-vector-modes.

Fixes: 369f6077c515 ("net/bnxt: support fast mbuf free")

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/mlx5: support ICMP identifier matching
Li Zhang [Fri, 9 Oct 2020 06:11:42 +0000 (09:11 +0300)]
net/mlx5: support ICMP identifier matching

PRM expose fields "Icmp_header_data" in IPv4 ICMP.
Update ICMP mask parameter with ICMP identifier and sequence number
fields.
ICMP sequence number spec with mask, Icmp_header_data low 16 bits are
set.
ICMP identifier spec with mask, Icmp_header_data high 16 bits are set.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>