dpdk.git
3 years agoexamples/pipeline: add new example application
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:04 +0000 (11:20 +0100)]
examples/pipeline: add new example application

Add new example application to showcase the API of the newly
introduced SWX pipeline type.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agotable: add exact match SWX table
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:03 +0000 (11:20 +0100)]
table: add exact match SWX table

Add the exact match table type for the SWX pipeline. Used under the
hood by the SWX pipeline table instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agoport: add source and sink SWX ports
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:02 +0000 (11:20 +0100)]
port: add source and sink SWX ports

Add the PCAP file-based source (input) and sink (output) port types
for the SWX pipeline. The sink port is typically used to implement the
packet drop pipeline action. Used under the hood by the pipeline rx
and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agoport: add ethernet device SWX port
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:01 +0000 (11:20 +0100)]
port: add ethernet device SWX port

Add the Ethernet device input/output port type for the SWX pipeline.
Used under the hood by the pipeline rx and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline specification file
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:00 +0000 (11:20 +0100)]
pipeline: add SWX pipeline specification file

Add support for building the SWX pipeline based on specification file
with syntax aligned to the P4 language. The specification file may be
generated by the P4C compiler in the future.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX table update high level API
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:59 +0000 (11:19 +0100)]
pipeline: add SWX table update high level API

High-level transaction-oriented API for SWX pipeline table updates. It
supports multi-table atomic updates, i.e. multiple tables can be
updated in a single step with only the before and after table set
visible to the packets. Uses the lower-level table update mechanisms.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline flush
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:58 +0000 (11:19 +0100)]
pipeline: add SWX pipeline flush

Flush the packets currently buffered by the SWX pipeline output ports.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline query API
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:57 +0000 (11:19 +0100)]
pipeline: add SWX pipeline query API

Query API to be used by the control plane to detect the configuration
and state of the SWX pipeline and its internal objects.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX instruction optimizer
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:56 +0000 (11:19 +0100)]
pipeline: add SWX instruction optimizer

Instruction optimizer. Detects frequent patterns and replaces them
with some more powerful vector-like pipeline instructions without any
user effort. Executes at instruction translation, not at run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX instruction verifier
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:55 +0000 (11:19 +0100)]
pipeline: add SWX instruction verifier

Instruction verifier. Executes at instruction translation time during
SWX pipeline build, i.e. at initialization instead of run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX instruction description
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:54 +0000 (11:19 +0100)]
pipeline: add SWX instruction description

Added SWX instruction set reference table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX jump and return instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:53 +0000 (11:19 +0100)]
pipeline: introduce SWX jump and return instructions

The jump instructions are either unconditional (jmp) or conditional on
positive/negative tests such as header validity (jmpv/jmpnv), table
lookup hit/miss (jmph/jmpnh), executed action (jmpa/jmpna), equality
(jmpeq/jmpneq), comparison result (jmplt/jmpgt). The return
instruction resumes the pipeline execution after action subroutine.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX extern instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:52 +0000 (11:19 +0100)]
pipeline: introduce SWX extern instruction

The extern instruction calls one of the member functions of a given
extern object or it calls the given extern function. The function
arguments must be written in advance to the mailbox. The results
are available in the same place after execution.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX table instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:51 +0000 (11:19 +0100)]
pipeline: introduce SWX table instruction

The table instruction looks up the input key into the table and then
it triggers the execution of the action found in the table entry. On
lookup miss, the default table action is executed.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX SHR instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:50 +0000 (11:19 +0100)]
pipeline: introduce SWX SHR instruction

The shr (i.e. shift right) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX SHL instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:49 +0000 (11:19 +0100)]
pipeline: introduce SWX SHL instruction

The shl (i.e. shift left) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX XOR instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:48 +0000 (11:19 +0100)]
pipeline: introduce SWX XOR instruction

The xor (i.e. bitwise exclusive or) instruction source can be header
field (H), meta-data field (M), extern object (E) or function (F)
mailbox field, table entry action data field (T) or immediate value
(I). The destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX or instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:47 +0000 (11:19 +0100)]
pipeline: introduce SWX or instruction

The or (i.e. bitwise or) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX and instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:46 +0000 (11:19 +0100)]
pipeline: introduce SWX and instruction

The and (i.e. bitwise and) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX cksub instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:45 +0000 (11:19 +0100)]
pipeline: introduce SWX cksub instruction

The cksub (i.e. checksum subtract) instruction is used to update the
1's complement sum commonly used by protocols such as IPv4, TCP or
UDP.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX ckadd instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:44 +0000 (11:19 +0100)]
pipeline: introduce SWX ckadd instruction

The ckadd (i.e. checksum add) instruction is used to either compute,
verify or update the 1's complement sum commonly used by protocols
such as IPv4, TCP or UDP.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX subtract instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:43 +0000 (11:19 +0100)]
pipeline: introduce SWX subtract instruction

The sub (i.e. subtract) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: introduce SWX add instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:42 +0000 (11:19 +0100)]
pipeline: introduce SWX add instruction

The add instruction source can be header field (H), meta-data field
(M), extern object (E) or function (F) mailbox field, table entry
action data field (T) or immediate value (I). The destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX DMA instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:41 +0000 (11:19 +0100)]
pipeline: add SWX DMA instruction

The DMA instruction handles the bulk read transfer of one header from
the table entry action data. Typically used to generate headers, i.e.
headers that are not extracted from the input packet.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX move instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:40 +0000 (11:19 +0100)]
pipeline: add SWX move instruction

The mov (i.e. move) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add header validate and invalidate SWX instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:39 +0000 (11:19 +0100)]
pipeline: add header validate and invalidate SWX instructions

Add instructions to flag a header as valid or invalid. This flag can
be tested by the jmpv (jump if header valid) and jmpnv (jump if header
not valid) instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX Tx and emit instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:38 +0000 (11:19 +0100)]
pipeline: add SWX Tx and emit instructions

Add header emit and packet transmission instructions. Emit adds to the
output packet a header that is either generated (e.g. read from table
entry by action) or extracted from the input packet. Tx ends the
pipeline processing; discard is implemented by tx to special port.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX Rx and extract instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:37 +0000 (11:19 +0100)]
pipeline: add SWX Rx and extract instructions

Add packet reception and header extraction instructions. The Rx must
be the first pipeline instruction. Each extracted header is logically
removed from the packet, then it can be read/written by instructions,
emitted into the outgoing packet or discarded.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:36 +0000 (11:19 +0100)]
pipeline: add SWX pipeline instructions

The SWX pipeline instructions represent the main program that defines
the life of the packet. As packets go through tables that trigger
action subroutines, the headers and meta-data get transformed along
the way.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline tables
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:35 +0000 (11:19 +0100)]
pipeline: add SWX pipeline tables

Add tables to the SWX pipeline. The match fields are flexibly selected
from the headers and meta-data. The set of table actions is flexibly
selected for each table from the set of pipeline actions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline action
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:34 +0000 (11:19 +0100)]
pipeline: add SWX pipeline action

Add SWX actions that are dynamically-defined through instructions as
opposed to pre-defined. The actions are subroutines of the pipeline
program that triggered by table lookup. The input arguments are the
action data from the table entry (format defined by struct), the
headers and meta-data are in/out.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX extern objects and funcs
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:33 +0000 (11:19 +0100)]
pipeline: add SWX extern objects and funcs

Add extern objects and functions to plug into the SWX pipeline any
functionality that cannot be efficiently implemented with existing
instructions, e.g. special checksum/ECC, crypto, meters, stats arrays,
heuristics, etc. In/out arguments are passed through mailbox with
format defined by struct.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX headers and meta-data
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:32 +0000 (11:19 +0100)]
pipeline: add SWX headers and meta-data

Add support for dynamically-defined packet headers and meta-data to
the SWX pipeline. The header and meta-data format are defined by the
struct type they instantiate.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline output port
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:31 +0000 (11:19 +0100)]
pipeline: add SWX pipeline output port

Add output ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The TX interface is single packet, with packet
batching internally for performance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add SWX pipeline input port
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:30 +0000 (11:19 +0100)]
pipeline: add SWX pipeline input port

Add input ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The RX interface is single packet, with packet
batching internally for performance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agopipeline: add new SWX pipeline type
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:29 +0000 (11:19 +0100)]
pipeline: add new SWX pipeline type

Add new improved Software Switch (SWX) pipeline type that supports
dynamically-defined packet headers, meta-data, actions and pipelines.
Actions and pipelines are defined through instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agodoc: remove references to make from prog guide
Ciara Power [Mon, 21 Sep 2020 13:59:17 +0000 (14:59 +0100)]
doc: remove references to make from prog guide

Make is no longer supported for compiling DPDK, references are now
removed in the documentation.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
3 years agodoc: remove references to make from howto guides
Ciara Power [Mon, 21 Sep 2020 13:59:16 +0000 (14:59 +0100)]
doc: remove references to make from howto guides

Make is no longer supported for compiling DPDK, references are now
removed in the documentation.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
3 years agodoc: remove references to make from FreeBSD guide
Ciara Power [Mon, 21 Sep 2020 13:59:15 +0000 (14:59 +0100)]
doc: remove references to make from FreeBSD guide

Make is no longer supported for compiling DPDK, references are now
removed in the documentation.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agodoc: remove references to make from Linux guide
Ciara Power [Mon, 21 Sep 2020 13:59:14 +0000 (14:59 +0100)]
doc: remove references to make from Linux guide

Make is no longer supported for compiling DPDK, references are now
removed in the documentation.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoapp: remove references to make-based config
Ciara Power [Mon, 21 Sep 2020 13:59:13 +0000 (14:59 +0100)]
app: remove references to make-based config

Make is no longer supported, RTE_SDK, RTE_TARGET and CONFIG options
are no longer in use.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>
3 years agodevtools: remove legacy flags from includes check
Ciara Power [Mon, 21 Sep 2020 13:59:12 +0000 (14:59 +0100)]
devtools: remove legacy flags from includes check

Make is no longer supported, the test script for make builds is no
longer required. Uses of make in other tool scripts are replaced.

Signed-off-by: Ciara Power <ciara.power@intel.com>
3 years agodoc: fix references to removed guide
Thomas Monjalon [Wed, 30 Sep 2020 17:20:18 +0000 (19:20 +0200)]
doc: fix references to removed guide

The page "Development Kit Build System" was about make,
so it has been removed. A better help is in the Linux guide
(note: mlx4/mlx5 are supported on Linux only for now).

Fixes: 3cc6ecfdfe85 ("build: remove makefiles")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ciara Power <ciara.power@intel.com>
3 years agoci: add tests jobs in aarch64 vm
Juraj Linkeš [Fri, 28 Aug 2020 11:45:37 +0000 (13:45 +0200)]
ci: add tests jobs in aarch64 vm

Tests requiring hugepages do not work outside of VM environment because
of security limitations. Add aarch64 builds which run tests to run in
a VM to avoid these limitations. Leave non-hugepage environments since
the tests may produce different results in hugepage and non-hugepage
environments.

Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Aaron Conole <aconole@redhat.com>
3 years agostack: fix uninitialized variable
Yunjian Wang [Fri, 25 Sep 2020 05:00:50 +0000 (13:00 +0800)]
stack: fix uninitialized variable

This patch fixes an issue that uninitialized 'success'
is used to be compared with '0'.

Coverity issue: 337676
Fixes: 3340202f5954 ("stack: add lock-free implementation")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agostack: relax pop CAS ordering
Steven Lariau [Fri, 25 Sep 2020 17:43:39 +0000 (18:43 +0100)]
stack: relax pop CAS ordering

Replace the store-release by relaxed for the CAS success at the end of
pop. Release isn't needed, because there is not write to data that need
to be synchronized.
The only preceding write is when the length is decreased, but the length
CAS loop already ensures the right synchronization.
The situation to avoid is when a thread sees the old length but the new
list, that doesn't have enough items for pop to success.
But the CAS success on length before the pop loop ensures any core reads
and updates the latest length, preventing this situation.

The store-release is also used to make sure that the items are read
before the head is updated, in order to prevent a core in pop to read an
incorrect value because another core rewrites it with push.
But this isn't needed, because items are read only when removed from the
used list. Right after this, they are pushed to the free list, and the
store-release in push makes sure the items are read before they are
visible in the free list.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agostack: reload head when pop fails
Steven Lariau [Fri, 25 Sep 2020 17:43:38 +0000 (18:43 +0100)]
stack: reload head when pop fails

List head must be loaded right before continue (when failed to
find the new head).
Without this, one thread might keep trying and failing to pop items
without ever loading the new correct head.

Fixes: 7e6e609939a8 ("stack: add C11 atomic implementation")
Cc: gage.eads@intel.com
Cc: stable@dpdk.org
Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agostack: remove redundant orderings on pop
Steven Lariau [Fri, 25 Sep 2020 17:43:37 +0000 (18:43 +0100)]
stack: remove redundant orderings on pop

The load-acquire of list->len on pop function is redundant.
Only the CAS success needs to be load-acquire.
It synchronizes with the store release in push, to ensure that the
updated head is visible when the new length is visible.
Without this, one thread in pop could see the increased length but the
old list, which doesn't have enough items yet for pop to succeed.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agostack: remove acquire fence on push
Steven Lariau [Fri, 25 Sep 2020 17:43:36 +0000 (18:43 +0100)]
stack: remove acquire fence on push

An acquire fence is used to make sure loads after the fence can observe
all store operations before a specific store-release.
But push doesn't read any data, except for the head which is part of a
CAS operation (the items on the list are not read).
So there is no need for the acquire barrier.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agostack: fix inconsistent weak/strong CAS
Steven Lariau [Fri, 25 Sep 2020 17:43:35 +0000 (18:43 +0100)]
stack: fix inconsistent weak/strong CAS

Fix cmpexchange usage of weak / strong.
The generated code is the same on x86 and ARM (there is no weak
cmpexchange), but the old usage was inconsistent.
For push and pop update size, weak is used because cmpexchange is inside
a loop.
For pop update root, strong is used even though cmpexchange is inside a
loop, because there may be a lot of operations to do in a loop iteration
(locate the new head).

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agotest/stack: remove thread synchronisation
Steven Lariau [Wed, 12 Aug 2020 19:18:47 +0000 (20:18 +0100)]
test/stack: remove thread synchronisation

Remove the part that checks if there is enough room in the stack, it's
always true as long as size of stack >= MAX_BULK*rte_lcore_count().
This check used an atomic cmpset, and read / write to a shared size
variable. These operations result in some form of synchronization
that might get in the way of the actual stack testing.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agotest/stack: check errors for multi-threads
Steven Lariau [Wed, 12 Aug 2020 19:18:46 +0000 (20:18 +0100)]
test/stack: check errors for multi-threads

Use rte_eal_wait_lcore to wait and get the return value for all cores.
This is used to propagate any error to the main core.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agotest/stack: remove unneeded memory allocations
Steven Lariau [Wed, 12 Aug 2020 19:18:44 +0000 (20:18 +0100)]
test/stack: remove unneeded memory allocations

Replace the arguments array by one argument.
All objects in the args array have the same values, so there is no need
to use an array, only one struct is enough.
The args object is a lot smaller, and the allocation can be replaced
with a global variable.
As a consequence of using a single argument, there is no need to use a
loop to launch the test on every core one by one. Replace it with
rte_eal_mp_remote_launch.

The allocation of obj_table isn't needed either, because MAX_BULK is
small. The allocation can instead be replaced with a static array.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
3 years agoethdev: fix link speed helper documentation
David Marchand [Tue, 29 Sep 2020 12:12:22 +0000 (14:12 +0200)]
ethdev: fix link speed helper documentation

When generating the documentation, a new warning can be seen:

.../dpdk/lib/librte_ethdev/rte_ethdev.h:2441:
  warning: argument 'link_speed' of command @param is not found in the
  argument list of rte_eth_link_speed_to_str(uint32_t speed_link)
.../dpdk/lib/librte_ethdev/rte_ethdev.h:2455: warning: The following
  parameters of rte_eth_link_speed_to_str(uint32_t speed_link) are not
  documented: parameter 'speed_link'

Align the function prototype to its doxygen description.

Fixes: fbf931c9c392 ("ethdev: format link status text")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agodoc: make doxygen comply with meson werror option
Bruce Richardson [Tue, 29 Sep 2020 16:55:02 +0000 (17:55 +0100)]
doc: make doxygen comply with meson werror option

When the --werror meson build option is set, we can set the WARN_AS_ERRORS
doxygen option in the doxygen config flag to get the same behaviour for API
doc building as for building the rest of DPDK. This can help catch
documentation errors sooner in the development process.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agodoc: hide sphinx standard output
Bruce Richardson [Tue, 29 Sep 2020 16:55:00 +0000 (17:55 +0100)]
doc: hide sphinx standard output

To see only errors and warnings from the doc builds, we can send the
standard output text to a logfile and have only the stderr messages
printed. This is similar to what is done for the API documentation.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agodoc: put doxygen log file in build directory
Bruce Richardson [Tue, 29 Sep 2020 16:54:59 +0000 (17:54 +0100)]
doc: put doxygen log file in build directory

The meson documentation states that projects should not rely upon the
custom_target build commands are run from any given directory.  Therefore,
rather than writing the standout output from doxygen to the current
directory - which could be anywhere in future, put it into the api
directory, so that it is in a known location.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agodoc: align doxygen output folder with sphinx guides
Bruce Richardson [Tue, 29 Sep 2020 16:54:58 +0000 (17:54 +0100)]
doc: align doxygen output folder with sphinx guides

The API docs were output to "<build>/doc/api/api" folder, which was
ugly-looking with the repeated "api", and inconsistent with the sphinx
guides which were written to "<build>/doc/guides/html". Changing the
doxygen output folder to "html" fixes both these issues.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agodoc: hide verbose doxygen standard output
Bruce Richardson [Tue, 29 Sep 2020 16:54:57 +0000 (17:54 +0100)]
doc: hide verbose doxygen standard output

The standard output of doxygen is very verbose, and since ninja mixes
stdout and stderr together it makes it difficult to see any warnings from
the doxygen run. Therefore, we can just log the standard output to file,
and only output the stderr to make warnings clear.

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agokni: fix build with Linux 5.9
Ferruh Yigit [Mon, 17 Aug 2020 10:32:47 +0000 (11:32 +0100)]
kni: fix build with Linux 5.9

Starting from Linux 5.9 'get_user_pages_remote()' API doesn't get
'struct task_struct' parameter:
commit 64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")

The change reflected to the KNI with version check.

Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agovhost/crypto: fix possible TOCTOU attack
Fan Zhang [Mon, 28 Sep 2020 10:59:18 +0000 (11:59 +0100)]
vhost/crypto: fix possible TOCTOU attack

This patch fixes the possible time-of-check to time-of-use (TOCTOU)
attack problem by copying request data and descriptor index to local
variable prior to process.

Also the original sequential read of descriptors may lead to TOCTOU
attack. This patch fixes the problem by loading all descriptors of a
request to local buffer before processing.

CVE-2020-14375
Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost/crypto: fix data length check
Fan Zhang [Mon, 28 Sep 2020 10:59:17 +0000 (11:59 +0100)]
vhost/crypto: fix data length check

This patch fixes the incorrect data length check to vhost crypto.
Instead of blindly accepting the descriptor length as data length, the
change compare the request provided data length and descriptor length
first. The security issue CVE-2020-14374 is not fixed alone by this
patch, part of the fix is done through:
"vhost/crypto: fix missed request check for copy mode".

CVE-2020-14374
Fixes: 3c79609fda7c ("vhost/crypto: handle virtually non-contiguous buffers")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost/crypto: fix write back source
Fan Zhang [Mon, 28 Sep 2020 10:59:16 +0000 (11:59 +0100)]
vhost/crypto: fix write back source

This patch fixes vhost crypto library for the incorrect source and
destination buffer calculation in the copy mode.

Fixes: cd1e8f03abf0 ("vhost/crypto: fix packet copy in chaining mode")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost/crypto: fix missed request check for copy mode
Fan Zhang [Mon, 28 Sep 2020 10:59:15 +0000 (11:59 +0100)]
vhost/crypto: fix missed request check for copy mode

This patch fixes the missed request check to vhost crypto
copy mode.

CVE-2020-14376
CVE-2020-14377
Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost/crypto: fix descriptor deduction
Fan Zhang [Mon, 28 Sep 2020 10:59:14 +0000 (11:59 +0100)]
vhost/crypto: fix descriptor deduction

This patch fixes the incorrect descriptor deduction for vhost crypto.

CVE-2020-14378
Fixes: 16d2e718b8ce ("vhost/crypto: fix possible out of bound access")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost/crypto: fix pool allocation
Fan Zhang [Mon, 28 Sep 2020 10:59:13 +0000 (11:59 +0100)]
vhost/crypto: fix pool allocation

This patch fixes the missing iv space allocation in crypto
operation mempool.

Fixes: 709521f4c2cd ("examples/vhost_crypto: support multi-core")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost: fix external backends readiness
Maxime Coquelin [Wed, 23 Sep 2020 09:49:02 +0000 (11:49 +0200)]
vhost: fix external backends readiness

Commit d0fcc38f5fa4 ("vhost: improve device readiness notifications")
makes the assumption that every Virtio devices are considered
ready for preocessing as soon as first queue pair is configured
and enabled.

While this is true for Virtio-net, it isn't for Virtio-scsi
and Virtio-blk.

This patch fixes this by only making this assumption for
the builtin Virtio-net backend, and restores back to previous
behaviour for other backends.

Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications")

Reported-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agobus/pci: fix mapping BAR containing MSI-X table
Hyong Youb Kim [Fri, 25 Sep 2020 02:14:35 +0000 (19:14 -0700)]
bus/pci: fix mapping BAR containing MSI-X table

When the BAR contains MSI-X table, pci_vfio_mmap_bar() tries to skip
the table and map the rest. "map around it" is the phrase used in the
source. The function splits the BAR into two regions: the region
before the table (first part or memreg[0]) and the region after the
table (second part or memreg[1]).

For hardware that has MSI-X vector table offset 0, the first part does
not exist (memreg[0].size == 0).

  Capabilities: [60] MSI-X: Enable- Count=48 Masked-
         Vector table: BAR=2 offset=00000000
         PBA: BAR=2 offset=00001000

The mapping part of the function maps the first part, if it
exists. Then, it maps the second part, if it exists and "if mapping the
first part succeeded".

The recent change that replaces MAP_FAILED with NULL breaks the "if
mapping the first part succeeded" condition (1) in the snippet below.

    void *map_addr = NULL;
    if (memreg[0].size) {
    /* actual map of first part */
    map_addr = pci_map_resource(...);
    }

    /* if there's a second part, try to map it */
    if (map_addr != NULL                              // -- (1)
    && memreg[1].offset && memreg[1].size) {
[...]
    }

    if (map_addr == NULL) {
            RTE_LOG(ERR, EAL, "Failed to map pci BAR%d\n",
                    bar_index);
            return -1;
    }

When the first part does not exist, (1) sees map_addr is still NULL,
and the function fails. This behavior is a regression and fails
probing hardware with vector table offset 0.

Previously, (1) was "map_addr != MAP_FAILED", which meant
pci_map_resource() was actually attempted and failed. So, expand (1)
to check if the first part exists as well, to match the semantics of
MAP_FAILED.

Bugzilla ID: 539
Fixes: e200535c1ca3 ("mem: drop mapping API workaround")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
3 years agoethdev: use C11 atomics for link status
Phil Yang [Thu, 24 Sep 2020 05:39:28 +0000 (13:39 +0800)]
ethdev: use C11 atomics for link status

Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for link status update.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
3 years agopower: use C11 atomics for power state
Phil Yang [Thu, 24 Sep 2020 05:39:27 +0000 (13:39 +0800)]
power: use C11 atomics for power state

Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for power in use state update.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Hunt <david.hunt@intel.com>
3 years agobbdev: use C11 atomics for device processing counter
Phil Yang [Thu, 24 Sep 2020 05:39:26 +0000 (13:39 +0800)]
bbdev: use C11 atomics for device processing counter

Since rte_atomicXX APIs are not allowed to be used, use C11 atomic builtins
for device processing counter.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>
3 years agoeal: use C11 atomics for initialization check
Phil Yang [Thu, 24 Sep 2020 05:39:25 +0000 (13:39 +0800)]
eal: use C11 atomics for initialization check

Since rte_atomicXX APIs are not allowed to be used, use C11 builtins to
check if EAL is already initialized.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
3 years agobuild: remove deprecated cpuflag macros
Radu Nicolau [Thu, 24 Sep 2020 08:18:29 +0000 (08:18 +0000)]
build: remove deprecated cpuflag macros

Replace use of RTE_MACHINE_CPUFLAG macros with regular compiler
macros, which are more complete than those provided by DPDK, and as such
it allows new instruction sets to be leveraged without having to do
extra work to set them up in DPDK.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
3 years agomaintainers: update NXP email
Sachin Saxena [Mon, 14 Sep 2020 14:06:03 +0000 (19:36 +0530)]
maintainers: update NXP email

Updated email of maintainer.

Signed-off-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
3 years agomaintainers: update Mellanox emails
Ori Kam [Wed, 12 Aug 2020 16:08:35 +0000 (16:08 +0000)]
maintainers: update Mellanox emails

This patch updates Mellanox maintainers mails from
the Mellanox domain to Nvidia domain.

Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agomaintainers: remove documentation maintainers
John McNamara [Wed, 9 Sep 2020 17:14:49 +0000 (18:14 +0100)]
maintainers: remove documentation maintainers

Removed the documentation maintainers.
The documentation is now, currently, unmaintained.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
3 years agoeal: remove deprecated coherent IO memory barriers
Phil Yang [Wed, 23 Sep 2020 09:16:37 +0000 (17:16 +0800)]
eal: remove deprecated coherent IO memory barriers

Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
3 years agotest/ring: enhance debug info in failure cases
Feifei Wang [Sun, 20 Sep 2020 11:48:56 +0000 (06:48 -0500)]
test/ring: enhance debug info in failure cases

Add more parameters into the macro TEST_RING_VERIFY and expand the scope
of application for it. Then replace all ring APIs check with
TEST_RING_VERIFY to facilitate debugging.

Furthermore, correct a spelling mistakes of the macro
TEST_RING_FULL_EMTPY_ITER.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: factorize object checks
Feifei Wang [Sun, 20 Sep 2020 11:48:55 +0000 (06:48 -0500)]
test/ring: factorize object checks

Do code clean up by moving repeated code inside 'test_ring_mem_cmp'
function to validate data and print information of enqueue/dequeue
elements if validation fails.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: validate single element enqueue/dequeue
Feifei Wang [Sun, 20 Sep 2020 11:48:54 +0000 (06:48 -0500)]
test/ring: validate single element enqueue/dequeue

Validate the return value of single element enqueue/dequeue operation in
the test.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: check dequeued object for single element
Feifei Wang [Sun, 20 Sep 2020 11:48:53 +0000 (06:48 -0500)]
test/ring: check dequeued object for single element

Add check in test_ring_basic_ex and test_ring_with_exact_size for single
element enqueue and dequeue operations to validate the dequeued objects.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: fix dequeued object checks
Feifei Wang [Sun, 20 Sep 2020 11:48:52 +0000 (06:48 -0500)]
test/ring: fix dequeued object checks

When using memcmp function to check data, the third param should be the
size of all elements, rather than the number of the elements.

Fixes: a9fe152363e2 ("test/ring: add custom element size functional tests")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: fix number of single element enqueue/dequeue
Feifei Wang [Sun, 20 Sep 2020 11:48:51 +0000 (06:48 -0500)]
test/ring: fix number of single element enqueue/dequeue

The ring capacity is (RING_SIZE - 1), thus only (RING_SIZE - 1) number of
elements can be enqueued into the ring.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agotest/ring: fix object reference for single element enqueue
Feifei Wang [Sun, 20 Sep 2020 11:48:50 +0000 (06:48 -0500)]
test/ring: fix object reference for single element enqueue

When enqueue one element to ring in the performance test, a pointer
should be passed to rte_ring_[sp|mp]enqueue APIs, not the pointer
to a table of void *pointers.

Fixes: a9fe152363e2 ("test/ring: add custom element size functional tests")
Cc: stable@dpdk.org
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agonet/enic: support VXLAN decap action combined with VLAN pop
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:06 +0000 (07:00 -0700)]
net/enic: support VXLAN decap action combined with VLAN pop

Flow Manager (flowman) provides DECAP_STRIP operation which
decapsulates VXLAN header and then removes VLAN header from the inner
packet. Use this operation to support vxlan_decap followed by
of_pop_vlan.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: generate VXLAN src port if it is zero in template
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:05 +0000 (07:00 -0700)]
net/enic: generate VXLAN src port if it is zero in template

When VXLAN source port in the template is zero, the adapter is
expected to generate a value based on the inner packet flow, when it
performs encapsulation. Flow Manager in the VIC adapter currently
lacks such ability. So, generate a random port when creating a flow if
the port is zero, to avoid transmitting packets with source port 0.

Fixes: ea7768b5bba8 ("net/enic: add flow implementation based on Flow Manager API")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: ignore VLAN inner type when it is zero
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:04 +0000 (07:00 -0700)]
net/enic: ignore VLAN inner type when it is zero

When a VLAN pattern is present, the flow handler always copies its
inner_type to the match buffer regardless of its value (i.e. HW
matches inner_type against packet's inner ethertype). When inner_type
spec and mask are both 0, adding it to the match buffer is usually
harmless but breaks the following pattern used in some applications
like OVS-DPDK.

flow create 0 ingress ... pattern eth ... type is 0x0800 /
vlan tci spec 0x2 tci mask 0xefff / ipv4 / end actions count /
of_pop_vlan / ...

The VLAN pattern's inner_type is 0. And the outer eth pattern's type
actually specifies the inner ethertype. The outer ethertype (0x0800)
is first copied to the match buffer. Then, the driver copies
inner_type (0) to the match buffer, which overwrites the existing
0x0800 with 0 and breaks the app usage above.

Simply ignore inner_type when it is 0, which is the correct
behavior. As a byproduct, the driver can support the usage like the
above.

Fixes: ea7768b5bba8 ("net/enic: add flow implementation based on Flow Manager API")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: support priorities for TCAM flows
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:03 +0000 (07:00 -0700)]
net/enic: support priorities for TCAM flows

Group 0 corresponds to TCAM which supports priorities. Accept non-zero
priorities for group 0 flows.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: support egress port id action
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:02 +0000 (07:00 -0700)]
net/enic: support egress port id action

Use Flow Manager (flowman) to support egress PORT_ID action. It can
steer egress packets from PFs and VFs to any uplink port as long as
they are all on the same VIC adapter. It can also steer packets
between ports on the same VIC adapter (i.e. loopback).

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: remove obsolete code
Hyong Youb Kim [Wed, 9 Sep 2020 14:00:01 +0000 (07:00 -0700)]
net/enic: remove obsolete code

The 'next' field in struct enic is unused. The comment in enic_cq_rq()
is out-of-date. Remove them.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: enable flow API for VF representor
Hyong Youb Kim [Wed, 9 Sep 2020 13:56:56 +0000 (06:56 -0700)]
net/enic: enable flow API for VF representor

Use Flow Manager (flowman) to support flow API for
representors. Representor's flow handlers simply invoke PF handlers
and pass the representor's flowman structure. The PF flowman handlers
are aware of representors and perform appropriate devcmds to create
flows on the NIC.

Also use flowman to create internal flows for implicit VF-representor
path. With that, representor Tx/Rx is now functional.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: extend flow handler to support VF representors
Hyong Youb Kim [Wed, 9 Sep 2020 13:56:55 +0000 (06:56 -0700)]
net/enic: extend flow handler to support VF representors

VF representor ports can create flows on VFs through the PF flowman
(Flow Manager) instance in the firmware. These flows match packets
egressing from VFs and apply flowman actions.

1. Make flow handler aware of VF representors
When a representor port invokes flow APIs, use the PF port's flowman
instance to perform flowman devcmd. If the port ID refers to a
representor, use VF handle instead of PF handle.

2. Serialize flow API calls
Multiple application thread may invoke flow APIs through PF and VF
representor ports simultaneously. This leads to races, as ports all
share the same PF flowman instance. Use a lock to serialize API
calls. Lock is used only when representors exist.

3. Add functions to create flows for implicit representor paths
There is an implicit path between VF and its representor. The
functions below create flow rules to implement that path.
- enic_fm_add_rep2vf_flow()
- enic_fm_add_vf2rep_flow()

The flows created for representor paths are marked as internal. These
are not visible to application, and the flush API does not destroy
them. They are automatically deleted when the representor port stops
(enic_fm_destroy).

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: add single queue Tx and Rx to VF representor
Hyong Youb Kim [Wed, 9 Sep 2020 13:56:54 +0000 (06:56 -0700)]
net/enic: add single queue Tx and Rx to VF representor

A VF representor allocates queues from PF's pool of queues and use
them for its Tx and Rx. It supports 1 Tx queue and 1 Rx queue.

Implicit packet forwarding between representor queues and VF does not
yet exist. It will be enabled in subsequent commits using flowman API.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: add minimal VF representor
Hyong Youb Kim [Wed, 9 Sep 2020 13:56:53 +0000 (06:56 -0700)]
net/enic: add minimal VF representor

Enable the minimal VF representor without Tx/Rx and flow API support.

1. Enable the standard devarg 'representor'
When the devarg is specified, create VF representor ports.

2. Initialize flowman early during PF probe
Representors require the flowman API from the firmware. Initialize it
before creating VF representors, so probe can detect the flowman
support and fail if not available.

3. Add enic_fm_allocate_switch_domain() to allocate switch domain ID
PFs and VFs on the same VIC adapter can forward packets to each other,
so the switch domain is the physical adapter.

4. Create a vnic_dev lock to serialize concurrent devcmd calls
PF and VF representor ports may invoke devcmd (e.g. dump stats)
simultaneously. As they all share a single PF devcmd instance in the
firmware, use a lock to serialize devcmd calls.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/enic: extend VNIC dev API for VF representors
Hyong Youb Kim [Wed, 9 Sep 2020 13:56:52 +0000 (06:56 -0700)]
net/enic: extend VNIC dev API for VF representors

VF representors need to proxy devcmd through the PF vnic_dev
instance. Extend vnic_dev to accommodate them as follows.

1. Add vnic_vf_rep_register()
A VF representor creates its own vnic_dev instance via this function
and saves VF ID. When performing devcmd, vnic_dev uses the saved VF ID
to proxy devcmd through the PF vnic_dev instance.

2. Add vnic_register_lock()
As PF and VF representors appear as independent ports to the
application, its threads may invoke APIs on them simultaneously,
leading to race conditions on the PF vnic_dev. For example, thread A
can query stats on PF port, while thread B queries stats on a VF
representor.

The PF port invokes this function to provide a lock to vnic_dev. This
lock is used to serialize devcmd calls from PF and VF representors.

3. Add utility functions to assist VF representor settings
vnic_dev_mtu() and vnic_dev_uif() retrieve vnic MTU and UIF number
(uplink index), respectively.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
3 years agonet/hns3: add Rx buffer size to Rx queue info
Chengchang Tang [Mon, 21 Sep 2020 13:22:38 +0000 (21:22 +0800)]
net/hns3: add Rx buffer size to Rx queue info

Report hns3 PMD configured Rx buffer size in Rx queue information query.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
3 years agoethdev: support getting Rx buffer size in Rx queue info
Chengchang Tang [Mon, 21 Sep 2020 13:22:37 +0000 (21:22 +0800)]
ethdev: support getting Rx buffer size in Rx queue info

Add a field named rx_buf_size in rte_eth_rxq_info to indicate the buffer
size used in receiving packets for HW.

In this way, upper-layer users can get this information by calling
rte_eth_rx_queue_info_get.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/netvsc: fix rndis packet addresses
Long Li [Fri, 18 Sep 2020 18:53:47 +0000 (11:53 -0700)]
net/netvsc: fix rndis packet addresses

The address should be calculated before type cast, not after.

Fixes: cc0251813277 ("net/netvsc: split send buffers from Tx descriptors")
Cc: stable@dpdk.org
Reported-by: Souvik Dey <sodey@rbbn.com>
Signed-off-by: Long Li <longli@microsoft.com>
3 years agonet/iavf: fix iterator for RSS LUT
Qi Zhang [Mon, 21 Sep 2020 08:30:58 +0000 (16:30 +0800)]
net/iavf: fix iterator for RSS LUT

Change RSS LUT iterator from uint8_t to uint16_t since the
RSS LUT size could exceed 255.

Fixes: 69dd4c3d0898 ("net/avf: enable queue and device")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ting Xu <ting.xu@intel.com>
3 years agonet/memif: relax barrier for zero copy path
Phil Yang [Fri, 11 Sep 2020 05:38:19 +0000 (13:38 +0800)]
net/memif: relax barrier for zero copy path

Using 'rte_mb' to synchronize the shared ring head/tail between producer
and consumer will stall the pipeline and damage performance on the weak
memory model platforms, such like aarch64.

Relax the expensive barrier with c11 atomic with explicit memory
ordering can improve 3.6% performance on throughput.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jakub Grajciar <jgrajcia@cisco.com>