dpdk.git
3 years agocommon/mlx5: share common definitions
Michael Baum [Tue, 19 Oct 2021 20:55:47 +0000 (23:55 +0300)]
common/mlx5: share common definitions

Create MACRO definitions file in the common driver as preparation for MR
and basic probe sharing.
Move relevant definitions from the net driver to the above file.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agocommon/mlx5: share basic probing with internal drivers
Michael Baum [Tue, 19 Oct 2021 20:55:46 +0000 (23:55 +0300)]
common/mlx5: share basic probing with internal drivers

Create common probing structure that includes, for now, basic probing
information detected by the common driver and share it with all the
internal drivers.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: register memory event callback in Windows
Michael Baum [Tue, 19 Oct 2021 20:55:45 +0000 (23:55 +0300)]
net/mlx5: register memory event callback in Windows

In device initialization, the driver registers to free hugepages events.
When hugepage is released, this callback frees all its related MRs.

In Windows initialization, this callback is not registered what may
cause to use invalid memory.

This patch adds memory event callback registration in Windows
initialization.

Fixes: 980826dc6f0f ("net/mlx5: probe on Windows")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agotest: add devargs test cases
Xueming Li [Wed, 20 Oct 2021 15:47:39 +0000 (23:47 +0800)]
test: add devargs test cases

Initial version to test global devargs syntax.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
3 years agodevargs: make bus optional
Xueming Li [Wed, 20 Oct 2021 15:47:38 +0000 (23:47 +0800)]
devargs: make bus optional

Global devargs syntax is used as device iteration filter like
"class=vdpa", a devargs without bus args is valid from parsing
perspective.

This patch makes bus args optional.

Fixes: d2a66ad79480 ("bus: add device arguments name parsing")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
3 years agodevargs: support path value with global device syntax
Xueming Li [Wed, 20 Oct 2021 15:47:37 +0000 (23:47 +0800)]
devargs: support path value with global device syntax

Slash is used to split global device arguments.

To support path value which contains slash, this patch parses devargs by
locating both slash and layer name key:
  bus=a,name=/some/path/class=b,k1=v1/driver=c,k2=v2
"/class=" and "/driver" are valid start of a layer.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
3 years agombuf: fix reset on mbuf free
Olivier Matz [Wed, 29 Sep 2021 21:37:07 +0000 (23:37 +0200)]
mbuf: fix reset on mbuf free

m->nb_seg must be reset on mbuf free whatever the value of m->next,
because it can happen that m->nb_seg is != 1. For instance in this
case:

  m1 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m1, 500);
  m2 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m2, 500);
  rte_pktmbuf_chain(m1, m2);
  m0 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m0, 500);
  rte_pktmbuf_chain(m0, m1);

As rte_pktmbuf_chain() does not reset nb_seg in the initial m1
segment (this is not required), after this code the mbuf chain
have 3 segments:
  - m0: next=m1, nb_seg=3
  - m1: next=m2, nb_seg=2
  - m2: next=NULL, nb_seg=1

Then split this chain between m1 and m2, it would result in 2 packets:
  - first packet
    - m0: next=m1, nb_seg=2
    - m1: next=NULL, nb_seg=2
  - second packet
    - m2: next=NULL, nb_seg=1

Freeing the first packet will not restore nb_seg=1 in the second
segment. This is an issue because it is expected that mbufs stored
in pool have their nb_seg field set to 1.

Fixes: 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
3 years agohash: promote some functions to stable
Honnappa Nagarahalli [Fri, 15 Oct 2021 02:27:10 +0000 (21:27 -0500)]
hash: promote some functions to stable

Promote APIs to stable.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
3 years agotest/hash: fix buffer overflow with jhash
Vladimir Medvedkin [Thu, 14 Oct 2021 17:48:19 +0000 (18:48 +0100)]
test/hash: fix buffer overflow with jhash

This patch fixes buffer overflow reported by ASAN,
please reference https://bugs.dpdk.org/show_bug.cgi?id=818

Some tests for the rte_hash table use the rte_jhash_32b() as
the hash function. This hash function interprets the length
argument in units of 4 bytes.

This patch adds a wrapper function around rte_jhash_32b()
to reflect API differences regarding the length argument,
effectively dividing it by 4.

For some tests rte_jhash() is used with keys of length not
a multiple of 4 bytes. From the rte_jhash() documentation:
If input key is not aligned to four byte boundaries or a
multiple of four bytes in length, the memory region just
after may be read (but not used in the computation).

This patch increases the size of the proto field of the
flow_key struct up to uint32_t.

Bugzilla ID: 818
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
3 years agoring: fix name size in ring structure
Honnappa Nagarahalli [Thu, 14 Oct 2021 20:55:50 +0000 (15:55 -0500)]
ring: fix name size in ring structure

Use correct define for the name array size. The change breaks ABI and
hence cannot be backported to stable branches.

Fixes: 38c9817ee1d8 ("mempool: adjust name size in related data types")

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agoethdev: replace bit shifts with macros
Thomas Monjalon [Wed, 20 Oct 2021 14:55:17 +0000 (15:55 +0100)]
ethdev: replace bit shifts with macros

The macros RTE_BIT32 and RTE_BIT64 are used to replace bit shifts.
The macro UINT64C is also used to replace remaining occurrences of ULL.

The bit shifts of ETH_RSS_LEVEL_* are kept for aesthetic reason.

The API of rte_mtr and rte_tm is using enums for 64-bit variables.
As they are enums, unsigned bit cannot be used.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agoethdev: forbid closing started device
Andrew Rybchenko [Wed, 20 Oct 2021 13:15:40 +0000 (16:15 +0300)]
ethdev: forbid closing started device

Ethernet device must be stopped first before close in accordance
with the documentation.

Fixes: 980995f8cc56 ("ethdev: improve API comments of close and detach functions")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agoapp/testpmd: add flex item commands
Gregory Etelson [Wed, 20 Oct 2021 15:14:57 +0000 (18:14 +0300)]
app/testpmd: add flex item commands

Network port hardware is shipped with fixed number of
supported network protocols. If application must work with a
protocol that is not included in the port hardware by default, it
can try to add the new protocol to port hardware.

Flex item or flex parser is port infrastructure that allows
application to add support for a custom network header and
offload flows to match the header elements.

Application must complete the following tasks to create a flow
rule that matches custom header:

1. Create flow item object in port hardware.
Application must provide custom header configuration to PMD.
PMD will use that configuration to create flex item object in
port hardware.

2. Create flex patterns to match. Flex pattern has a spec and a mask
components, like a regular flow item. Combined together, spec and mask
can target unique data sequence or a number of data sequences in the
custom header.
Flex patterns of the same flex item can have different lengths.
Flex pattern is identified by unique handler value.

3. Create a flow rule with a flex flow item that references
flow pattern.

Testpmd flex CLI commands are:

testpmd> flow flex_item create <port> <flex_id> <filename>

testpmd> set flex_pattern <pattern_id> \
         spec <spec data> mask <mask data>

testpmd> set flex_pattern <pattern_id> is <spec_data>

testpmd> flow create <port> ... \
/ flex item is <flex_id> pattern is <pattern_id> / ...

The patch works with the jansson library API.
A new optional dependency on jansson library is added for
testpmd. If jansson not detected the flex item functionality
is disabled.
Jansson development files must be present:
jansson.pc, jansson.h libjansson.[a,so]

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agoapp/testpmd: add flow command parsing routine
Gregory Etelson [Wed, 20 Oct 2021 15:14:56 +0000 (18:14 +0300)]
app/testpmd: add flow command parsing routine

testpmd flow creation is constructed from these procedures:
  1. receive string with flow rule description;
  2. parse input string and build flow parameters: port_id value,
     flow attributes, items array, actions array;
  3. create a flow rule from flow rule parameters.

Flow rule creation procedures are built as a pipeline. A new
procedure starts immediately after successful predecessor completion.
Due to this we have no dedicated routines providing intermediate
results for step 1-3 above.

The patch adds `flow_parse()` function call. It parses input string
and provides a caller with parsed data. This is a preparation step
for introducing flex item command processing.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agoethdev: introduce configurable flexible item
Viacheslav Ovsiienko [Wed, 20 Oct 2021 15:14:55 +0000 (18:14 +0300)]
ethdev: introduce configurable flexible item

1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2    /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agoethdev: support flow elements with variable length
Gregory Etelson [Wed, 20 Oct 2021 15:14:54 +0000 (18:14 +0300)]
ethdev: support flow elements with variable length

Flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
3 years agonet/ice: fix sideband queue initialization
Dapeng Yu [Tue, 19 Oct 2021 09:44:49 +0000 (17:44 +0800)]
net/ice: fix sideband queue initialization

Sideband queue need to be initialized when device is initialized.
Otherwise the calling to function "ice_init_ctrlq" may fail.

This patch fixes it.

Fixes: 97f4f78bbd9f ("net/ice/base: add functions for device clock control")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agocommon/cnxk: improve MCAM entries management
Satheesh Paul [Tue, 5 Oct 2021 02:55:27 +0000 (08:25 +0530)]
common/cnxk: improve MCAM entries management

This patch removes the MCAM preallocation scheme. The free
entry cache is removed and for every flow created, an MCAM
allocation request is made to the kernel. Each priority level
has a list of MCAM entries. For every flow rule added, the
MCAM entry obtained from kernel is checked if it is at the
correct user specified priority. If not, the existing rules
are moved across MCAM entries so that the user specified
priority is maintained.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
3 years agonet/cnxk: support telemetry
Gowrishankar Muthukrishnan [Tue, 19 Oct 2021 11:27:27 +0000 (16:57 +0530)]
net/cnxk: support telemetry

Add telemetry endpoints to ethdev.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agomempool/cnxk: support telemetry
Gowrishankar Muthukrishnan [Tue, 19 Oct 2021 11:27:26 +0000 (16:57 +0530)]
mempool/cnxk: support telemetry

Adding telemetry endpoints to cnxk mempool driver.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support telemetry for NIX
Gowrishankar Muthukrishnan [Tue, 19 Oct 2021 11:27:25 +0000 (16:57 +0530)]
common/cnxk: support telemetry for NIX

Add telemetry endpoints to NIX.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support telemetry for network pool allocator
Gowrishankar Muthukrishnan [Tue, 19 Oct 2021 11:27:24 +0000 (16:57 +0530)]
common/cnxk: support telemetry for network pool allocator

Add telemetry endpoints to NPA.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support meter action to flow destroy
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:12 +0000 (12:36 +0530)]
net/cnxk: support meter action to flow destroy

Meters are configured per flow using rte_flow_create API.
Patch adds support for destroy operation for meter action
applied on the flow.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support meter action to flow create
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:11 +0000 (12:36 +0530)]
net/cnxk: support meter action to flow create

Meters are configured per flow using rte_flow_create API.
Implement support for meter action applied on the flow.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to read/update meter stats
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:10 +0000 (12:36 +0530)]
net/cnxk: support to read/update meter stats

Implement API to read and update stats corresponding to
given meter instance for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to update precolor DSCP table
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:09 +0000 (12:36 +0530)]
net/cnxk: support to update precolor DSCP table

Implement API to update DSCP table for pre-coloring for
incoming packet per nixlf for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to enable/disable meter
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:08 +0000 (12:36 +0530)]
net/cnxk: support to enable/disable meter

Implement API to enable or disable meter instance for
CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to delete meter
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:07 +0000 (12:36 +0530)]
net/cnxk: support to delete meter

Implement API to delete meter instance for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to create meter
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:06 +0000 (12:36 +0530)]
net/cnxk: support to create meter

Implement API to create meter instance for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to delete meter policy
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:05 +0000 (12:36 +0530)]
net/cnxk: support to delete meter policy

Implement API to delete meter policy for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to create meter policy
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:04 +0000 (12:36 +0530)]
net/cnxk: support to create meter policy

Implement API to add meter policy for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to validate meter policy
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:03 +0000 (12:36 +0530)]
net/cnxk: support to validate meter policy

Implement API to validate meter policy for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to delete meter profile
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:02 +0000 (12:36 +0530)]
net/cnxk: support to delete meter profile

Implement API to delete meter profile for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to create meter profile
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:01 +0000 (12:36 +0530)]
net/cnxk: support to create meter profile

Implement API to add meter profile for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support to get meter capabilities
Sunil Kumar Kori [Tue, 12 Oct 2021 07:06:00 +0000 (12:36 +0530)]
net/cnxk: support to get meter capabilities

Implement ethdev operation to get meter capabilities for
CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/cnxk: support meter ops get
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:59 +0000 (12:35 +0530)]
net/cnxk: support meter ops get

To enable support for ingress meter, supported operations
are exposed for CNXK platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support meter in action list
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:58 +0000 (12:35 +0530)]
common/cnxk: support meter in action list

Meter action is added in supported action list.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support profile stats reset
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:57 +0000 (12:35 +0530)]
common/cnxk: support profile stats reset

Implement RoC API to reset stats per bandwidth profile
or per NIXLF.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support profile statistics
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:56 +0000 (12:35 +0530)]
common/cnxk: support profile statistics

CN10K platform provides statistics per bandwidth profile and
per nixlf. Implement RoC API to read stats for given bandwidth
profile.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profile stats to index
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:55 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profile stats to index

CN10K platform supports different stats for HW bandwidth profiles.
Implement RoC API to get index for given stats type.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profiles connection
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:54 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profiles connection

To maintain chain of bandwidth profiles, they needs to be
connected. Implement RoC API to connect two bandwidth profiles
at different levels.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support precolor table setup
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:53 +0000 (12:35 +0530)]
common/cnxk: support precolor table setup

For initial coloring of input packet, CN10K platform maintains
precolor table for VLAN, DSCP and Generic. Implement RoC
interface to setup pre color table.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profile dump
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:52 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profile dump

Implement RoC API to dump bandwidth profile on CN10K
platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support profile state toggle
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:51 +0000 (12:35 +0530)]
common/cnxk: support profile state toggle

Implement RoC API to enable or disable HW bandwidth profiles
on CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profile configure
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:50 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profile configure

Implement RoC API to configure HW bandwidth profile for
CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profiles free
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:49 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profiles free

Implement RoC interface to free HW bandwidth profiles on
CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support bandwidth profiles allocation
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:48 +0000 (12:35 +0530)]
common/cnxk: support bandwidth profiles allocation

Implement RoC API to allocate HW resources i.e. bandwidth
profiles for policer processing on CN10K platform.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support to get profile count
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:47 +0000 (12:35 +0530)]
common/cnxk: support to get profile count

Implement interface to get available profile count for given
NIXLF.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: support to get policer level to index
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:46 +0000 (12:35 +0530)]
common/cnxk: support to get policer level to index

CN10K platform supports policer up to 3 level of hierarchy.
Implement RoC API to get corresponding index for given level.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: update policer mbox API and HW definitions
Sunil Kumar Kori [Tue, 12 Oct 2021 07:05:45 +0000 (12:35 +0530)]
common/cnxk: update policer mbox API and HW definitions

To support ingress policer on CN10K, MBOX interfaces and HW
definitions updated.

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/octeontx2: use fast udata and mdata flags
Tejasree Kondoj [Fri, 24 Sep 2021 17:11:45 +0000 (22:41 +0530)]
net/octeontx2: use fast udata and mdata flags

Using fast metadata and userdata flags instead of
driver callbacks for set_pkt_metadata and
get_userdata in inline IPsec.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
3 years agonet/mlx5: fix RSS expansion for L2/L3 VXLAN
Lior Margalit [Thu, 14 Oct 2021 06:21:36 +0000 (09:21 +0300)]
net/mlx5: fix RSS expansion for L2/L3 VXLAN

The RSS expansion algorithm is using a graph to find the possible
expansion paths. The current implementation does not differentiate
between standard (L2) VXLAN and L3 VXLAN. As result the flow is expanded
with all possible paths.
For example:
testpmd> flow create... / vxlan / end actions rss level 2 / end
It is currently expanded to the following paths:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END

The fix is to adjust the expansion according to the outer UDP destination
port. In case flow pattern defines a match on the standard udp port, 4789,
or does not define a match on the destination port, which also implies
setting the standard one, the expansion for the above example will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
Otherwise, the expansion will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END

Fixes: f4f06e361516 ("net/mlx5: add flow VXLAN item")
Cc: stable@dpdk.org
Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/i40e: fix risk in descriptor read in NEON Rx
Ruifeng Wang [Wed, 15 Sep 2021 08:33:38 +0000 (16:33 +0800)]
net/i40e: fix risk in descriptor read in NEON Rx

Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.

In NEON vector PMD, vector load loads two contiguous 8B of
descriptor data into vector register. Given vector load ensures no
16B atomicity, read of the word that includes DD field could be
reordered after read of other words. In this case, some words could
contain invalid data.

Read barrier is added after read of qword1 that includes DD field.
And qword0 is reloaded to update vector register. This ensures
that the fetched data is correct.

Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
3 years agonet/i40e: fix IPv6 fragment RSS offload type in flow
Alvin Zhang [Tue, 19 Oct 2021 01:53:48 +0000 (09:53 +0800)]
net/i40e: fix IPv6 fragment RSS offload type in flow

To keep flow format uniform with ice, this patch adds support for
this RSS rule:
    flow create 0 ingress pattern eth / ipv6 / ipv6_frag_ext / end \
    actions rss types ipv6-frag end queues end queues end / end

Fixes: ef4c16fd9148 ("net/i40e: refactor RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix generic build on FreeBSD
Leyi Rong [Tue, 19 Oct 2021 03:02:08 +0000 (11:02 +0800)]
net/ice: fix generic build on FreeBSD

The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new ice_rxtx_common_avx.h header
file to avoid such bugs.

Bugzilla ID: 788
Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
Fixes: 20daa1c978b7 ("net/ice: fix crash in AVX512")
Cc: stable@dpdk.org
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/i40e: fix generic build on FreeBSD
Leyi Rong [Tue, 19 Oct 2021 03:02:07 +0000 (11:02 +0800)]
net/i40e: fix generic build on FreeBSD

The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new i40e_rxtx_common_avx.h header
file to avoid such bugs.

Bugzilla ID: 788
Fixes: 0604b1f2208f ("net/i40e: fix crash in AVX512")
Cc: stable@dpdk.org
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/mlx5: support more tunnel types
Eli Britstein [Thu, 23 Sep 2021 08:43:01 +0000 (11:43 +0300)]
net/mlx5: support more tunnel types

Accept RTE_FLOW_ITEM_TYPE_GRE, RTE_FLOW_ITEM_TYPE_NVGRE and
RTE_FLOW_ITEM_TYPE_GENEVE as valid tunnel types.

Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agoapp/testpmd: add tunnel types
Eli Britstein [Thu, 23 Sep 2021 08:43:00 +0000 (11:43 +0300)]
app/testpmd: add tunnel types

Current testpmd implementation supports VXLAN only for tunnel offload.
Add GRE, NVGRE and GENEVE for tunnel offload flow matches.

For example:
testpmd> flow tunnel create 0 type vxlan
port 0: flow tunnel #1 type vxlan
testpmd> flow tunnel create 0 type nvgre
port 0: flow tunnel #2 type nvgre
testpmd> flow tunnel create 0 type gre
port 0: flow tunnel #3 type gre
testpmd> flow tunnel create 0 type geneve
port 0: flow tunnel #4 type geneve

Fixes: 1b9f274623b8 ("app/testpmd: add commands for tunnel offload")
Cc: stable@dpdk.org
Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Gregory Etelson <getelson@nvidia.com>
3 years agonet/softnic: fix memory leak of meter policy
Dapeng Yu [Thu, 14 Oct 2021 02:12:07 +0000 (10:12 +0800)]
net/softnic: fix memory leak of meter policy

After the meter policies are created, they are not freed on device
close.

This patch fixes it.

Fixes: 5f0d54f372f0 ("ethdev: add pre-defined meter policy API")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
3 years agoapp/testpmd: fix access to DSCP table entries
Sunil Kumar Kori [Tue, 12 Oct 2021 08:33:17 +0000 (14:03 +0530)]
app/testpmd: fix access to DSCP table entries

During parsing of DSCP entries, memory is allocated and assigned
to *dscp_table. Later on, same memory is accessed using
*dscp_table[i++].

Due to higher precedence for array subscript, dscp_table[i++] will
be executed first which actually does not point to the same memory
which was allocated previously for DSCP table entries.

Fixes: 459463ae6c26 ("app/testpmd: fix memory allocation for DSCP table")
Cc: stable@dpdk.org
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/ena: update version to 2.5.0
Michal Krawczyk [Tue, 19 Oct 2021 10:56:29 +0000 (12:56 +0200)]
net/ena: update version to 2.5.0

This version update contains:
  * Fix for verification of the offload capabilities (especially for
    IPv6 packets).
  * Support for Tx and Rx free threshold values.
  * Fixes for per-queue offload capabilities.
  * Announce support of the scattered Rx offload.
  * NUMA aware allocations.
  * Check for the missing Tx completions.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
3 years agonet/ena: check missing Tx completions
Michal Krawczyk [Tue, 19 Oct 2021 10:56:28 +0000 (12:56 +0200)]
net/ena: check missing Tx completions

In some cases Tx descriptors may be uncompleted by the HW and as a
result they will never be released.

This patch adds checking for the missing Tx completions to the ENA timer
service, so in order to use this feature, the application must call the
function rte_timer_manage().

Missing Tx completion reset threshold is determined dynamically, by
taking into consideration ring size and the default value.

Tx cleanup is associated with the Tx burst function. As DPDK
applications can call Tx burst function dynamically, time when last
cleanup was called must be traced to avoid false detection of the
missing Tx completion.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/ena: add NUMA-aware allocations
Michal Krawczyk [Tue, 19 Oct 2021 10:56:27 +0000 (12:56 +0200)]
net/ena: add NUMA-aware allocations

Only the IO rings memory was allocated with taking the socket ID into
the respect, while the other structures was allocated using the regular
rte_zmalloc() API.

Ring specific structures are now being allocated using the ring's
socket ID.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/ena: advertise scattered Rx capability
Michal Krawczyk [Tue, 19 Oct 2021 10:56:26 +0000 (12:56 +0200)]
net/ena: advertise scattered Rx capability

ENA can't be forced to always pass single descriptor for the Rx packet.
Even if the passed buffer size is big enough to hold the data, we can't
make assumption that the HW won't use extra descriptor because of
internal optimizations. This assumption may be true, but only for some
of the FW revisions, which may differ depending on the used AWS instance
type.

As the scattered Rx support on the Rx path already exists, the driver
just needs to announce DEV_RX_OFFLOAD_SCATTER capability by turning on
the rte_eth_dev_data::scattered_rx option.

Fixes: 1173fca25af9 ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/ena: fix per-queue offload capabilities
Michal Krawczyk [Tue, 19 Oct 2021 10:56:25 +0000 (12:56 +0200)]
net/ena: fix per-queue offload capabilities

As ENA currently doesn't support offloads which could be configured
per-queue, only per-port flags should be set.

In addition, to make the code cleaner, parsing appropriate offload
flags is encapsulated into helper functions, in a similar matter it's
done by the other PMDs.

[1] https://doc.dpdk.org/guides/prog_guide/
    poll_mode_drv.html?highlight=offloads#hardware-offload

Fixes: 7369f88f88c0 ("net/ena: convert to new Rx offloads API")
Fixes: 56b8b9b7e5d2 ("net/ena: convert to new Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/ena: support Tx/Rx free thresholds
Michal Krawczyk [Tue, 19 Oct 2021 10:56:24 +0000 (12:56 +0200)]
net/ena: support Tx/Rx free thresholds

The caller can pass Tx or Rx free threshold value to the configuration
structure for each ring. It determines when the Tx/Rx function should
start cleaning up/refilling the descriptors. ENA was ignoring this value
and doing it's own calculations.

Now the user can configure ENA's behavior using this parameter and if
this variable won't be set, the ENA will continue with the old behavior
and will use it's own threshold value.

The default value is not provided by the ENA in the ena_infos_get(), as
it's being determined dynamically, depending on the requested ring size.

Note that NULL check for Tx conf was removed from the function
ena_tx_queue_setup(), as at this place the configuration will be
either provided by the user or the default config will be used and it's
handled by the upper (rte_ethdev) layer.

Tx threshold shouldn't be used for the Tx cleanup budget as it can be
inadequate to the used burst. Now the PMD tries to release mbufs for the
ring until it will be depleted.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/ena: fix offload capabilities verification
Michal Krawczyk [Tue, 19 Oct 2021 10:56:23 +0000 (12:56 +0200)]
net/ena: fix offload capabilities verification

ENA PMD has multiple checksum offload flags, which are more discrete
than the DPDK offload capabilities flags.
As the driver wasn't storing it's internal checksum offload capabilities
and was relying only on the DPDK capabilities, not all scenarios could
be properly covered (like when to prepare pseudo header checksum and
when not).

Moreover, the user could request offload capability, which isn't
supported by the HW and the PMD would quietly ignore the issue.

This commit reworks eth_ena_prep_pkts() function to perform additional
checks and to properly reflect the HW requirements. With the
RTE_LIBRTE_ETHDEV_DEBUG enabled, the function will do even more
verifications, to help the user find any issues with the mbuf
configuration.

Fixes: b3fc5a1ae10d ("net/ena: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
3 years agonet/sfc: implement transfer proxy port callback
Viacheslav Galaktionov [Fri, 15 Oct 2021 06:49:03 +0000 (09:49 +0300)]
net/sfc: implement transfer proxy port callback

In sfc, MAE admin serves as a transfer proxy. In order to track which
ethdev is privileged, augment every independent switch port structure
with information about its MAE privilege.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agonet/sfc: allow ports without MAE privilege
Viacheslav Galaktionov [Fri, 15 Oct 2021 06:49:02 +0000 (09:49 +0300)]
net/sfc: allow ports without MAE privilege

Register unprivileged ports in the switch domain registry in order to
allow redirecting traffic to them.

Differentiate between different levels of MAE support, update all MAE
status checks.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agocommon/sfc_efx/base: support unprivileged MAE clients
Viacheslav Galaktionov [Fri, 15 Oct 2021 06:49:01 +0000 (09:49 +0300)]
common/sfc_efx/base: support unprivileged MAE clients

In order to differentiate between privileged and unprivileged MAE clients,
add a separate boolean flag to represent a NIC's MAE privilege level.

Allow initializing unprivileged MAE clients by avoiding calls to functions
that can only be called by the admin NIC.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
3 years agoexamples/ip_reassembly: remove unused option
Ferruh Yigit [Mon, 18 Oct 2021 13:48:53 +0000 (14:48 +0100)]
examples/ip_reassembly: remove unused option

Remove 'max-pkt-len' parameter.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agoethdev: unify MTU checks
Ferruh Yigit [Mon, 18 Oct 2021 13:48:52 +0000 (14:48 +0100)]
ethdev: unify MTU checks

Both 'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' sets MTU but
have slightly different checks. Like one checks min MTU against
RTE_ETHER_MIN_MTU and other RTE_ETHER_MIN_LEN.

Checks moved into common function to unify the checks. Also this has
benefit to have common error logs.

Default 'dev_info->min_mtu' (the one set by ethdev if driver doesn't
provide one), changed to ('RTE_ETHER_MIN_LEN' - overhead). Previously it
was 'RTE_ETHER_MIN_MTU' which is min MTU for IPv4 packets. Since the
intention is to provide min MTU corresponding minimum frame size, new
default value suits better.

Suggested-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agoethdev: remove jumbo offload flag
Ferruh Yigit [Mon, 18 Oct 2021 13:48:51 +0000 (14:48 +0100)]
ethdev: remove jumbo offload flag

Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.

Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.

And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.

Removing this additional configuration for simplification.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
3 years agoethdev: move MTU set check to library
Ferruh Yigit [Mon, 18 Oct 2021 13:48:50 +0000 (14:48 +0100)]
ethdev: move MTU set check to library

Move requested MTU value check to the API to prevent the duplicated
code.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agoethdev: move jumbo frame offload check to library
Ferruh Yigit [Mon, 18 Oct 2021 13:48:49 +0000 (14:48 +0100)]
ethdev: move jumbo frame offload check to library

Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.

When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.

Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.

This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
3 years agoethdev: fix max Rx packet length
Ferruh Yigit [Mon, 18 Oct 2021 13:48:48 +0000 (14:48 +0100)]
ethdev: fix max Rx packet length

There is a confusion on setting max Rx packet length, this patch aims to
clarify it.

'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.

Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.

These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.

Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
  'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
  Ethernet frame overhead, and this overhead may be different from
  device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
  which adds additional confusion and some APIs and PMDs already
  discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
  field, this adds configuration complexity for application.

As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.

For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.

When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.

Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
3 years agonet: fix aliasing in checksum computation
Georg Sauthoff [Sun, 17 Oct 2021 20:37:18 +0000 (22:37 +0200)]
net: fix aliasing in checksum computation

That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Fixes: 6006818cfb26 ("net: new checksum functions")
Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org
Signed-off-by: Georg Sauthoff <mail@gms.tf>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
3 years agonet/enic: fix build with GCC 7.5
Ferruh Yigit [Fri, 15 Oct 2021 10:28:18 +0000 (11:28 +0100)]
net/enic: fix build with GCC 7.5

Build error:
../drivers/net/enic/enic_fm_flow.c: In function 'enic_fm_flow_parse':
../drivers/net/enic/enic_fm_flow.c:1467:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~
../drivers/net/enic/enic_fm_flow.c:1580:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~
../drivers/net/enic/enic_fm_flow.c:1599:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
    struct rte_eth_dev *dev;
                        ^~~

Build error looks like false positive, but to silence the compiler
initializing the pointer with NULL.

Bugzilla ID: 812
Fixes: 54bd4ebe8b05 ("net/enic: support meta flow actions to overrule destinations")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agodoc: fix emulated device names in e1000 guide
William Tu [Mon, 11 Oct 2021 18:55:40 +0000 (11:55 -0700)]
doc: fix emulated device names in e1000 guide

The device name should be 82574L Gigabit Ethernet Controller.
The patch also remove a redundant "*".

Fixes: fc1f2750a3ec ("doc: programmers guide")
Cc: stable@dpdk.org
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
3 years agocommon/octeontx2: enable build only on 64-bit Linux
Pavan Nikhilesh [Thu, 14 Oct 2021 19:56:53 +0000 (01:26 +0530)]
common/octeontx2: enable build only on 64-bit Linux

Since AARCH32 extension is not implemented on octeontx2 family, only
enable build for 64bit.
Due to Linux kernel AF(Admin Function) driver dependency, only enable
build for 64-bit Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/octeontx: enable build only on 64-bit Linux
Pavan Nikhilesh [Thu, 14 Oct 2021 19:56:52 +0000 (01:26 +0530)]
common/octeontx: enable build only on 64-bit Linux

Since AARCH32 extension is not implemented on octeontx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for 64-bit Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agonet/thunderx: enable build only on 64-bit Linux
Pavan Nikhilesh [Thu, 14 Oct 2021 19:56:51 +0000 (01:26 +0530)]
net/thunderx: enable build only on 64-bit Linux

Since AARCH32 extension is not implemented on thunderx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for Linux.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agoapp/testpmd: fix RSS hash offload display
Jie Wang [Thu, 14 Oct 2021 10:31:24 +0000 (18:31 +0800)]
app/testpmd: fix RSS hash offload display

The driver may change RSS hash offloads in dev->data->dev_conf
during dev_configure which may cause port->dev_conf and port->rx_conf
contain outdated values.
Since testpmd uses its configuration structures to display offloads
configuration, it doesn't display RSS hash offload.

This patch updates the testpmd offloads from device configuration
to fix this issue.

Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings")

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: add API to get device configuration
Jie Wang [Thu, 14 Oct 2021 10:31:23 +0000 (18:31 +0800)]
ethdev: add API to get device configuration

The driver may change offloads info into dev->data->dev_conf
in dev_configure which may cause apps use outdated values.

Add a new API to get actual device configuration.

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/bonding: fix Tx queue release
Xueming Li [Fri, 15 Oct 2021 09:55:48 +0000 (17:55 +0800)]
net/bonding: fix Tx queue release

When release Tx queue, Rx queue data got freed because wrong Tx queue
data located.

This patch fixes the wrong Tx queue data location.

Fixes: 7483341ae553 ("ethdev: change queue release callback")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet/mlx5: fix domains selection for meter policy
Li Zhang [Mon, 11 Oct 2021 02:40:48 +0000 (05:40 +0300)]
net/mlx5: fix domains selection for meter policy

Fate actions are different per domain.
When all the domains, ingress, egress and FDB (transfer),
can support all the policy actions, i.e. [SET_TAG],
the policy prepares resources for all the domains and
failure happens if one of the domains misses its fate action
in the policy action list.

Remove the domains missing their fate action
from the meter policy preparation.

Now, the policy will prepare a domain only when the domain supports
all the actions and when one of the domain fate actions is on the list.

Fixes: afb4aa4f122b ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/ice: fix dereferenced null pointer
Simei Su [Mon, 11 Oct 2021 08:25:25 +0000 (16:25 +0800)]
net/ice: fix dereferenced null pointer

This patch fixes coverity issue by avoiding use of null pointer
in taking false branch.

Coverity issue: 373360
Fixes: 437dbd2fd428 ("net/ice: support 1PPS")

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix freeing queues on DCF device reset
Dapeng Yu [Mon, 11 Oct 2021 07:25:46 +0000 (15:25 +0800)]
net/ice: fix freeing queues on DCF device reset

In function ice_dcf_stop_queues(), RX queues and TX queues are actually
not freed, so their pointers shall not be set to NULL when queues are
stopped.

This patch adds function call to free queues on DCF device close,
which also set the RX and TX queues' pointers to NULL on freeing
queues, and avoids referring to the released resource when device is
started again.

Fixes: 1a86f4dbdf42 ("net/ice: support DCF device reset")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix deadlock on flow redirect
Dapeng Yu [Mon, 11 Oct 2021 07:38:49 +0000 (15:38 +0800)]
net/ice: fix deadlock on flow redirect

If flow redirect failed, the spinlock will not be unlocked.
This patch fixes it.

Fixes: bc9201388d56 ("net/ice: support flow redirect")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix build when Rx descriptor size is 16
Simei Su [Mon, 11 Oct 2021 08:35:52 +0000 (16:35 +0800)]
net/ice: fix build when Rx descriptor size is 16

The Timestamp Overlay feature is available only in 32B Flex Descriptors.
This patch adds compile option when in 16B Flex Descriptors.

Fixes: 953e74e6b73a ("net/ice: enable Rx timestamp on flex descriptor")
Fixes: 646dcbe6c701 ("net/ice: support IEEE 1588 PTP")

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice/base: fix null pointer dereferences for parser
Junfeng Guo [Wed, 13 Oct 2021 10:34:55 +0000 (10:34 +0000)]
net/ice/base: fix null pointer dereferences for parser

Null-checking "p" suggests that it may be null, but it has already
been dereferenced on all paths leading to the check. Thus correct
the code lines and remove the redundant line.

Fixes: c84f8aa2100c ("net/ice/base: add parser runtime skeleton")

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/i40e: upgrade AQ command of MAC/VLAN remove
Robin Zhang [Mon, 11 Oct 2021 08:12:48 +0000 (08:12 +0000)]
net/i40e: upgrade AQ command of MAC/VLAN remove

Firmware 8.4+ will return I40E_AQ_RC_ENOENT when try to delete
non-existent MAC/VLAN addresses from the HW filtering, this should
not be considered as an Admin Queue error. But in i40e_asq_send_command,
it will return I40E_ERR_ADMIN_QUEUE_ERROR if the return value of Admin
Queue command processed by Firmware is not I40E_AQ_RC_OK or
I40E_AQ_RC_EBUSY.

Use i40e_aq_remove_macvlan_v2 instead so that we can get the
corresponding Admin Queue status, and not report as an error in DPDK
when Firmware return I40E_AQ_RC_ENOENT, and this also not break with an
old firmware.

Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice/base: fix parser runtime reset
Junfeng Guo [Mon, 11 Oct 2021 13:30:13 +0000 (13:30 +0000)]
net/ice/base: fix parser runtime reset

Adjust the code line order of the parser runtime reset, since the
struct rt->psr is used in function _rt_flag_set before assignment.

Fixes: c84f8aa2100c ("net/ice/base: add parser runtime skeleton")
Cc: stable@dpdk.org
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agodrivers/net: remove queue xstats auto-fill flag
Andrew Rybchenko [Tue, 28 Sep 2021 16:48:54 +0000 (19:48 +0300)]
drivers/net: remove queue xstats auto-fill flag

Some drivers do not provide per-queue statistics. So, there is no point
to have these misleading zeros in xstats.

Fixes: f30e69b41f94 ("ethdev: add device flag to bypass auto-filled queue xstats")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agoethdev: add telemetry endpoint for device info
Gowrishankar Muthukrishnan [Wed, 29 Sep 2021 04:25:29 +0000 (09:55 +0530)]
ethdev: add telemetry endpoint for device info

Add telemetry endpoint /ethdev/info for device info.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
3 years agonet: introduce IPv4 IHL and version fields
Gregory Etelson [Thu, 14 Oct 2021 17:41:13 +0000 (20:41 +0300)]
net: introduce IPv4 IHL and version fields

RTE IPv4 header definition combines the `version' and `ihl'  fields
into a single structure member.
This patch introduces dedicated structure members for both `version'
and `ihl' IPv4 fields. Separated header fields definitions allow to
create simplified code to match on the IHL value in a flow rule.
The original `version_ihl' structure member is kept for backward
compatibility.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet: fix IPv4 change announce
Gregory Etelson [Thu, 14 Oct 2021 17:41:12 +0000 (20:41 +0300)]
net: fix IPv4 change announce

IPv4 header encodes fragment information into 16 bits field.
3 bits hold flags and remaining 13 bits are for fragment offset.
13 bits bit-field cannot be defined both for big and little endian
systems.

The patch removes IPv4 fragments union announce.

Fixes: f7383e7c7ec1 ("net: announce changes in IPv4 header access")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/txgbe: fix VXLAN-GPE packet checksum
Jiawen Wu [Wed, 13 Oct 2021 02:45:21 +0000 (10:45 +0800)]
net/txgbe: fix VXLAN-GPE packet checksum

Parse inner L2 length to set correct packet type, and ensure that
hardware can compute the checksum successfully.

Fixes: b950203be7f1 ("net/txgbe: support VXLAN-GPE")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
3 years agonet/txgbe: set fixed flag for exact link speed
Jiawen Wu [Wed, 13 Oct 2021 02:45:20 +0000 (10:45 +0800)]
net/txgbe: set fixed flag for exact link speed

Setting exact link speed makes sense if auto-negotiation is
disabled. Fixed flag is required to disable auto-negotiation.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
3 years agonet/txgbe: fix to get interrupt status
Jiawen Wu [Wed, 13 Oct 2021 02:45:19 +0000 (10:45 +0800)]
net/txgbe: fix to get interrupt status

It's necessary to set 1 on TXGBE_PX_INTA register to get interrupts
normally, when legacy interrupt mode is used.

Fixes: 2fc745e6b606 ("net/txgbe: add interrupt operation")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>