]> git.droids-corp.org - dpdk.git/log
dpdk.git
5 years agonet/mlx5: fix LRO checksum
Dong Zhou [Fri, 12 Jun 2020 08:57:46 +0000 (11:57 +0300)]
net/mlx5: fix LRO checksum

The TCP checksum includes IPV4 pseudo-header checksum and L3
payload checksum which include TCP header and TCP payload.
When mlx5 LRO is enabled, HW will calculate the TCP payload
checksum, PMD need complete the IPV4 pseudo-header checksum
and the TCP header checksum.

The mlx5_lro_update_tcp_hdr function completes the TCP header
checksum, but this function using lower 4 bits of data-offset
field in TCP header to get the whole TCP header length, this
will cause TCP header checksum wrong calculation.

Update the code using higher 4 bits of data-offset field
instead of lower 4 bits.

Fixes: e4c2a16eb1de ("net/mlx5: handle LRO packets in Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix descriptors number adjustment
Alexander Kozyrev [Thu, 11 Jun 2020 17:43:27 +0000 (17:43 +0000)]
net/mlx5: fix descriptors number adjustment

The number of descriptors to configure in a Rx/Tx queue is passed to
the mlx5_tx/rx_queue_pre_setup() function by value. That means any
adjustments of this variable are local and cannot affect the actual
value that is used to allocate mbufs in the mlx5_txq/rxq_new()
functions. Pass the number as a reference to actually update it.

Fixes: 6218063b39a6 ("net/mlx5: refactor Rx data path")
Fixes: 1d88ba171942 ("net/mlx5: refactor Tx data path")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: do not select legacy MPW implicitly
Alexander Kozyrev [Thu, 11 Jun 2020 17:42:00 +0000 (17:42 +0000)]
net/mlx5: do not select legacy MPW implicitly

The Legacy MPW (multi-packet write) should not be engaged implicitly.
We should exclude this function from a Tx burst routine selection
process unless it is requested specifically by setting the txq_mpw_en
devarg.  Exclude this function from the selection process the same way
it is done for the Enhanced MPW in the mlx5_select_tx_function()
routine.

Fixes: eb8121ab9dac ("net/mlx5: introduce Tx burst routine template")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
5 years agonet/mlx5: refactor statistics
Ophir Munk [Wed, 10 Jun 2020 09:32:33 +0000 (09:32 +0000)]
net/mlx5: refactor statistics

mlx5 statistics are calculated by several methods:
1. In software when packets go through datapath.
2. Calling ioctl with ETHTOOL command (Linux specific).
3. Reading counters from SYSFS device path (Linux specific).

The Linux related functions are moved to file linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agocommon/mlx5: flag Verbs dependency in a DevX command
Ophir Munk [Wed, 10 Jun 2020 09:32:31 +0000 (09:32 +0000)]
common/mlx5: flag Verbs dependency in a DevX command

Function mlx5_devx_cmd_qp_query_tis_td() receives as parameter a pointer
to verbs QP returned by ibv_create_qp. Therefore support it only if
HAVE_IBV_FLOW_DV_SUPPORT is defined. Otherwise return an error ENOTSUP.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: refactor device operations for Linux
Ophir Munk [Wed, 10 Jun 2020 09:32:30 +0000 (09:32 +0000)]
net/mlx5: refactor device operations for Linux

There are three types of eth_dev_ops: primary, secondary and isolate.
Their function calls assignments are moved from common file
mlx5.c to the Linux specific file linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: move Linux-specific functions
Ophir Munk [Wed, 10 Jun 2020 09:32:29 +0000 (09:32 +0000)]
net/mlx5: move Linux-specific functions

File mlx5_ethdev.c is partially moved to linux/mlx5_ethdev_os.c for
functions which are Linux specific. Functions which are Linux agnostics
remain in mlx5_ethdev.c file.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: move socket files in Linux directory
Ophir Munk [Wed, 10 Jun 2020 09:32:28 +0000 (09:32 +0000)]
net/mlx5: move socket files in Linux directory

mlx5_socket.c file is using APIs which are Linux specifics.  Therefore
move it (including mlx5_socket.h) from net/mlx5 directory to
net/mlx5/linux directory. This commit also updates the Makefile and
the meson files.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
5 years agonet/mlx5: rename ib in names
Ophir Munk [Wed, 10 Jun 2020 09:32:27 +0000 (09:32 +0000)]
net/mlx5: rename ib in names

Renames in this commit:
mlx5_ibv_list -> mlx5_dev_ctx_list
mlx5_alloc_shared_ibctx -> mlx5_alloc_shared_dev_ctx
mlx5_free_shared_ibctx -> mlx5_free_shared_dev_ctx
mlx5_ibv_shared_port -> mlx5_dev_shared_port
ibv_port -> dev_port

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: remove completion object dependency on DV
Ophir Munk [Wed, 10 Jun 2020 09:32:26 +0000 (09:32 +0000)]
net/mlx5: remove completion object dependency on DV

Replace 'struct mlx5dv_devx_cmd_comp *' with 'void *' in 'struct
mlx5_dev_ctx_shared'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix flow memory allocation size
Gregory Etelson [Mon, 8 Jun 2020 16:01:56 +0000 (19:01 +0300)]
net/mlx5: fix flow memory allocation size

In DV enabled MLX5 PMD build mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size
was initiated for DV structure. If RTE initialization encountered MLX5
PCI function with disabled DV support
mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size was reduced to match legacy
verbs flow size.  Since mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW] is a
global variable that change reflected on DV enabled MLX5 PCI functions
too.

Running flow with invalid ipool size crashes PMD.

The patch adjusts ipool flow size for each active PCI function.

Fixes: b88341ca35fc ("net/mlx5: convert flow dev handle to indexed")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix GTP mask definition location
Dekel Peled [Wed, 10 Jun 2020 13:25:19 +0000 (16:25 +0300)]
net/mlx5: fix GTP mask definition location

Recent patch added definition of mask MLX5_GTP_FLAGS_MASK, just
above function flow_dv_validate_item_gtp(), where it is used.

Patch was applied together with other patches which modified the same
file, so the mask was located further away from the function it is
used in.

This patch moves the mask definition to the proper location.

Fixes: 563ac307a46b ("net/mlx5: support match on GTP flags")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/pcap: support Tx nanosecond timestamps
Vivien Didelot [Tue, 9 Jun 2020 19:07:19 +0000 (15:07 -0400)]
net/pcap: support Tx nanosecond timestamps

When capturing packets into a PCAP file, DPDK currently uses
microseconds for the timestamps. But libpcap supports interpreting
tv_usec as nanoseconds depending on the file timestamp precision,
as of commit ba89e4a18e8b ("Make timestamps precision configurable").

To support this, use PCAP_TSTAMP_PRECISION_NANO when creating the
empty PCAP file as specified by PCAP_OPEN_DEAD(3PCAP) and implement
nanosecond timeval addition. This also ensures that the precision
reported by capinfos is nanoseconds (9).

Note that NSEC_PER_SEC is defined as 1000000000L instead of 1e9 since
the latter might be interpreted as floating point.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
5 years agonet/mlx5: fix typos in meter error messages
Ali Alnubani [Mon, 8 Jun 2020 14:02:57 +0000 (17:02 +0300)]
net/mlx5: fix typos in meter error messages

Fixes: 3bd26b23cefc ("net/mlx5: support meter profile operations")
Cc: stable@dpdk.org
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/hns3: fix unintended sign extension in dump operation
Hongbo Zheng [Tue, 9 Jun 2020 08:44:17 +0000 (16:44 +0800)]
net/hns3: fix unintended sign extension in dump operation

There are coverity defects related "Unintended sign extension" in the
internal static function named hns3_get_regs_length used for dumping reg
operation.

This patch fixes them by replacing the data type of cmdq_lines,
common_lines, ring_lines and tqp_intr_lines with uint32_t in the inner
static function named hns3_get_regs_length of hns3 PMD driver.

Coverity issue: 349917, 349914
Fixes: 936eda25e8da ("net/hns3: support dump register")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: fix unintended sign extension in fd operation
Wei Hu (Xavier) [Tue, 9 Jun 2020 08:44:16 +0000 (16:44 +0800)]
net/hns3: fix unintended sign extension in fd operation

Currently, there are coverity defects warning as below:

CID 349937 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: port_number with
type uint16_t (16 bits, unsigned) is promoted in port_number << cur_pos
to type int (32 bits, signed), then sign-extended to type unsigned long
(64 bits, unsigned). If port_number << cur_pos is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.

CID 349893 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: vlan_tag with type
uint8_t (8 bits, unsigned) is promoted in vlan_tag << cur_pos to type
int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If vlan_tag << cur_pos is greater than 0x7FFFFFFF, the
upper bits of the result will all be 1.

This patch fixes them by replacing the data type of port_number and
vlan_tag with uint32_t in the inner static function named
hns3_fd_convert_meta_data of hns3 PMD driver.

Coverity issue: 349937, 349893
Fixes: fcba820d9b9e ("net/hns3: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: ignore function return on reset error path
Hongbo Zheng [Tue, 9 Jun 2020 08:44:15 +0000 (16:44 +0800)]
net/hns3: ignore function return on reset error path

There is a coverity defect related "Unchecked return value".

The internal static hns3_reset_err_handle function is reset error
process of hns3 PMD driver. If failure in reset process, it does not
mean that the network port is completely unavailable, so the command
interface between driver and firmware still needs to be initialized.
Regardless of whether the execution of the function named hns3_cmd_init
is successful or not, the next process after execution must be
continued, so there is no need to check the return value. If
hns3_cmd_init fails to execute, there will be corresponding log
information inside hns3_cmd_init.

This patch adds '(void)' Type conversion to avoid coverity warning.

Coverity issue: 349934
Fixes: 2790c6464725 ("net/hns3: support device reset")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: fix flow director error message
Wei Hu (Xavier) [Tue, 9 Jun 2020 08:44:14 +0000 (16:44 +0800)]
net/hns3: fix flow director error message

There is a coverity defect related "Argument cannot be negative".

This patch fixes it by passing '-ret' to the function strerror() when
ret is negative.

Coverity issue: 349933
Fixes: fcba820d9b9e ("net/hns3: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agoapp/testpmd: fix error detection in MTU command
Shy Shyman [Mon, 8 Jun 2020 14:17:47 +0000 (17:17 +0300)]
app/testpmd: fix error detection in MTU command

MTU is used in testpmd to set the maximum payload size for packets.
According to testpmd, the setting influence RX only.
In rte_ethdev there's no relation between MTU setting and JUMBO offload
or rx_max_pkt_len.

The previous fix in patch referenced below was meant to update the
correlated variables of max_pkt_len and JUMBO offload, but by doing so
it assumes that MTU setting can only exist when JUMBO offload supported
in the device. For example fail-safe device does supports set MTU and
doesn't support JUMBO offload, and in this case, though set MTU
succeeds, an error message is still printed since the JUMBO packet
offload is disabled.

The fix separates the two conditions to make sure the error
triggers only in case the set_mtu action actually failed.

Fixes: 150c9ac2df13 ("app/testpmd: update Rx offload after setting MTU")
Cc: stable@dpdk.org
Signed-off-by: Shy Shyman <shys@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
5 years agonet/mlx5: remove Verbs dependency in spawn struct
Ophir Munk [Wed, 3 Jun 2020 15:06:02 +0000 (15:06 +0000)]
net/mlx5: remove Verbs dependency in spawn struct

1. Replace 'struct ibv_device *' with 'void *' in 'struct
mlx5_dev_spawn_data'. Define a getter function to retrieve the
device name.
2. Rename ibv_dev and ibv_port as phys_dev and phys_port
respectively.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: add Linux-specific header file
Ophir Munk [Wed, 3 Jun 2020 15:06:01 +0000 (15:06 +0000)]
net/mlx5: add Linux-specific header file

File drivers/net/linux/mlx5_os.h is added. It includes specific
Linux definitions such as PCI driver flags, link state changes
interrupts, link removal interrupts, etc.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: refactor PCI probing on Linux
Ophir Munk [Wed, 3 Jun 2020 15:06:00 +0000 (15:06 +0000)]
net/mlx5: refactor PCI probing on Linux

Refactor PCI probing related code. Move Linux specific functions (as
well as verbs and dv related code) from mlx5.c file to linux/mlx5_os.c
file.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: remove umem field dependency on Direct Verbs
Ophir Munk [Wed, 3 Jun 2020 15:05:59 +0000 (15:05 +0000)]
net/mlx5: remove umem field dependency on Direct Verbs

umem field is used in several structs. Its type 'struct mlx5dv_devx_umem
*' is changed to 'void *'. This change will allow non-Linux OS
compilations.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: remove attributes dependency on Verbs
Ophir Munk [Wed, 3 Jun 2020 15:05:58 +0000 (15:05 +0000)]
net/mlx5: remove attributes dependency on Verbs

Define 'struct mlx5_dev_attr' which is ibv and dv independent. It
contains attribute that were originally contained in 'struct
ibv_device_attr_ex' and 'struct mlx5dv_context dv_attr'. Add a new API
mlx5_os_get_dev_attr() which fills in the new defined struct.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agocommon/mlx5: remove protection domain dependency on Verbs
Ophir Munk [Wed, 3 Jun 2020 15:05:57 +0000 (15:05 +0000)]
common/mlx5: remove protection domain dependency on Verbs

Replace 'struct ibv_pd *' with 'void *' in struct mlx5_ctx_shared and
all function calls in mlx5 PMD.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: add Linux-specific file with getter functions
Ophir Munk [Wed, 3 Jun 2020 15:05:56 +0000 (15:05 +0000)]
net/mlx5: add Linux-specific file with getter functions

'ctx' type (field in 'struct mlx5_ctx_shared') is changed from 'struct
ibv_context *' to 'void *'.  'ctx' members which are verbs dependent
(e.g. device_name) will be accessed through getter functions which are
added to a new file under Linux directory: linux/mlx5_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: rename Verbs shared object
Ophir Munk [Wed, 3 Jun 2020 15:05:55 +0000 (15:05 +0000)]
net/mlx5: rename Verbs shared object

Replace all 'mlx5_ibv_shared' appearances with 'mlx5_dev_ctx_shared'.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agocfgfile: check flags on creation for future proofing
Stephen Hemminger [Mon, 27 Apr 2020 23:16:25 +0000 (16:16 -0700)]
cfgfile: check flags on creation for future proofing

All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
5 years agostack: check flags on creation for future proofing
Stephen Hemminger [Mon, 27 Apr 2020 23:16:24 +0000 (16:16 -0700)]
stack: check flags on creation for future proofing

All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Gage Eads <gage.eads@intel.com>
5 years agohash: check flags on creation for future proofing
Stephen Hemminger [Mon, 27 Apr 2020 23:16:23 +0000 (16:16 -0700)]
hash: check flags on creation for future proofing

All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
5 years agoring: check flag settings for future proofing
Stephen Hemminger [Mon, 27 Apr 2020 23:16:22 +0000 (16:16 -0700)]
ring: check flag settings for future proofing

All API's should check that they support the flag values passed.
These checks ensure that the extra bits can safely be used
without risk of ABI breakage.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
5 years agonet/hinic: use common bit operations API
Joyce Kong [Mon, 27 Apr 2020 07:58:56 +0000 (15:58 +0800)]
net/hinic: use common bit operations API

Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
5 years agonet/qede: use common bit operations API
Joyce Kong [Mon, 27 Apr 2020 07:58:55 +0000 (15:58 +0800)]
net/qede: use common bit operations API

Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
5 years agonet/bnx2x: use common bit operations API
Joyce Kong [Mon, 27 Apr 2020 07:58:54 +0000 (15:58 +0800)]
net/bnx2x: use common bit operations API

Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
5 years agonet/axgbe: use common bit operations API
Joyce Kong [Mon, 27 Apr 2020 07:58:53 +0000 (15:58 +0800)]
net/axgbe: use common bit operations API

Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
5 years agotest/bitops: add bit operations test case
Joyce Kong [Mon, 27 Apr 2020 07:58:52 +0000 (15:58 +0800)]
test/bitops: add bit operations test case

Add test cases for setting bit, clearing bit, testing
and setting bit, testing and clearing bit operation.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
5 years agoeal: introduce bit operations API
Joyce Kong [Mon, 27 Apr 2020 07:58:51 +0000 (15:58 +0800)]
eal: introduce bit operations API

Bitwise operation APIs are defined and used in a lot of PMDs,
which caused a huge code duplication. To reduce duplication,
this patch consolidates them into a common API family.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Morten BrĆørup <mb@smartsharesystems.com>
5 years agoeal/windows: implement basic memory management
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:54 +0000 (03:43 +0300)]
eal/windows: implement basic memory management

Basic memory management supports core libraries and PMDs operating in
IOVA as PA mode. It uses a kernel-mode driver, virt2phys, to obtain
IOVAs of hugepages allocated from user-mode. Multi-process mode is not
implemented and is forcefully disabled at startup. Assign myself as a
maintainer for Windows file and memory management implementation.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal/windows: initialize hugepage info
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:53 +0000 (03:43 +0300)]
eal/windows: initialize hugepage info

Add hugepages discovery ("large pages" in Windows terminology)
and update documentation for required privilege setup. Only 2MB
hugepages are supported and their number is estimated roughly
due to the lack or unstable status of suitable OS APIs.
Assign myself as maintainer for the implementation file.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agodoc: split build and run instructions in Windows guide
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:52 +0000 (03:43 +0300)]
doc: split build and run instructions in Windows guide

With memory management implemented for Windows, the guide for running
sample applications is going to be extended with hugepages and driver
setup.  Move run instructions to a separate file to give space for
planned expansion.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal/windows: improve CPU and NUMA node detection
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:51 +0000 (03:43 +0300)]
eal/windows: improve CPU and NUMA node detection

1. Map CPU cores to their respective NUMA nodes as reported by system.
2. Support systems with more than 64 cores (multiple processor groups).
3. Fix magic constants, styling issues, and compiler warnings.
4. Add EAL private function to map DPDK socket ID to NUMA node number.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal/windows: complete queue.h data structures
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:50 +0000 (03:43 +0300)]
eal/windows: complete queue.h data structures

Limited version imported previously lacks at least SLIST macros.
Import a complete file from FreeBSD, since its license exception is
already approved by Technical Board.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal/windows: add tracing stubs
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:49 +0000 (03:43 +0300)]
eal/windows: add tracing stubs

EAL common code depends on tracepoint calls, but generic implementation
cannot be enabled on Windows due to missing standard library facilities.
Add stub functions to support tracepoint compilation, so that common
code does not have to conditionally include tracepoints until proper
support is added.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agotrace: add size_t field emitter
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:48 +0000 (03:43 +0300)]
trace: add size_t field emitter

It is not guaranteed that sizeof(long) == sizeof(size_t). On Windows,
sizeof(long) == 4 and sizeof(size_t) == 8 for 64-bit programs.
Tracepoints using "long" field emitter are therefore invalid there.
Add dedicated field emitter for size_t and use it to store size_t values
in all existing tracepoints.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agomem: extract common dynamic memory allocation
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:47 +0000 (03:43 +0300)]
mem: extract common dynamic memory allocation

Code in Linux EAL that supports dynamic memory allocation (as opposed to
static allocation used by FreeBSD) is not OS-dependent and can be reused
by Windows EAL. Move such code to a file compiled only for the OS that
require it. Keep Anatoly Burakov maintainer of extracted code.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agomem: extract common memseg list initialization
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:46 +0000 (03:43 +0300)]
mem: extract common memseg list initialization

All supported OS create memory segment lists (MSL) and reserve VA space
for them in a nearly identical way. Move common code into EAL private
functions to reduce duplication.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal: introduce memory management wrappers
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:45 +0000 (03:43 +0300)]
eal: introduce memory management wrappers

Introduce OS-independent wrappers for memory management operations used
across DPDK and specifically in common code of EAL:

* rte_mem_map()
* rte_mem_unmap()
* rte_mem_page_size()
* rte_mem_lock()

Windows uses different APIs for memory mapping and reservation, while
Unices reserve memory by mapping it. Introduce EAL private functions to
support memory reservation in common code:

* eal_mem_reserve()
* eal_mem_free()
* eal_mem_set_dump()

Wrappers follow POSIX semantics limited to DPDK tasks, but their
signatures deliberately differ from POSIX ones to be more safe and
expressive. New symbols are internal. Being thin wrappers, they require
no special maintenance.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal: introduce internal wrappers for file operations
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:44 +0000 (03:43 +0300)]
eal: introduce internal wrappers for file operations

Introduce OS-independent wrappers in order to support common EAL code
on Unix and Windows:

* eal_file_open: open or create a file.
* eal_file_lock: lock or unlock an open file.
* eal_file_truncate: enforce a given size for an open file.

Implementation for Linux and FreeBSD is placed in "unix" subdirectory,
which is intended for common code between the two. These thin wrappers
require no special maintenance.

Common code supporting multi-process doesn't use the new wrappers,
because it is inherently Unix-specific and would impose excessive
requirements on the wrappers.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal: replace page sizes enum with a set of constants
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:43 +0000 (03:43 +0300)]
eal: replace page sizes enum with a set of constants

Clang on Windows follows MS ABI where enum values are limited to 2^31-1.
Enum rte_page_sizes has members valued above this limit, which get
wrapped to zero, resulting in compilation error (duplicate values in
enum). Using MS ABI is mandatory for Windows EAL to call Win32 APIs.

Remove rte_page_sizes and replace its values with #define's.
This enumeration is not used in public API, so there's no ABI breakage.
Announce API changes for 20.08 in documentation.

Suggested-by: Jerin Jacob <jerinjacobk@gmail.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agoeal/windows: fix symbol export
David Marchand [Wed, 10 Jun 2020 14:30:49 +0000 (16:30 +0200)]
eal/windows: fix symbol export

rte_eal_get_configuration() has been made private in 19.11, remove
leftover in Windows export list.

Fixes: f58cef079b05 ("eal: make the global configuration private")
Signed-off-by: David Marchand <david.marchand@redhat.com>
5 years agoeal/windows: fix warnings
Pallavi Kadam [Thu, 11 Jun 2020 19:50:55 +0000 (12:50 -0700)]
eal/windows: fix warnings

Fixed bunch of warnings when compiling using clang on Windows
such as the use of an unsafe string function (strerror),
[-Wunused-variable], [-Wunused-function] in eal_common_options.c
[-Wunused-const-variable] in getopt.c and [-Wunused-parameter]
in eal_common_thread.c.
Also fixed warnings generated using Mingw:
[-Werror=old-style-definition], [-Werror=cast-function-type] and
[-Werror=attributes]

Signed-off-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
5 years agoeal/windows: support thread ID query
Tasnim Bashar [Thu, 21 May 2020 00:32:53 +0000 (17:32 -0700)]
eal/windows: support thread ID query

Add rte_sys_gettid function to use rte_gettid() on Windows.
rte_gettid() is required for recursive spin lock and recursive ticket lock.

Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
5 years agombuf: align layout in Windows
Tal Shnaiderman [Tue, 19 May 2020 18:41:11 +0000 (21:41 +0300)]
mbuf: align layout in Windows

Using uint32_t type bit-fields in Windows will pads the
'L2/L3/L4 and tunnel information' union with additional bits.

This padding causes rte_mbuf size misalignment and the total size
increases to 3 cache-lines.

Changed packet_type bit-fields types from uint32_t to uint8_t
to allow unified 2 cache-line structure size.

Added the __extension__ attribute over the modified struct to avoid
the warning:

type of bit-field ... is a GCC extension [-pedantic]

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
5 years agombuf: fix external buffer pool boundaries
Alexander Kozyrev [Mon, 1 Jun 2020 15:24:16 +0000 (15:24 +0000)]
mbuf: fix external buffer pool boundaries

Memzones are created in testpmd in order to test external data
buffers functionality. Each memzone is 2Mb in size and divided among
the pool of external memory buffers.

Memzone may not always be fully utilized because mbufs size can vary
and some space can be left unused at the tail of a memzone. This is
not handled properly and mbuf can get the address of this leftover
space since this address is still valid (part of memzone), but there
is not enough space to fit the whole packet data. As a result packet
data may overflow and cause the memory corruption.

Take mbuf size into account when distributing memory addresses from
a memzone to external mbufs. Skip the remaining tail in case there
is not enough room for a packet and move to a next memzone instead.

Fixes: 6c8e50c2e5 ("mbuf: create pool with external memory buffers")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
5 years agotest/mbuf: fix a dynamic flag log
Xiaolong Ye [Tue, 9 Jun 2020 08:24:29 +0000 (16:24 +0800)]
test/mbuf: fix a dynamic flag log

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
5 years agombuf: remove unused next member in dynamic flag/field
Xiaolong Ye [Tue, 9 Jun 2020 07:12:56 +0000 (15:12 +0800)]
mbuf: remove unused next member in dynamic flag/field

TAILQ_ENTRY next is not needed in struct mbuf_dynfield_elt and
mbuf_dynflag_elt, since they are actually chained by rte_tailq_entry's
next field when calling TAILQ_INSERT_TAIL(mbuf_dynfield/dynflag_list, te,
next).

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
5 years agombuf: document guideline for new fields and flags
Thomas Monjalon [Thu, 11 Jun 2020 06:32:29 +0000 (08:32 +0200)]
mbuf: document guideline for new fields and flags

Since dynamic fields and flags were added in 19.11,
the idea was to use them for new features, not only PMD-specific.

The guideline is made more explicit in doxygen, in the mbuf guide,
and in the contribution design guidelines.

For more information about the original design, see the presentation
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf

This decision was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-June/169667.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
5 years agoapp/testpmd: fix stats error message
Wei Hu (Xavier) [Sat, 6 Jun 2020 03:46:37 +0000 (11:46 +0800)]
app/testpmd: fix stats error message

There are coverity defects related "Argument cannot be negative"

This patch fixes them by passing '-ret' to the function strerror() when
ret is negative.

Coverity issue: 349913, 358437, 358449, 358450
Fixes: da328f7f115a ("ethdev: change xstats reset function to return int")
Fixes: 9eb974221f44 ("app/testpmd: fix statistics after reset")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
5 years agomaintainers: update for bonding
Wei Hu (Xavier) [Mon, 18 May 2020 11:20:09 +0000 (19:20 +0800)]
maintainers: update for bonding

Adding Xavier as additional maintainer to bonding.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Chas Williams <chas3@att.com>
5 years agonet/axgbe: support setting MTU
Girish Nandibasappa [Mon, 1 Jun 2020 14:48:41 +0000 (20:18 +0530)]
net/axgbe: support setting MTU

This patch adds support for set_mtu API which can be used to change
the Maximum Transmission unit (MTU) from application.

Signed-off-by: Girish Nandibasappa <girish.nandibasappa@amd.com>
Acked-by: Amaranath Somalapuram <asomalap@amd.com>
5 years agonet/axgbe: support RSS RETA/hash query and update
Chandu Babu N [Fri, 29 May 2020 11:49:20 +0000 (17:19 +0530)]
net/axgbe: support RSS RETA/hash query and update

add support for RSS reta/hash query and update function

Signed-off-by: Chandu Babu N <chandu@amd.com>
Acked-by: Amaranath Somalapuram <asomalap@amd.com>
5 years agonet/i40e: enable NEON Rx/Tx in meson
Ruifeng Wang [Fri, 5 Jun 2020 05:20:55 +0000 (13:20 +0800)]
net/i40e: enable NEON Rx/Tx in meson

The i40e neon vector implementation is not compiled with meson.
Add the file to meson for Arm platform.

Fixes: e940646b20fa ("drivers/net: build Intel NIC PMDs with meson")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agonet/hns3: check TSO segment size during Tx
Hongbo Zheng [Wed, 3 Jun 2020 09:32:01 +0000 (17:32 +0800)]
net/hns3: check TSO segment size during Tx

Base on hns3 network engine, when the rte_eth_tx_burst API is called
by Upper Level Process, if PKT_TX_TCP_SEG flag is set and tso_segsz
is 0 in the input parameter structure rte_mbuf, hns3 PMD driver will
process this packet as an non-TSO packet, otherwise hardware will enter
an abnormal state.

Fixes: 6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: fix VLAN tags reported in Rx
Wei Hu (Xavier) [Wed, 3 Jun 2020 09:32:00 +0000 (17:32 +0800)]
net/hns3: fix VLAN tags reported in Rx

Currently, based on hns3 network engine, driver always reports the
incoming packet's VLAN tags to the structure rte_mbuf those are the
output parameter pointers in '.rx_pkt_burst' ops implementation
function, and never reports PKT_RX_VLAN_STRIPPED flag to the structure
rte_mbuf even if Upper Level Process configured hardware strip by
calling rte_eth_dev_configure or rte_eth_dev_set_vlan_offload API
function. It makes the ULP unable to know the stripping of VLAN.

It is supposed to present the stripped flags to the mbuf ol_flags, and
report the right VLAN tag.

And as hardware constraints, the stripped VLAN tag will always in the Rx
descriptor. Even if setting a PVID based on the function, the PVID will
be reported to the Rx descriptor. So the driver need to determine which
VLAN tag should be reported to output the structure rte_mbuf in
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Fixes: 411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: fix VLAN strip configuration when setting PVID
Chengchang Tang [Wed, 3 Jun 2020 09:31:59 +0000 (17:31 +0800)]
net/hns3: fix VLAN strip configuration when setting PVID

Currently, based on hns3 PF device, hardware will strip 2 vlan tags when
ULP calls rte_eth_dev_set_vlan_pvid API function to set a PVID whether
vlan strip related offload is turned on by calling rte_eth_dev_configure
or rte_eth_dev_set_vlan_offload API function.

When receiving a QinQ packet with the pvid tag, if ULP does not
configure the vlan strip by the method mentioned above, a layer of vlan
tag will be lost to ULP, which is not the expected result.

It is supposed to configure the vlan strip according to the upper level
process's configuration.

Fixes: 411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: remove unsupported VLAN capabilities
Chengchang Tang [Wed, 3 Jun 2020 09:31:58 +0000 (17:31 +0800)]
net/hns3: remove unsupported VLAN capabilities

This patch removes unsupported vlan capabilities to avoid misleading
users.

Fixes: a5475d61fa34 ("net/hns3: support VF")
Fixes: 1f5ca0b460cd ("net/hns3: support some device operations")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/mlx5: fix vectorized Rx burst termination
Alexander Kozyrev [Tue, 2 Jun 2020 03:50:41 +0000 (03:50 +0000)]
net/mlx5: fix vectorized Rx burst termination

Maximum burst size of Vectorized Rx burst routine is set to
MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any
application that would like to gather more than 64 packets from
the single Rx burst for batch processing (i.e. VPP).

The situation gets worse with a mix of zipped and unzipped CQEs.
They are processed separately and the Rx burst function returns
small number of packets every call.

Repeat the cycle of gathering packets from the vectorized Rx routine
until a requested number of packets are collected or there are no
more CQEs left to process.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: add reclaim memory mode
Suanming Mou [Mon, 1 Jun 2020 06:09:43 +0000 (14:09 +0800)]
net/mlx5: add reclaim memory mode

Currently, when flow destroyed, some memory resources may still be kept
as cached to help next time create flow more efficiently.

Some system may need the resources to be more flexible with flow create
and destroy.  After peak time, with millions of flows destroyed, the
system would prefer the resources to be reclaimed completely, no cache
is needed. Then the resources can be allocated and used by other
components. The system is not so sensitive about the flow insertion
rate, but more care about the resources.

Both DPDK mlx5 PMD driver and the low level component rdma-core have
provided the flow resources to be configured cached or not, but there is
no APIs or parameters exposed to user to configure the flow resources
cache mode. In this case, introduce a new PMD devarg to let user
configure the flow resources cache mode will be helpful.

This commit is to add a new "reclaim_mem_mode" to help user configure if
the destroyed flows' cache resources should be kept or not.

Their will be three mode can be chosen:
1. 0(none). It means the flow resources will be cached as usual. The
resources will be cached, helpful with flow insertion rate.
2. 1(light). It will only enable the DPDK PMD level resources reclaim.
3. 2(aggressive). Both DPDK PMD level and rdma-core low level will be
configured as reclaimed mode.

With these three mode, user can configure the resources cache mode with
different levels.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
5 years agocommon/mlx5: add memory reclaim glue function
Suanming Mou [Mon, 1 Jun 2020 06:09:42 +0000 (14:09 +0800)]
common/mlx5: add memory reclaim glue function

While flow destroyed, rdma-core may still cache some resources for more
efficiently flow recreate. In case the peak time that millions of flows
created and destroyed, the cached resources will be very huge.

Currently, rdma-core provides the new function to configure the flow
resources not to be cached. Add the memory reclaim function to avoid
too many resources be cached.

This is the first patch for the memory reclaim. A new devarg will be
added to PMD to support the reclaim can be configured.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
5 years agocommon/mlx5: split common file under Linux directory
Ophir Munk [Mon, 1 Jun 2020 05:50:47 +0000 (05:50 +0000)]
common/mlx5: split common file under Linux directory

File mlx5_common.c includes both specific and non-specific Linux APIs.
Move the Linux specific APIS into a new file named linux/mlx5_common_os.c.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agocommon/mlx5: move netlink files under Linux directory
Ophir Munk [Mon, 1 Jun 2020 05:50:46 +0000 (05:50 +0000)]
common/mlx5: move netlink files under Linux directory

File mlx5_nl.c is using Netlink APIs which are Linux specifics.
Move it (including file mlx5_nl.h) to common/mlx5/linux directory.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agocommon/mlx5: move glue files under Linux directory
Ophir Munk [Mon, 1 Jun 2020 05:50:45 +0000 (05:50 +0000)]
common/mlx5: move glue files under Linux directory

The glue file mlx5_glue.c is based on Linux specifics APIs.
Move it (including file mlx5_glue.h) to common/mlx5/linux directory.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/failsafe: fix RSS RETA size info
Ian Dolzhansky [Wed, 27 May 2020 14:34:33 +0000 (15:34 +0100)]
net/failsafe: fix RSS RETA size info

Failsafe driver has been indicating zero for RSS redirection table size
after device info reporting had been reworked. Report proper value.

Fixes: 4586be3743d4 ("net/failsafe: fix reported device info")
Cc: stable@dpdk.org
Signed-off-by: Ian Dolzhansky <ian.dolzhansky@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agodoc: remove duplicated line in memif guide
Muhammad Bilal [Fri, 29 May 2020 14:47:45 +0000 (19:47 +0500)]
doc: remove duplicated line in memif guide

There was a duplicate command instruction in the documentation of memif
so I have removed the 1 command from it.

Fixes: cbbbbd3365d2 ("net/memif: enable loopback")
Cc: stable@dpdk.org
Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
5 years agonet/mlx5: fix interrupt installation timing
Suanming Mou [Thu, 28 May 2020 09:22:09 +0000 (17:22 +0800)]
net/mlx5: fix interrupt installation timing

Currently, the DevX counter query works asynchronously with Devx
interrupt handler return the query result. When port closes, the
interrupt handler will be uninstalled and the Devx comp obj will
also be destroyed. Meanwhile the query is still not cancelled.

In this case, counter query may use the invalid Devx comp which
has been destroyed, and query failure with invalid FD will be
reported.

Adjust the shared interrupt install and uninstall timing to make
the counter asynchronous query stop before interrupt uninstall.

Fixes: f15db67df09c ("net/mlx5: accelerate DV flow counter query")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix secondary process resources release
Suanming Mou [Thu, 28 May 2020 06:59:49 +0000 (14:59 +0800)]
net/mlx5: fix secondary process resources release

When secondary process starts, it will allocate its own process private
data, and also does remap to UAR register of the Tx queue. Once the
secondary process exits, these resources should be released accordingly.
And the shared resources owned by primary should not be touched.

Currently, once one port in the secondary process spawn failed, all the
other spawned ports will also be released during process exits. However,
the mlx5_dev_close() function does not add the cases for secondary
process, it means call the mlx5_dev_close() function directly in
secondary process releases the resources it should not touch.

Add the case for secondary process release to its own resources in
mlx5_dev_close() function to help it quits gracefully.

Fixes: 942d13e6e7d1 ("net/mlx5: fix sharing context destroy order")
Fixes: 3a8207423a0f ("net/mlx5: close all ports on remove")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix unreachable MPLS error path
Michael Baum [Wed, 27 May 2020 08:37:57 +0000 (08:37 +0000)]
net/mlx5: fix unreachable MPLS error path

The mlx5_flow_validate_item_mpls function checks MPLS item validation.
It first checks if the device supports MPLS, it is done using the ifdef
condition that if it fails to skip to endif and return the appropriate
error.

When MPLS is supported, the preprocessor will copy the body of the
function ending with return 0 followed by the lines that report MPLS
support.
In fact, these lines are unreachable because before them the function
returns 0 and in any case they are unnecessary.

Replace the endif by else and move endif to the end of the
function.

Fixes: 23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: remove needless Tx queue initialization check
Michael Baum [Wed, 27 May 2020 08:37:56 +0000 (08:37 +0000)]
net/mlx5: remove needless Tx queue initialization check

The mlx5_txq_obj_new function defines a pointer named txq_data and
assign value into it. After assigning, the code writer is sure that the
variable does not point to NULL and even express it using assertion.

During the function, the function does dereferencing to the pointer
several times and at no point change its value. However, at the end of
the function at the error label when it wants to free one of the fields
of the structure that txq_data points to, it checks again whether
txq_data is invalid.
This check is unnecessary since it knows for sure that txq_data is
valid.

Remove the aforementioned needless check.

Fixes: 644906881881 ("net/mlx5: add free on completion queue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix socket close
Michael Baum [Wed, 27 May 2020 08:37:55 +0000 (08:37 +0000)]
net/mlx5: fix socket close

The mlx5_pmd_socket_handle function calls the accept function that
returns the socket descriptor into the conn_sock variable. The socket
descriptor value can be 0 (according to accept API) or positive and so
immediately after calling the function it checks whether conn_sock < 0.
Later in the function when other things fail it jumps to the error label
and release previously allocated resources (such as socket or file).

During the resource release, it checks whether the variable conn_sock
containing the socket descriptor is positive and if it is, it releases
it. However, in this check it misses the case where conn_sock == 0, in
this case the socket will not be released and there will be a Resource
leak.

Extend the close condition for 0 value too.

Fixes: e6cdc54cc0ef ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: remove unnecessary init in socket creation
Michael Baum [Wed, 27 May 2020 08:37:54 +0000 (08:37 +0000)]
net/mlx5: remove unnecessary init in socket creation

In the mlx5_pmd_socket_handle function it calls the recvmsg function
which returns the number of bytes read. The function assigns this return
value into a ret variable defined at the beginning of the function.
Similarly in the mlx5_pmd_socket_init function the it calls the socket
function which returns a file descriptor for the new socket. The
function also assigns this return value into a ret variable defined at
the beginning of the function.

In both functions they initialize the variable when defining it,
however, in both cases they do not use any ret variable before assigning
the return value from the function, so the initialization is
unnecessary.

Clean the aforementioned unnecessary initializations.

Fixes: e6cdc54cc0ef ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix hairpin Rx queue creation error path
Michael Baum [Wed, 27 May 2020 08:37:53 +0000 (08:37 +0000)]
net/mlx5: fix hairpin Rx queue creation error path

The mlx5_rxq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_rq function.

In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.

Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.

Fixes: e79c9be91515 ("net/mlx5: support Rx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/mlx5: fix hairpin Tx queue creation error path
Michael Baum [Wed, 27 May 2020 08:37:52 +0000 (08:37 +0000)]
net/mlx5: fix hairpin Tx queue creation error path

The mlx5_txq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_sq function.

In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate, and another variable that has never been allocated.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.

Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.

Fixes: ae18a1ae9692 ("net/mlx5: support Tx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
5 years agonet/ice: fix PCI DSN to lowercase
Haiyue Wang [Thu, 28 May 2020 05:39:13 +0000 (13:39 +0800)]
net/ice: fix PCI DSN to lowercase

The PCI DSN (device serial number) to format package file name should be
lowercase values.

Fixes: d1c91179e952 ("net/ice: check DSN package file firstly")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
5 years agonet/bnxt: fix missed unlock
Yunjian Wang [Wed, 27 May 2020 12:11:20 +0000 (20:11 +0800)]
net/bnxt: fix missed unlock

Coverity issue: 357741
Fixes: 02a95625fe9c ("net/bnxt: add flow stats in extended stats")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/bnxt: fix mark action if rule is at index zero
Mike Baucom [Fri, 22 May 2020 23:55:01 +0000 (19:55 -0400)]
net/bnxt: fix mark action if rule is at index zero

In the ingress path, the cfa_code field in Rx completion identifies the
CFA action rule that was used for the incoming packet. It is possible
that the packet could hit the rule at index 0 in the table.
The mark action code was too restrictive by disallowing a cfa_code of
zero.
This code loosens the requirement and allows zero.

Fixes: b87abb2e55cb ("net/bnxt: support marking packet")
Cc: stable@dpdk.org
Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/iavf: fix flow uninit
Jeff Guo [Wed, 27 May 2020 07:16:50 +0000 (15:16 +0800)]
net/iavf: fix flow uninit

When closing VF device, the process of shutdown adminq should be after
the process of uninit the flow, since the VF might still need to use the
adminq to uninit flow.

Fixes: 9e03acd726cf ("net/iavf: fix flow access")
Fixes: ff2d0c345c3b ("net/iavf: support generic flow API")
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
5 years agoapp/testpmd: fix memory leak on error path
Yunjian Wang [Mon, 25 May 2020 01:46:23 +0000 (09:46 +0800)]
app/testpmd: fix memory leak on error path

This patch fixes the resource leak issue.

Fixes: e63b50162aa3 ("app/testpmd: clean metering and policing commands")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
5 years agonet/netvsc: do not spin forever waiting for reply
Stephen Hemminger [Tue, 19 May 2020 16:52:30 +0000 (09:52 -0700)]
net/netvsc: do not spin forever waiting for reply

Because of bugs in driver or host a reply to a request might
never occur. Better to give an error than spin forever.

Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: process link change messages in alarm
Stephen Hemminger [Tue, 19 May 2020 16:52:29 +0000 (09:52 -0700)]
net/netvsc: process link change messages in alarm

The original code would deadlock itself if a link change event
happened with link state interrupt enabled. The problem is that
the link state changed message would be seen while reading
the host to guest ring (under lock) and then the driver would
send a query to the host to see the new link state. The response
would never be seen (stuck in a while loop) waiting for the
response.

The solution is to use the link change indication to trigger
a DPDK alarm. The alarm will happen in a different thread and
in that context it can send request for new link state and
also do interrupt callback. This is similar to how the bonding
driver is handling the same thing.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: do not query VF link state
Stephen Hemminger [Tue, 19 May 2020 16:52:28 +0000 (09:52 -0700)]
net/netvsc: do not query VF link state

When the primary device link state is queried, there is no
need to query the VF state as well. The application only sees
the state of the synthetic device.

Fixes: dc7680e8597c ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: fix warning when VF is removed
Stephen Hemminger [Tue, 19 May 2020 16:52:27 +0000 (09:52 -0700)]
net/netvsc: fix warning when VF is removed

The code to unset owner of VF device was changing port to invalid
value before calling unset.

Fixes: 4a9efcddaddd ("net/netvsc: fix VF support with secondary process")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: change datapath logging
Stephen Hemminger [Tue, 19 May 2020 16:52:26 +0000 (09:52 -0700)]
net/netvsc: change datapath logging

The PMD_TX_LOG and PMD_RX_LOG can hide errors since this
debug log is typically disabled. Change the code to use
PMD_DRV_LOG for errors.

Under load, the ring buffer to the host can fill.
Add some statistics to estimate the impact and see other errors.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: implement descriptor status
Stephen Hemminger [Tue, 19 May 2020 16:52:25 +0000 (09:52 -0700)]
net/netvsc: implement descriptor status

These functions are useful for applications and debugging.
The netvsc PMD also transparently handles the rx/tx descriptor
functions for underlying VF device.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/netvsc: support per-queue info requests
Stephen Hemminger [Tue, 19 May 2020 16:52:24 +0000 (09:52 -0700)]
net/netvsc: support per-queue info requests

There is not a lot of info here from this driver.
But worth supporting these additional info queries.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agonet/bnxt: fix crash during close
Ajit Khaparde [Fri, 22 May 2020 21:27:31 +0000 (14:27 -0700)]
net/bnxt: fix crash during close

We are freeing flow_stats a little early. This results in a
segfault when the driver accesses the members during cleanup.
Move the call to bnxt_free_flow_stats_info() to prevent this.

Fixes: 02a95625fe9c ("net/bnxt: add flow stats in extended stats")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/bnxt: fix performance for Arm
Rahul Gupta [Fri, 22 May 2020 17:42:09 +0000 (23:12 +0530)]
net/bnxt: fix performance for Arm

Eliminate unnecessary rte_smp_wmb() before writing to request/completion
doorbells. Use rte_cio_wmb() memory barrier instead of rte_io_wmb()
before writing to tx/rx request queue doorbells and use
rte_compiler_barrier() before writing to tx/rx completion queue
doorbells.

Fixes: 4af9d0c72941 ("net/bnxt: cleanup NQ doorbell")
Fixes: f8168ca0e690 ("net/bnxt: support thor controller")
Cc: stable@dpdk.org
Signed-off-by: Rahul Gupta <rahul.gupta@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
5 years agonet/bnxt: fix settingĀ link speed
Kalesh AP [Fri, 22 May 2020 17:42:08 +0000 (23:12 +0530)]
net/bnxt: fix settingĀ link speed

bnxt PMD uses the macro BNXT_SUPPORTED_SPEEDS to validate
the user requested speed. But this has all the speed values
supported by the PMD and is not chip specific.

The check against this macro returns success when the user
tries set the speed to 100G on a port even if the chip does
not support 100G speed.

Fixed it to use bnxt_get_speed_capabilities() to check the
supported speeds by the chip.

Fixes: 1d0704f4d793 ("net/bnxt: add device configure operation")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/hns3: fix key length when configuring RSS
Lijun Ou [Fri, 22 May 2020 09:21:18 +0000 (17:21 +0800)]
net/hns3: fix key length when configuring RSS

When users set the length of RSS hash key greater than the supported
length by hardware, the driver should intercept and can not configure
the wrong key into the hardware.

Fixes: c37ca66f2b27 ("net/hns3: support RSS")
Cc: stable@dpdk.org
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: add RSS hash offload to Rx configuration
Lijun Ou [Fri, 22 May 2020 09:21:17 +0000 (17:21 +0800)]
net/hns3: add RSS hash offload to Rx configuration

Rx offload flag `DEV_RX_OFFLOAD_RSS_HASH` which can be used to
enable/disable PMDs write to `rte_mbuf::hash::rss`. The hns3 PMD driver
already can notify the validity of `rte_mbuf::hash:rss` to the
application by enabling `PKT_RX_RSS_HASH` flag in `rte_mbuf::ol_flags`.

Fixes: 19a3ca4c99cf ("net/hns3: add start/stop and configure operations")
Fixes: c37ca66f2b27 ("net/hns3: support RSS")
Cc: stable@dpdk.org
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
5 years agonet/hns3: fix Tx less than 60 bytes
Wei Hu (Xavier) [Fri, 22 May 2020 09:21:16 +0000 (17:21 +0800)]
net/hns3: fix Tx less than 60 bytes

Currently, when running testpmd application based on hns3 network engine
with csum fwd mode by "set fwd csum" command in the prompt line, sending
42 consecutive bytes of ARP packets to network port with packets
generator. But in fact hardware can't send the ARP packets and the
related logs as below:
"Preparing packet burst to failed: Invalid argument"

The hardware doesn't support transmit packets less than 60 bytes, and in
the '.tx_pkt_burst' ops implementation function named hns3_xmit_pkts
appending operation has been added for less than 60 bytes packets. So
the interception needs to be removed in the '.tx_pkt_prepare' ops
implementation function named hns3_prep_pkts.

Fixes: de620754a109 ("net/hns3: fix sending packets less than 60 bytes")
Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Hao Chen <chenhao164@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>