Qi Zhang [Mon, 15 Jun 2020 02:04:36 +0000 (10:04 +0800)]
net/ice/base: change IPv6 training packet
Add additional UDP payload to allow for additional headers such as ESP.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:35 +0000 (10:04 +0800)]
net/ice/base: refactor flow director filter swap
Move the swap of flow director addresses and ports into training packet
generation. This reduces the code written for ACL.
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:34 +0000 (10:04 +0800)]
net/ice/base: consolidate VF promiscuous mode
Consolidate the Promiscuous rule for SMBM on the chosen logical port.
Signed-off-by: Shibin Koikkara Reeny <shibin.koikkara.reeny@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:33 +0000 (10:04 +0800)]
net/ice/base: update maximum PHY type high index
As currently, we are supporting only 5 PHY_SPEEDs for phy_type_high.
Thus, we should adjust the value of ICE_PHY_TYPE_HIGH_MAX_INDEX to 5.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:32 +0000 (10:04 +0800)]
net/ice/base: fix ACL rules index
A u8 idx in ice_acl_add_entry causes the code to truncate the values
greater than 255 to 255 or less when calling ice_aq_program_acl_entry()
resulting in the wrong TCAM index being programmed for the specified
rule. The result is that the rule action doesn't work correctly
(packets don't get routed to the correct queue or dropped if that
is the action). Fix the issue by changing the variable to be a u16
again.
Fixes:
f3202a097f12 ("net/ice/base: add ACL module")
Cc: stable@dpdk.org
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:31 +0000 (10:04 +0800)]
net/ice/base: add AUI media type
Add and report AUI PHY types as an AUI media type
Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:30 +0000 (10:04 +0800)]
net/ice/base: improve VSI filters rebuild
This change improve VSI filter configuration rebuild for
multiport configuration, ie. where 1 PF includes more than
one logical port. For some functions, association between
port and corresponding switch_info or port_info structure
has been lost because by default the pointer to the first
element of array (switch, port etc.) is passed as function
argument. With this change, pointer to proper element is
added an extra argument in relevant functions.
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:29 +0000 (10:04 +0800)]
net/ice/base: gate devices from FW link override
Currently, the FW link override feature is only permitted for E810
devices. However, the ice_fw_supports_link_override() guards against FW
versions irrespective of the device. This assumes FW versions between
the families are aligned, which is not the case.
Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:28 +0000 (10:04 +0800)]
net/ice/base: report AOC PHY types as fiber
Report AOC types as fiber instead of unknown
Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:27 +0000 (10:04 +0800)]
net/ice/base: consolidate MAC config set
Consolidate implementation of ice_aq_set_mac_cfg for switch mode
and NIC mode. As per the specification, the driver needs to call
set_mac_cfg (opcode 0x0603) to be able to exercise jumbo frames.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:26 +0000 (10:04 +0800)]
net/ice/base: avoid undefined behavior
When writing the driver's struct ice_tlan_ctx structure, do not write
the 8-bit element int_q_state with the associated internal-to-hardware
field which is 122-bits, otherwise the helper function ice_write_byte()
will use undefined behavior when setting the mask used for that write.
This should not cause any functional change and will avoid use of
undefined behavior. Also, update a comment to highlight this structure
element is not written.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:25 +0000 (10:04 +0800)]
net/ice/base: disable profile merge for flow director
For Flow Director, we don't want to re-use an existed profile with the
same field vector and mask. Merging two different flow_type’s field
vector will also make them sharing trained rule and cause rule
interference.
For example:
issue rule A: IPV4_TCP matching tcp src&dst port 80 to queue 8
issue rule B: IPV6_TCP matching tcp src&dst port 200 to queue 20
Below behavior is found but not expected:
IPV4_TCP pkt with src&dst port 200 hits rule B and goes to queue 20
IPV6_TCP pkt with src&dst port 80 hits rule A and goes to queue 8
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:24 +0000 (10:04 +0800)]
net/ice/base: add macros to parse flow director Rx desc
Add descriptor field offset and mask definition. It is used to parse
FDIR Rx descriptor field value.
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Mon, 15 Jun 2020 02:04:23 +0000 (10:04 +0800)]
net/ice/base: support flow director for non-IP packets
FDIR can forward Ethernet packets with non-IP ethertype.
Signed-off-by: Yahui Cao <yahui.cao@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Dong Zhou [Fri, 12 Jun 2020 08:57:46 +0000 (11:57 +0300)]
net/mlx5: fix LRO checksum
The TCP checksum includes IPV4 pseudo-header checksum and L3
payload checksum which include TCP header and TCP payload.
When mlx5 LRO is enabled, HW will calculate the TCP payload
checksum, PMD need complete the IPV4 pseudo-header checksum
and the TCP header checksum.
The mlx5_lro_update_tcp_hdr function completes the TCP header
checksum, but this function using lower 4 bits of data-offset
field in TCP header to get the whole TCP header length, this
will cause TCP header checksum wrong calculation.
Update the code using higher 4 bits of data-offset field
instead of lower 4 bits.
Fixes:
e4c2a16eb1de ("net/mlx5: handle LRO packets in Rx queue")
Cc: stable@dpdk.org
Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Alexander Kozyrev [Thu, 11 Jun 2020 17:43:27 +0000 (17:43 +0000)]
net/mlx5: fix descriptors number adjustment
The number of descriptors to configure in a Rx/Tx queue is passed to
the mlx5_tx/rx_queue_pre_setup() function by value. That means any
adjustments of this variable are local and cannot affect the actual
value that is used to allocate mbufs in the mlx5_txq/rxq_new()
functions. Pass the number as a reference to actually update it.
Fixes:
6218063b39a6 ("net/mlx5: refactor Rx data path")
Fixes:
1d88ba171942 ("net/mlx5: refactor Tx data path")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Alexander Kozyrev [Thu, 11 Jun 2020 17:42:00 +0000 (17:42 +0000)]
net/mlx5: do not select legacy MPW implicitly
The Legacy MPW (multi-packet write) should not be engaged implicitly.
We should exclude this function from a Tx burst routine selection
process unless it is requested specifically by setting the txq_mpw_en
devarg. Exclude this function from the selection process the same way
it is done for the Enhanced MPW in the mlx5_select_tx_function()
routine.
Fixes:
eb8121ab9dac ("net/mlx5: introduce Tx burst routine template")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:33 +0000 (09:32 +0000)]
net/mlx5: refactor statistics
mlx5 statistics are calculated by several methods:
1. In software when packets go through datapath.
2. Calling ioctl with ETHTOOL command (Linux specific).
3. Reading counters from SYSFS device path (Linux specific).
The Linux related functions are moved to file linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:31 +0000 (09:32 +0000)]
common/mlx5: flag Verbs dependency in a DevX command
Function mlx5_devx_cmd_qp_query_tis_td() receives as parameter a pointer
to verbs QP returned by ibv_create_qp. Therefore support it only if
HAVE_IBV_FLOW_DV_SUPPORT is defined. Otherwise return an error ENOTSUP.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:30 +0000 (09:32 +0000)]
net/mlx5: refactor device operations for Linux
There are three types of eth_dev_ops: primary, secondary and isolate.
Their function calls assignments are moved from common file
mlx5.c to the Linux specific file linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:29 +0000 (09:32 +0000)]
net/mlx5: move Linux-specific functions
File mlx5_ethdev.c is partially moved to linux/mlx5_ethdev_os.c for
functions which are Linux specific. Functions which are Linux agnostics
remain in mlx5_ethdev.c file.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:28 +0000 (09:32 +0000)]
net/mlx5: move socket files in Linux directory
mlx5_socket.c file is using APIs which are Linux specifics. Therefore
move it (including mlx5_socket.h) from net/mlx5 directory to
net/mlx5/linux directory. This commit also updates the Makefile and
the meson files.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:27 +0000 (09:32 +0000)]
net/mlx5: rename ib in names
Renames in this commit:
mlx5_ibv_list -> mlx5_dev_ctx_list
mlx5_alloc_shared_ibctx -> mlx5_alloc_shared_dev_ctx
mlx5_free_shared_ibctx -> mlx5_free_shared_dev_ctx
mlx5_ibv_shared_port -> mlx5_dev_shared_port
ibv_port -> dev_port
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 10 Jun 2020 09:32:26 +0000 (09:32 +0000)]
net/mlx5: remove completion object dependency on DV
Replace 'struct mlx5dv_devx_cmd_comp *' with 'void *' in 'struct
mlx5_dev_ctx_shared'.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Gregory Etelson [Mon, 8 Jun 2020 16:01:56 +0000 (19:01 +0300)]
net/mlx5: fix flow memory allocation size
In DV enabled MLX5 PMD build mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size
was initiated for DV structure. If RTE initialization encountered MLX5
PCI function with disabled DV support
mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW].size was reduced to match legacy
verbs flow size. Since mlx5_ipool_cfg[MLX5_IPOOL_MLX5_FLOW] is a
global variable that change reflected on DV enabled MLX5 PCI functions
too.
Running flow with invalid ipool size crashes PMD.
The patch adjusts ipool flow size for each active PCI function.
Fixes:
b88341ca35fc ("net/mlx5: convert flow dev handle to indexed")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Dekel Peled [Wed, 10 Jun 2020 13:25:19 +0000 (16:25 +0300)]
net/mlx5: fix GTP mask definition location
Recent patch added definition of mask MLX5_GTP_FLAGS_MASK, just
above function flow_dv_validate_item_gtp(), where it is used.
Patch was applied together with other patches which modified the same
file, so the mask was located further away from the function it is
used in.
This patch moves the mask definition to the proper location.
Fixes:
563ac307a46b ("net/mlx5: support match on GTP flags")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Vivien Didelot [Tue, 9 Jun 2020 19:07:19 +0000 (15:07 -0400)]
net/pcap: support Tx nanosecond timestamps
When capturing packets into a PCAP file, DPDK currently uses
microseconds for the timestamps. But libpcap supports interpreting
tv_usec as nanoseconds depending on the file timestamp precision,
as of commit
ba89e4a18e8b ("Make timestamps precision configurable").
To support this, use PCAP_TSTAMP_PRECISION_NANO when creating the
empty PCAP file as specified by PCAP_OPEN_DEAD(3PCAP) and implement
nanosecond timeval addition. This also ensures that the precision
reported by capinfos is nanoseconds (9).
Note that NSEC_PER_SEC is defined as 1000000000L instead of 1e9 since
the latter might be interpreted as floating point.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ali Alnubani [Mon, 8 Jun 2020 14:02:57 +0000 (17:02 +0300)]
net/mlx5: fix typos in meter error messages
Fixes:
3bd26b23cefc ("net/mlx5: support meter profile operations")
Cc: stable@dpdk.org
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Hongbo Zheng [Tue, 9 Jun 2020 08:44:17 +0000 (16:44 +0800)]
net/hns3: fix unintended sign extension in dump operation
There are coverity defects related "Unintended sign extension" in the
internal static function named hns3_get_regs_length used for dumping reg
operation.
This patch fixes them by replacing the data type of cmdq_lines,
common_lines, ring_lines and tqp_intr_lines with uint32_t in the inner
static function named hns3_get_regs_length of hns3 PMD driver.
Coverity issue: 349917, 349914
Fixes:
936eda25e8da ("net/hns3: support dump register")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Wei Hu (Xavier) [Tue, 9 Jun 2020 08:44:16 +0000 (16:44 +0800)]
net/hns3: fix unintended sign extension in fd operation
Currently, there are coverity defects warning as below:
CID 349937 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: port_number with
type uint16_t (16 bits, unsigned) is promoted in port_number << cur_pos
to type int (32 bits, signed), then sign-extended to type unsigned long
(64 bits, unsigned). If port_number << cur_pos is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.
CID 349893 (#1 of 1): Unintended sign extension (SIGN_EXTENSION)
sign_extension: Suspicious implicit sign extension: vlan_tag with type
uint8_t (8 bits, unsigned) is promoted in vlan_tag << cur_pos to type
int (32 bits, signed), then sign-extended to type unsigned long (64
bits, unsigned). If vlan_tag << cur_pos is greater than 0x7FFFFFFF, the
upper bits of the result will all be 1.
This patch fixes them by replacing the data type of port_number and
vlan_tag with uint32_t in the inner static function named
hns3_fd_convert_meta_data of hns3 PMD driver.
Coverity issue: 349937, 349893
Fixes:
fcba820d9b9e ("net/hns3: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Hongbo Zheng [Tue, 9 Jun 2020 08:44:15 +0000 (16:44 +0800)]
net/hns3: ignore function return on reset error path
There is a coverity defect related "Unchecked return value".
The internal static hns3_reset_err_handle function is reset error
process of hns3 PMD driver. If failure in reset process, it does not
mean that the network port is completely unavailable, so the command
interface between driver and firmware still needs to be initialized.
Regardless of whether the execution of the function named hns3_cmd_init
is successful or not, the next process after execution must be
continued, so there is no need to check the return value. If
hns3_cmd_init fails to execute, there will be corresponding log
information inside hns3_cmd_init.
This patch adds '(void)' Type conversion to avoid coverity warning.
Coverity issue: 349934
Fixes:
2790c6464725 ("net/hns3: support device reset")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Wei Hu (Xavier) [Tue, 9 Jun 2020 08:44:14 +0000 (16:44 +0800)]
net/hns3: fix flow director error message
There is a coverity defect related "Argument cannot be negative".
This patch fixes it by passing '-ret' to the function strerror() when
ret is negative.
Coverity issue: 349933
Fixes:
fcba820d9b9e ("net/hns3: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Shy Shyman [Mon, 8 Jun 2020 14:17:47 +0000 (17:17 +0300)]
app/testpmd: fix error detection in MTU command
MTU is used in testpmd to set the maximum payload size for packets.
According to testpmd, the setting influence RX only.
In rte_ethdev there's no relation between MTU setting and JUMBO offload
or rx_max_pkt_len.
The previous fix in patch referenced below was meant to update the
correlated variables of max_pkt_len and JUMBO offload, but by doing so
it assumes that MTU setting can only exist when JUMBO offload supported
in the device. For example fail-safe device does supports set MTU and
doesn't support JUMBO offload, and in this case, though set MTU
succeeds, an error message is still printed since the JUMBO packet
offload is disabled.
The fix separates the two conditions to make sure the error
triggers only in case the set_mtu action actually failed.
Fixes:
150c9ac2df13 ("app/testpmd: update Rx offload after setting MTU")
Cc: stable@dpdk.org
Signed-off-by: Shy Shyman <shys@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Ophir Munk [Wed, 3 Jun 2020 15:06:02 +0000 (15:06 +0000)]
net/mlx5: remove Verbs dependency in spawn struct
1. Replace 'struct ibv_device *' with 'void *' in 'struct
mlx5_dev_spawn_data'. Define a getter function to retrieve the
device name.
2. Rename ibv_dev and ibv_port as phys_dev and phys_port
respectively.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:06:01 +0000 (15:06 +0000)]
net/mlx5: add Linux-specific header file
File drivers/net/linux/mlx5_os.h is added. It includes specific
Linux definitions such as PCI driver flags, link state changes
interrupts, link removal interrupts, etc.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:06:00 +0000 (15:06 +0000)]
net/mlx5: refactor PCI probing on Linux
Refactor PCI probing related code. Move Linux specific functions (as
well as verbs and dv related code) from mlx5.c file to linux/mlx5_os.c
file.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:05:59 +0000 (15:05 +0000)]
net/mlx5: remove umem field dependency on Direct Verbs
umem field is used in several structs. Its type 'struct mlx5dv_devx_umem
*' is changed to 'void *'. This change will allow non-Linux OS
compilations.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:05:58 +0000 (15:05 +0000)]
net/mlx5: remove attributes dependency on Verbs
Define 'struct mlx5_dev_attr' which is ibv and dv independent. It
contains attribute that were originally contained in 'struct
ibv_device_attr_ex' and 'struct mlx5dv_context dv_attr'. Add a new API
mlx5_os_get_dev_attr() which fills in the new defined struct.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:05:57 +0000 (15:05 +0000)]
common/mlx5: remove protection domain dependency on Verbs
Replace 'struct ibv_pd *' with 'void *' in struct mlx5_ctx_shared and
all function calls in mlx5 PMD.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:05:56 +0000 (15:05 +0000)]
net/mlx5: add Linux-specific file with getter functions
'ctx' type (field in 'struct mlx5_ctx_shared') is changed from 'struct
ibv_context *' to 'void *'. 'ctx' members which are verbs dependent
(e.g. device_name) will be accessed through getter functions which are
added to a new file under Linux directory: linux/mlx5_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Wed, 3 Jun 2020 15:05:55 +0000 (15:05 +0000)]
net/mlx5: rename Verbs shared object
Replace all 'mlx5_ibv_shared' appearances with 'mlx5_dev_ctx_shared'.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Stephen Hemminger [Mon, 27 Apr 2020 23:16:25 +0000 (16:16 -0700)]
cfgfile: check flags on creation for future proofing
All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Stephen Hemminger [Mon, 27 Apr 2020 23:16:24 +0000 (16:16 -0700)]
stack: check flags on creation for future proofing
All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Stephen Hemminger [Mon, 27 Apr 2020 23:16:23 +0000 (16:16 -0700)]
hash: check flags on creation for future proofing
All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Stephen Hemminger [Mon, 27 Apr 2020 23:16:22 +0000 (16:16 -0700)]
ring: check flag settings for future proofing
All API's should check that they support the flag values passed.
These checks ensure that the extra bits can safely be used
without risk of ABI breakage.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:56 +0000 (15:58 +0800)]
net/hinic: use common bit operations API
Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:55 +0000 (15:58 +0800)]
net/qede: use common bit operations API
Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:54 +0000 (15:58 +0800)]
net/bnx2x: use common bit operations API
Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:53 +0000 (15:58 +0800)]
net/axgbe: use common bit operations API
Remove its own bit operation APIs and use the common one,
this can reduce the code duplication largely.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:52 +0000 (15:58 +0800)]
test/bitops: add bit operations test case
Add test cases for setting bit, clearing bit, testing
and setting bit, testing and clearing bit operation.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Joyce Kong [Mon, 27 Apr 2020 07:58:51 +0000 (15:58 +0800)]
eal: introduce bit operations API
Bitwise operation APIs are defined and used in a lot of PMDs,
which caused a huge code duplication. To reduce duplication,
this patch consolidates them into a common API family.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:54 +0000 (03:43 +0300)]
eal/windows: implement basic memory management
Basic memory management supports core libraries and PMDs operating in
IOVA as PA mode. It uses a kernel-mode driver, virt2phys, to obtain
IOVAs of hugepages allocated from user-mode. Multi-process mode is not
implemented and is forcefully disabled at startup. Assign myself as a
maintainer for Windows file and memory management implementation.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:53 +0000 (03:43 +0300)]
eal/windows: initialize hugepage info
Add hugepages discovery ("large pages" in Windows terminology)
and update documentation for required privilege setup. Only 2MB
hugepages are supported and their number is estimated roughly
due to the lack or unstable status of suitable OS APIs.
Assign myself as maintainer for the implementation file.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:52 +0000 (03:43 +0300)]
doc: split build and run instructions in Windows guide
With memory management implemented for Windows, the guide for running
sample applications is going to be extended with hugepages and driver
setup. Move run instructions to a separate file to give space for
planned expansion.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:51 +0000 (03:43 +0300)]
eal/windows: improve CPU and NUMA node detection
1. Map CPU cores to their respective NUMA nodes as reported by system.
2. Support systems with more than 64 cores (multiple processor groups).
3. Fix magic constants, styling issues, and compiler warnings.
4. Add EAL private function to map DPDK socket ID to NUMA node number.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:50 +0000 (03:43 +0300)]
eal/windows: complete queue.h data structures
Limited version imported previously lacks at least SLIST macros.
Import a complete file from FreeBSD, since its license exception is
already approved by Technical Board.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:49 +0000 (03:43 +0300)]
eal/windows: add tracing stubs
EAL common code depends on tracepoint calls, but generic implementation
cannot be enabled on Windows due to missing standard library facilities.
Add stub functions to support tracepoint compilation, so that common
code does not have to conditionally include tracepoints until proper
support is added.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:48 +0000 (03:43 +0300)]
trace: add size_t field emitter
It is not guaranteed that sizeof(long) == sizeof(size_t). On Windows,
sizeof(long) == 4 and sizeof(size_t) == 8 for 64-bit programs.
Tracepoints using "long" field emitter are therefore invalid there.
Add dedicated field emitter for size_t and use it to store size_t values
in all existing tracepoints.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:47 +0000 (03:43 +0300)]
mem: extract common dynamic memory allocation
Code in Linux EAL that supports dynamic memory allocation (as opposed to
static allocation used by FreeBSD) is not OS-dependent and can be reused
by Windows EAL. Move such code to a file compiled only for the OS that
require it. Keep Anatoly Burakov maintainer of extracted code.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:46 +0000 (03:43 +0300)]
mem: extract common memseg list initialization
All supported OS create memory segment lists (MSL) and reserve VA space
for them in a nearly identical way. Move common code into EAL private
functions to reduce duplication.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:45 +0000 (03:43 +0300)]
eal: introduce memory management wrappers
Introduce OS-independent wrappers for memory management operations used
across DPDK and specifically in common code of EAL:
* rte_mem_map()
* rte_mem_unmap()
* rte_mem_page_size()
* rte_mem_lock()
Windows uses different APIs for memory mapping and reservation, while
Unices reserve memory by mapping it. Introduce EAL private functions to
support memory reservation in common code:
* eal_mem_reserve()
* eal_mem_free()
* eal_mem_set_dump()
Wrappers follow POSIX semantics limited to DPDK tasks, but their
signatures deliberately differ from POSIX ones to be more safe and
expressive. New symbols are internal. Being thin wrappers, they require
no special maintenance.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:44 +0000 (03:43 +0300)]
eal: introduce internal wrappers for file operations
Introduce OS-independent wrappers in order to support common EAL code
on Unix and Windows:
* eal_file_open: open or create a file.
* eal_file_lock: lock or unlock an open file.
* eal_file_truncate: enforce a given size for an open file.
Implementation for Linux and FreeBSD is placed in "unix" subdirectory,
which is intended for common code between the two. These thin wrappers
require no special maintenance.
Common code supporting multi-process doesn't use the new wrappers,
because it is inherently Unix-specific and would impose excessive
requirements on the wrappers.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Mon, 15 Jun 2020 00:43:43 +0000 (03:43 +0300)]
eal: replace page sizes enum with a set of constants
Clang on Windows follows MS ABI where enum values are limited to 2^31-1.
Enum rte_page_sizes has members valued above this limit, which get
wrapped to zero, resulting in compilation error (duplicate values in
enum). Using MS ABI is mandatory for Windows EAL to call Win32 APIs.
Remove rte_page_sizes and replace its values with #define's.
This enumeration is not used in public API, so there's no ABI breakage.
Announce API changes for 20.08 in documentation.
Suggested-by: Jerin Jacob <jerinjacobk@gmail.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
David Marchand [Wed, 10 Jun 2020 14:30:49 +0000 (16:30 +0200)]
eal/windows: fix symbol export
rte_eal_get_configuration() has been made private in 19.11, remove
leftover in Windows export list.
Fixes:
f58cef079b05 ("eal: make the global configuration private")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Pallavi Kadam [Thu, 11 Jun 2020 19:50:55 +0000 (12:50 -0700)]
eal/windows: fix warnings
Fixed bunch of warnings when compiling using clang on Windows
such as the use of an unsafe string function (strerror),
[-Wunused-variable], [-Wunused-function] in eal_common_options.c
[-Wunused-const-variable] in getopt.c and [-Wunused-parameter]
in eal_common_thread.c.
Also fixed warnings generated using Mingw:
[-Werror=old-style-definition], [-Werror=cast-function-type] and
[-Werror=attributes]
Signed-off-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tasnim Bashar [Thu, 21 May 2020 00:32:53 +0000 (17:32 -0700)]
eal/windows: support thread ID query
Add rte_sys_gettid function to use rte_gettid() on Windows.
rte_gettid() is required for recursive spin lock and recursive ticket lock.
Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tal Shnaiderman [Tue, 19 May 2020 18:41:11 +0000 (21:41 +0300)]
mbuf: align layout in Windows
Using uint32_t type bit-fields in Windows will pads the
'L2/L3/L4 and tunnel information' union with additional bits.
This padding causes rte_mbuf size misalignment and the total size
increases to 3 cache-lines.
Changed packet_type bit-fields types from uint32_t to uint8_t
to allow unified 2 cache-line structure size.
Added the __extension__ attribute over the modified struct to avoid
the warning:
type of bit-field ... is a GCC extension [-pedantic]
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Alexander Kozyrev [Mon, 1 Jun 2020 15:24:16 +0000 (15:24 +0000)]
mbuf: fix external buffer pool boundaries
Memzones are created in testpmd in order to test external data
buffers functionality. Each memzone is 2Mb in size and divided among
the pool of external memory buffers.
Memzone may not always be fully utilized because mbufs size can vary
and some space can be left unused at the tail of a memzone. This is
not handled properly and mbuf can get the address of this leftover
space since this address is still valid (part of memzone), but there
is not enough space to fit the whole packet data. As a result packet
data may overflow and cause the memory corruption.
Take mbuf size into account when distributing memory addresses from
a memzone to external mbufs. Skip the remaining tail in case there
is not enough room for a packet and move to a next memzone instead.
Fixes:
6c8e50c2e5 ("mbuf: create pool with external memory buffers")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Xiaolong Ye [Tue, 9 Jun 2020 08:24:29 +0000 (16:24 +0800)]
test/mbuf: fix a dynamic flag log
Fixes:
4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Xiaolong Ye [Tue, 9 Jun 2020 07:12:56 +0000 (15:12 +0800)]
mbuf: remove unused next member in dynamic flag/field
TAILQ_ENTRY next is not needed in struct mbuf_dynfield_elt and
mbuf_dynflag_elt, since they are actually chained by rte_tailq_entry's
next field when calling TAILQ_INSERT_TAIL(mbuf_dynfield/dynflag_list, te,
next).
Fixes:
4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Thomas Monjalon [Thu, 11 Jun 2020 06:32:29 +0000 (08:32 +0200)]
mbuf: document guideline for new fields and flags
Since dynamic fields and flags were added in 19.11,
the idea was to use them for new features, not only PMD-specific.
The guideline is made more explicit in doxygen, in the mbuf guide,
and in the contribution design guidelines.
For more information about the original design, see the presentation
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf
This decision was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-June/169667.html
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Wei Hu (Xavier) [Sat, 6 Jun 2020 03:46:37 +0000 (11:46 +0800)]
app/testpmd: fix stats error message
There are coverity defects related "Argument cannot be negative"
This patch fixes them by passing '-ret' to the function strerror() when
ret is negative.
Coverity issue: 349913, 358437, 358449, 358450
Fixes:
da328f7f115a ("ethdev: change xstats reset function to return int")
Fixes:
9eb974221f44 ("app/testpmd: fix statistics after reset")
Cc: stable@dpdk.org
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Wei Hu (Xavier) [Mon, 18 May 2020 11:20:09 +0000 (19:20 +0800)]
maintainers: update for bonding
Adding Xavier as additional maintainer to bonding.
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Chas Williams <chas3@att.com>
Girish Nandibasappa [Mon, 1 Jun 2020 14:48:41 +0000 (20:18 +0530)]
net/axgbe: support setting MTU
This patch adds support for set_mtu API which can be used to change
the Maximum Transmission unit (MTU) from application.
Signed-off-by: Girish Nandibasappa <girish.nandibasappa@amd.com>
Acked-by: Amaranath Somalapuram <asomalap@amd.com>
Chandu Babu N [Fri, 29 May 2020 11:49:20 +0000 (17:19 +0530)]
net/axgbe: support RSS RETA/hash query and update
add support for RSS reta/hash query and update function
Signed-off-by: Chandu Babu N <chandu@amd.com>
Acked-by: Amaranath Somalapuram <asomalap@amd.com>
Ruifeng Wang [Fri, 5 Jun 2020 05:20:55 +0000 (13:20 +0800)]
net/i40e: enable NEON Rx/Tx in meson
The i40e neon vector implementation is not compiled with meson.
Add the file to meson for Arm platform.
Fixes:
e940646b20fa ("drivers/net: build Intel NIC PMDs with meson")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Hongbo Zheng [Wed, 3 Jun 2020 09:32:01 +0000 (17:32 +0800)]
net/hns3: check TSO segment size during Tx
Base on hns3 network engine, when the rte_eth_tx_burst API is called
by Upper Level Process, if PKT_TX_TCP_SEG flag is set and tso_segsz
is 0 in the input parameter structure rte_mbuf, hns3 PMD driver will
process this packet as an non-TSO packet, otherwise hardware will enter
an abnormal state.
Fixes:
6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Wei Hu (Xavier) [Wed, 3 Jun 2020 09:32:00 +0000 (17:32 +0800)]
net/hns3: fix VLAN tags reported in Rx
Currently, based on hns3 network engine, driver always reports the
incoming packet's VLAN tags to the structure rte_mbuf those are the
output parameter pointers in '.rx_pkt_burst' ops implementation
function, and never reports PKT_RX_VLAN_STRIPPED flag to the structure
rte_mbuf even if Upper Level Process configured hardware strip by
calling rte_eth_dev_configure or rte_eth_dev_set_vlan_offload API
function. It makes the ULP unable to know the stripping of VLAN.
It is supposed to present the stripped flags to the mbuf ol_flags, and
report the right VLAN tag.
And as hardware constraints, the stripped VLAN tag will always in the Rx
descriptor. Even if setting a PVID based on the function, the PVID will
be reported to the Rx descriptor. So the driver need to determine which
VLAN tag should be reported to output the structure rte_mbuf in
'.rx_pkt_burst' ops implementation function named hns3_recv_pkts.
Fixes:
bba636698316 ("net/hns3: support Rx/Tx and related operations")
Fixes:
411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Chengchang Tang [Wed, 3 Jun 2020 09:31:59 +0000 (17:31 +0800)]
net/hns3: fix VLAN strip configuration when setting PVID
Currently, based on hns3 PF device, hardware will strip 2 vlan tags when
ULP calls rte_eth_dev_set_vlan_pvid API function to set a PVID whether
vlan strip related offload is turned on by calling rte_eth_dev_configure
or rte_eth_dev_set_vlan_offload API function.
When receiving a QinQ packet with the pvid tag, if ULP does not
configure the vlan strip by the method mentioned above, a layer of vlan
tag will be lost to ULP, which is not the expected result.
It is supposed to configure the vlan strip according to the upper level
process's configuration.
Fixes:
411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Chengchang Tang [Wed, 3 Jun 2020 09:31:58 +0000 (17:31 +0800)]
net/hns3: remove unsupported VLAN capabilities
This patch removes unsupported vlan capabilities to avoid misleading
users.
Fixes:
a5475d61fa34 ("net/hns3: support VF")
Fixes:
1f5ca0b460cd ("net/hns3: support some device operations")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Alexander Kozyrev [Tue, 2 Jun 2020 03:50:41 +0000 (03:50 +0000)]
net/mlx5: fix vectorized Rx burst termination
Maximum burst size of Vectorized Rx burst routine is set to
MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any
application that would like to gather more than 64 packets from
the single Rx burst for batch processing (i.e. VPP).
The situation gets worse with a mix of zipped and unzipped CQEs.
They are processed separately and the Rx burst function returns
small number of packets every call.
Repeat the cycle of gathering packets from the vectorized Rx routine
until a requested number of packets are collected or there are no
more CQEs left to process.
Fixes:
6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Suanming Mou [Mon, 1 Jun 2020 06:09:43 +0000 (14:09 +0800)]
net/mlx5: add reclaim memory mode
Currently, when flow destroyed, some memory resources may still be kept
as cached to help next time create flow more efficiently.
Some system may need the resources to be more flexible with flow create
and destroy. After peak time, with millions of flows destroyed, the
system would prefer the resources to be reclaimed completely, no cache
is needed. Then the resources can be allocated and used by other
components. The system is not so sensitive about the flow insertion
rate, but more care about the resources.
Both DPDK mlx5 PMD driver and the low level component rdma-core have
provided the flow resources to be configured cached or not, but there is
no APIs or parameters exposed to user to configure the flow resources
cache mode. In this case, introduce a new PMD devarg to let user
configure the flow resources cache mode will be helpful.
This commit is to add a new "reclaim_mem_mode" to help user configure if
the destroyed flows' cache resources should be kept or not.
Their will be three mode can be chosen:
1. 0(none). It means the flow resources will be cached as usual. The
resources will be cached, helpful with flow insertion rate.
2. 1(light). It will only enable the DPDK PMD level resources reclaim.
3. 2(aggressive). Both DPDK PMD level and rdma-core low level will be
configured as reclaimed mode.
With these three mode, user can configure the resources cache mode with
different levels.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Suanming Mou [Mon, 1 Jun 2020 06:09:42 +0000 (14:09 +0800)]
common/mlx5: add memory reclaim glue function
While flow destroyed, rdma-core may still cache some resources for more
efficiently flow recreate. In case the peak time that millions of flows
created and destroyed, the cached resources will be very huge.
Currently, rdma-core provides the new function to configure the flow
resources not to be cached. Add the memory reclaim function to avoid
too many resources be cached.
This is the first patch for the memory reclaim. A new devarg will be
added to PMD to support the reclaim can be configured.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Ophir Munk [Mon, 1 Jun 2020 05:50:47 +0000 (05:50 +0000)]
common/mlx5: split common file under Linux directory
File mlx5_common.c includes both specific and non-specific Linux APIs.
Move the Linux specific APIS into a new file named linux/mlx5_common_os.c.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Mon, 1 Jun 2020 05:50:46 +0000 (05:50 +0000)]
common/mlx5: move netlink files under Linux directory
File mlx5_nl.c is using Netlink APIs which are Linux specifics.
Move it (including file mlx5_nl.h) to common/mlx5/linux directory.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Mon, 1 Jun 2020 05:50:45 +0000 (05:50 +0000)]
common/mlx5: move glue files under Linux directory
The glue file mlx5_glue.c is based on Linux specifics APIs.
Move it (including file mlx5_glue.h) to common/mlx5/linux directory.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ian Dolzhansky [Wed, 27 May 2020 14:34:33 +0000 (15:34 +0100)]
net/failsafe: fix RSS RETA size info
Failsafe driver has been indicating zero for RSS redirection table size
after device info reporting had been reworked. Report proper value.
Fixes:
4586be3743d4 ("net/failsafe: fix reported device info")
Cc: stable@dpdk.org
Signed-off-by: Ian Dolzhansky <ian.dolzhansky@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Muhammad Bilal [Fri, 29 May 2020 14:47:45 +0000 (19:47 +0500)]
doc: remove duplicated line in memif guide
There was a duplicate command instruction in the documentation of memif
so I have removed the 1 command from it.
Fixes:
cbbbbd3365d2 ("net/memif: enable loopback")
Cc: stable@dpdk.org
Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Suanming Mou [Thu, 28 May 2020 09:22:09 +0000 (17:22 +0800)]
net/mlx5: fix interrupt installation timing
Currently, the DevX counter query works asynchronously with Devx
interrupt handler return the query result. When port closes, the
interrupt handler will be uninstalled and the Devx comp obj will
also be destroyed. Meanwhile the query is still not cancelled.
In this case, counter query may use the invalid Devx comp which
has been destroyed, and query failure with invalid FD will be
reported.
Adjust the shared interrupt install and uninstall timing to make
the counter asynchronous query stop before interrupt uninstall.
Fixes:
f15db67df09c ("net/mlx5: accelerate DV flow counter query")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Suanming Mou [Thu, 28 May 2020 06:59:49 +0000 (14:59 +0800)]
net/mlx5: fix secondary process resources release
When secondary process starts, it will allocate its own process private
data, and also does remap to UAR register of the Tx queue. Once the
secondary process exits, these resources should be released accordingly.
And the shared resources owned by primary should not be touched.
Currently, once one port in the secondary process spawn failed, all the
other spawned ports will also be released during process exits. However,
the mlx5_dev_close() function does not add the cases for secondary
process, it means call the mlx5_dev_close() function directly in
secondary process releases the resources it should not touch.
Add the case for secondary process release to its own resources in
mlx5_dev_close() function to help it quits gracefully.
Fixes:
942d13e6e7d1 ("net/mlx5: fix sharing context destroy order")
Fixes:
3a8207423a0f ("net/mlx5: close all ports on remove")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:57 +0000 (08:37 +0000)]
net/mlx5: fix unreachable MPLS error path
The mlx5_flow_validate_item_mpls function checks MPLS item validation.
It first checks if the device supports MPLS, it is done using the ifdef
condition that if it fails to skip to endif and return the appropriate
error.
When MPLS is supported, the preprocessor will copy the body of the
function ending with return 0 followed by the lines that report MPLS
support.
In fact, these lines are unreachable because before them the function
returns 0 and in any case they are unnecessary.
Replace the endif by else and move endif to the end of the
function.
Fixes:
23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:56 +0000 (08:37 +0000)]
net/mlx5: remove needless Tx queue initialization check
The mlx5_txq_obj_new function defines a pointer named txq_data and
assign value into it. After assigning, the code writer is sure that the
variable does not point to NULL and even express it using assertion.
During the function, the function does dereferencing to the pointer
several times and at no point change its value. However, at the end of
the function at the error label when it wants to free one of the fields
of the structure that txq_data points to, it checks again whether
txq_data is invalid.
This check is unnecessary since it knows for sure that txq_data is
valid.
Remove the aforementioned needless check.
Fixes:
644906881881 ("net/mlx5: add free on completion queue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:55 +0000 (08:37 +0000)]
net/mlx5: fix socket close
The mlx5_pmd_socket_handle function calls the accept function that
returns the socket descriptor into the conn_sock variable. The socket
descriptor value can be 0 (according to accept API) or positive and so
immediately after calling the function it checks whether conn_sock < 0.
Later in the function when other things fail it jumps to the error label
and release previously allocated resources (such as socket or file).
During the resource release, it checks whether the variable conn_sock
containing the socket descriptor is positive and if it is, it releases
it. However, in this check it misses the case where conn_sock == 0, in
this case the socket will not be released and there will be a Resource
leak.
Extend the close condition for 0 value too.
Fixes:
e6cdc54cc0ef ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:54 +0000 (08:37 +0000)]
net/mlx5: remove unnecessary init in socket creation
In the mlx5_pmd_socket_handle function it calls the recvmsg function
which returns the number of bytes read. The function assigns this return
value into a ret variable defined at the beginning of the function.
Similarly in the mlx5_pmd_socket_init function the it calls the socket
function which returns a file descriptor for the new socket. The
function also assigns this return value into a ret variable defined at
the beginning of the function.
In both functions they initialize the variable when defining it,
however, in both cases they do not use any ret variable before assigning
the return value from the function, so the initialization is
unnecessary.
Clean the aforementioned unnecessary initializations.
Fixes:
e6cdc54cc0ef ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:53 +0000 (08:37 +0000)]
net/mlx5: fix hairpin Rx queue creation error path
The mlx5_rxq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_rq function.
In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.
Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.
Fixes:
e79c9be91515 ("net/mlx5: support Rx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 27 May 2020 08:37:52 +0000 (08:37 +0000)]
net/mlx5: fix hairpin Tx queue creation error path
The mlx5_txq_obj_hairpin_new function defines a pointer named tmpl and
allocates memory for it using the rte_zmalloc_socket function.
Later, this function allocates memory to a variable inside tmpl using
the mlx5_devx_cmd_create_sq function.
In both cases, if the allocation fails, the code jumps to the error
label and frees allocated resources. However, in the first jump there
are still no resources to free and the jump only for the line return
NULL is unnecessary. Even worse, when it jumps to error label with
invalid tmpl it actually does dereference to a null pointer.
In contrast, the second jump needs to free the tmpl variable but the
function instead of freeing, tries to free the variable that it just
failed to allocate, and another variable that has never been allocated.
In addition, for another error, the function returns NULL without
freeing the tmpl variable before, causing a memory leak.
Delete the error label and replace each jump with local return NULL and
free tmpl variable if needed.
Fixes:
ae18a1ae9692 ("net/mlx5: support Tx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Haiyue Wang [Thu, 28 May 2020 05:39:13 +0000 (13:39 +0800)]
net/ice: fix PCI DSN to lowercase
The PCI DSN (device serial number) to format package file name should be
lowercase values.
Fixes:
d1c91179e952 ("net/ice: check DSN package file firstly")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Yunjian Wang [Wed, 27 May 2020 12:11:20 +0000 (20:11 +0800)]
net/bnxt: fix missed unlock
Coverity issue: 357741
Fixes:
02a95625fe9c ("net/bnxt: add flow stats in extended stats")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Mike Baucom [Fri, 22 May 2020 23:55:01 +0000 (19:55 -0400)]
net/bnxt: fix mark action if rule is at index zero
In the ingress path, the cfa_code field in Rx completion identifies the
CFA action rule that was used for the incoming packet. It is possible
that the packet could hit the rule at index 0 in the table.
The mark action code was too restrictive by disallowing a cfa_code of
zero.
This code loosens the requirement and allows zero.
Fixes:
b87abb2e55cb ("net/bnxt: support marking packet")
Cc: stable@dpdk.org
Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Jeff Guo [Wed, 27 May 2020 07:16:50 +0000 (15:16 +0800)]
net/iavf: fix flow uninit
When closing VF device, the process of shutdown adminq should be after
the process of uninit the flow, since the VF might still need to use the
adminq to uninit flow.
Fixes:
9e03acd726cf ("net/iavf: fix flow access")
Fixes:
ff2d0c345c3b ("net/iavf: support generic flow API")
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>