Adam Dybkowski [Tue, 23 Jul 2019 09:53:28 +0000 (11:53 +0200)]
app/compress-perf: prevent output buffer overflow
This patch fixes the issue of memory overwrite after the end of
the output buffer by calculating its size as the number of all
segments multipled by the output segment size.
Additionally buffer overflow errors returned by PMD driver are
detected and shown, ending the test.
Also the output buffer size multiplier was increased from 105%
to 110% to allow running the tests on noncompressible files that
expand to over 107% of original size during the compression.
The changes were made in the verification part of the flow and
they don't affect the benchmark results.
Fixes: 424dd6c8c1 ("app/compress-perf: add weak functions for multicore test") Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Artur Trybula [Wed, 24 Jul 2019 13:55:15 +0000 (15:55 +0200)]
app/compress-perf: improve results report
This patch adds extra features to the compress performance
test. Some important parameters (memory allocation,
number of ops, number of segments) are calculated and
printed out.
Information about threads, cores, devices and queue-pairs
is also printed.
Signed-off-by: Artur Trybula <arturx.trybula@intel.com> Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Bruce Richardson [Tue, 23 Jul 2019 16:26:48 +0000 (17:26 +0100)]
raw/ioat: fix include quotes
Some builds with clang report an error because '<>' rather than '""' were
used for including the ioat spec header file.
Target: x86_64-native-bsdapp-clang
error: 'rte_ioat_spec.h' file not found with <angled> include; use "quotes" instead
#include <rte_ioat_spec.h>
^~~~~~~~~~~~~~~~~
"rte_ioat_spec.h"
1 error generated.
Since this file should always be in the same directory as the main header,
we can safely change the include line to fix this error.
Fixes: abff4333ec20 ("raw/ioat: create device on probe and destroy on release") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When using IOVA as VA mode, there is no need to map segments
page by page. This normally isn't a problem, but it becomes one
when attempting to use DPDK in no-huge mode, where VFIO subsystem
simply runs out of space to store mappings.
Fix this for x86 by triggering different callbacks based on whether
IOVA as VA mode is enabled.
Fixes: 73a639085938 ("vfio: allow to map other memory regions") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Andrius Sirvys <andrius.sirvys@intel.com>
Andrew Rybchenko [Tue, 23 Jul 2019 12:11:21 +0000 (13:11 +0100)]
ethdev: avoid getting uninitialized info for bad port
rte_eth_dev_info_get() returns void and caller does know if the function
does its job or not. Changing of the return value to int would be
API/ABI breakage which requires deprecation process and cannot be
backported to stable branches. For now, make sure that device info is
initialized even in the case of invalid port ID.
Fixes: a30268e9a2d0 ("ethdev: reset whole dev info structure before filling") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
NVGRE has a GRE header with c_rsvd0_ver value 0x2000 and protocol
value 0x6558.
These should be matched when item_nvgre is provided.
This patch adds validation function of NVGRE item.
It also updates the translate function of NVGRE item, to add the
required values, if they were not specified.
Original work by Xiaoyu Min <jackmin@mellanox.com>
Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Xiaoyu Min <jackmin@mellanox.com>
LRO message is contained in the MPRQ strides.
While the LRO message size cannot be bigger than 65280 according to the
PRM, the strides which contain it may be bigger than the maximum buffer
size allowed in dpdk mbuf - 0xFFFF.
Adjust the maximum LRO message size to avoid buffer length overflow.
As an arrangement to the LRO support when a packet can consume all the
stride memory, the external mbuf shared information cannot be anymore
in the end of the stride, because the HW may write the packet data to
all the stride memory.
Move the shared information memory from the stride to the control
memory of the external mbuf.
Function mlx5_rxq_obj_new(), previously called mlx5_rxq_ibv_new(),
supports creating Rx queue objects using verbs.
This patch expands the relevant functions, to support creating
verbs or DevX Rx queue objects:
Function mlx5_rxq_obj_new() updated to create RQ object using DevX.
Function mlx5_ind_table_obj_new() updated to create RQT object using DevX.
Function mlx5_hrxq_new() updated to create TIR object using DevX.
New utility functions added to perform specific operations:
mlx5_devx_rq_new(), mlx5_devx_wq_attr_fill(),
mlx5_devx_create_rq_attr_fill().
net/mlx5: store protection domain number on create
Function mlx5_alloc_shared_ibctx() allocates Protection Domain using
verbs API, as part of shared IB device context.
This patch adds reading and storing of pdn value from the created PD
object, using DV API.
The pdn value is required when creating WQ using DevX API.
This patch also updates function flow_dv_create_counter_stat_mem_mng()
which uses the pdn value as well.
Prepare for introducing use of DevX TIR object.
Hash Rx queue is currently created using verbs QP only.
The next patches will add the option to create it with a TIR object
using DevX.
This patch renames hrxq_ibv to hrxq wherever relevant, and adds
the DevX items to relevant structs.
Prepare for introducing of DevX RQT object.
Rx indirection table object is currently created using verbs only.
The next patches will add the option to create an RQT object using
DevX.
This patch renames ind_table_ibv to ind_table_obj wherever relevant,
and adds the DevX items to relevant structs.
Prepare for introducing of DevX RxQ object.
RxQ object is currently created using verbs only.
The next patches will add the option to create RxQ object using DevX.
This patch renames rxq_ibv to rxq_obj wherever relevant, and adds the
DevX items to relevant structs.
When using DevX API, memory for door-bell records should be allocated
by PMD and registered using DevX API.
This patch implements the utility functions to support it:
- Add struct mlx5_devx_dbr_page, containing door-bells page data.
- Add list of struct mlx5_devx_dbr_page door-bell pages to device
private data.
- Implement function mlx5_alloc_dbr_page() to allocate page for
door-bell records, and register it using DevX API.
- Implement function mlx5_get_dbr(). to acquire a door-bell record
from the door-bells page, allocating a new page if needed.
- Implement function mlx5_release_dbr() to release a door-bell
record that is no longer needed, freeing the containing page if
it becomes empty.
Update function mlx5_txq_ibv_new(), query and store the TIS
transport domain value.
It is required later on Rx side when creating matching TIR.
Add field in mlx5 data structure to store Transport Domain ID.
Use DevX API to read device LRO capabilities.
Check if LRO is supported and can be enabled.
Check if MPRQ is supported and can be used.
Enable MPRQ for LRO use if not enabled by user.
Added note for mlx5_mprq_enabled(), to emphasize that LRO
enables MPRQ.
Disable CQE compression and CRC stripping if LRO is enabled.
Add compile option HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR, and matching
dest_tir flag in device configuration structure.
Add glue function pointer dv_create_flow_action_dest_devx_tir, and
function mlx5_glue_dv_create_flow_action_dest_devx_tir(),
to invoke API mlx5dv_dr_action_create_dest_devx_tir();
Add command-line argument to set LRO session timeout.
Add LRO settings struct in PMD configuration struct.
Add support of LRO offload in port configuration.
Add macros and function to check if LRO is supported and enabled.
MAC address parsing was causing failure [1],
this patch partially reverts the commit
commit b5ddce8959b2 ("app/testpmd: use new ethernet address parser")
[1]
testpmd> flow validate 0 priority 2 ingress group 0 pattern eth dst
is 98:03:9B:5C:D9:00 / end actions queue index 0 / end
Bad arguments
Fixes: b5ddce8959b2 ("app/testpmd: use new ethernet address parser") Reported-by: Raslan Darawsheh <rasland@mellanox.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Raslan Darawsheh <rasland@mellanox.com>
Xiaolong Ye [Mon, 22 Jul 2019 12:06:37 +0000 (20:06 +0800)]
net/i40e: fix ethernet flow rule
i40e FDIR doesn't allow to create flow with empty spec and mask for
ethertype pattern. Without this patch, below flow would be created
successfully which is unexpected.
> flow create 0 ingress pattern eth / end actions drop / end
Fixes: 7d83c152a207 ("net/i40e: parse flow director filter") Cc: stable@dpdk.org Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Beilei Xing <beilei.xing@intel.com>
Krzysztof Kanas [Mon, 22 Jul 2019 14:58:51 +0000 (16:58 +0200)]
net/octeontx2: fix driver reconfiguration
When configure returns error, e.g. in case not supported offloads
(outer ip and sctp) driver released Rx,Tx queues. Then in case of
correct configuration the driver could not start due to queues already
released but the driver thought it was configured correctly.
Secondly if driver returns error from configuration librte_ethdev will
release, rx queues and tx queues, without chaining driver configured
state.
Fix that by 'releasing' configuration and changing driver state when
error is returned from otx2_nix_configure.
Fixes: 548b5839a32b ("net/octeontx2: add device configure operation") Signed-off-by: Krzysztof Kanas <kkanas@marvell.com> Reviewed-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
This patch resets frc and ctrl in sg tx fd to avoid corruption.
Fixes: 774e9ea91992 ("net/dpaa2: add support for multi seg buffers") Cc: stable@dpdk.org Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The associated device index is retrieved via Netlink request to
underlying Infiniband device driver. This network device index
is permanent throughout the lifetime of device. We do not
spawn the rte_eth_dev ports without associated network device, and
if network device is being unbound we get the remove notification
message and rte_eth_dev port is also detached. So, we may store
the ifindex in mlx5_device_spawn() routine at rte_eth_dev port
creation and initialization time and use the cached value further
instead of doing actual Netlink request.
This patch fills the tx_desc_lim.nb_seg_max and
tx_desc_lim.nb_mtu_seg_max fields of rte_eth_dev_info
structure to report thee maximal number of packet
segments, requested inline data configuration is
taken into account in conservative way.
This patch adds the implementation of tx_burst routine template.
The template supports all Tx offloads and multiple optimized
tx_burst routines can be generated by compiler from this one.
Mellanox NICs support the wide set of Tx offloads. The supported
offloads are reported by the mlx5 PMD in rte_eth_dev_info
tx_offload_capa field.
An application may choose any combination of supported offloads
and configure the device appropriately. Some of Tx offloads may be
not requested by application, or ever all of them may be omitted.
Most of the Tx offloads require some code branches in tx_burst routine
to support ones. If Tx offload is not requested the tx_burst routine
code may be significantly simplified and consume less CPU cycles.
For example, if application does not engage TSO offload this code
can be omitted, if multi-segment packet is not supposed the tx_burst
may assume single mbuf packets only, etc.
Currently, the mlx5 PMD implements multiple tx_burst subroutines
for most common combinations of requested Tx offloads, each branch
has its own dedicated implementation. It is not very easy to update,
support and develop such kind of code - multiple branches impose
the multiple points to process. Also many of frequently requested
offload combinations are not supported yet. That leads to selecting of
not completely matching tx_burst routine and harms the performance.
This patch introduces the new approach for tx_burst code. It is proposed
to develop the unified template for tx_burst routine, which supports
all the Tx offloads and takes the compile time defined parameter
describing the supposed set of supported offloads. On the base
of this template, the compiler is able to generate multiple tx_burst
routines highly optimized for the statically specified set of
Tx offloads.
Next, in runtime, at Tx queue configuration the best matching optimized
implementation of tx_burst is chosen.
This patch intentionally omits the template internal implementation,
but just introduces the template itself to emboss the approach of
the multiple specially tuned tx_burst routines.
This patch extends the NIC attributes query via DevX.
The appropriate interface structures are borrowed from
kernel driver headers and DevX calls are added to
mlx5_devx_cmd_query_hca_attr() routine.
This patch updates Tx datapath definitions, mostly hardware related.
The Tx descriptor structures are redefined with required fields,
size definitions are renamed to reflect the meanings in more
appropriate way. This is a preparation step before introducing
the new Tx datapath implementation.
This patch introduces new mlx5 PMD devarg options:
- txq_inline_min - specifies minimal amount of data to be inlined into
WQE during Tx operations. NICs may require this minimal data amount
to operate correctly. The exact value may depend on NIC operation
mode, requested offloads, etc.
- txq_inline_max - specifies the maximal packet length to be completely
inlined into WQE Ethernet Segment for ordinary SEND method. If packet
is larger the specified value, the packet data won't be copied by the
driver at all, data buffer is addressed with a pointer. If packet
length is less or equal all packet data will be copied into WQE.
- txq_inline_mpw - specifies the maximal packet length to be completely
inlined into WQE for Enhanced MPW method.
This patch removes the existing Tx datapath code
as preparation step before introducing the new
implementation. The following entities are being
removed:
The following devargs are deprecated and ignored:
- "txq_inline" is going to be converted to "txq_inline_max"
for compatibility issue
- "tx_vec_en"
- "txqs_max_vec"
- "txq_mpw_hdr_dseg_en"
- "txq_max_inline_len" is going to be converted
to "txq_inline_mpw" for compatibility issue
The deprecated devarg keys are recognized by PMD
and ignored/converted to the new ones in order not
to block device probing.
In functions flow_dv_translate() and flow_dv_validate(), the flow
items are scanned and each item is marked in item_flags bitmap.
The code handling some of the items was ported from another project,
where items are marked in a slightly different manner.
This patch fixes the setting of items in bitmap, adapting it to the
required manner.
Fixes: d53aa89aea91 ("net/mlx5: support matching on ICMP/ICMP6") Fixes: 5865955ad994 ("net/mlx5: match GRE key and present bits") Fixes: 2e4c987aad91 ("net/mlx5: validate Direct Rule E-Switch") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Xiaoyu Min <jackmin@mellanox.com>
net/e1000: fix buffer overrun while i219 processing DMA
Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.
Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html
net/bnxt: disable vector mode Tx with VLAN offload
The vector mode transmit path does not currently support VLAN tag
insertion, so we need to disable vector transmit when transmit
VLAN insertion offload is enabled.
We were adding the VLAN filters to all the VNICs of the function.
Also, we were adding these VLANs to all the existing MAC only filters.
This was resulting in fewer VLANs getting added. By default we should
allocate MAC+VLAN filter only to the default VNIC of the function using
the default mac address.
Similar logic was followed in the VLAN deletion code. This patch fixes
it. Use inner VLAN fields instead of outer VLAN during filter deletion
to be in sync with VLAN addition code.
Fixes: 246c5cc5f05e ("net/bnxt: use correct flags during VLAN configuration") Cc: stable@dpdk.org Signed-off-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The current implementation erroneously passes the address of the
beginning of RSS table for each 64-entry context instead of the
address of the appropriate suitable for the context. This results
in only the first 64 receive queues being used. Fix by passing the
correct address for each context.
BCM57500-based adapters use a variable number of RSS contexts
depending upon the number of receive rings in use. The current
implementation is erroneously using the maximum possible number
of RSS contexts instead of the actual number allocated when
setting up RSS tables in the adapter. Fix by using the actual
number of allocated contexts.
In bnxt_hwrm_vnic_rss_cfg_thor, we were exiting if hash_type is 0.
This was preventing RSS getting disabled. Fixing it by removing the
check for hash_type while configuring RSS.
Kalesh AP [Thu, 18 Jul 2019 03:36:05 +0000 (09:06 +0530)]
net/bnxt: fix error checking of FW commands
HWRM_CHECK_RESULT() checks the return value of HWRM command and returns
in case the command fails. There is no need of return value check after
HWRM_CHECK_RESULT().
Fixes: 49947a13ba9e ("net/bnxt: support Tx loopback, set VF MAC and queues drop") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Lance Richardson <lance.richardson@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
rte_intr_callback_unregister() can fail if the handler happens to
be active at the time of the call. Add logic to retry a reasonable
number of times to help ensure that the callback is unregistered
on uninit.
Fixes: 7bc8e9a227cc ("net/bnxt: support async link notification") Cc: stable@dpdk.org Signed-off-by: Lance Richardson <lance.richardson@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Thu, 18 Jul 2019 03:36:01 +0000 (09:06 +0530)]
net/bnxt: reset filters before registering interrupts
If interrupt registration fails during device init, driver invokes
uninit which in turn causes error messages while trying to free
vnic filters. Fix this by moving filter initialization call before
interrupt registration.
Fixes: 1b533790f44e ("net/bnxt: avoid invalid vnic id in set L2 Rx mask") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Thu, 18 Jul 2019 03:36:00 +0000 (09:06 +0530)]
net/bnxt: fix device init error path
1. bnxt_dev_init() invokes bnxt_dev_uninit() on failure. So there is
no need to do individual function cleanups in failure path.
2. rearrange the check for primary process to remove an unwanted goto.
3. fix to invoke bnxt_hwrm_func_buf_unrgtr() in bnxt_dev_uninit() when
it is needed.
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Thu, 18 Jul 2019 03:35:59 +0000 (09:05 +0530)]
net/bnxt: fix setting primary MAC address
1. Default filter is tied to VNIC 0 at index 0. After finding the filter
with mac_index 0 and set the new MAC address, looping through
remaining filters is unnecessary.
2. Added a check for NULL MAC address.
3. bnxt_hwrm_set_l2_filter() clears the existing filter configuration
first before applying new filter settings. Hence there is no need to
invoke bnxt_hwrm_clear_l2_filter() explicitly in
bnxt_set_default_mac_addr_op().
Fixes: d69851df12b2 ("net/bnxt: support multicast filter and set MAC addr") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Thu, 18 Jul 2019 03:35:55 +0000 (09:05 +0530)]
net/bnxt: fix error handling in port start
1. during port start, if bnxt_init_chip() return error
bnxt_dev_start_op() invokes bnxt_shutdown_nic() which in turn calls
bnxt_free_all_hwrm_resources() to free up resources. Hence remove the
bnxt_free_all_hwrm_resources() from bnxt_init_chip() failure path.
2. fix to check the return value of rte_intr_enable() as this call
can fail.
3. set bp->dev_stopped to 0 only when port start succeeds.
4. handle failure cases in bnxt_init_chip() routine to do proper
cleanup and return correct error value.
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
net: be more restrictive with ethernet address format
The current ether_unformat_addr code was based off of
BSD ether_aton. That version changed what was allowed
by the cmdline ether address parser.
For example, it allows dropping leading zeros.
Change the code to be more restrictive and only allow the fully
expanded standard formats.
Bugzilla ID: 324 Fixes: 596d31092d32 ("net: add function to convert string to ethernet address") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
When NVM API version is 1.7 or above adminq operation to set TPID is
set as supported. This cause using adminq instead of registers.
For SFP X722 FW4.16, reported NVM API version is 1.8, and this cause
adminq operation to set as supported but it is not supported on FW4.16
Additional check added for SFP X722 to not enable adminq operation.
Fixes: 73cd7d6dc8e1 ("net/i40e: use set switch AQ instead of register setting") Cc: stable@dpdk.org Signed-off-by: Xiao Zhang <xiao.zhang@intel.com> Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Ying A Wang [Thu, 18 Jul 2019 01:38:42 +0000 (09:38 +0800)]
net/ice: fix flow action validation
Action is a list. We should check each element of the action
rather than the first one.
This patch fixes this issue.
Fixes: d76116a4678f ("net/ice: add generic flow API") Signed-off-by: Ying A Wang <ying.a.wang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com> Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Some configuration options can not be tested properly with testpmd
because it automatically starts all ports. This makes it harder
to test driver handling of configuration options:
(for example rx_deferred_start).
Add new command line flag --disable-device-start which skips
the device start. The port can then be started manually later.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>