Wei Zhao [Wed, 11 Oct 2017 08:55:32 +0000 (16:55 +0800)]
net/i40e: support queue region set and flush
This feature enable queue regions configuration for RSS in PF,
so that different traffic classes or different packet
classification types can be separated to different queues in
different queue regions.This patch can set queue region range,
it include queue number in a region and the index of first queue.
This patch enable mapping between different priorities (UP) and
different traffic classes.It also enable mapping between a region
index and a sepcific flowtype(PCTYPE).It also provide the solution
of flush all configuration about queue region the above described.
Wei Dai [Thu, 28 Sep 2017 02:28:33 +0000 (10:28 +0800)]
net/ixgbe: fix VFIO interrupt mapping in VF
When a VF port is bound to VFIO-PIC, only miscellaneous interrupt
is mapped to VFIO vector 0 in eth_ixgbevf_dev_init( ).
In ixgbevf_dev_start(), if previous VFIO interrupt mapping set in
eth_ixgbevf_dev_init( ) is not cleard, it will fail when calling
rte_intr_enable( ) tries to map Rx queue interrupt to other VFIO
vectors. This patch clears the VFIO interrupt mappings before
setting both miscellaneous and Rx queue interrupt mappings again
to avoid failure.
Fixes: 77234603fba0 ("net/ixgbe: support VF mailbox interrupt for link up/down") Cc: stable@dpdk.org Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Tested-by: Jianwei Ma <jianwei.ma@intel.com>
Wei Dai [Wed, 20 Sep 2017 10:18:13 +0000 (18:18 +0800)]
net/ixgbe: fix Rx queue interrupt mapping in VF
When a VF port is bound to VFIO-PCI, miscellaneous interrupt is
mapped to MSI-X vector 0 and Rx queues interrupt are mapped to
other vectors in vfio_enable_msix( ). To simplify implementation,
all VFIO-PCI bound ixgbe VF Rx queue interrupts can be mapped in
vector 1. And as current igb_uio only support only one vector,
ixgbe VF PMD should use vector 0 for igb_uio and vector 1 for
VFIO-PCI. Without this patch, VF Rx queue interrupt is mapped
to vector 0 in register settings and mapped to VFIO vector 1
in vfio_enable_msix( ), and then all Rx queue interrupts will
be missed.
Fixes: b13bfab4cdbe ("eal: reserve VFIO vector zero for misc interrupt") Cc: stable@dpdk.org Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Tested-by: Jianwei Ma <jianwei.ma@intel.com>
Matan Azrad [Tue, 10 Oct 2017 20:20:18 +0000 (20:20 +0000)]
ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.
Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.
This patch changes the stats_get API return value to be int instead of
void.
All the net PMDs stats_get dev ops are adjusted by this patch.
Ajit Khaparde [Tue, 10 Oct 2017 14:23:00 +0000 (09:23 -0500)]
net/bnxt: fix cleanup if a filter allocation fails
We are not checking if a filter allocation succeeded.
And we end up accessing a null pointer after that.
Also invalidate the fw_l2_filter_id to prevent unnecessary
HW access and hence HWRM command failures during exit.
Yongseok Koh [Tue, 10 Oct 2017 14:04:02 +0000 (07:04 -0700)]
net/mlx5: fix deadlock due to buffered slots in Rx SW ring
When replenishing Rx ring, there're always buffered slots reserved
between consumed entries and HW owned entries. These have to be filled
with fake mbufs to protect from possible overflow rather than
optimistically expecting successful replenishment which can cause
deadlock with small-sized queue.
Fixes: fc048bd52cb7 ("net/mlx5: fix overflow of Rx SW ring") Cc: stable@dpdk.org Reported-by: Martin Weiser <martin.weiser@allegro-packets.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Tested-by: Martin Weiser <martin.weiser@allegro-packets.com>
As it stands, the existing assignment to mbuf has no effect outside of
the function. Prior to this change, the mbuf argument would contain
an invalid address, but it would not be null. After this change, the
caller gets a null mbuf back.
This commit extends the testpmd application with new forwarding engine
that demonstrates the use of ethdev traffic management APIs and softnic
PMD for QoS traffic management.
In this mode, 5-level hierarchical tree of the QoS scheduler is built
with the help of ethdev TM APIs such as shaper profile add/delete,
shared shaper add/update, node add/delete, hierarchy commit, etc.
The hierarchical tree has following nodes; root node(x1, level 0),
subport node(x1, level 1), pipe node(x4096, level 2),
tc node(x16348, level 3), queue node(x65536, level 4).
During runtime, each received packet is first classified by mapping the
packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
queue within traffic class, and color) and storing it in the packet mbuf
sched field. After classification, each packet is sent to softnic port
which prioritizes the transmission of the received packets, and
accordingly sends them on to the output interface.
To enable traffic management mode, following testpmd command is used;
Yongseok Koh [Mon, 9 Oct 2017 18:46:59 +0000 (11:46 -0700)]
net/mlx5: fix configuration of Rx CQE compression
With the upstream rdma-core, to enable Rx CQE compression,
mlx5dv_create_cq() in Direct Verbs has to be used instead of regular
Verbs call (ibv_create_cq()). And if the size of CQE is 128 bytes,
compression is supported only by certain devices. Thus, it has to be
decided by checking the capability bits.
Yongseok Koh [Mon, 9 Oct 2017 18:46:58 +0000 (11:46 -0700)]
net/mlx5: match Rx completion entry size to cacheline
The size of Rx completion entry should match the size of a cacheline.
This is already reflected in struct mlx5_cqe by adding 64bytes padding
if a cacheline is 128bytes. Some ARM CPUs have 128bytes cacheline.
Yongseok Koh [Mon, 9 Oct 2017 18:46:57 +0000 (11:46 -0700)]
net/mlx5: separate shareable vector functions
Considering more architecture (e.g. ARM and PowerPC) will be added for
vectorized Rx/Tx burst, all the shareable functions which don't use any
vector intrinsics need to be separated from architecture-dependent
functions. All the vector functions for x86 SSE are moved to a new
header file - mlx5_rxtx_vec_sse.h. And shareable common functions are
now in mlx5_rxtx_vec.c.
Yongseok Koh [Mon, 9 Oct 2017 18:46:54 +0000 (11:46 -0700)]
net/mlx5: cleanup memory barriers
Updating a consumer index to HW doesn't require a memory barrier in case
that there's no updated data to be posted to HW, but a compiler barrier
is sufficient. rte_wmb() is replaced with rte_io_wmb() when it makes
changes visible to HW, not other core.
net/mlx5: handle RSS hash configuration in RSS flow
Add RSS support according to the RSS configuration.
A special case is handled, when the pattern does not cover the RSS hash
configuration request such as:
flow create 0 ingress pattern eth / end actions rss queues 0 1 end / end
In such situation with the default configuration of testpmd RSS i.e. IP,
it should be converted to 3 Verbs flow to handle correctly the request:
1. IPv4 flow, an extra IPv4 wildcard specification needs to be added in
the conversion.
2. IPv6 flow, same as for IPv4.
3. Ethernet followed by any other protocol on which no RSS can be
performed and thus the traffic will be redirected to the first queue
of the user request.
The same kind of issue is handled if the RSS is performed only on UDPv4
or UDPv6 or TCPv*.
This does not handle a priority conflict which can occurs if the user
adds several colliding flow rules. Currently in the example above, the
request is already consuming 2 priorities (1 for IPv4/IPV6 matching
rule priority and one for Ethernet matching rule priority + 1).
Moves ibv_attr containing the specification of the flow from Verbs point
of view also with the verbs flow itself near the related verbs objects
making the flow.
This is also a preparation to handle correctly the RSS hash
configuration provided by the user, has multiple Verbs flows will be
necessary for a single generic flow.
struct mlx5_flow_parse was commonly used with the name "flow" confusing
sometimes the development. The variable name is replaced by parser to
reflect its use.
In case the pattern contains an RSS actions, the RSS configuration to
use is the one provided by the user. To make the correct conversion
from DPDK RSS hash fields to Verbs ones according to the users requests
the actions must be processed first.
net/mlx5: fully convert a flow to verbs in validate
Validation of flows is only making few verifications on the pattern, in
some situation the validate action could end by with success whereas the
pattern could not be converted correctly.
This brings this conversion verification part also to the validate.
drivers/net/mlx5/mlx5_rxq.c:606:6: error: comparison of constant 4
with expression of type 'enum hash_rxq_flow_type' is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
i != (int)RTE_DIM((*priv->hash_rxqs)[0].special_flow);
~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Clang expects to have an index going upto special_flow size which is
defined by MLX5_MAX_SPECIAL_FLOWS and value is 4. Comparing to an
unrelated enum where index my be lower cause this compilation issue.
Hash Rx queue is an high level queue providing the RSS hash algorithm,
key and indirection table to spread the packets. Those objects can be
easily shared between several Verbs flows. This commit bring this
capability to the PMD.
Indirection table in verbs side resides in a list of final work queues
to spread the packets according to an higher level queue. This
indirection table can be shared among the hash Rx queues which points
to them.
Use the same design for DPDK queue as for Verbs queue for symmetry, this
also helps in fixing some issues like the DPDK release queue API which
is not expected to fail. With such design, the queue is released when
the reference counters reaches 0.
Use the same design for DPDK queue as for Verbs queue for symmetry, this
also helps in fixing some issues like the DPDK release queue API which
is not expected to fail. With such design, the queue is released when
the reference counters reaches 0.
net/mlx5: separate DPDK from verbs Tx queue objects
Move verbs object to their own functions to allocate/release them
independently from the DPDK queue. At the same time a reference counter
is added to help in issues detections when the queue is being release
but still in use somewhere else (flows for instance).
net/mlx5: separate DPDK from verbs Rx queue objects
Move verbs object to their own functions to allocate/release them
independently from the DPDK queue. At the same time a reference counter
is added to help in issues detections when the queue is being release
but still in use somewhere else (flows for instance).
This patch introduce the Memory region as a shared object where users
should get a reference to it by calling the priv_mr_get() or
priv_mr_new() to create the memory region. This last one will
register the memory pool in the kernel driver and retrieve the
associated memory region.
This should help to reduce the memory consumption cause by registering
multiple times the same memory pool.
The number of queues in DPDK does not means that the array of queue will be
totally filled, those information are uncorrelated. The number of queues
is provided in the port configuration whereas the array is filled by
calling tx/rx_queue_setup(). As this number of queue is not increased or
decrease according to tx/rx_queue_setup() or tx/rx_queue_release(), PMD
must consider a queue may not be initialised in some position of the array.
Reta update needs to stop/start the port but stopping the port does not
disable the polling functions which may end in a segfault if a core is
polling the queue while the control thread is modifying it.
This patch changes the sequences to an order where such situation cannot
happen.
Precompiler instructions #ifdef RTE_LIBRTE_I40E_PMD ... #endif
were not placed correctly, which caused number of
compilation errors if I40E PMD is disabled.
Mark Kavanagh [Mon, 9 Oct 2017 13:59:30 +0000 (14:59 +0100)]
net/bnxt: fix build
For gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7),
As of 5ef3b79fdfe6f, compilation of DPDK fails with the following
ERROR MESSAGE:
"bnxt_filter.c:960:117: error: ‘vnic’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]".
Resolve this by initializing 'vnic' to NULL;
Fixes: 5ef3b79fdfe6 ("net/bnxt: support flow filter ops") Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Phil Yang [Fri, 22 Sep 2017 06:21:32 +0000 (14:21 +0800)]
app/testpmd: fix quitting in container
In container, the process cannot be terminated by SIGINT/SIGTERM when
execute with 'stats-period' option.
Fixed by adding a flag to exit stats period loop after received either
SIGINT or SIGTERM.
Fixes: cfea1f3048d1 ("app/testpmd: print statistics periodically") Cc: stable@dpdk.org Signed-off-by: Phil Yang <phil.yang@arm.com> Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
The corrupted code didn't check the port availability when
it was trying to set the forward port IDs array.
However, when it was counting the number of ports, the availability
was checked by RTE_ETH_FOREACH_DEV iterator.
Hence, even when ETH devices ports were not in ATTACHED state,
the testpmd tried to forward traffic by them and got segmentation
fault at queue access time.
For example:
When EAL command line parameters include two devices, the first
is failsafe with two sub devices and the second is any device,
testpmd gets two devices by the iterator and sets for forwarding
both, the failsafe device and the failsafe first sub device
(instead of the second sub device).
After the first failsafe sub device state was changed to DEFERRED,
testpmd tries to forward traffic through the deferred device
because it didn't check the port availability in setting time.
The fix uses the RTE_ETH_FOREACH_DEV iterator for the forward
port IDs default setting.
Li Han [Tue, 22 Aug 2017 05:03:42 +0000 (13:03 +0800)]
app/testpmd: fix invalid port id parameters
in parse_ringnuma_config/parse_portnuma_config functions, port_id
should be less than RTE_MAX_ETHPORTS, but port_id_is_invalid check
assumes that port_id may be RTE_PORT_ALL (65535).
Also fix port_id storage size.
Fixes: 4468635fdd04 ("app/testpmd: forbid actions on invalid port") Cc: stable@dpdk.org Signed-off-by: Li Han <han.li1@zte.com.cn> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Daniel Mrzyglod [Fri, 18 Aug 2017 14:17:36 +0000 (16:17 +0200)]
app/testpmd: fix DDP package filesize detection
This issue was about passing unsigned argument where should be signed
number.
In reality this is about wrong usage of fseek and ftell to determine
filesize.
This patch is compliant to suggestions from FIO19-C:
"Do not use fseek() and ftell() to compute the size of a regular file"
Rasesh Mody [Sat, 7 Oct 2017 06:31:10 +0000 (23:31 -0700)]
net/qede/base: fix for VF malicious indication
IOV regression testing led to discovery of a minor issue + possibly race
in IOV flows:
a. Malicious indications in VF-database on PF-side get cleared during
FLR flows - but not when disabling SRIOV. At least in Linux if you
disable IOV while having a malicious VF you wouldn't be able to
clear the indication as driver would prevent from initializing it.
b. Possible race during PF response to VF - the channel is made ready
only after sending the rc via dmae to VF. It's possible due to
context switch at end of DMAE [when releasing Mutex] that VF would
start running and send another message prior to PF clearing the
channel, making the FW consider that VF to be malicious.
This patch fixes that by
- clearing the indication even if we're only going to disable VF
- resetting the channel to ready before PF copies the rc to the VF, PF
can then continue and send an additional message
Rasesh Mody [Sat, 7 Oct 2017 06:31:09 +0000 (23:31 -0700)]
net/qede/base: fix access to an uninitialized list
Fix an access to an uninitialized list when the management FW is not
initialized by simply doing the list initialization always, at a
previous step, before ecore_mcp_cmd_init() can stop in the middle and
return.
Rasesh Mody [Sat, 7 Oct 2017 06:31:06 +0000 (23:31 -0700)]
net/qede/base: code cleanup
- Remove some dead definitions, function declarations and unused
variables
- Remove an obsolete workaround from ecore_int_igu_enable()
- Remove set variables that are not used
- Remove needless check in ecore_init_wfq_param() when configuring
minimum vport BW. We already check whether total for all vports is
greater than the PF's, so no need to check independently the current
requested configuration as well.
Rasesh Mody [Sat, 7 Oct 2017 06:31:04 +0000 (23:31 -0700)]
net/qede/base: add/change/revise logs
Changes include:
- adding new log to print Dcbx version
- reformatting log in ecore_dmae_operation_wait() in case of DMA engine
failure
- changing verbosity of some log messages such as:
VFs incorrect behavior should be logged on PF with DP_VERBOSE(), not
DP_NOTICE(). In general keep IOV-related logs at low verbosity.
Log the critical issues from VF perspective, like message to PF
times-out or the PF rejects the VF configuration, as NOTICE than
VERBOSE
- Add a printout of some MCP CPU info in case of no MFW response
Rasesh Mody [Sat, 7 Oct 2017 06:31:02 +0000 (23:31 -0700)]
net/qede/base: add various OS abstraction macros
- Introduce OSAL_IOV_VF_VPORT_STOP to allow VF to carry out required
operations and prevent a potential assert before closing vport
- Add OSAL_DIV_S64() for 64-bit division on 32-bit platforms.
- Add OSAL for transceiver update OSAL_TRANSCEIVER_UPDATE()
- Add OSAL for MFW command preemption OSAL_MFW_CMD_PREEMPT() within the
spinning loops while sending a mailbox command to the MFW
- Implement OSAL_SPIN_LOCK_IRQSAVE macro
- Rename OSAL_NUM_ACTIVE_CPU() to OSAL_NUM_CPUS()