Yuanhan Liu [Fri, 14 Apr 2017 06:36:45 +0000 (14:36 +0800)]
net/virtio: fix link status always being up
The virtio port link status will always be UP, even the port is stopped:
testpmd> port stop 0
Stopping ports...
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Done
The link status is queried by link_update callback when LSC is disabled.
Which in turn queries the "status" field. However, the "status" is
read-only. I couldn't think of some proper ways to change the status
without doing device reset.
Instead of doing (the heavy) reset at stop, this patch introduced a flag,
which is set to 1 and 0 on start and stop, respectively. When it's set to
0, the link status is set to DOWN unconditionally.
Fixes: a85786dc816f ("virtio: fix states handling during initialization") Cc: stable@dpdk.org Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 17 Apr 2017 07:27:04 +0000 (15:27 +0800)]
vhost: fix use after free
A "return" is missing on error, which could lead to a "use after free"
issue (about var "conn").
Coverity issue: 143476 Fixes: 65388b43f592 ("vhost: fix fd leaks for vhost-user server mode") Reported-by: John McNamara <john.mcnamara@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jianfeng Tan [Thu, 13 Apr 2017 10:11:27 +0000 (10:11 +0000)]
net/virtio-user: fix feature negotiation
The feature negotiation in virtio-user is proven to be broken,
which results in device initialization failure.
Originally, we get features from vhost backend, and remove those
that are not supported. But when new feature is added, for example,
VIRTIO_NET_F_MTU, we fail to remove this new feature. Then, this
new feature will be negotiated, as both frontend and backend claim
to support this feature.
To fix it, we add a macro to record supported features, as a filter
to remove newly added features.
Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer") Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Xiao Wang [Tue, 11 Apr 2017 10:44:28 +0000 (03:44 -0700)]
net/virtio: fix queue notify
According to spec, we should write virtqueue index into the notify
address, rather than 1. Besides, some HW backend may rely on the data
written to identify which queue need to serve.
Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0") Cc: stable@dpdk.org Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
With the Enhanced multi packet send addition, the defaults were made
in order to get the maximum out of the box performance.
Features like tso, don't use the enhanced send, however the defaults
are still valid. This cause Tx queue creation to fail.
Currently the argument process is done without indication which
parameter was forced by the application and which one is on it
default value.
This becomes problematic when different features requires different
defaults. For example, Enhanced multi packet send and TSO.
This commit modifies the argument process, enabling to differ
which parameter was forced by the application.
Andrew Rybchenko [Tue, 18 Apr 2017 13:03:08 +0000 (14:03 +0100)]
net/sfc: correct RSS hash availability condition
RSS hash is computed by hardware if corresponding Rx filter (for
example, default Rx filters) has RSS flag set which is set if
the number of RSS channels is greater than zero.
Fixes: 4ec1fc3ba881 ("net/sfc: add basic stubs for RSS support on driver attach") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Tue, 18 Apr 2017 13:03:06 +0000 (14:03 +0100)]
net/sfc: reset RSS channels back to 0 on close
Fixes: 4ec1fc3ba881 ("net/sfc: add basic stubs for RSS support on driver attach") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Tue, 18 Apr 2017 12:51:01 +0000 (13:51 +0100)]
net/sfc: remove logically dead code
Coverity issue: 1419717 Fixes: a9825ccf5bb8 ("net/sfc: support flow API filters") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Current implementation is error-prone if the max inline size
(txq->max_inilne) is decoupled from txq->inline_en and becomes zero. If it
becomes zero, HW can crash due to WQ overflow.
Qiming Yang [Thu, 13 Apr 2017 03:08:33 +0000 (11:08 +0800)]
doc: add known igb_uio issue for i40e
When insmod "igb_uio" with "intr_mode=legacy and test link
status interrupt. Since INTx interrupt is not supported by
X710/XL710/XXV710, it will cause Input/Output error when
reading file descriptor.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by Jingjing Wu <jingjing.wu@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>
This patch corrects the description on Physical and Virtual Function
Infrastructure of Intel NICs. The RSS part description should belong
to ixgbe but not i40e.
This patch also add more notes to describe the queue number on Intel
X710/XL710 NICs.
Jerin Jacob [Fri, 14 Apr 2017 09:41:07 +0000 (15:11 +0530)]
net/thunderx: reduce writes to mbuf
With the mbuf rework, we now have 8 contiguous bytes to be
rearmed in the mbuf at 8B naturally aligned address.
Use single 8B write to avoid multiple 2B writes in Rx path.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Charles Myers [Thu, 13 Apr 2017 22:15:24 +0000 (12:15 -1000)]
net/mlx4: fix Rx after mbuf alloc failure
Fixes issue where mlx4 driver stops receiving packets when mbuf
allocation fails in mlx4_rx_burst().
This issue appears to be caused because the code doesn't recycle the
existing mbuf to the sges array when mbuf allocation fails as is done
in the code right above it which handles (wc.status != IBV_WC_SUCCESS).
Copying the code from the above case fixes the issue.
Fixes: acac55f16412 ("mlx4: use MOFED 3.0 fast verbs interface for Rx operations") Cc: stable@dpdk.org Signed-off-by: Charles Myers <charles.myers@spirent.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Result:
EAL: Initializing pmd_bond for eth_bond0
EAL: Create bonded device eth_bond0 on port 4 in mode 2 on socket 0.
rte_eth_dev_configure: ethdev port_id=4 \
max_rx_pkt_len 9018 > max valid value 2048
It's expected that slaves will be added to bonded device inside
'rte_eth_dev_configure' and proper 'max_rx_pktlen' configured
for all of them.
Failure happens because of hardcoded low value of 'max_rx_pktlen'.
Increasing of this value to ETHER_MAX_JUMBO_FRAME_LEN will allow
above scenario (attach + configure).
It is important because it is the way OVS wants to work with
all DPDK devices (including virtual).
Changing the default hardcoded value makes no harm because
all the slaves' related code uses only 'candidate_max_rx_pktlen'
variable.
Some applications like OVS knows nothing about the
device type and wants to use same API to work with
all of them. But bond_pmd, unlike other pmds, requires
additional step (removing of all the slaves) before
closing the device.
In fact that bond_pmd automatically adds all the
devices from kvargs to bonding on configuration it
also should remove all of them on close.
This change is intended to have the same API for physical
and virtual devices. It allows us to handle virtual
devices in OVS in a common way.
DPDK community has several emails discussion on this topic,
these mails link is bellow:
http://dpdk.org/ml/archives/dev/2017-March/060379.html,
http://dpdk.org/ml/archives/dev/2017-March/060295.html,
items like VLAN can already have several valid "types"
(0x88a8, 0x8100, 0x9100), and who knows what will come up
in the future.
And ixgbe_flow just ignores the types when do filter configuration.
So it may be reasonable to delete the related tpid check process.
Also add some more comment log on stack explanation.
Since the queue release API does not allow failures (return value is void)
and the flow API does not allow a queue to be released as long as a flow
rule depends on it, the only rational decision to avoid undefined behavior
is to panic in this situation.
VF link status rely on PF's notification, so when PF link status
be updated, it should notify VF to update link status also.
Current implementation only cover part of the situation when PF's link
status is updated, call i40e_notify_all_vfs_link_status in
i4e_dev_link_update will cover all situationa.
The patch adds 4 APIs to support configurable
PTYPE mapping for i40e device.
rte_pmd_i40e_ptype_mapping_get.
rte_pmd_i40e_ptype_mapping_replace.
rte_pmd_i40e_ptype_mapping_reset.
rte_pmd_i40e_ptype_mapping_update.
The mapping from hardware defined packet type to software defined packet
type can be updated/reset/read out with these APIs.
Also a software ptype with the most significent bit set will be regarded
as a user defined ptype (RTE_PMD_I40E_PTYPE_USER_DEFINE_MASK) so
application can use it to defined its own PTYPE naming system.
The mapping from hardware defined packet type to software defined
packet type is static for i40e device, the patch let each ethdev to
to have their own copy of mapping table, this give the possibility
that different ethdev can be set different PTYPE mapping rule which
is the requirement to support following hardware's dynamic PTYPE
feature.
Add a section in NIC drivers documentation to explain compiling and
testing of a PMD. It also mentions about host setup, which is required
before running testpmd.
Add label "testpmd_ug" to refer user guide.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: John McNamara <john.mcnamara@intel.com>
As the hardware determines which core will process which packet,
performance is boosted by direct cache warming/stashing as well
as by providing biasing for core-to-flow affinity, which ensures
that flow-specific data structures can remain in the core’s cache.
This patch enables the one cache line data stashing for packet
annotation data and packet context
DPAA2 Hardware Mempool handlers allow enqueue/dequeue from NXP's
QBMAN hardware block.
CONFIG_RTE_MBUF_DEFAULT_MEMPOOL_OPS is set to 'dpaa2', if the pool
is enabled.
This memory pool currently supports packet mbuf type blocks only.
This patch adds generic functions for allowing dq storage
for the frame queues.
As the frame queues are common resource for different drivers
this is helpful.
Before DPAA2 devices can communicate using hardware queues, this patch
adds queue definitions in the FSLMC bus which the DPAA2 devices would
instantiate.
The portal driver is bound to DPIO objects discovered on the fsl-mc bus and
provides services that:
- allow other drivers, such as the Ethernet driver, to enqueue and dequeue
frames for their respective objects
A system will typically allocate 1 DPIO object per CPU to allow queuing
operations to happen simultaneously across all CPUs.
This patch will add support in fslmc vfio process to
scan and parse the dpni and dpseci object for net and crypto
devices. It will add the scanned devices to the fslmc bus.
Add support for using VFIO for dpaa2 based fsl-mc bus.
There are some differences in the way vfio used for fsl-mc bus
from the eal vfio.
- The scanning of bus for individual objects on the basis of
the DPRC container.
- The use and mapping of MC portal for object access
With the evolution of bus model, they can be further aligned with
eal vfio code.
This patch introduces the DPAA2 MC(Management complex Driver).
This is a minimal set of low level functions to send and
receive commands to the fsl-mc. It includes support for basic
management commands and commands to manipulate MC objects.
This is common to be used by various DPAA2 PMDs. e.g.net, crypto
and other drivers.
QBMAN, is a hardware block which interfaces with the other
accelerating hardware blocks (For e.g., WRIOP) on NXP's DPAA2
SoC for queue, buffer and packet scheduling.
This patch introduces a userspace driver for interfacing with
the QBMAN hw block.
The qbman-portal component provides APIs to do the low level
hardware bit twiddling for operations such as:
-initializing Qman software portals
-building and sending portal commands
-portal interrupt configuration and processing
This same/similar code is used in kernel and compat file is used
to make it working in user space.
Michal Krawczyk [Mon, 10 Apr 2017 14:28:11 +0000 (16:28 +0200)]
net/ena: calculate partial checksum if DF bit is disabled
When TSO is disabled we still have to calculate partial checksum if DF bit
if turned off. This is caused by firmware bug.
First of all, we must make sure that we are dealing with IPV4 packet.
If not, we will just skip further checking of this packet and move to
the next one.
If application will not set m2_len field, we assume we that it was Ethernet
frame because we have to look inside the packet to check for the DF flag.
To make it work properly, PMD is assuming that before sending
packet application called function rte_eth_tx_prepare().
Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Jakub Palider <jpalider@gmail.com> Acked-by: Jan Medala <jan.medala@outlook.com>
Michal Krawczyk [Mon, 10 Apr 2017 14:28:10 +0000 (16:28 +0200)]
net/ena: cleanup if refilling of Rx descriptors fails
If wrong number of descriptors for refilling was passed to the Rx
repopulate function, there was memory leak which caused memory pool to
run out of resources in longer go.
In case of fail when refilling Rx descriptors, all additional mbufs
have to be released.
Fixes: 1173fca25af9 ("ena: add polling-mode driver") Cc: stable@dpdk.org Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Jakub Palider <jpalider@gmail.com> Acked-by: Jan Medala <jan.medala@outlook.com>
Michal Krawczyk [Mon, 10 Apr 2017 14:28:09 +0000 (16:28 +0200)]
net/ena: fix delayed cleanup of Rx descriptors
On RX path, after receiving bunch of packets, variable tracking
available descriptors in HW queue was not updated.
To fix this issue, variable tracking used descriptors must be updated
after receiving packets - it must be reduced by the amount of received
descriptors in current batch.
Additionally, variable next_to_clean in rx_ring must be updated before
entering ena_populate_rx_queue() to keep it up to date with the current
ring state.
Fixes: 1daff5260ff8 ("net/ena: use unmasked head and tail") Cc: stable@dpdk.org Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Jakub Palider <jpalider@gmail.com> Acked-by: Jan Medala <jan.medala@outlook.com>