Kumar Sanghvi [Sat, 10 Mar 2018 22:48:22 +0000 (04:18 +0530)]
net/cxgbe: add probe to initialize VF devices
Add probe to initialize VF devices. Separate init/de-init paths
for PF and VF. Do firmware state initialization wrt VF and retrieve
various operational parameters by querying firmware. Finally configure
and initialize ports.
Kumar Sanghvi [Sat, 10 Mar 2018 22:48:20 +0000 (04:18 +0530)]
net/cxgbe: add VF firmware mailbox
Add firmware mailbox communication support for VF. Add is_pf4()
to check if driver is attached to PF4. Use is_pf4() to determine
whether to use PF or VF mailbox communication.
Shahaf Shuler [Mon, 26 Mar 2018 10:12:19 +0000 (13:12 +0300)]
net/mlx5: fix RSS key length query
The RSS key length returned by rte_eth_dev_info_get command was taken
from the
PMD private structure. This structure initialization was done only after
the port configuration.
Considering Mellanox device supports only 40B long RSS key, reporting
the fixed number instead.
Fixes: 29c1d8bb3e79 ("net/mlx5: handle a single RSS hash key for all protocols") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Dahir Osman [Wed, 21 Mar 2018 12:47:51 +0000 (13:47 +0100)]
net/mlx5: setup RSS regardless of queue count
In some environments it is desirable to have the NIC perform RSS
normally on the packet regardless of the number of queues configured.
The RSS hash result that is stored in the mbuf can then be used by
the application to make decisions about how to distribute workloads
to threads, secondary processes, or even virtual machines if the
application is a virtual switch. This change to the mlx5 driver
aligns with how other drivers in the Intel family work.
Yunjian Wang [Tue, 20 Mar 2018 07:01:24 +0000 (15:01 +0800)]
net/i40e: fix intr callback unregister by adding retry
The nic's interrupt source has some active callbacks, when
the port hotplug. Add a retry to give more port's a chance
to uninit before returning an error.
Fixes: d42aaf30008b ("i40e: support port hotplug") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Somnath Kotur [Mon, 26 Mar 2018 03:22:06 +0000 (08:52 +0530)]
net/bnxt: fix flow director with same cmd different queue
When user reissues same flow director cmd with a different queue
update the existing filter to redirect flow to the new desired
queue as destination just like the other filters like 5 tuple and
generic flow.
Roman Zhukov [Sat, 24 Mar 2018 06:42:23 +0000 (06:42 +0000)]
net/sfc: fix type of opaque pointer in perf profile handler
The 'opaque' pointer in handler function is the last argument
of sfc_kvargs_process() function and it is pointer to the adapter
'evq_flags' that has a uint32_t type. So 'value' must be pointer
to uint32_t.
Fixes: c22d3c508e0c ("net/sfc: support parameter to choose performance profile") Cc: stable@dpdk.org Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andrew Rybchenko [Mon, 19 Mar 2018 07:50:11 +0000 (07:50 +0000)]
net/sfc: fix mbuf data alignment calculation
Unlike ffs() rte_bsf32() counts bit position from 0.
Fixes: 0c7a0c35f24c ("net/sfc: calculate Rx buffer size which may be used") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ajit Khaparde [Wed, 28 Feb 2018 22:12:36 +0000 (14:12 -0800)]
net/bnxt: fix LRO disable
When the vnic_tpa_cfg HWRM command is sent to the FW,
we are not passing the VNIC ID in case of disable.
This can cause the FW to return an error.
Correct VNIC ID needs to be passed for both enable and disable.
Ivan Malov [Wed, 21 Mar 2018 11:28:21 +0000 (11:28 +0000)]
net/sfc: add dynamic log level for MCDI messages
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Ivan Malov [Wed, 21 Mar 2018 11:28:20 +0000 (11:28 +0000)]
net/sfc: remove dedicated init log parameter
The previous patches in the set convert per-port
logging to use NOTICE level and make this level default.
This provides the possibility to remove the dedicated
toggle for init-related messages and merge init logging
with the main log type. In order to keep these logs silent
by default, INFO level should be used.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Ivan Malov [Wed, 21 Mar 2018 11:28:19 +0000 (11:28 +0000)]
net/sfc: prepare to merge init logs with main log type
Conversion to dynamic logging done in the previous patches
makes it possible to simplify internal controls for init
logging. This patch allows to prepare for such a change.
It makes init-unrelated messages use NOTICE level so that
the following patch will be able to convert init logging
to use INFO level and remain silent by default.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Ivan Malov [Wed, 21 Mar 2018 11:28:18 +0000 (11:28 +0000)]
net/sfc: support per-port dynamic logging
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Ivan Malov [Wed, 21 Mar 2018 11:28:17 +0000 (11:28 +0000)]
net/sfc: support driver-wide dynamic logging
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Ivan Malov [Wed, 21 Mar 2018 11:28:16 +0000 (11:28 +0000)]
eal: register log type and pick level from args
Dynamic log types are registered on RTE_INIT() step.
This allows one to set log levels by EAL options on
application launch. However, this does not allow to
manage log types if they are created during runtime.
EAL does not store log levels and types passed from
the command line. Thus, they cannot be picked later.
This is an obvious flaw since it would be better to
be able to pick levels for dynamic types registered
for runtime-determined facilities such as NIC ports.
This patch provides a mechanism to store log levels
passed from EAL options and adds an API to register
log types and pick levels from the internal storage.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Nélio Laranjeiro [Tue, 13 Mar 2018 14:17:39 +0000 (15:17 +0100)]
net/mlx5: refuse empty VLAN flow specification
Verbs specification doesn't help to distinguish between packets having an
VLAN and those which do not have, this ends by having flow rule which does
not react as the user expects e.g.
flow create 0 ingress pattern eth / vlan / end action queue index 0 / end
flow create 0 ingress pattern eth / end action queue index 1 / end
are colliding in Verbs definition as in both rule are matching packets with
or without VLAN.
For this reason, the VLAN specification must not be empty, otherwise the
PMD has to refuse it.
Nélio Laranjeiro [Tue, 13 Mar 2018 14:17:36 +0000 (15:17 +0100)]
net/mlx5: change tunnel flow priority
Packet matching inner and outer flow rules are caught by the first one
added in the device as both flows are configured with the same priority.
To avoid such situation, the inner flow can have an higher priority than
the outer ones as their pattern matching will otherwise collide.
Nélio Laranjeiro [Mon, 12 Mar 2018 13:43:19 +0000 (14:43 +0100)]
net/mlx5: fix link status to use wait to complete
Wait to complete is present to let the application get a correct status
when it requires it, it should not be ignored.
Fixes: e313ef4c2fe8 ("net/mlx5: fix link state on device start") Fixes: cb8faed7dde8 ("mlx5: support link status update") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Nélio Laranjeiro [Mon, 12 Mar 2018 13:43:18 +0000 (14:43 +0100)]
net/mlx5: fix link status behavior
This behavior is mixed between what should be handled by the application
and what is under PMD responsibility.
According to DPDK API:
- link_update() should only query the link status [1]
- link_set_{up,down}() should only set the link to the according status [1]
- dev_{start,stop}() should enable/disable traffic reception/emission [2]
On this PMD, the link status is retrieved from the net device associated
owned by the Linux Kernel, it does not means that even when this interface
is down, the PMD cannot send/receive traffic from the NIC those two
information are unrelated, until the physical port is active and has a
link, the PMD can receive/send traffic on the wire.
According to DPDK API, calling the rte_eth_dev_start() even when the Linux
interface link is down is then possible and allowed, as the traffic will
flow between the DPDK application and the Physical port.
This also means that a synchronization between the Linux interface and the
DPDK application remains under the DPDK application responsibility.
To handle such synchronization the application should behave as the
following scheme, to start:
rte_eth_get_link(port_id, &link);
if (link.link_status == ETH_DOWN)
rte_eth_dev_set_link_up(port_id);
rte_eth_dev_start(port_id);
Taking in account the possible returned values for each function.
The application should also set the LSC interrupt callbacks to catch and
behave accordingly when the administrator set the Linux device down/up.
The same callbacks are called when the link on the medium falls/raise.
Nélio Laranjeiro [Mon, 12 Mar 2018 13:43:17 +0000 (14:43 +0100)]
net/mlx5: remove kernel version check
Kernel version check was introduced in
commit 3a49ffe38a95 ("net/mlx5: fix link status query")
due to a bug fixed by
commit ef09a7fc7620 ("net/mlx5: fix inconsistent link status query")
This patch restore the previous behavior as described in Linux API.
Yongseok Koh [Wed, 14 Mar 2018 17:51:48 +0000 (10:51 -0700)]
net/mlx5: fix ARM build
rdma-core v16 has a bug. The following compilation error occurs on ARM
hosts.
In file included
from drivers/net/mlx5/mlx5_glue.h:16:0,
from drivers/net/mlx5/mlx5_glue.c:11:
/usr/include/infiniband/mlx5dv.h:144:2: error: unknown type name 'off_t'
off_t uar_mmap_offset;
^
As a temporary fix, sys/types.h is included in PMD. This has been fixed in
rdma-core v17. This can be removed when all the Linux distros are shipped
with rdma-core v17 or back-ported fix. As of now, RedHat 7.5 is known to
have rdma-core v16.
Xueming Li [Fri, 16 Mar 2018 15:22:27 +0000 (23:22 +0800)]
net/mlx5: fix existing file removal
There is no guarantee that the file won't be removed by external
user/application between the stat() and remove() syscalls, remove() will
fail if the file no longer exists.
Fixes: f8b9a3bad467 ("net/mlx5: install a socket to exchange a file descriptor") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
net/mlx5: change non failing function return values
These functions return int although they are not supposed to fail,
resulting in unnecessary checks in their callers.
Some are returning error where is should be a boolean.
Tomasz Duszynski [Thu, 15 Mar 2018 12:12:21 +0000 (13:12 +0100)]
net/mrvl: fix Rx descriptors number
Since filling hardware buffer pool (bpool) is Rx related
constant describing maximum number of rx descriptors
instead of maximum number of Tx descriptors should be used.
Fixes: 0ddc9b815b11 ("net/mrvl: add net PMD skeleton") Cc: stable@dpdk.org Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
Vipin Varghese [Mon, 12 Mar 2018 21:53:52 +0000 (03:23 +0530)]
net/tap: allow user MAC to be passed as args
Allow TAP PMD to pass user desired MAC address as argument.
The argument value is processed as string delimited by ':',
is parsed and converted to HEX MAC address after validation.
Use new rte_eth_linkstatus_get/set helper functions to handle link
status update.
This driver was not doing atomic update of link status information.
And the return value was different than others.
The hardware also does not do autonegotiation (at least on Linux).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Many drivers are all doing copy/paste of the same code to atomically
update the link status. Reduce duplication, and allow for future
changes by having common function for this.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
To handle atomic update of link status (64 bit), every driver
was doing its own version using cmpset.
Atomic exchange is a useful primitive in its own right;
therefore make it a EAL routine.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Chas Williams [Wed, 17 Jan 2018 15:04:57 +0000 (10:04 -0500)]
net/vmxnet3: keep consistent link status
Bonding may examine the link properties to ensure that matching interfaces
are bound together. If the link is going to have fixed properties,
these need to remain consistent regardless of the link_status or the
state of the adapter.
Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Shrikrishna Khare <skhare@vmware.com>
Chas Williams [Wed, 17 Jan 2018 15:04:56 +0000 (10:04 -0500)]
net/vmxnet3: set the queue shared buffer at start
If a reconfiguration happens, queuedesc is reallocated. Any queues that
are preserved point to the previous queuedesc since the queues are only
configured during queue setup. Delay configuration of the shared queue
pointers until device start when queuedesc is no longer changing.
Fixes: 8618d19b52b1 ("net/vmxnet3: reallocate shared memzone on re-config") Cc: stable@dpdk.org Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Shrikrishna Khare <skhare@vmware.com>
Glue object files are looked up in RTE_EAL_PMD_PATH by default when set and
should be installed in this directory.
During startup, EAL attempts to load them automatically like other plug-ins
found there. While normally harmless, dlopen() fails when rdma-core is not
installed, EAL interprets this as a fatal error and terminates the
application.
This patch requests glue objects to be installed in a different directory
to prevent their automatic loading by EAL since they are PMD helpers, not
actual DPDK plug-ins.
Fan Zhang [Thu, 8 Mar 2018 12:17:52 +0000 (12:17 +0000)]
net/i40e: fix link update no wait
In i40e_dev_link_update() the driver obtains the link status
info via admin queue command despite of "no_wait" flag. This
requires relatively long time and may be a problem to some
application such as ovs-dpdk.
(https://bugzilla.redhat.com/show_bug.cgi?id=1551761).
This patch aims to fix the problem by using a different
approach of obtaining link status for i40e NIC without waiting.
Instead of getting the link status via admin queue command,
this patch reads the link status registers to accelerate the
procedure.
Fixes: 263333bbb7a9 ("i40e: fix link status timeout") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Signed-off-by: Andrey Chilikin <andrey.chilikin@intel.com> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Tested-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Ilya Maximets [Mon, 26 Feb 2018 08:39:00 +0000 (09:39 +0100)]
vhost: add note about sockets in server mode
From time to time, someone sends patches about unlinking existing
sockets when registering a vhost user in server mode.
A recent example:
http://dpdk.org/ml/archives/dev/2018-February/090025.html
This problem has been discussed many times, and it was made clear that
the library should not unlink files given by the application in order
to avoid possible security problems, such as removing random files
used by other programs.
One of the first discussions:
http://dpdk.org/ml/archives/dev/2015-December/030326.html
To avoid such patches in the future, it was decided to add a comment
that explains what is happening and tries to describe the reasoning.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Hyong Youb Kim [Thu, 8 Mar 2018 02:46:59 +0000 (18:46 -0800)]
net/enic: support Rx queue interrupts
Enable rx queue interrupts if the app requests them, and vNIC has
enough interrupt resources. Use interrupt vector 0 for link status and
errors. Use vector 1 for rx queue 0, vector 2 for rx queue 1, and so
on. So, with n rx queues, vNIC needs to have at n + 1 interrupts.
For VIC, enabling and disabling rx queue interrupts are simply
mask/unmask operations. VIC's credit based interrupt moderation is not
used, as the app wants to explicitly control when to enable/disable
interrupts.
This version requires MSI-X (vfio-pci). Sharing one interrupt for link
status and rx queues is possible, but is rather complex and has no
user demands.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Thu, 8 Mar 2018 02:46:58 +0000 (18:46 -0800)]
net/enic: allocate stats DMA buffer upfront during probe
The driver provides a DMA buffer to the firmware when it requests port
stats. The NIC then fills that buffer with latest stats. Currently,
the driver allocates the DMA buffer the first time it requests stats
and saves it for later use. This can lead to crashes when
primary/secondary processes are involved. For example, the following
sequence crashes the secondary process.
1. Start a primary app that does not call rte_eth_stats_get()
2. dpdk-procinfo -- --stats
dpdk-procinfo crashes while trying to allocate the stats DMA buffer
because the alloc function pointer (vdev.alloc_consistent) is valid
only in the primary process, not in the secondary process.
Overwriting the alloc function pointer in the secondary process is not
an option, as it will simply make the pointer invalid in the primary
process. Instead, allocate the DMA buffer during probe so that only
the primary process does both allocate and free. This allows the
secondary process to dump stats as well.
Fixes: 9913fbb91df0 ("enic/base: common code") Cc: stable@dpdk.org Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Thu, 8 Mar 2018 02:46:56 +0000 (18:46 -0800)]
net/enic: remove VLAN filter handler
VIC does not support VLAN filtering at the moment. The firmware does
accept the filter add/del commands and returns success. But, they are
no-ops. To avoid confusion, remove the filter set handler so the app
sees an error instead of silent failure.
Also during the device configure time, enicpmd_vlan_offload_set would
not print a warning message about unsupported VLAN filtering, because
the caller specifies only ETH_VLAN_STRIP_MASK. This is wrong, as we
should attempt to apply all requested offloads at the configure
time. So, pass all VLAN offload masks, which triggers a warning
message about VLAN filtering, if requested.
Finally, enicpmd_vlan_offload_set should check both mask and
rxmode.offloads, not just mask.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Thu, 8 Mar 2018 02:46:55 +0000 (18:46 -0800)]
net/enic: heed the requested max Rx packet size
Currently, enic completely ignores the requested max Rx packet size
(rxmode.max_rx_pkt_len). The desired behavior is that the NIC hardware
drops packets larger than the requested size, even though they are
still smaller than MTU.
Cisco VIC does not have such a feature. But, we can accomplish a
similar (not same) effect by reducing the size of posted receive
buffers. Packets larger than the posted size get truncated, and the
receive handler drops them. This is also how the kernel enic driver
enforces the Rx side MTU.
This workaround works only when scatter mode is *not* used. When
scatter is used, there is currently no way to support
rxmode.max_rx_pkt_len, as the NIC always receives packets up to MTU.
For posterity, add a copious amount of comments regarding the
hardware's drop/receive behavior with respect to max/current MTU.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>