Qi Zhang [Wed, 8 Mar 2017 06:19:01 +0000 (01:19 -0500)]
net/fm10k/base: do not stop reset
Don't report FM10K_ERR_REQUESTS_PENDING when we fail to disable queues
within the timeout. This can occur due to a hardware Tx hang, or when
the switch ethernet fabric is resetting while we are transmitting
traffic. It can sometimes take up to 500ms before the Tx DMA engine
gives up. Instead, just skip the DMA engine check and perform
a data-path reset anyways. Add a statistic counter to keep track of the
number of resets occurring while we have pending DMA on the rings.
In order to prevent having to assign err = FM10K_SUCCESS, re-order the
last few items of the reset_hw_pf function so that we don't perform
"return err" at the end.
Qi Zhang [Wed, 8 Mar 2017 06:18:59 +0000 (01:18 -0500)]
net/fm10k/base: enable lport map request
If the fm10k interface is brought up, but the switch manager software is
not running, the driver will continuously request the lport map every
few seconds in the base driver watchdog routine. Eventually after
several minutes the switch mailbox Tx fifo will fill up and the mailbox
will timeout, resulting in a reset. This reset will appear as if for no
reason, and occurs regularly every few minutes until the switch manager
software is loaded.
The VF uses a multi-bit update request to clear unused VLANs whenever it
resets. However, an accident in a previous refactor broke multi-bit
updates for VFs, due to misreading a comment in fm10k_vf.c and
attempting to reduce code duplication. The problem occurs because
a multi-bit request has a non-zero length, and the PF would simply drop
any request with the upper 16 bits set. In addition, a multi-bit vlan
update does not have a concept for "VLAN 0" as the single bit update
does.
A previous revision of this patch resolved the issue by simply removing
the upper 16 bit check and the iov_select_vid checks. However, this would
remove the checks for default VID and for ensuring no other VLANs can be
enabled except pf_vid when it has been set. To resolve that issue, this
revision uses the iov_select_vid when we have a single-bit update, and
denies any multi-bit update when the VLAN was administratively set by
the PF. This should be ok since the PF properly updates VLAN_TABLE when
it assigns the PF vid. This ensures that requests to add or "remove" the
PF vid work as expected, but a rogue VF could not use the multi-bit
update as a loophole to attempt receiving traffic on other VLANs.
The original comment may be read incorrectly as referring to checking
the *entire* length is zero. However, it merely checks only the reserved
bits of both length and reserved in a small amount of code. Update the
comment to indicate this is a clever trick and clearly spell out that it
only checks the reserve bits.
Qi Zhang [Wed, 8 Mar 2017 06:18:56 +0000 (01:18 -0500)]
net/fm10k/base: use different name for override bit
Use a new #define FM10K_VLAN_OVERRIDE even though we're using
the exact same bit. The reason for this is clarity in the code,
otherwise you can read FM10K_VLAN_CLEAR and think it should be
removed. Also add a comment explaining why the FM10K_VLAN_OVERRIDE
bit is set.
Qi Zhang [Wed, 8 Mar 2017 06:18:55 +0000 (01:18 -0500)]
net/fm10k/base: update comment to use 8 bit notation
The diagram represents bit layout of the multi-bit VLAN update
message format. Re-draw the numbers using base 8, and mark the
bit values every 8 bits at the top. This should make it more
easy to grasp the table quickly.
Qi Zhang [Wed, 8 Mar 2017 06:18:54 +0000 (01:18 -0500)]
net/fm10k/base: add new item to lport msg attr
Add FM10K_PF_ATTR_ID_ERR, since it is possible for the switch manager
to send out an error message indicating status of the LPORT_MAP due to
zero allocated bandwidth.
Qi Zhang [Wed, 8 Mar 2017 06:18:53 +0000 (01:18 -0500)]
net/fm10k/base: clean up the logic
Clean up the logic in fm10k_tlv_attr_parse, we
should not reply on that FM10K_NOT_IMPLEMENTED is
greater than zero, as this can easily cause confusion.
The patch also correct a minor document error.
Qi Zhang [Wed, 8 Mar 2017 06:18:50 +0000 (01:18 -0500)]
net/fm10k/base: reset multicast mode when deleting lport
Deleting lport when multicast mod is configured to
FM10K_XCAST_MODE_ALLMULTI or FM10K_XCAST_MODE_PROMISC will
result in generating orphaned multicast-group entries in the
switch manager.
Before deleting the lport, reset multicast mode to
FM10K_XCAST_MODE_NONE to flush out these multicast-group
entries.
net/sfc/base: separate limitations on Tx DMA descriptors
Siena has limitation on maximum byte count and 4k boundary crosssing
(which is stricter than maximum byte count).
EF10 has limitation on maximum byte count only.
Fixes: f7dc06bf35f2 ("net/sfc/base: import 5xxx/6xxx family support") Fixes: e7cd430c864f ("net/sfc/base: import SFN7xxx family support") Fixes: 94190e3543bf ("net/sfc/base: import SFN8xxx family support") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
With all vmxnet3 version 3 changes incorporated in the vmxnet3 driver,
the driver can configure emulation to run at vmxnet3 version 3, provided
the emulation advertises support for version 3.
This patch also updates release notes.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Acked-by: Yong Wang <yongwang@vmware.com> Acked-by: Jin Heo <heoj@vmware.com>
In vmxnet3 version 3, the emulation added support for the vmxnet3 driver
to communicate information about the memory regions the driver will use
for rx/tx buffers. The driver can also indicate which rx/tx queue the
memory region is applicable for. If this information is communicated
to the emulation, the emulation will always keep these memory regions
mapped, thereby avoiding the mapping/unmapping overhead for every packet.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Guolin Yang <gyang@vmware.com> Acked-by: Yong Wang <yongwang@vmware.com> Acked-by: Jin Heo <heoj@vmware.com>
vmxnet3 driver preallocates buffers for receiving packets and posts the
buffers to the emulation. In order to deliver a received packet to the
guest, the emulation must map buffer(s) and copy the packet into it.
To avoid this memory mapping overhead, this patch introduces the receive
data ring - a set of small sized buffers that are always mapped by
the emulation. If a packet fits into the receive data ring buffer, the
emulation delivers the packet via the receive data ring (which must be
copied by the guest driver), or else the usual receive path is used.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Acked-by: Yong Wang <yongwang@vmware.com> Acked-by: Jin Heo <heoj@vmware.com>
vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.
Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Acked-by: Yong Wang <yongwang@vmware.com> Acked-by: Jin Heo <heoj@vmware.com>
Shared memory is used to exchange information between the vmxnet3 driver
and the emulation. In order to request emulation to perform a task, the
driver first populates specific fields in this shared memory and then
issues corresponding command by writing to the command register(CMD). The
layout of the shared memory was defined by vmxnet3 version 1 and cannot
be extended for every new command without breaking backward compatibility.
To address this problem, in vmxnet3 version 3, the emulation repurposed
a reserved field in the shared memory to represent command information
instead. For new commands, the driver first populates the command
information field in the shared memory and then issues the command. The
emulation interprets the data written to the command information
depending on the type of the command. This patch exposes this capability
to the driver.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Acked-by: Yong Wang <yongwang@vmware.com> Acked-by: Jin Heo <heoj@vmware.com>
Jingjing Wu [Tue, 28 Feb 2017 06:26:30 +0000 (14:26 +0800)]
net/i40e: enable DCB on SRIOV VFs
enable DCB on SRIOV VFs, including
- UP and TC mapping according to dcb_tc in struct rte_eth_dcb_rx_conf.
- TC and queue mapping: queues are divided equally for each TC.
- UP insert when sending packet according to the TC the Tx queue
belongs to.
Jingjing Wu [Tue, 28 Feb 2017 06:26:27 +0000 (14:26 +0800)]
net/ixgbe: fix multi-queue mode check in SRIOV mode
In SRIOV case, ETH_MQ_RX_VMDQ_DCB and ETH_MQ_RX_DCB should be considered as
the same meaning, due to the multi-queue mapping is the same SRIOV and VMDq
in ixgbe.
Fixes: 27b609cbd1c6 ("ethdev: move the multi-queue mode check to specific drivers") Cc: stable@dpdk.org Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Wenzhuo Lu [Mon, 27 Feb 2017 05:34:04 +0000 (13:34 +0800)]
net/ixgbe: fix all queues drop setting of DCB
DCB is split to RX and TX mode. All-queues-drop is set for TX mode.
It's not appropriate because all-queue-drop is a RX feature.
Move this setting from TX to RX.
Fixes: f3f9b17bb8a5 ("net/ixgbe: support multiqueue mode VMDq DCB with SRIOV") Cc: stable@dpdk.org Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Shahaf Shuler [Thu, 2 Mar 2017 09:05:44 +0000 (11:05 +0200)]
net/mlx5: add hardware checksum offload for tunnel packets
Prior to this commit Tx checksum offload was supported only for the
inner headers.
This commit adds support for the hardware to compute the checksum for the
outer headers as well.
The support is for tunneling protocols GRE and VXLAN.
Wenzhuo Lu [Wed, 22 Feb 2017 02:59:35 +0000 (10:59 +0800)]
net/ixgbe: fix Rx queue blocking issue
In the IOV scenario, multi Rx queues can be assigned to one VF.
If the dropping is not enabled, when no descriptors are available
for one queue, this queue can block others.
Fixes: 00e30184daa0 ("ixgbe: add PF support") Cc: stable@dpdk.org Suggested-by: Liang-Min Larry Wang <liang-min.wang@intel.com> Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Qi Zhang [Tue, 21 Feb 2017 22:45:29 +0000 (17:45 -0500)]
net/i40e: no more initial VF MAC address
During PF initialization, PF will generate an initial mac address
for VFs, the purpose is to help VF keep a constant mac address between
its startup/shutdown cycles. Now this is not necessary, since we already
provide an API to set VF's MAC address from PF side
(rte_pmd_i40e_set_vf_mac_addr).
Application can use this API to lock down VF's mac address.(of course this
should happen before VF init)
While without this patch, we still can use rte_pmd_i40e_set_vf_mac_addr
to overwrite the random one, but this patch align DPDK's default behavior
with Kernel PF driver's, and this help to give an identical experience
when work with kernel VF driver.
Having a drop queue per drop flow consumes a lot of memory and reduce the
speed capabilities of the NIC to handle such cases.
To avoid this and reduce memory consumption, an RSS drop queue is created
for all drop flows.
In mlx5 PMD handling a single queue of several destination queues ends in
creating the same Verbs attribute, the main difference resides in the
indirection table and the RSS hash key.
This helps to prepare the supports to the RSS queues by first handling the
queue action has being an RSS queue with a single queue. No RSS hash key
will be provided to the Verbs flow.
Creating a drop queue in mlx5 ends by creating a non polled queue, but if
the associated work queue could not be created the error was not handled
ending in a undefined situation.
Fixes: 2097d0d1e2cc ("net/mlx5: support basic flow items and actions") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Wenzhuo Lu [Wed, 1 Mar 2017 06:04:51 +0000 (14:04 +0800)]
net/ixgbe/base: fix build error
Fix ICC build error by removing the EWARN third parameter.
Build error:
.../drivers/net/ixgbe/base/ixgbe_phy.c(1543):
error #268: the format string ends before this argument
EWARN(hw, "WARNING: Intel (R) Network "
^
.../drivers/net/ixgbe/base/ixgbe_phy.c(1805):
error #268: the format string ends before this argument
EWARN(hw, "WARNING: Intel (R) Network "
^
Fixes: aa4fc14d2cee ("ixgbe: update base driver") Fixes: b94a06c1b451 ("ixgbe/base: support qsfp and lco") Cc: stable@dpdk.org Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 1 Mar 2017 06:04:49 +0000 (14:04 +0800)]
net/ixgbe/base: disable FC for 15B0
Disable Ethernet Flow Control (FC) for device 15B0.
Make sure that ixgbe_device_supports_autoneg_fc()
returns false and hw->fc.disable_fc_autoneg is set
to true to avoid running the fc_autoneg function
for the device 15B0, as this device doesn't support
this function.
Wenzhuo Lu [Wed, 1 Mar 2017 06:04:48 +0000 (14:04 +0800)]
net/ixgbe/base: complete HW init when SFP not present
If SFP module is not present, reset_hw doesn't return success.
SW should complete the initialization, or with specific module
it resulted in no link when the module was later inserted.
Yong Wang [Tue, 21 Feb 2017 09:33:23 +0000 (04:33 -0500)]
net/e1000/base: fix multicast setting in VF
In function e1000_update_mc_addr_list_vf(), "msgbuf[0]" is used prior
to initialization at "msgbuf[0] |= E1000_VF_SET_MULTICAST_OVERFLOW".
And "msgbuf[0]" is overwritten at "msgbuf[0] = E1000_VF_SET_MULTICAST".
Fix it by moving the second line prior to the first one that mentioned
above.
Fixes: dffbaf7880a8 ("e1000: revert fix for multicast in VF") Cc: stable@dpdk.org Signed-off-by: Yong Wang <wang.yong19@zte.com.cn> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Chas Williams [Fri, 10 Feb 2017 20:12:06 +0000 (15:12 -0500)]
net/bnx2x: fix transmit queue free threshold
The default tx_free_thresh is potentially larger than the allocated queue
which will result in TX queue cleanup never happening. To fix this,
lower the default free threshold and ensure that the free threshold is
never greater than the maximum outstanding transmit buffers.
Shahaf Shuler [Tue, 21 Feb 2017 14:37:24 +0000 (16:37 +0200)]
net/mlx5: fix extended statistics
The number of extended statistics counters is queried through ETHTOOL.
ETHTOOL provides a different number when the link is up or down.
Since extended statistics query occurs at device start,
segmentation fault might happen when changing the link state before and
after the device start.
this commit address this issue, and query the number of statistics
before every call to ETHTOOL.
Qi Zhang [Mon, 20 Feb 2017 18:11:56 +0000 (13:11 -0500)]
net/i40e: fix compile error
Fix the compile error when RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC
is disabled.
Also fake_mbuf is required to be initialized and assigned to
additional sw_ring entries for vector PMD independent from
RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC config option.
Shahaf Shuler [Tue, 14 Feb 2017 14:31:06 +0000 (16:31 +0200)]
net/mlx5: add out of buffer counter to extended statistic
This commit adds RX out of buffer counter to xstats report.
The counter counts the number of dropped occurred due to lack of buffers
on device RX queues.
Keith Wiles [Fri, 17 Feb 2017 15:43:04 +0000 (09:43 -0600)]
net/tap: fix possibly unterminated string
Calling strncpy with a maximum size argument of 16 bytes on destination
array "ifr.ifr_ifrn.ifrn_name" of size 16 bytes might leave the
destination string unterminated.
Allain Legacy [Fri, 31 Mar 2017 13:52:03 +0000 (09:52 -0400)]
cfgfile: support empty value
This commit adds support to the cfgfile library for parsing a key=value
line that has no value string specified (e.g., "key="). This can be used
to override a configuration attribute that has a default value or default
list of values to set it back to an undefined value to disable
functionality.
Joseph Richard [Fri, 31 Mar 2017 13:52:02 +0000 (09:52 -0400)]
cfgfile: fix parsing of long fields
When parsing a ini file with a "key = value" line that has both "key" and
"value" sized to the maximum allowed length causes a parsing failure. The
internal "buffer" variable should be sized at least as large as the maximum
for both fields. This commit updates the local array to be sized to hold
the max name, max value, " = ", and the nul terminator.
Allain Legacy [Fri, 31 Mar 2017 13:52:01 +0000 (09:52 -0400)]
cfgfile: constrain string search
The call to memchr() uses the absolute length of the string buffer instead
of the actual length of the string returned by fgets(). This causes the
search to go beyond the '\n' character and find ';' characters in random
garbage on the stack. This then causes the 'len' variable to be updated
and the subsequent search for the '=' character to potentially find one
beyond the first newline character.
Since this bug relies on ';' and '=' characters appearing in random places
in the 'buffer' variable it is intermittently reproducible at best.
Allain Legacy [Fri, 31 Mar 2017 13:52:00 +0000 (09:52 -0400)]
cfgfile: support configurable comment character
The current cfgfile comment character is hardcoded to ';'. This commit a
new API to allow the user to specify which comment character to use while
parsing the file.
This is to ease adoption by applications that have an existing
configuration file which may use a different comment character. For
instance, an application may already have a configuration file that uses
the '#' as the comment character.
The approach of using a new API with an extensible parameters structure was
used rather than simply adding a new argument to the existing API to allow
for additional arguments to be introduced in the future.
Allain Legacy [Fri, 31 Mar 2017 13:51:59 +0000 (09:51 -0400)]
cfgfile: support global properties section
The current implementation of the cfgfile library requires that all
key=value pairs be within [SECTION] definitions. The ini file standard
allows for key=value pairs in an unnamed section.
This commit adds the capability of parsing key=value pairs from such an
unnamed section. The CFG_FLAG_GLOBAL_SECTION flag must be passed to the
rte_cfgfile_load() API to enable this functionality. Any key=value pairs
found before the first section can be accessed in the section named
"GLOBAL".
Allain Legacy [Fri, 31 Mar 2017 13:51:58 +0000 (09:51 -0400)]
test/cfgfile: add basic unit tests
This commit adds the basic infrastructure for the cfgfile library unit
tests. It includes success path tests for the most commonly used APIs.
More unit tests will be added later.
Yong Wang [Wed, 29 Mar 2017 08:27:35 +0000 (04:27 -0400)]
doc: fix a typo in howto guide
Fixes: 0ba3870e7559 ("doc: add guide to use virtio-user as exceptional path") Cc: stable@dpdk.org Signed-off-by: Yong Wang <wang.yong19@zte.com.cn> Acked-by: John McNamara <john.mcnamara@intel.com>
Jerin Jacob [Mon, 3 Apr 2017 08:35:14 +0000 (14:05 +0530)]
eal/linux: fix build with glibc 2.25
glibc 2.25 is warning about if applications depend on
sys/types.h for makedev macro, it expects to be included
from <sys/sysmacros.h>
Found this error while testing with GCC 6.3.1 on archlinux.
lib/librte_eal/linuxapp/eal/eal_pci_uio.c: In function ‘pci_mknod_uio_dev’:
lib/librte_eal/linuxapp/eal/eal_pci_uio.c:134:13:
error: In the GNU C Library, "makedev" is defined
by <sys/sysmacros.h>. For historical compatibility, it is
currently defined by <sys/types.h> as well, but we plan to
remove this soon. To use "makedev", include <sys/sysmacros.h>
directly. If you did not intend to use a system-defined macro
"makedev", you should undefine it after including <sys/types.h>. [-Werror]
dev = makedev(major, minor);
^~~~~~~~~~~~~~~~~
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Bruce Richardson [Fri, 24 Mar 2017 14:30:11 +0000 (14:30 +0000)]
nic_uio: fix device binding at boot
When loading nic_uio from /boot/loader.conf as specified in the Getting
Started Guide doc, the NIC devices were not bound at boot. Unloading the
nic_uio driver and reloading it would cause them to be bound, however.
The root cause appears to be the fact that when the module is loaded at
boot, the call to find the pci device when parsing the b:d:f parameter
fails to return the device. That means that later on when the device
is probed as part of a PCI scan, no action is taken as it's not recorded
as a device to be used.
We fix this by having the b:d:f string parsed again on probe if the
initial check to see if it's an already-known device fails. In my tests,
this causes the NIC devices to be successfully bound at boot time, as
well as leaving things working as before in the case the module is loaded
post-boot.
Jianfeng Tan [Thu, 16 Mar 2017 16:28:44 +0000 (16:28 +0000)]
vfio: fix secondary process start
When binding with vfio-pci, secondary process cannot be started with
an error message:
cannot find TAILQ entry for PCI device.
It's due to: struct rte_pci_addr is padded with 1 byte for alignment
by compiler. Then below comparison in commit 2f4adfad0a69
("vfio: add multiprocess support") will fail if the last byte is not
initialized.
Some compilers require definition of vfio_iommu_spapr_tce_ddw_info
before its use in vfio_iommu_spapr_tce_info, so move tce_info
definition below tce_ddw_info.
Fixes: 468f42cc2645 ("vfio: fix build on old kernel") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Recently added "dma_zalloc_coherent()" call is causing build error
for Linux kernels < 3.2.
compile error:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
In function ‘igbuio_pci_probe’:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:434:2:
error: implicit declaration of function ‘dma_zalloc_coherent’
[-Werror=implicit-function-declaration]
map_addr = dma_zalloc_coherent(&dev->dev, 1024,
^
dma_zalloc_coherent() introduced with Linux kernel 3.2, with commit
Linux: 842fa69f3e0c ("include/linux/dma-mapping.h: add dma_zalloc_coherent()")
Since it does not exist for older kernels, causing a build error.
Switched to dma_alloc_coherent() API to prevent build error.
Shreyansh Jain [Fri, 31 Mar 2017 05:35:37 +0000 (11:05 +0530)]
mempool: move stack handler as a driver
Moved from lib/librte_mempool, stack mempool handler is an independent
driver.
Shared builds would now require to link in librte_mempool_stack for
"stack" mempool handler.
Shreyansh Jain [Fri, 31 Mar 2017 05:35:36 +0000 (11:05 +0530)]
mempool: move ring handler as a driver
Moved from lib/librte_mempool, ring mempool is now an independent
driver.
Shared builds would now need to add librte_mempool_ring for:
* ring_mp_mc
* ring_sp_sc
* ring_sp_mc
* ring_mp_sc
Shreyansh Jain [Fri, 31 Mar 2017 05:35:35 +0000 (11:05 +0530)]
mempool: fix crash when handler not found
In case the stack or ring mempool handler are compiled as shared
library and not linked in with test binary, segfault is reported.
This is because return value of rte_mempool_set_ops_byname is not
being checked in rte_mempool_ops_alloc.
This patch handles error returned from rte_mempool_set_ops_byname
when a mempool is not found.
Gage Eads [Thu, 30 Mar 2017 23:02:00 +0000 (18:02 -0500)]
mempool: update non-EAL thread note
Commit 30e6399892276 ("mempool: support non-EAL thread") added the
capability for non-EAL threads to use the mempool library. This commit
removes the note indicating that the mempool library cannot be used safely
by non-EAL threads, and replaces it with a more up-to-date note.