Shahaf Shuler [Mon, 1 May 2017 06:58:12 +0000 (09:58 +0300)]
doc: announce ABI change for Tx offload
This is an ABI change notice for DPDK 17.08 in ethdev
about changes in rte_eth_txmode structure.
Currently Tx offloads are enabled by default, and can be disabled
using ETH_TXQ_FLAGS_NO* flags. This behaviour is not consistent with
the Rx side where the Rx offloads are disabled by default and enabled
according to bit field in rte_eth_rxmode structure.
The proposal is to disable the Tx offloads by default, and provide
a way for the application to enable them in rte_eth_txmode structure.
Besides of making the Tx configuration API more consistent for
applications, PMDs will be able to provide a better out of the
box performance.
Finally, as part of the work, the ETH_TXQ_FLAGS_NO* will
be superseded as well.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Gaetan Rivet [Wed, 10 May 2017 15:46:10 +0000 (17:46 +0200)]
doc: announce ABI change for device parameters
The PCI and virtual bus are planned to be moved to the generic
drivers/bus directory in v17.08. For this change to be possible, the EAL
must be made completely independent.
The rte_devargs structure currently holds device representation internal
to those two busses. It must be made generic before this work can be
completed.
Instead of using either a driver name for a vdev or a PCI address for a
PCI device, a devargs structure will have to be able to describe any
possible device on all busses, without introducing dependencies on
any bus-specific device representation. This will break the ABI for this
structure.
Additionally, an evolution will occur regarding the device parsing
from the command-line. A user must be able to set which bus will handle
which device, and this setting is integral to the definition of a
device.
The format has not yet been formally defined, but a proposition will
follow soon for a new command line parameter format for all devices.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: David Marchand <david.marchand@6wind.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Coverity reported that an argument for sizeof was used improperly.
We should allocate memory for value size that pointer points to,
instead of pointer size itself.
Coverity issue: 144523, 144521 Fixes: 7ac16a3660c0 ("app/proc-info: support xstats by ID and by name") Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Gregory Etelson [Wed, 10 May 2017 15:13:10 +0000 (17:13 +0200)]
mbuf: fix bulk allocation when debug enabled
The debug assertions when allocating a raw mbuf are not correct since
commit 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool"),
which triggers a panic when using this function in debug mode
Change the expected number of segments to 1 instead of 0, and
factorize these sanity checks.
Fixes: 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool") Signed-off-by: Gregory Etelson <gregory@weka.io> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Rasesh Mody [Sun, 7 May 2017 22:53:12 +0000 (15:53 -0700)]
net/qede: fix RSS table entries for 100G adapter
With the change in base APIs the logic for 100G handling needs to be
adjusted to pass cid values instead for queue ids. The current API
works assuming its queue id.
Fixes: 69d7ba88f1a1 ("net/qede/base: use L2-handles for RSS configuration") Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Yongseok Koh [Tue, 9 May 2017 20:49:31 +0000 (13:49 -0700)]
net/mlx5: change error-prone code on Tx path
In the main loop of mlx5_tx_burst(), pointers/indexes are advanced at the
beginning. Therefore, those should be rolled back if checking resource
availability fails and breaks the loop. And some of them are even
redundant.
Yongseok Koh [Tue, 9 May 2017 20:49:30 +0000 (13:49 -0700)]
net/mlx5: fix index handling for Tx ring
In case of resource deficiency on Tx, mlx5_tx_burst() breaks the loop
without rolling back consumed resources (txq->wqes[] and txq->elts[]). This
can make application crash because unposted mbufs can be freed while
processing completions. Other Tx functions don't have this issue.
Wei Dai [Wed, 10 May 2017 07:00:02 +0000 (15:00 +0800)]
net/ixgbe: fix calling null function of VF
hw->mac.ops.get_media-type() of ixgbe VF is NULL and should not
be called directly. It had better be replaced by calling
ixgbe_get_media_type( ) to avoid crash.
Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is updated") Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Laurent Hardy <laurent.hardy@6wind.com>
Coverity reported that an argument for sizeof was used improperly.
We should allocate memory for value size that pointer points to,
instead of pointer size itself.
Coverity issue: 144522 Fixes: 79c913a42f0e ("ethdev: retrieve xstats by ID") Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Jerin Jacob [Tue, 9 May 2017 09:45:41 +0000 (15:15 +0530)]
hash: add switch fall-through comments for arm64
This fixes compiler warnings with GCC 7 for arm64 build. Fixes: da8dcc27f644 ("hash: use armv8-a CRC32 instructions") Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tiwei Bie [Sun, 7 May 2017 13:33:34 +0000 (13:33 +0000)]
eal/bsd: fix read on PCI configuration space
Some drivers (such as virtio) may need to read more than 4 bytes
data from PCI configuration space via rte_eal_pci_read_config().
But it will return with an error on FreeBSD when the expected
data length is bigger than the size of pi.pi_data whose type is
u_int32_t. This patch removes this limitation.
Fixes: 632b2d1deeed ("eal: provide functions to access PCI config") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Jianfeng Tan [Fri, 5 May 2017 16:10:13 +0000 (16:10 +0000)]
xen: fix physical address availability in dom0
When physical NICs are binded to igb_uio/uio-pci-generic, they cannot
be used in DPDK app in Xen dom0.
Due to (1) a restriction that phys addresses should be availabe is added
by commit cdc242f260e7 ("eal/linux: support running as unprivileged user"),
(2) and previous implementation of the test to check if phys addresses are
available (using a variable on the stack) just works for non-Xen
environment. Actually, for Xen dom0, the physical addresses are always
available if the memory is initialized successfully..
To fix it, we add an precheck to bypass the physical address availability
test.
Fixes: cdc242f260e7 ("eal/linux: support running as unprivileged user") Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Alejandro Lucero [Wed, 10 May 2017 08:54:23 +0000 (09:54 +0100)]
vfio: fix index for tracking devices in a group
Previous fix for properly handling devices from the same VFIO group
introduced another bug where the file descriptor of a kernel vfio
group is used as the index for tracking number of devices of a vfio
group struct handled by dpdk vfio code. Instead of the file
descriptor itself, the vfio group object that file descriptor is
registered with has to be used.
This patch introduces specific functions for incrementing or
decrementing the device counter for a specific vfio group using the
vfio file descriptor as a parameter. Note the code is not optimized
as the vfio group is found sequentially going through the vfio group
array but this should not be a problem as this is not related to
packet handling at all.
Fixes: a9c349e3a100 ("vfio: fix device unplug when several devices per group") Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com> Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Yulong Pei [Wed, 3 May 2017 10:29:47 +0000 (18:29 +0800)]
app/testpmd: fix NUMA structures initialization
Previous numa_support = 0 by default, it need to add --numa to testpmd
command line to enable numa, so port_numa and ring_numa were initialized
at function launch_args_parse(), now testpmd change numa_support = 1 as
default, so port_numa and ring_numa also need to initialize by default,
otherwise port->socket_id will be probed to wrong value.
Fixes: 999b2ee0fe45 ("app/testpmd: enable NUMA support by default") Signed-off-by: Yulong Pei <yulong.pei@intel.com> Acked-by: Jingjing Wu <jingjing.wu@intel.com>
MAC addresses are implicitly handled in network order since they are
actually byte strings, however this is not properly enforced with MAC masks
provided as prefix lengths, which end up inverted on little endian
systems.
Add a comment documenting explicitly that we are falling through the case
statements to the next one.
Fixes: f9072f8b90bb ("ixgbe: migrate flow director filtering to new API") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
There are two new warnings in GCC 7 that cause problems in the DPDK
compile.
1. GCC now warns if you have a switch fall-through without a suitable
comment indicating that it was intentional. The compiler supports a number
of levels of warning which are triggered depending on the type of message
used, with level 3 being the default. To accept a wider range of possible
fall-through messages, we adjust this down to level 2.
2. GCC also warns about an snprintf where there may be truncation and the
return value is not checked. Given that we often use snprintf in DPDK in
place of strncpy, and in many cases where truncation is not a problem, we
can just disable this particular warning.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Yongseok Koh [Mon, 1 May 2017 21:05:42 +0000 (14:05 -0700)]
net/mlx5: fix crash on deleting flow drop queue
If mlx5_dev_start() fails, it tries to rollback data structures related to
rte_flow including drop queue. The destruction code doesn't assume the
structures are created but priv_flow_delete_drop_queue() never does sanity
check. This can cause a crash.
Fixes: 028761059aeb ("net/mlx5: use an RSS drop queue") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Shahaf Shuler [Wed, 3 May 2017 06:55:35 +0000 (09:55 +0300)]
net/mlx5: fix Tx max inline with TSO
When TSO is enabled, Verbs layer aggregates the TSO
inline size with the txq inline size for the Tx creation,
while the PMD takes the maximum among them.
Fixing it by adjusting the max inline parameter before
passing to to Verbs.
There exists case that software sets mtu (i.e jumbo frame) of
ixgbe device when it's stopped. Before the fix, scattered_rx
is cleared during device stop, and setting jumbo frame mtu
after device stop will always fail as scattered_rx is 0.
Signed-off-by: Jia Yu <jyu@vmware.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wei Dai [Thu, 4 May 2017 09:54:40 +0000 (17:54 +0800)]
net/ixgbe: fix VF Rx mode for allmulticast disabled
Some customers find that 82599 NIC DPDK VF PMD can't receive any
broadcast packets when it is bound to igb_uio in the first time
to run a DPDK application like testpmd. But when the application
is quited and run again, the DPDK VF PMD can receive broadcast
packets again. The associated PF is run by kernel driver when
the VF is driven by DPDK PMD.
Fixes: 260e2e22e26f ("net/ixgbe/base: move multicast mode update") Fixes: 72dec9e37a84 ("ixgbe: support multicast promiscuous mode on VF") Cc: stable@dpdk.org Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Tue, 2 May 2017 08:34:59 +0000 (16:34 +0800)]
net/ixgbe: fix default MAC setting
Pool 0 is not PF, it's VF 0. So the MAC is set for VF 0
but not PF.
The code introduced a weird issue. In the scenario PF + VF,
when only starting PF, the default PF MAC address is working.
But after starting a VF, the default PF MAC address becomes
the VF's address.
Use the pool which is not occupied by VFs for PF to fix it.
Fixes: 8164fe82846b ("ixgbe: add default mac address modifier") Cc: stable@dpdk.org Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wei Dai [Fri, 5 May 2017 00:40:00 +0000 (08:40 +0800)]
ethdev: fix adding invalid MAC address
Some customers find adding MAC addr to VF sometimes can fail,
but it is still stored in dev->data->mac_addrs[ ]. So this
can lead to some errors that assumes the non-zero entry in
dev->data->mac_addrs[ ] is valid.
Following acknowledgements are from specific NIC PMD
maintainer for their managing part.
This patch changes the ethdev internal API, it should not be
backported to a stable/LTS release so far.
Fixes: af75078fece3 ("first public release") Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Jerin Jacob [Tue, 2 May 2017 05:19:51 +0000 (10:49 +0530)]
eal: optimize TSC routines when HPET is disabled
Since DPDK has only two timer sources,
Avoid &eal_timer_source memory read and followed
by the switch case statement when
RTE_LIBEAL_USE_HPET is not defined.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Wei Dai [Mon, 13 Mar 2017 08:59:27 +0000 (16:59 +0800)]
config: make backtrace optional
When building DPDK with musl, there is need not to disable
backtrace to remove some references to execinfo.h which is
not supported by musl now.
This also applies to some other libc implementation which
doesn't support backtrace() and backtrace_symbols().
musl is an implementation of the userspace portion
of the standard library functionality described in
the ISO C and POSIX standards, plus common extensions.
Got more details about musl from http://www.musl-libc.org .
Signed-off-by: Wei Dai <wei.dai@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Wei Dai [Mon, 13 Mar 2017 08:59:26 +0000 (16:59 +0800)]
examples/performance-thread: remove useless include
There is no function to refer any part of execinfo.h, so remove the
reference to it.
And there is no this file in musl. So need to remove it to support musl.
Thomas Monjalon [Sun, 30 Apr 2017 14:11:40 +0000 (16:11 +0200)]
usertools: fix CPU layout with python 3
These differences in Python 3 were causing errors:
- xrange is replaced by range
- dict values are a view (instead of list)
- has_key is removed
Fixes: deb87e6777c0 ("usertools: use sysfs for CPU layout") Fixes: 63985c5f104e ("usertools: fix CPU layout for more than 2 threads") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
The existing code used to search for module files via modinfo has
several corner cases which can result in it failing where it should be
successful.
The call to lower() would cause results returned by 'modinfo' to be
forced to lowercase, results which were subsequently passed to
exists() which is case sensitive. This was most likely done to capture
all variants of failure strings modinfo might return
(ie. ERROR/Error/error/...) without thought negative effect to the
later call to exists(). For many this is a nonissue but if the module
path included non-lowercase alpha characters, something which is
easily possible with a non-lowercase kernel-extraversion string, this
would cause an issue.
We could move the call to lower() to the check for "error" but this
still leaves possible corner cases, for modules or module paths with
'error' in them.
Instead we will prevent modinfo's stderr from being used as a "good
value" for path, meaning we either get a valid path from modinfo, or
nothing at all. This removes all corner cases.
Ultimately these preliminary checks are unnecessary as exists() will
only return True if it is passed a valid path, passing it modinfo's
stderr would fail. In keeping with the original code, however, we do
some preliminary checks, but we are now free of corner cases.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Kuba Kozak [Thu, 27 Apr 2017 14:42:40 +0000 (16:42 +0200)]
net/ixgbe: support xstats by ID
To achieve functionality of retrieving only specific statistics
given by application there are two new functions added:
ixgbe_dev_xstats_get_by_ids() which retrieve
values of statistics specified by ids array
and ixgbe_dev_xstats_get_names_by_ids() which retrieve
names of statistics specified by ids array.
Kuba Kozak [Thu, 27 Apr 2017 14:42:39 +0000 (16:42 +0200)]
net/igb: support xstats by ID
To achieve functionality of retrieving only specific statistics
given by application there are two new functions added:
eth_igb_xstats_get_by_id() which retrieve
values of statistics specified by ids array
and eth_igb_xstats_get_names_by_id() which retrieve
names of statistics specified by ids array.
Kuba Kozak [Thu, 27 Apr 2017 14:42:38 +0000 (16:42 +0200)]
app/proc-info: support xstats by ID and by name
There are new arguments --xstats-ids and --xstats-name
in proc_info command line to retrieve statistics given by ids
and by name.
E.g. --xstats-ids="1,3,5,7,8"
E.g. --xstats-name rx_errors
Kuba Kozak [Thu, 27 Apr 2017 14:42:36 +0000 (16:42 +0200)]
ethdev: retrieve xstats by ID
Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping managed
by the application.
Added new functions rte_eth_xstats_get_names_by_id and
rte_eth_xstats_get_by_id using additional arguments (in compare
to rte_eth_xstats_get_names and rte_eth_xstats_get) - array of ids
and array of values.
doc: add description for modified xstats API
Documentation change for new extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.
Kuba Kozak [Thu, 27 Apr 2017 14:42:35 +0000 (16:42 +0200)]
ethdev: revert xstats by ID
Revert patches to provide clear view for
upcoming changes. Reverted patches are listed below:
commit ea85e7d711b6 ("ethdev: retrieve xstats by ID")
commit a954495245c4 ("ethdev: get xstats ID by name")
commit 1223608adb9b ("app/proc-info: support xstats by ID")
commit 25e38f09af9c ("net/e1000: support xstats by ID")
commit 923419333f5a ("net/ixgbe: support xstats by ID")
Stop port before enabling QinQ.
Add commands to set inner and outer TPID's and start port.
Remove TPID's from flow validate and and flow create commands.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>
Rx queues configured with more than 1023 descriptors cause readv() calls to
fail due to more iovec entries than permitted by the kernel. As a result,
no packets can be received.
Quietly limit internal Rx queue size to the maximum number of iovec entries
to fix this issue.
Fixes: 0781f5762cfe ("net/tap: support segmented mbufs") Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Jerin Jacob [Mon, 1 May 2017 18:41:55 +0000 (00:11 +0530)]
net/thunderx: fix deadlock in Rx path
RBDR buffers are refilled when SW consumes the buffers from CQ.
This creates deadlock case when CQ buffers exhausted due to lack
of RBDR buffers. Fix is to refill the RBDR when rx_free_thresh
meet, irrespective of the number of CQ buffers consumed.
Fixes: e2d7fc9f0a24 ("net/thunderx: add single and multi-segment Rx") Cc: stable@dpdk.org Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Markos Chandras [Thu, 16 Feb 2017 16:17:31 +0000 (16:17 +0000)]
examples/ethtool: fix link with ixgbe shared lib
When RTE_DEVEL_BUILD is unset, -rpath is unset.
So the ethtool app cannot link with ixgbe shared library
which is required by ethtool lib:
warning: librte_pmd_ixgbe.so.1, needed by
examples/ethtool/lib/x86_64-native-linuxapp-gcc/lib/librte_ethtool.so,
not found (try using -rpath or -rpath-link)
It is fixed by adding the library in the application link.
The library link is also improved to specify that this explicit link
to ixgbe is needed only in the shared lib mode.
Fixes: 077d223e25c3 ("examples/ethtool: use ixgbe public function") Signed-off-by: Markos Chandras <mchandras@suse.de> Acked-by: Remy Horton <remy.horton@intel.com> Acked-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Pablo de Lara [Wed, 26 Apr 2017 11:29:52 +0000 (12:29 +0100)]
examples/l3fwd-power: fix Rx descriptor size
L3fwd power app monitors the RX queues to see if the polling frequency
should be adjusted (the busier the queue, the higher the frequency).
The app uses several thresholds in the ring to determine the frequency,
being 96 the highest one, when frequency should be highest.
The problem is that the difference between this value and the ring size
is not big enough (128 - 96 = 32 descriptors), which means that
if the descriptors are not replenished quick enough, queue might
not be busy, but the app would think that it is, because 96th descriptor
is set.
Therefore, by increasing this gap (increasing the RX ring size),
we make sure that this false measurement will not happen.
Fixes: b451aa39db31 ("examples/l3fwd-power: use DD bit rather than RX queue count") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com>
examples/l2fwd-keepalive: clean up shared mem on exit
This patch adds the unlinking/unmapping of shared host memory
on termination of l2fwd-keepalive. Previously it was only
cleaned on re-running of the example application.
The l2fwd-keepalive example has infinite processing loops and as a
result the only way to exit it is via SIGINT/SIGTERM (e.g. Control-C).
The resulting shutdown is unclean, which is fixed by adding a signal
handler that causes the processing loops to break.
Bruce Richardson [Fri, 28 Apr 2017 10:18:15 +0000 (11:18 +0100)]
examples/performance-thread: fix build on FreeBSD 10.0
While later releases in the FreeBSD 10 series have a CPU_COUNT macro
defined, FreeBSD 10.0 and 10.1 do not have this macro. Therefore we provide
a basic fallback implementation of the macro for platforms where it is not
defined.
Fixes: 433ba6228f9a ("examples/performance-thread: add pthread_shim app") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>