Zhirun Yan [Thu, 20 Dec 2018 11:01:29 +0000 (11:01 +0000)]
net/i40e: support PF respond VF request more queues
This patch respond the VIRTCHNL_OP_REQUEST_QUEUES msg from VF, and
process to allocated more queues for the requested VF. If successful,
PF will notify VF to reset. If unsuccessful, PF will send message to
inform VF.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Thu, 20 Dec 2018 11:01:28 +0000 (11:01 +0000)]
net/i40e: support VF request more queues
Before this patch, VF gets a default number of queues from the PF.
This patch enables VF to request a different number. When VF configures
more queues, it will send VIRTCHNL_OP_REQUEST_QUEUES to PF to request
more queues, if success, PF will reset the VF.
User can run "port stop all", "port config port_id rxq/txq queue_num"
and "port start all" to reconfigure queue number.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Rahul Lakkireddy [Wed, 19 Dec 2018 16:28:26 +0000 (21:58 +0530)]
net/cxgbe: fix other misc build issues for Windows
Fix following build errors reported by Intel C++ compiler in Windows
build.
C:\> t4_hw.c(5105): warning #147: declaration is incompatible with
"int t4_bar2_sge_qregs(struct adapter *, unsigned int, unsigned int,
u64={uint64_t={unsigned __int64}} *, unsigned int *)"
(declared at line 524 of "..\..\..\..\drivers\net\cxgbe\base\common.h")
int t4_bar2_sge_qregs(struct adapter *adapter, unsigned int qid,
^
C:\> sge.c(400): error : expression must be a pointer to a complete
object type
(uint16_t)(RTE_PTR_ALIGN((char *)mbuf->buf_addr +
^
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
Rahul Lakkireddy [Wed, 19 Dec 2018 16:28:23 +0000 (21:58 +0530)]
net/cxgbe: use relative paths for includes
The Intel C++ compiler is not able to locate the header files without
relative paths in Windows build. Following errors are seen for these
header files.
Fix by explicitly stating header file location using relative paths.
Also, remove automatically including header files for Linux, to keep
it consistent across both OS.
Build Environment:
1. Target OS: Microsoft Windows Server 2016
2. Compiler: Intel C++ Compiler from Intel Parallel Studio XE 2019 [1]
3. Development Tools:
3.1 Microsoft Visual Studio 2017 Professional
3.2 Windows Software Development Kit (SDK) v10.0.17763
3.3 Windows Driver Kit (WDK) v10.0.17763
Rahul Lakkireddy [Fri, 14 Dec 2018 19:01:53 +0000 (00:31 +0530)]
net/cxgbe: fix overlapping regions in TID table
Location of filter TID table should be after active TID table memory,
and not from the beginning of TID table memory. This fixes memory
corruption due to overlapping regions.
Michal Krawczyk [Fri, 14 Dec 2018 13:18:46 +0000 (14:18 +0100)]
net/ena: update version to 2.0.0
The ENAv2 is introducing many new features, mainly the LLQ feature
(Low Latency Queue) which allows the device to process packets faster
and as a result, the latency is noticeably lower.
The second major feature is configurable depth of hw queues where Rx
and Tx can be reconfigured independently and maximum depth of Rx queue
is 8k.
The release also includes many bug fixes and minor new features, like
improved statistics counters and extended statistics.
Rafal Kozik [Fri, 14 Dec 2018 13:18:44 +0000 (14:18 +0100)]
net/ena: update completion queue after cleanup
After Rx or Tx cleanup update completion queue head by calling
ena_com_update_dev_comp_head().
Fixes: 1daff5260ff8 ("net/ena: use unmasked head and tail") Cc: stable@dpdk.org Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Rafal Kozik [Mon, 17 Dec 2018 11:06:18 +0000 (12:06 +0100)]
net/ena: fix cleanup for out of order packets
When wrong req_id is detected some previous mbufs could be used for
receiving different segments of received packets. In such cases chained
mbufs will be twice returned to pool.
To prevent it chained mbuf is now freed just after error detection.
To simplify cleaning, pointers taken for Rx ring are set to NULL.
As after ena_rx_queue_release_bufs and ena_tx_queue_release_bufs queues
are not used updating of next_to_clean pointer is not necessary.
Fixes: c2034976673d ("net/ena: add Rx out of order completion") Cc: stable@dpdk.org Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Rafal Kozik [Fri, 14 Dec 2018 13:18:39 +0000 (14:18 +0100)]
net/ena: fix invalid reference to variable in union
Use empty_rx_reqs instead of empty_tx_reqs.
As those two variables are part of union this not cause
any failure, but for consistency should be changed.
Fixes: c2034976673d ("net/ena: add Rx out of order completion") Cc: stable@dpdk.org Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Rafal Kozik [Fri, 14 Dec 2018 13:18:38 +0000 (14:18 +0100)]
net/ena: add supported RSS offloads types
The PMD was not passing RSS offloads values although it was supporting
the RSS. To allow application to probe the PMD for RSS support, the
missing information was added.
Rafal Kozik [Fri, 14 Dec 2018 13:18:36 +0000 (14:18 +0100)]
net/ena: do not reconfigure queues on reset
Reset function should return the port to initial state, in which no Tx
and Rx queues are setup. Then application should reconfigure the queues.
According to DPDK documentation the rte_eth_dev_reset() itself is a
generic function which only does some hardware reset operations through
calling dev_unint() and dev_init().
ena_com_dev_reset which perform NIC registers reset should be called
during stop.
Rafal Kozik [Fri, 14 Dec 2018 13:18:33 +0000 (14:18 +0100)]
net/ena: increase maximum Rx ring size
Some of ENA devices supports 8k Rx rings. Maximum supported size is
received upon device initialization.
As ENA_DEFAULT_RING_SIZE_RX macro is upper limit, it needs to be
adjusted.
Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Michal Krawczyk [Fri, 14 Dec 2018 13:18:32 +0000 (14:18 +0100)]
net/ena: support LLQv2
LLQ (Low Latency Queue) is the feature that allows pushing header
directly to the device through PCI before even DMA is triggered.
It reduces latency, because device can start preparing packet before
payload is sent through DMA.
Rafal Kozik [Fri, 14 Dec 2018 13:18:31 +0000 (14:18 +0100)]
net/ena: skip packet with wrong request id
When invalid req_id is received, the reset should be handled by the
application, as it is indicating invalid rings state, so further Rx
is not making any sense.
Fixes: c2034976673d ("net/ena: add Rx out of order completion") Cc: stable@dpdk.org Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Rafal Kozik [Fri, 14 Dec 2018 13:18:30 +0000 (14:18 +0100)]
net/ena: add HW queues depth setup
The device now allows driver to reconfigure Tx and Rx queues depth
independently. Moreover, maximum size for Tx and Rx can be different.
Those maximum values are received from the device.
After reset, previous ring configuration is restored.
If number of descriptor is set to RTE_ETH_DEV_FALLBACK_RX_RINGSIZE
or RTE_ETH_DEV_FALLBACK_TX_RINGSIZE, the maximum value is restored.
Remove checks, if provided number is not too big, as this is done in
generic functions (rte_eth_rx_queue_setup and rte_eth_tx_queue_setup).
Maximum number of segments is being set for Rx packets and provided to
ena_com_rx_pkt() for validation.
Unused definitions were removed.
Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Rafal Kozik [Fri, 14 Dec 2018 13:18:29 +0000 (14:18 +0100)]
net/ena: add reset reason in Rx error
Whenever the driver will receive too many descriptors from the device,
it should trigger the device reset with reset reason set to
ENA_REGS_RESET_TOO_MANY_RX_DESCS.
Fixes: 241da076b1f7 ("net/ena: adjust error checking and cleaning") Cc: stable@dpdk.org Signed-off-by: Rafal Kozik <rk@semihalf.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Xiao Wang [Tue, 18 Dec 2018 08:02:06 +0000 (16:02 +0800)]
net/ifc: support SW assisted VDPA live migration
In SW assisted live migration mode, driver will stop the device and
setup a mediated virtio ring to relay the communication between the
virtio driver and the VDPA device.
This data path intervention will allow SW to help on guest dirty page
logging for live migration.
This SW fallback is event driven relay thread, so when the network
throughput is low, this SW fallback will take little CPU resource, but
when the throughput goes up, the relay thread's CPU usage will goes up
accordingly.
User needs to take all the factors including CPU usage, guest perf
degradation, etc. into consideration when selecting the live migration
support mode.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xiao Wang [Tue, 18 Dec 2018 08:02:04 +0000 (16:02 +0800)]
net/ifc: add LM mode parameter
This patch series enables a new method for live migration, i.e. software
assisted live migration. This patch provides a device argument for user
to choose the methold.
When "sw-live-migration=1", driver/device will do live migration with a
relay thread dealing with dirty page logging. Without this parameter,
device will do dirty page logging and there's no relay thread consuming
CPU resource.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xiao Wang [Tue, 18 Dec 2018 08:01:59 +0000 (16:01 +0800)]
vhost: provide helper for host notifier ctrl
VDPA driver can decide if it needs to enable/disable the host notifier
mapping, so exposing a API can allow flexibility. A later patch will
base on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tiago Lam [Mon, 17 Dec 2018 09:14:22 +0000 (09:14 +0000)]
doc: add af_packet PMD guide
As of commit 364e08f2bbc0, DPDK allows an application to send and
receive raw packets using an AF_PACKET and PACKET_MMAP, when using
Linux Kernel. This complements it by adding a simple guide with the
following information:
- An introduction, where a brief explanation of this driver is given,
pointing out the dependency on PACKET_MMAP;
- Which options are supported at configuration time, while setting up an
interface, and it's inherent limitations;
- What the prerequisites are;
- A command line example of how to set up a DPDK port using the
af_packet driver.
Since there's a dependency in PACKET_MMAP, the guide also points to the
original Kernel documentation, so the reader can get more details.
It is possible that the VF device exists but DPDK doesn't know
about it. This could happen if device was blacklisted or more
likely the necessary device (Mellanox) was not part of the DPDK
configuration.
In either case, the right thing to do is just keep working
but only with the slower para-virtual device.
Fixes: dc7680e8597c ("net/netvsc: support integrated VF") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Xiaoyun Li [Thu, 6 Dec 2018 06:03:42 +0000 (14:03 +0800)]
net/i40e: fix statistics inconsistency
While calculating the input packet count per port, discarded packets
should be reduced, right now only PF VSI discarded packets are reduced.
But while calculating the input byte count per port, Rx byte count is
used, which should take all discarded packets into account, including
VF VSI ones.
This will cause inconsistency in stat counters in some cases.
This patch would take all VSI stats as packet and byte count to address
the issue.
Fixes: 763de290cbd1 ("net/i40e: fix packet count for PF") Cc: stable@dpdk.org Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Thu, 13 Dec 2018 15:46:45 +0000 (15:46 +0000)]
net/i40e: clear VF reset flags after reset
The reset flags vf->vf_reset and vf->pend_msg are set when VF received
VIRTCHNL_EVENT_RESET_IMPENDING. So after resetting done, these flags
should be cleared.
Anatoly Burakov [Fri, 21 Dec 2018 11:29:01 +0000 (11:29 +0000)]
test/mem: check external memory without IOVA table
Currently, only scenario with valid IOVA table is tested. Fix this
by also testing without IOVA table - in these cases, EAL should
always return RTE_BAD_IOVA for all memsegs, and contiguous memzone
allocation should fail.
Anatoly Burakov [Fri, 21 Dec 2018 11:29:00 +0000 (11:29 +0000)]
test/mem: refactor and rename functions
We will be adding a new extmem test that will behave roughly similar
to already existing, so clarify function names to distinguish between
these tests, as well as factor out the common parts.
Anatoly Burakov [Fri, 21 Dec 2018 12:26:05 +0000 (12:26 +0000)]
malloc: fix deadlock when reading stats
Currently, malloc statistics and external heap creation code
use memory hotplug lock as a way to synchronize accesses to
heaps (as in, locking the hotplug lock to prevent list of heaps
from changing under our feet). At the same time, malloc
statistics code will also lock the heap because it needs to
access heap data and does not want any other thread to allocate
anything from that heap.
In such scheme, it is possible to enter a deadlock with the
following sequence of events:
thread 1 thread 2
rte_malloc()
rte_malloc_dump_stats()
take heap lock
take hotplug lock
failed to allocate,
attempt to take
hotplug lock
attempt to take heap lock
Neither thread will be able to continue, as both of them are
waiting for the other one to drop the lock. Adding an
additional lock will require an ABI change, so instead of
that, make malloc statistics calls thread-unsafe with
respect to creating/destroying heaps.
Fixes: 72cf92b31855 ("malloc: index heaps using heap ID rather than NUMA node") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Jeff Shaw [Sat, 8 Dec 2018 00:01:26 +0000 (16:01 -0800)]
hash: fix return of bulk lookup
The __rte_hash_lookup_bulk() function returns void, and therefore
should not return with an expression. This commit fixes the following
compiler warning when attempting to compile with "-pedantic -std=c11".
warning: ISO C forbids ‘return’ with expression, in function
returning void [-Wpedantic]
Fixes: 9eca8bd7a61c ("hash: separate lock-free and r/w lock lookup") Cc: stable@dpdk.org Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Liang Ma [Thu, 20 Dec 2018 14:43:42 +0000 (14:43 +0000)]
power: add p-state driver compatibility
Previously, in order to use the power library, it was necessary
for the user to disable the intel_pstate driver by adding
“intel_pstate=disable” to the kernel command line for the system,
which causes the acpi_cpufreq driver to be loaded in its place.
This patch adds the ability for the power library use the intel-pstate
driver.
It adds a new suite of functions behind the current power library API,
and will seamlessly set up the user facing API function pointers to
the relevant functions depending on whether the system is running with
acpi_cpufreq kernel driver, intel_pstate kernel driver or in a guest,
using kvm. The library API and ABI is unchanged.
Signed-off-by: Liang Ma <liang.j.ma@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com>
Qi Zhang [Thu, 20 Dec 2018 12:51:14 +0000 (20:51 +0800)]
eal: close multi-process socket during cleanup
When secondary process quit, the mp_socket* file still exist, that
cause rte_mp_request_sync fail when try to send message on a floating
socket.
The patch fix the issue by introduce a function rte_mp_channel_cleanup.
This function will be called by rte_eal_cleanup and it will close the
mp socket and delete the mp_socket* file.