Andrew Rybchenko [Tue, 29 Nov 2016 16:19:00 +0000 (16:19 +0000)]
net/sfc/base: import bootrom configuration
Provide API to read/write bootrom configuration from/to NVRAM.
EFSYS_OPT_BOOTROM should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:59 +0000 (16:18 +0000)]
net/sfc/base: import VPD support
Provide API to read/write PCI Vital Product Data.
EFSYS_OPT_VPD should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:58 +0000 (16:18 +0000)]
net/sfc/base: import NVRAM support
Provide API to work with NIC non-volatile memory. It is used
to update firmware, configure NIC including bootrom parameters,
manage licenses, store PCI Vital Product Data etc.
EFSYS_OPT_NVRAM should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:57 +0000 (16:18 +0000)]
net/sfc/base: import Rx packed stream mode
In packed stream mode, large buffers are provided to the NIC
into which many packets can be delivered. This reduces the
number of queue refills needed compared to delivering every
packet into a separate buffer.
EFSYS_OPT_RX_PACKED_STREAM should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:56 +0000 (16:18 +0000)]
net/sfc/base: import monitors access via MCDI
EFSYS_OPT_MON_MCDI should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:55 +0000 (16:18 +0000)]
net/sfc/base: import monitors statistics
EFSYS_OPT_MON_STATS should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:54 +0000 (16:18 +0000)]
net/sfc/base: import loopback control
EFSYS_OPT_LOOPBACK should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:53 +0000 (16:18 +0000)]
net/sfc/base: import RSS support
EFSYS_OPT_RX_SCALE should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:52 +0000 (16:18 +0000)]
net/sfc/base: import Rx scatter support
EFSYS_OPT_RX_SCATTER should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:51 +0000 (16:18 +0000)]
net/sfc/base: import event prefetch
EFSYS_OPT_EV_PREFECT allows to enable event prefetching
when event queue is polled.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:50 +0000 (16:18 +0000)]
net/sfc/base: import MAC statistics
MAC statistics are either periodically (if supported/requested)
or on-demand written to provided DMA-mapped memory.
If periodic update is not supported (e.g. for EF10 virtual
functions), it is the driver responsibility to handle it.
EFSYS_OPT_MAC_STATS should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:49 +0000 (16:18 +0000)]
net/sfc/base: import PHY LEDs control
EFSYS_OPT_PHY_LED_CONTROL should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:48 +0000 (16:18 +0000)]
net/sfc/base: import PHY statistics
EFSYS_OPT_PHY_STATS should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:47 +0000 (16:18 +0000)]
net/sfc/base: import PHY flags control
EFSYS_OPT_PHY_FLAGS should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:46 +0000 (16:18 +0000)]
net/sfc/base: import software per-queue statistics
EFSYS_OPT_QSTATS should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:45 +0000 (16:18 +0000)]
net/sfc/base: import built-in selftest
EFSYS_OPT_BIST should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:44 +0000 (16:18 +0000)]
net/sfc/base: import diagnostics support
EFSYS_OPT_DIAG should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:43 +0000 (16:18 +0000)]
net/sfc/base: import SFN8xxx family support
SFN8xxx is the second family based on EF10 architecture.
It has few differences from SFN7xxx adapters family.
EFSYS_OPT_MEDFORD should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:42 +0000 (16:18 +0000)]
net/sfc/base: import SFN7xxx family support
SFN7xxx is the first family based on EF10 architecture.
EFSYS_OPT_HUNTINGTON should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:41 +0000 (16:18 +0000)]
net/sfc/base: import 5xxx/6xxx family support
EFSYS_OPT_SIENA should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:40 +0000 (16:18 +0000)]
net/sfc/base: import MCDI proxy authorization
MCDI proxy authorization may be used if privileged PCI
function (physical function) would like to intercept and
authorize MCDI requests done by unprivileged (e.g. virtual)
PCI function. It may be used to control unprivileged
function Rx mode (e.g. promiscuous, all-multicast), MTU
and default MAC address change requests etc.
Current libefx support is limited to client-side which
is required to work when function requests need to be
authorized.
Server side support required to request and do the
authorization is not implemented yet.
EFSYS_OPT_MCDI_PROXY_AUTH should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:39 +0000 (16:18 +0000)]
net/sfc/base: import MCDI logging
Driver can provide a function to be called to log MCDI
requests and responses to help with debugging.
Solarflare netlogdecode cross-platform tool may be used
to decode these logs.
EFSYS_OPT_MCDI_LOGGING should be enabled to use it.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:38 +0000 (16:18 +0000)]
net/sfc/base: import MCDI implementation
Implement interface to talk to NIC management CPU. Provide
helpers to fill in MCDI requests, execute it and process
received response.
MCDI request is prepared in either PCI BAR mapped memory
(SFN5xxx/SFN6xxx) or DMA-mapped memory (SFN7xxx/SFN8xxx) and,
doorbell is pressed (memory-mapped register) to execute it.
Events about MCDI completion are delivered to house-keeping
event queue, but usage of these events is optional and MCDI
buffer may be simply polled waiting for completion
indication set.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:37 +0000 (16:18 +0000)]
net/sfc/base: import MCDI definition
The header defines data interface between host CPU and NIC
management CPU.
The header is automatically generated from firmware sources.
MCDI is used on NIC control path (configuration,
event/transmit/receive queues setup and teardown etc), but
not used on data path.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:36 +0000 (16:18 +0000)]
net/sfc/base: import filters support
Filtering capabilities depend on NIC family and used firmware
variant. Provided API allows to get supported filter types
(in a priority order), add/delete individual filters and
restore entire filter table after, for example, NIC management
CPU reboot.
Rx filters allow to redirect matching flow to specified Rx queue.
Tx filters allow to control generated traffic (e.g. to implement
virtual function anti-spoofing control).
EFSYS_OPT_FILTER should be enabled to use it. It is required
for SFN7xxx and SFN8xxx adapter families support.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:35 +0000 (16:18 +0000)]
net/sfc/base: import register definitions
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:34 +0000 (16:18 +0000)]
net/sfc/base: import libefx base
libefx is a platform-independent library to implement drivers
for Solarflare network adapters. It provides unified adapter
family independent interface (if possible).
Driver must provide efsys.h header which defines options
(EFSYS_OPT_*) to be used and macros/functions to allocate
memory, read/write DMA-mapped memory, read/write PCI BAR
space, locks, barriers etc.
efx.h and efx_types.h provide external interfaces intended
to be used by drivers. Other header files are internal.
From Solarflare Communications Inc.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Tue, 29 Nov 2016 16:18:33 +0000 (16:18 +0000)]
net/sfc: libefx-based driver stub
Enable the PMD by default on supported configurations.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Jingjing Wu [Wed, 30 Nov 2016 02:02:25 +0000 (10:02 +0800)]
net/i40evf: fix casting between structs
Casting from structs which lay out data in typed members
to structs which have flat memory buffers, will cause
problems if the alignment of the former isn't as expected.
This patch removes the casting between structs.
Fixes:
ae19955e7c86 ("i40evf: support reporting PF reset")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Wenzhuo Lu [Sun, 27 Nov 2016 18:11:44 +0000 (13:11 -0500)]
net/e1000/base: announce supported devices
Document all supported NICs.
Add Intel I219 NICs support in release note.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Qi Zhang [Tue, 29 Nov 2016 20:26:21 +0000 (15:26 -0500)]
net/i40e: enable auto link update for 25G
For 25G devices auto link update was disabled because it was causing
link issues when enabled.
The problem found because of interface changes in admin queue command
"set_phy_config" and "get_phy_capabilities" for 25G.
This patch fixes the issue and enables auto link update for 25G devices.
Fixes:
75d133dd3296 ("net/i40e: enable 25G device")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Jingjing Wu [Sun, 27 Nov 2016 09:11:35 +0000 (17:11 +0800)]
net/i40e: fix logging for Tx free threshold check
Fixes:
4861cde46116 ("i40e: new poll mode driver")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:23:00 +0000 (12:23 -0500)]
net/e1000: enable new I219 devices
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:59 +0000 (12:22 -0500)]
net/e1000/base: update shared code version
Updated to 2016.11.22
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:58 +0000 (12:22 -0500)]
net/e1000/base: support more I219 devices
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:57 +0000 (12:22 -0500)]
net/e1000/base: disable force K1-off feature
MAC-PHY desync may occur causing misdetection of link up event.
Disabling K1-off feature can work around the problem.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:56 +0000 (12:22 -0500)]
net/e1000/base: add workaround for possible stalled packet
This works around a possible stalled packet issue, which may occur due to
clock recovery from the PCH being too slow, when the LAN is transitioning
from K1 at 1G link speed.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:55 +0000 (12:22 -0500)]
net/e1000/base: enable new I219 devices
Enable the support of new I219 devices.
Also define some registers for future usage.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:54 +0000 (12:22 -0500)]
net/e1000/base: add workaround for ULP entry flow
For I217 revision 6, when entering Ultra Low Power (ULP) we need to enable
Low Power Link Up (LPLU) and disable Gig speed to make it work.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:53 +0000 (12:22 -0500)]
net/e1000/base: increase LANPHYPC low duration
LANPHYPC low duration of 10 usec was too low for some corner cases
causing interface mismatches during Ultra Low Power (ULP) exit.
This patch increases the LANPHYPC low duration to 1 msec.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:52 +0000 (12:22 -0500)]
net/e1000/base: clear ULP configuration register on ULP exit
There are some client PHY Ultra Low Power (ULP) register bits that are
configured by the Manageability Engine (ME) FW.
The driver must ensure that these bits are cleared on exit from ULP.
Ordinarily the ME FW would do that, but there are cases in which the
FW is not present, and the driver must handle that.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:51 +0000 (12:22 -0500)]
net/e1000/base: restore link speed after ULP exit
When Ultra Low Power (ULP) enabled, the client PHY needs to be set up
for link configuration after cable reconnected.
Previously link configuration was only done in auto-negotiate mode.
Do link configuration also in autoneg disabled mode.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:50 +0000 (12:22 -0500)]
net/e1000/base: define max Rx jumbo frame size
Add definition MAX_RX_JUMBO_FRAME_SIZE for igb.
All igb parts (82575 and newer) have 9.5K max jumbo frame size.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:49 +0000 (12:22 -0500)]
net/e1000/base: expose I350 internal function
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:48 +0000 (12:22 -0500)]
net/e1000/base: get FW version for I354
I354 support was missing in the e1000_get_fw_version() which resulted in
the FW version not being reported. Support added.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:47 +0000 (12:22 -0500)]
net/e1000/base: retry to get HW mailbox lock
The driver shouldn't give up if it fails to get the hardware mailbox lock.
This can happen in a situation where the PF-VF communication channel is
heavily loaded and causes complete communications failure between the PF
and VF drivers.
Add a counter and a delay. The driver will now retry ten times,
waiting one millisecond between retries.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:46 +0000 (12:22 -0500)]
net/e1000/base: avoid packet loss for non-1G
To avoid packet loss, Phase Lock Loop (PLL) clock gate time needs to be
increased for non 1G speeds.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Wed, 23 Nov 2016 17:22:45 +0000 (12:22 -0500)]
net/e1000/base: increase ULP timer
With new hardware (I219), Ultra Low Power (ULP) exit takes significantly
longer time. Therefore, driver must wait longer.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
David Marchand [Mon, 21 Nov 2016 18:06:14 +0000 (19:06 +0100)]
net: align ethdev and eal driver names
Some virtual pmds report a different name than the vdev driver name
registered in eal.
While it does not hurt, let's try to be consistent.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
David Marchand [Mon, 21 Nov 2016 18:06:13 +0000 (19:06 +0100)]
net: remove dead driver names
Since commit
b1fb53a39d88 ("ethdev: remove some PCI specific handling"),
rte_eth_dev_info_get() relies on dev->data->drv_name to report the driver
name to caller.
Having the pmds set driver_info->driver_name in the pmds is useless,
since ethdev overwrites it right after.
The only thing the pmd must do is:
- for pci drivers, call rte_eth_copy_pci_info() which then sets
data->drv_name
- for vdev drivers, manually set data->drv_name
At this stage, virtio-user does not properly report a driver name (fixed in
next commit).
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Jan Blunck <jblunck@infradead.org>
Nélio Laranjeiro [Thu, 17 Nov 2016 09:49:56 +0000 (10:49 +0100)]
net/mlx5: do not invalidate title CQE
We can leave the title completion queue entry untouched since its contents
are not modified.
Reported-by: Liming Sun <lsun@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Nélio Laranjeiro [Thu, 17 Nov 2016 09:49:55 +0000 (10:49 +0100)]
net/mlx5: fix endianness in Tx completion queue
Completion queue entry data uses network endian, to access them we should
use ntoh*().
Fixes:
c305090bbaf8 ("net/mlx5: replace countdown with threshold for Tx completions")
Reported-by: Liming Sun <lsun@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Nélio Laranjeiro [Thu, 17 Nov 2016 09:49:54 +0000 (10:49 +0100)]
net/mlx5: fix leak when starvation occurs
The list of segments to free was wrongly manipulated ending by only freeing
the first segment instead of freeing all of them. The last one still
belongs to the NIC and thus should not be freed.
Fixes:
a1bdb71a32da ("net/mlx5: fix crash in Rx")
Reported-by: Liming Sun <lsun@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Remy Horton [Mon, 14 Nov 2016 06:14:49 +0000 (14:14 +0800)]
net/i40e: fix spelling
Fixes:
da61cd084976 ("i40evf: add extended stats")
Fixes:
0eedec25ea36 ("i40e: clean log messages")
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Remy Horton [Mon, 14 Nov 2016 06:14:48 +0000 (14:14 +0800)]
net/i40e: fix xstats value mapping
The offsets used in rte_i40evf_stats_strings for transmission
statistics were wrong, returning the total byte count rather than
the respective (unicast, multicast, broadcast, drop, & error)
packet counts.
Fixes:
da61cd084976 ("i40evf: add extended stats")
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Thomas Monjalon [Tue, 17 Jan 2017 22:13:00 +0000 (23:13 +0100)]
net/virtio: fix build without virtio-user
When CONFIG_RTE_VIRTIO_USER is disabled (default on FreeBSD),
the virtio driver cannot be compiled:
librte_pmd_virtio.a(virtio_ethdev.o): In function `eth_virtio_dev_init':
(.text+0x1eba): undefined reference to `virtio_user_ops'
Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Qiming Yang [Mon, 16 Jan 2017 10:48:31 +0000 (18:48 +0800)]
examples/ethtool: display firmware version
This patch enhances the ethtool example to support to show
firmware version, in the same way that the Linux kernel
ethtool does.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Qiming Yang [Mon, 16 Jan 2017 10:48:30 +0000 (18:48 +0800)]
net/i40e: add firmware version get
This patch add a new function i40e_fw_version_get.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Qiming Yang [Mon, 16 Jan 2017 10:48:29 +0000 (18:48 +0800)]
net/ixgbe: add firmware version get
This patch adds a new function ixgbe_fw_version_get.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Qiming Yang [Mon, 16 Jan 2017 10:48:28 +0000 (18:48 +0800)]
net/e1000: add firmware version get
This patch adds a new function eth_igb_fw_version_get.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Qiming Yang [Mon, 16 Jan 2017 10:48:27 +0000 (18:48 +0800)]
ethdev: add firmware version get
This patch adds a new API 'rte_eth_dev_fw_version_get' for
fetching firmware version by a given device.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Olivier Matz [Tue, 17 Jan 2017 10:35:53 +0000 (11:35 +0100)]
net/virtio: fix advertised Rx offload capabilities
When the virtio PMD is used on top of a vhost that does not support
offloads, Rx offload capabilities are still advertised by
virtio_dev_info_get(). But if an application tries to start the PMD with
Rx offloads enabled (rxmode.hw_ip_checksum = 1), the initialization of
the device will fail with -ENOTSUP and the following log:
rx ip checksum not available on this host
This patch fixes the Rx offload capabilities returned by
virtio_dev_info_get() to be consistent with features advertised by the
host.
Fixes:
96cb6711939e ("net/virtio: support Rx checksum offload")
Fixes:
86d59b21468a ("net/virtio: support LRO")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tomasz Kulasek [Fri, 16 Dec 2016 15:15:16 +0000 (16:15 +0100)]
examples/performance-thread: add packet type parsing
Last changes in Niantic and Fortville NIC drivers causes that
vector Rx path is chosen by default in l3fwd-thread application.
This path doesn't support propagation of hw packet type recognition
to the packet_type field in mbuf, and packets cannot be classified
properly.
The approach to solve this problem is similar to the commit:
71a7e2424e07 ("examples/l3fwd: fix using packet type blindly").
To use sw packet analyzer, new command line option "--parse-ptype" is
introduced.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Jasvinder Singh [Mon, 21 Nov 2016 13:37:37 +0000 (13:37 +0000)]
examples/ip_pipeline: fix parsing of pass-through pipeline
This patch fixes the configuration file parsing error when load balancing
function is enabled in pass-through pipeline.
error log:
pipeline> [APP] Initializing PIPELINE1 ...
[PIPELINE1] Pass-through
Parse error in section "PIPELINE1": entry "lb" has invalid value ("hash")
Fixes:
cbe82f6cfb0a ("examples/ip_pipeline: add swap action in pass-through")
Cc: stable@dpdk.org
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Sankar Chokkalingam [Wed, 28 Dec 2016 12:01:54 +0000 (05:01 -0700)]
examples/ip_pipeline: fix coremask limitation
Issue:
coremask used in IP Pipeline is limited to 64 cores.
Solution:
Modified coremask as an array of uint64_t to support RTE_MAX_LCORE
Fixes:
7f64b9c004aa ("examples/ip_pipeline: rework config file syntax")
Fixes:
eb32fe7c5574 ("examples/ip_pipeline: rework initialization parameters")
Fixes:
b4aee0fb9c6d ("examples/ip_pipeline: reconfigure thread binding dynamically")
Fixes:
4e14069328fc ("examples/ip_pipeline: measure CPU utilization")
Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Anand B Jyoti [Sun, 8 Jan 2017 21:55:49 +0000 (03:25 +0530)]
examples/ip_pipeline: check VLAN and MPLS parameters
This commit add to CLI command check for the following errors
1. SVLAN and CVLAN IDs greater than 12 bits
2. MPLS ID greater than 20 bits
3. max number of supported MPLS labels to avoid array overflow
It prevents running CLI commands with invalid parameters.
Signed-off-by: Anand B Jyoti <anand.b.jyoti@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Thomas Monjalon [Tue, 17 Jan 2017 17:16:19 +0000 (18:16 +0100)]
examples/ip_pipeline: remove useless makefile line
A dollar sign is missing and it is not needed because of VPATH.
Reported-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Olivier Matz [Tue, 22 Nov 2016 13:52:15 +0000 (14:52 +0100)]
examples/l3fwd: rework long options parsing
Avoid the use of several strncpy() since getopt is able to
map a long option with an id, which can be matched in the
same switch/case than short options.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Tue, 22 Nov 2016 13:52:16 +0000 (14:52 +0100)]
examples/l2fwd: rework long options parsing
Do the same than in l3fwd to avoid strcmp() for long options.
For l2fwd, there is no long option that take advantage of this new
mechanism as --mac-updating and --no-mac-updating are directly setting a
flag without needing an entry in the switch/case.
So this patch just prepares the framework in case a new long option is
added in the future.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Yongseok Koh [Fri, 6 Jan 2017 22:40:10 +0000 (14:40 -0800)]
doc: fix links to Linux in contribution guide
A referenced document in the Linux Kernel has been moved to a
sub-directory. And kernel community has moved to RST/Sphinx. The links are
replaced with HTML rendered links.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Pablo de Lara [Mon, 19 Dec 2016 16:34:12 +0000 (16:34 +0000)]
doc: simplify l3fwd example guide
L3 Forwarding sample app user guides have some inconsistencies
between the example command line and the configuration table.
Also, they were showing too complicated configuration, using two
different NUMA nodes for two ports, which will probably lead
to performance drop due to use cross-socket channel.
This patch simplifies the configuration of these examples,
by using a single NUMA node and a single queue per port.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Yong Wang [Mon, 19 Dec 2016 12:14:38 +0000 (07:14 -0500)]
doc: fix a typo in prog guide
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: John McNamara <john.mcnamara@intel.com>
Rami Rosen [Thu, 5 Jan 2017 21:36:09 +0000 (23:36 +0200)]
doc: fix a typo in proc_info guide
This patch fixes a typo in proc_info guide (tools).
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Rami Rosen [Sat, 7 Jan 2017 14:08:04 +0000 (16:08 +0200)]
doc: fix a typo in testpmd guide
This patch fixes a trivial typo in testpmd application guide.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Wenzhuo Lu [Tue, 13 Dec 2016 07:11:08 +0000 (15:11 +0800)]
app/testpmd: fix check for invalid ports
Some CLIs don't check the input port ID, it
may cause segmentation fault (core dumped).
Fixes:
425781ff5afe ("app/testpmd: add ixgbe VF management")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Zhihong Wang [Wed, 7 Dec 2016 01:31:06 +0000 (20:31 -0500)]
eal: optimize aligned memcpy on x86
This patch optimizes rte_memcpy for well aligned cases, where both
dst and src addr are aligned to maximum MOV width. It introduces a
dedicated function called rte_memcpy_aligned to handle the aligned
cases with simplified instruction stream. The existing rte_memcpy
is renamed as rte_memcpy_generic. The selection between them 2 is
done at the entry of rte_memcpy.
The existing rte_memcpy is for generic cases, it handles unaligned
copies and make store aligned, it even makes load aligned for micro
architectures like Ivy Bridge. However alignment handling comes at
a price: It adds extra load/store instructions, which can cause
complications sometime.
DPDK Vhost memcpy with Mergeable Rx Buffer feature as an example:
The copy is aligned, and remote, and there is header write along
which is also remote. In this case the memcpy instruction stream
should be simplified, to reduce extra load/store, therefore reduce
the probability of load/store buffer full caused pipeline stall, to
let the actual memcpy instructions be issued and let H/W prefetcher
goes to work as early as possible.
This patch is tested on Ivy Bridge, Haswell and Skylake, it provides
up to 20% gain for Virtio Vhost PVP traffic, with packet size ranging
from 64 to 1500 bytes.
The test can also be conducted without NIC, by setting loopback
traffic between Virtio and Vhost. For example, modify the macro
TXONLY_DEF_PACKET_LEN to the requested packet size in testpmd.h,
rebuild and start testpmd in both host and guest, then "start" on
one side and "start tx_first 32" on the other.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:30 +0000 (07:10 +0000)]
examples/l3fwd-power: fix stop and close on signal
As it gets killed, in SIGINT signal handler, device is not stopped
and closed. In virtio's case, vector assignment in the KVM is not
deassigned.
This patch will invoke dev_stop() and dev_close() in signal handler.
Fixes:
d7937e2e3d12 ("power: initial import")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:29 +0000 (07:10 +0000)]
examples/l3fwd-power: add --parse-ptype option
To support those devices that do not provide packet type info when
receiving packets, add a new option, --parse-ptype, to analyze
packet type in the Rx callback.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:28 +0000 (07:10 +0000)]
net/virtio: unmap queue/irq when closing
When closing virtio devices, close eventfds, free the struct to
store queue/irq mapping.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:27 +0000 (07:10 +0000)]
net/virtio: unbind interrupt/eventfd when stopping
When virtio devices get stopped, tell the kernel to unbind the
mapping between interrupts and eventfds.
Note: it behaves differently from other NICs which close eventfds,
free struct. In virtio, we do those things when close device in
following patch.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 08:00:03 +0000 (08:00 +0000)]
net/virtio: setup Rx queue interrupts
This patch mainly allocates structure to store queue/irq mapping,
and configure queue/irq mapping down through PCI ops. It also creates
eventfds for each Rx queue and tell the kernel about the eventfd/intr
binding.
Note: So far, we hard-code 1:1 queue/irq mapping (each rx queue has
one exclusive interrupt), like this:
vec 0 -> config irq
vec 1 -> rxq0
vec 2 -> rxq1
...
which means, the "vectors" option of QEMU should be configured with
a value >= N+1 (N is the number of the queue pairs).
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:25 +0000 (07:10 +0000)]
net/virtio: add Rx interrupt enable/disable functions
This patch implements interrupt enable/disable functions for each
Rx queue. And we rely on flags of avail queue as the hint for virtio
device to interrupt virtio driver or not.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:24 +0000 (07:10 +0000)]
net/virtio: add PCI operation for queue/irq binding
Add handler in virtio_pci_ops to set queue/irq bind.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:23 +0000 (07:10 +0000)]
net/virtio: add Rx descriptor check
Under interrupt mode, rx_descriptor_done is used as an indicator
for applications to check if some number of packets are ready to
be received.
This patch enables this by checking used ring's local consumed idx
with shared (with backend) idx.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:22 +0000 (07:10 +0000)]
net/virtio: invoke method directly for setting IRQ config
We need to define a prototype for such wrapper, which makes thing
too complicated. Remove wrapper and call set_config_irq directly.
Suggested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 17 Jan 2017 07:10:21 +0000 (07:10 +0000)]
net/virtio: fix rewriting LSC flag
The LSC flag is decided according to if VIRTIO_NET_F_STATUS feature
is negotiated. Copy the PCI info after the judgement will rewrite
the correct result.
Fixes:
198ab33677c9 ("net/virtio: move device initialization in a function")
CC: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:40 +0000 (12:18 +0000)]
net/virtio-user: enable multiqueue with kernel vhost
With vhost kernel, to enable multiqueue, we need backend device
in kernel support multiqueue feature. Specifically, with tap
as the backend, as linux/Documentation/networking/tuntap.txt shows,
we check if tap supports IFF_MULTI_QUEUE feature.
And for vhost kernel, each queue pair has a vhost fd, and with a tap
fd binding this vhost fd. All tap fds are set with the same tap
interface name.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:39 +0000 (12:18 +0000)]
net/virtio-user: enable offloading
When used with vhost kernel backend, we can offload at both directions.
- From vhost kernel to virtio_user, the offload is enabled so that
DPDK app can trust the flow is checksum-correct; and if DPDK app
sends it through another port, the checksum needs to be
recalculated or offloaded. It also applies to TSO.
- From virtio_user to vhost_kernel, the offload is enabled so that
kernel can trust the flow is L4-checksum-correct, no need to verify
it; if kernel will consume it, DPDK app should make sure the
l3-checksum is correctly set.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:38 +0000 (12:18 +0000)]
net/virtio-user: support kernel vhost
This patch add support vhost kernel as the backend for virtio_user.
Three main hook functions are added:
- vhost_kernel_setup() to open char device, each vq pair needs one
vhostfd;
- vhost_kernel_ioctl() to communicate control messages with vhost
kernel module;
- vhost_kernel_enable_queue_pair() to open tap device and set it
as the backend of corresonding vhost fd (that is to say, vq pair).
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:37 +0000 (12:18 +0000)]
net/virtio-user: abstract backend operations
Add a struct virtio_user_backend_ops to abstract three kinds of backend
operations:
- setup, create the unix socket connection;
- send_request, sync messages with backend;
- enable_qp, enable some queue pair.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:36 +0000 (12:18 +0000)]
net/virtio-user: move vhost-user specific code
To support vhost kernel as the backend of net_virtio_user in coming
patches, we move vhost_user specific structs and macros into
vhost_user.c, and only keep common definitions in vhost.h.
Besides, remove VHOST_USER_MQ feature check.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:35 +0000 (12:18 +0000)]
net/virtio-user: fix not properly reset device
virtio_user is not properly reset when users call vtpci_reset(),
as it ignores VIRTIO_CONFIG_STATUS_RESET status in
virtio_user_set_status().
This might lead to initialization failure as it starts to re-init
the device before sending RESET messege to backend. Besides, previous
callfds and kickfds are not closed.
To fix it, we add support to disable virtqueues when it's set to
DRIVER OK status, and re-init fields in struct virtio_user_dev.
Fixes:
e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Fixes:
37a7eb2ae816 ("net/virtio-user: add device emulation layer")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 13 Jan 2017 12:18:34 +0000 (12:18 +0000)]
net/virtio-user: fix wrongly get/set features
Before the commit
86d59b21468a ("net/virtio: support LRO"), features
in virtio PMD, is decided and properly set at device initialization
and will not be changed. But afterward, features could be changed in
virtio_dev_configure(), and will be re-negotiated if it's changed.
In virtio-user, device features is obtained at driver probe phase
only once, but we did not store it. So the added feature bits in
re-negotiation will fail.
To fix it, we store it down, and will be used to feature negotiation
either at device initialization phase or device configure phase.
Fixes:
e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 12 Jan 2017 05:37:00 +0000 (13:37 +0800)]
net/virtio: do not store PCI device pointer at shared memory
hw->dev, a pointer to pci_dev, was actually not used, until the
refactor of decouping from PCI device. This would somehow break
the multiple process again, since "hw" is stored at shared memory,
while "pci_dev" is not: the primary and secondary process could
have different address for it, while just one value is allowed.
Thus we should not store it to "hw", instead, we could retrieve
it from the "eth_dev->device" field.
Fixes:
ae34410a8a8a ("ethdev: move info filling of PCI into drivers")
Fixes:
eac901ce29be ("ethdev: decouple from PCI device")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 12 Jan 2017 05:31:57 +0000 (13:31 +0800)]
net/virtio: access interrupt handler directly
Since commit
0e1b45a284b4 ("ethdev: decouple interrupt handling from
PCI device"), intr_handle is stored at eth_dev struct, that we could
use it directly. Thus there is no need to get it from hw.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 6 Jan 2017 10:16:19 +0000 (18:16 +0800)]
net/virtio: fix multiple process support
The introduce of virtio 1.0 support brings yet another set of ops, badly,
it's not handled correctly, that it breaks the multiple process support.
The issue is the data/function pointer may vary from different processes,
and the old used to do one time set (for primary process only). That
said, the function pointer the secondary process saw is actually from the
primary process space. Accessing it could likely result to a crash.
Kudos to the last patches, we now be able to maintain those info that may
vary among different process locally, meaning every process could have its
own copy for each of them, with the correct value set. And this is what
this patch does:
- remap the PCI (IO port for legacy device and memory map for modern
device)
- set vtpci_ops correctly
After that, multiple process would work like a charm. (At least, it
passed my fuzzy test)
Fixes:
b8f04520ad71 ("virtio: use PCI ioport API")
Fixes:
d5bbeefca826 ("virtio: introduce PCI implementation structure")
Fixes:
6ba1f63b5ab0 ("virtio: support specification 1.0")
Cc: stable@dpdk.org
Reported-by: Juho Snellman <jsnell@iki.fi>
Reported-by: Yaron Illouz <yaroni@radcom.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 6 Jan 2017 10:16:18 +0000 (18:16 +0800)]
net/virtio: store IO port info locally
Like vtpci_ops, the rte_pci_ioport has to store in local memory. This
is basically for the rte_pci_device field is allocated from process
local memory, but not from shared memory.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 6 Jan 2017 10:16:17 +0000 (18:16 +0800)]
net/virtio: store PCI operators pointer locally
We used to store the vtpci_ops at virtio_hw structure. The struct,
however, is stored in shared memory. That means only one value is
allowed. For the multiple process model, however, the address of
vtpci_ops should be different among different processes.
Take virtio PMD as example, the vtpci_ops is set by the primary
process, based on its own process space. If we access that address
from the secondary process, that would be an illegal memory access,
A crash then might happen.
To make the multiple process model work, we need store the vtpci_ops
in local memory but not in a shared memory. This is what the patch
does: a local virtio_hw_internal array of size RTE_MAX_ETHPORTS is
allocated. This new structure is used to store all these kind of
info in a non-shared memory. Current, we have:
- vtpci_ops
- rte_pci_ioport
- virtio pci mapped memory, such as common_cfg.
The later two will be done in coming patches. Later patches would also
set them correctly for secondary process, so that the multiple process
model could work.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 6 Jan 2017 10:16:16 +0000 (18:16 +0800)]
net/virtio: fix wrong Rx/Tx method for secondary process
If the primary enables the vector Rx/Tx path, the current code would
let the secondary always choose the non vector Rx/Tx path. This results
to a Rx/Tx method mismatch between primary and secondary process. Werid
errors then may happen, something like:
PMD: virtio_xmit_pkts() tx: virtqueue_enqueue error: -14
Fix it by choosing the correct Rx/Tx callbacks for the secondary process.
That is, use vector path if it's given.
Fixes:
8d8393fb1861 ("virtio: pick simple Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 9 Jan 2017 07:50:59 +0000 (15:50 +0800)]
ethdev: fix port data mismatched in multiple process model
Assume we have two virtio ports, 00:03.0 and 00:04.0. The first one is
managed by the kernel driver, while the later one is managed by DPDK.
Now we start the primary process. 00:03.0 will be skipped by DPDK virtio
PMD driver (since it's being used by the kernel). 00:04.0 would be
successfully initiated by DPDK virtio PMD (if nothing abnormal happens).
After that, we would get a port id 0, and all the related info needed
by virtio (virtio_hw) is stored at rte_eth_dev_data[0].
Then we start the secondary process. As usual, 00:03.0 will be firstly
probed. It firstly tries to get a local eth_dev structure for it (by
rte_eth_dev_allocate):
port_id = rte_eth_dev_find_free_port();
...
eth_dev = &rte_eth_devices[port_id];
eth_dev->data = &rte_eth_dev_data[port_id];
...
return eth_dev;
Since it's a first PCI device, port_id will be 0. eth_dev->data would
then point to rte_eth_dev_data[0]. And here things start going wrong,
as rte_eth_dev_data[0] actually stores the virtio_hw for 00:04.0.
That said, in the secondary process, DPDK will continue to drive PCI
device 00.03.0 (despite the fact it's been managed by kernel), with
the info from PCI device 00:04.0. Which is wrong.
The fix is to attach the port already registered by the primary process.
That is, iterate the rte_eth_dev_data[], and get the port id who's PCI
ID matches the current PCI device.
This would let us maintain same port ID for the same PCI device, keeping
the chance of referencing to wrong data minimal.
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>