Ouyang Changchun [Mon, 13 Oct 2014 03:46:12 +0000 (11:46 +0800)]
virtio: increase max Rx packet length
Since commit 13ce5e7eb94 ("virtio: mergeable buffers"),
this driver has the capability of receiving and transmitting jumbo frame.
So update max Rx packet length.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com> Tested-by: Jingguo Fu <jingguox.fu@intel.com>
At least on kernels 3.15 or newer, wrong compiler flags are set when building
kernel modules.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Linux kernel build system requires V=1 to enable verbose output, but
current DPDK framework just check if V is defined.
Fix: force V=1 when building Linux kernel modules if verbose output is
enabled.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Ouyang Changchun [Wed, 15 Oct 2014 03:11:00 +0000 (11:11 +0800)]
virtio: fix needed vring entry number
Fix one issue in virtio TX: it needs one more vring descriptor to hold the virtio
header when transmitting packets, it is used later to determine whether to free
more entries from used vring.
It fixes failing to transmit any packet with 1 segment in the circumstance of only
1 descriptor in the vring free list.
Huawei Xie [Wed, 8 Oct 2014 18:54:58 +0000 (02:54 +0800)]
vhost: comment identified issues
1) FIXME: concurrent calls to vhost set mem table from different guests
could cause mem_temp to be overrided.
2) TODO: cmpset cost quite some cpu cyles. Allow app to disable this
feature if there is no contention in real workload.
3) FIXME: fix scatter gather mbuf copy to vhost vring chained buffers.
Huawei Xie [Wed, 8 Oct 2014 18:54:53 +0000 (02:54 +0800)]
vhost: supported features
VHOST_SUPPORTED_FEATURES is the feature mask that vhost lib supports.
VHOST_FEATURES is the feature mask vhost currently supports after some features are turned on/off.
Huawei Xie [Wed, 8 Oct 2014 18:54:50 +0000 (02:54 +0800)]
vhost: rename ops registering function
Rename init_virtio_net as rte_vhost_callback_register API.
rte_vhost_callback_register register the callbacks called when a
vhost device is created and ready to be added to data processing core
or is de-actived by guest.
Huawei Xie [Wed, 8 Oct 2014 18:54:48 +0000 (02:54 +0800)]
vhost: get internal ops when registering
vhost_net_device_ops is internal implementation in vhost lib.
register_cuse_device will be vhost driver register API.
There is no need for it to know the internal vhost ops.
Instead, that ops is retrieved in register_cuse_device
through get_virtio_net_callbacks.
Huawei Xie [Wed, 8 Oct 2014 18:54:45 +0000 (02:54 +0800)]
vhost: enqueue/dequeue burst
rte_vhost_enqueue_burst copies host packets to guest.
rte_vhost_enqueue_burst will call virtio_dev_rx and virtio_dev_merge_rx
respectively depending on whether merge-able feature is negotiated or not
in the vhost device.
virtio_dev_merge_tx is renamed to rte_vhost_dequeue_burst.
rte_vhost_dequeue_burst gets to-be-sent packets from guest.
Huawei Xie [Wed, 8 Oct 2014 18:54:41 +0000 (02:54 +0800)]
vhost: return packets to upper layer
This patch makes virtio_dev_merge_tx return the received packets to app layer.
Previously virtio_tx_route was called to route these packets and then free them.
Huawei Xie [Wed, 8 Oct 2014 18:54:46 +0000 (02:54 +0800)]
vhost: move internal structure
The structure virtio_net_config_ll is moved to virtio_net.c.
It is related to internal virtio device management,
so it should not be exposed to other files.
Huawei Xie [Wed, 8 Oct 2014 18:54:39 +0000 (02:54 +0800)]
vhost: remove zero copy memory region generation logic
Currently zero copy feature isn't generic as it couples closely with nic.
It isn't put in the vhost lib in this version.
gpa(guest physical address) to hpa(host physical address) mapping region
logic is removed.
Huawei Xie [Wed, 8 Oct 2014 18:54:38 +0000 (02:54 +0800)]
vhost: remove switching related logics
The following logics will be moved to vhost example:
1. mac learning, which is used to learn the mac address from the first
transmitted packet of guest and bind the vhost device to a queue in a
pool of VMDQ.
2. VMDQ mac/vlan filter: Each pool the vhost device is bind to is
assigned a mac/vlan filter.
3. num_devices is used to specify the maximum vhost devices the nic supports.
Huawei Xie [Wed, 8 Oct 2014 18:54:37 +0000 (02:54 +0800)]
vhost: remove useless code for Rx/Tx
Remove all other codes and only keep virtio_dev_rx, copy_from_mbuf_to_vring,
virtio_dev_merge_rx, virtio_dev_merge_tx.
Previous vhost merge-able feature introduces another version of tx function,
virtio_dev_merge_tx. Actually it is not related to merge-able feature but is
the fix for memcpy between mbuf and vring descriptors.
This lib will create the tx functions based on virtio_dev_merge_tx.
Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: do not remove code used or moved later]
Huawei Xie [Wed, 8 Oct 2014 18:54:35 +0000 (02:54 +0800)]
vhost: move from examples to dedicated library
Those files will be refactored in subsequent patches to form user space
vhost library.
Makefile and main.h are removed.
main.c is renamed to vhost_rxtx.c and will provide vring enqueue/dequeue API.
virtio-net.h is renamed to rte_virtio_net.h which is the API header file.
Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove from examples Makefile and merge file renaming]
Pablo de Lara [Wed, 1 Oct 2014 09:49:05 +0000 (10:49 +0100)]
examples: use factorized default Rx/Tx configuration
For apps that were using default rte_eth_rxconf and rte_eth_txconf
structures, these have been removed and now they are obtained by
calling rte_eth_dev_info_get, just before setting up RX/TX queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
i40e: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
ixgbe: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
igb: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
ethdev: get default Rx/Tx configuration from dev info
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
rte_eth_tx_queue_setup, these functions get internally the default RX/TX
configuration for the user.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:03 +0000 (10:49 +0100)]
ethdev: reset whole dev info structure before filling
To guarantee that RX/TX configuration structures are reseted
before modifying them, plus the other dev info fields,
dev info structure is zeroed beforehand.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: David Marchand <david.marchand@6wind.com>
app/testpmd: print message if queue start/stop is not supported
Print an error message to the user when trying to start/stop a rx/tx queue and
this function is not supported by the PMD driver. The patch does not check if
the return value is -EINVAL because testpmd is already validating the port and
queue id.
Signed-off-by: Nicolás Pernas Maradei <nico@emutex.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
librte_pmd_pcap driver was opening the pcap/interfaces only at init time and
closing them only when the port was being stopped. This behaviour would cause
problems (leading to segfault) if the user closed the port 2 times. The first
time the pcap/interfaces would be normally closed but libpcap would throw an
error causing a segfault if the closed pcaps/interfaces were closed again.
This behaviour is solved by re-openning pcaps/interfaces when the port is
started (only if these weren't open already for example at init time).
Signed-off-by: Nicolás Pernas Maradei <nico@emutex.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>
Pablo de Lara [Wed, 1 Oct 2014 22:27:25 +0000 (23:27 +0100)]
ixgbe: fix build with bypass enabled
Since commit aae1047905621 ("use the right debug macro"),
DEBUGOUT was replaced by PMD_DRV_LOG which requires at least
2 arguments. But the level argument was missing.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch takes the existing TX flags defined for the mbuf and shifts
each uniquely defined one left so that additional RX flags can be
defined without having RX and TX flags mixed together. Under the new
scheme, RX flags start at bit 0 and work left, TX flags start at bit 55
and work right, and bits 56-63 are reserved for generic mbuf use, not
for offloads.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:15 +0000 (12:08 +0100)]
app/testpmd: change rxfreet default to 32
To improve performance by using bulk alloc or vectored RX routines, we
need to set rx free threshold (rxfreet) value to 32, so make this the
testpmd default.
Thirty-two is the minimum setting needed to enable either the
bulk alloc or vector RX routines inside the ixgbe driver, so it's
best made the default for that reason. Please see
"check_rx_burst_bulk_alloc_preconditions()" in ixgbe_rxtx.c, and
RX function assignment logic in "ixgbe_dev_rx_queue_setup()" in
the same file.
The difference in IO performance for testpmd when called without any
optional parameters, and using 10G NICs using the ixgbe driver, can be
significant - approx 25% or more.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:14 +0000 (12:08 +0100)]
ixgbe: add prefetch to improve slow-path tx perf
Make a small improvement to slow path TX performance by adding in a
prefetch for the second mbuf cache line.
Also move assignment of l2/l3 length values only when needed.
What I've done with the prefetches is two-fold:
1) changed it from prefetching the mbuf (first cache line) to prefetching
the mbuf pool pointer (second cache line) so that when we go to access
the pool pointer to free transmitted mbufs we don't get a cache miss. When
clearing the ring and freeing mbufs, the pool pointer is the only mbuf
field used, so we don't need that first cache line.
2) changed the code to prefetch earlier - in effect to prefetch one mbuf
ahead. The original code prefetched the mbuf to be freed as soon as it
started processing the mbuf to replace it. Instead now, every time we
calculate what the next mbuf position is going to be we prefetch the mbuf
in that position (i.e. the mbuf pool pointer we are going to free the mbuf
to), even while we are still updating the previous mbuf slot on the ring.
This gives the prefetch much more time to resolve and get the data we need
in the cache before we need it.
In terms of performance difference, a quick sanity test using testpmd
on a Xeon (Sandy Bridge uarch) platform showed performance increases
between approx 8-18%, depending on the particular RX path used in
conjuntion with this TX path code.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:17 +0000 (12:08 +0100)]
mbuf: switch vlan_tci and reserved2 fields
Move the vlan_tci field up by two bytes in the mbuf data structure. This
has two effects:
* Ensures the the ixgbe vector driver places the vlan tag in the correct
place in the mbuf.
* Allows a second vlan tag field, if one is added in the future, to be
placed after the existing vlan field, rather than before.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:16 +0000 (12:08 +0100)]
mbuf: add userdata pointer field
While some applications may store metadata about packets in the packet
mbuf headroom, this is not a workable solution for packet metadata which
is either:
* larger than the headroom (or headroom is needed for adding pkt headers)
* needs to be shared or copied among packets
To support these use cases in applications, we reserve a general
"userdata" pointer field inside the second cache-line of the mbuf. This
is better than having the application store the pointer to the external
metadata in the packet headroom, as it saves an additional cache-line
from being used.
Apart from storing metadata, this field also provides a general 8-byte
scratch space inside the mbuf for any other application uses that are
applicable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:13 +0000 (12:08 +0100)]
mbuf: ensure next pointer is set to null on free
The receive functions for packets do not modify the next pointer so
the next pointer should always be cleared on mbuf free, just in case.
The slow-path TX needs to clear it, and the standard mbuf free function
also needs to clear it. Fast path TX does not handle chained mbufs so
is unaffected
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Overloading the 'msg_size' field in the 'arq_event_info' struct
is a bad idea. It leads to bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the
output value (actual message length). The fix introduces one
more field of 'buf_len' for the buffer size, and renames the
field of 'msg_size' to 'msg_len' for the real message size.
The firmware api request of writes to hardware registers should be
exposed to driver. The new API of 'i40e_aq_debug_write_register'
is introduced for that.
There are variables that represent values in little endian.
Adding prefix of '__Le' can remove warnings during sparse
checks. In addition, remove some unreachable 'break' statements,
and add 'UL' on a couple of constants.
The code wrapped in '#ifdef I40E_TPH_SUPPORT' was added
to check if 'TPH' (TLP Processing Hints) is supported, and enable it.
It is not used currently and can be removed.
The code wrapped in '#ifdef PREBOOT_SUPPORT' was added for
queue context initialization specifically for A0 silicon.
As A0 silicon has gone for a long time, the code should be
removed at all. In addition, the checks of 'QV_RELEASE'
and 'PREBOOT_SUPPORT' are also not needed anymore and can
be removed.
'nvmupdate' is intended to support the userland NVMUpdate tool for
Fortville eeprom. These code changes is to remove the conditional
compile macro, and support those by default. In addition, renaming
all 'errno' to avoid any compile warning or error.
In share code, 'tab' is used to align values rather than 'space'.
The changes in i40e_adminq_cmd.h is to make the indentation more
consistent in share code.
Add new file to support controller X550, therefore update the Makefile
and README file. It also updates the API functions, DCB related functions,
mailbox related functions, etc to support X550.
In addition, some new macros used by X550 are added.
- Store lan_id and physical semaphore mask into hardware physical information,
and use them to control read and write physical registers in IXGBE base code.
- Extend mask from 16 bits to 32 bits for releasing or acquiring SWFW semaphore
in IXGBE base code. It is used in reading and writing I2C byte.
Daniel Mrzyglod [Tue, 30 Sep 2014 12:10:25 +0000 (13:10 +0100)]
kni: fix build on Ubuntu 12.04.5
Recent Ubuntu 12.04.5 LTS is shipped with 3.13.0-36.63 as the only
supported kernel.
So skb_set_hash has been backported and is conflicting with kni kcompat one.
Commit a09b359daca ("fix build on Ubuntu 14.04") describes the initial problem.
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
[Thomas: reorder conditions to ease reading] Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>