82599 has two loopback operation modes, Tx->Rx and Rx->Tx.
For the time being only Tx->Rx is supported.
The new field lpbk_mode added in struct rte_eth_conf defines loopback
operation mode for certain ethernet controller. By default the value
of lpbk_mode is 0, meaning loopback mode disabled.
Since each ethernet controller has its own definition of loopback modes,
API user has to check both datasheet and implementation of certain driver
so as to understand what are valid values to be set, and what are the
expected behaviors.
Check IXGBE_LPBK_82599_XXX which are defined in ixgbe_ethdev.h
for valid values of 82599 loopback mode.
igb: restore workaround errata with wthresh on 82576
The 82576 has known issues which require the write threshold to be set to 1.
See:
http://download.intel.com/design/network/specupdt/82576_SPECUPDATE.pdf
If not then single packets will hang in transmit ring until more arrive.
Simple tests like ping will fail.
The workaround was in the wrong file (commit a30ebfbb8c3a).
Move it in igb one to restore original patch (7e9e49feea).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Maxime Leroy [Mon, 2 Sep 2013 15:14:10 +0000 (17:14 +0200)]
igb/ixgbe: allow RSS with only one Rx queue
It should be possible to enable RSS with one Rx queue.
RSS hash can be useful independently of the number of Rx queues.
Applications can use RSS hash to identify different IP flows.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com> Acked-by: Ivan Boule <ivan.boule@6wind.com>
Maxime Leroy [Mon, 2 Sep 2013 15:07:55 +0000 (17:07 +0200)]
igb/ixgbe: ETH_MQ_RX_NONE should disable RSS
As explained in rte_ethdev.h, ETH_MQ_RX_NONE allows to not choose RSS, DCB
or VMDQ mode.
But the igb/ixgbe code always silently select the RSS mode with ETH_MQ_RX_NONE.
This patch fixes this incoherence between the API and the implementation.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com> Acked-by: Ivan Boule <ivan.boule@6wind.com>
Bruce Richardson [Mon, 10 Feb 2014 15:27:26 +0000 (15:27 +0000)]
vmxnet3: import new vmxnet3 poll mode driver implementation
Poll Mode Driver for Paravirtual VMXNET3 NIC.
As a PMD, the VMXNET3 driver provides the packet reception and transmission
callbacks, vmxnet3_recv_pkts and vmxnet3_xmit_pkts. It does not support
scattered packet reception as part of vmxnet3_recv_pkts and
vmxnet3_xmit_pkts. Also, it does not support scattered packet reception as part of
the device operations supported.
The VMXNET3 PMD handles all the packet buffer memory allocation and resides in
guest address space and it is solely responsible to free that memory when not needed.
The packet buffers and features to be supported are made available to hypervisor via
VMXNET3 PCI configuration space BARs. During RX/TX, the packet buffers are
exchanged by their GPAs, and the hypervisor loads the buffers with packets in the RX
case and sends packets to vSwitch in the TX case.
The VMXNET3 PMD is compiled with vmxnet3 device headers. The interface is similar
to that of the other PMDs available in the Intel(R) DPDK API. The driver pre-allocates the
packet buffers and loads the command ring descriptors in advance. The hypervisor fills
those packet buffers on packet arrival and write completion ring descriptors, which are
eventually pulled by the PMD. After reception, the Intel(R) DPDK application frees the
descriptors and loads new packet buffers for the coming packets. The interrupts are
disabled and there is no notification required. This keeps performance up on the RX
side, even though the device provides a notification feature.
In the transmit routine, the Intel(R) DPDK application fills packet buffer pointers in the
descriptors of the command ring and notifies the hypervisor. In response the hypervisor
takes packets and passes them to the vSwitch. It writes into the completion descriptors
ring. The rings are read by the PMD in the next transmit routine call and the buffers
and descriptors are freed from memory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Tue, 11 Feb 2014 12:25:54 +0000 (12:25 +0000)]
kni: update kernel driver ethtool baseline
Update the KNI kernel driver so it can compile on more modern kernels
Also, rebaseline the ethtool support off updated igb kernel drivers
so that we get the latest bug fixes and device support.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Wed, 12 Feb 2014 15:50:11 +0000 (15:50 +0000)]
xen: import xenvirt pmd and vhost_xen
This provides a para-virtualization packet switching solution, based on the
Xen hypervisor’s Grant Table, which provides simple and fast packet
switching capability between guest domains and host domain based on
MAC address or VLAN tag.
This solution is comprised of two components; a Poll Mode Driver (PMD)
as the front end in the guest domain and a switching back end in the
host domain. XenStore is used to exchange configure information
between the PMD front end and switching back end,
including grant reference IDs for shared Virtio RX/TX rings, MAC
address, device state, and so on.
The front end PMD can be found in the Intel DPDK directory lib/
librte_pmd_xenvirt and back end example in examples/vhost_xen.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Wed, 12 Feb 2014 15:32:25 +0000 (15:32 +0000)]
xen: core library changes
Core support for using the Intel DPDK with Xen Dom0 - including EAL
changes and mempool changes. These changes encompass how memory mapping
is done, including support for initializing a memory pool inside an
already-allocated block of memory.
KNI sample app updated to use KNI close function when used with Xen.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Tue, 11 Feb 2014 10:28:51 +0000 (10:28 +0000)]
ivshmem: library changes for mmaping using ivshmem
These library changes provide a new Intel DPDK feature for communicating
with virtual machines using QEMU's IVSHMEM mechanism.
The feature works by providing a command line for QEMU to map several hugepages
into a single IVSHMEM device. For the guest to know what is inside any given IVSHMEM
device (and to distinguish between Intel(R) DPDK and non-Intel(R) DPDK IVSHMEM
devices), a metadata file is also mapped into the IVSHMEM segment. No work needs to
be done by the guest application to map IVSHMEM devices into memory; they are
automatically recognized by the Intel(R) DPDK Environment Abstraction Layer (EAL).
Changes in this patch:
* Changes to EAL to allow mapping of all hugepages in a memseg into a single file
* Changes to EAL to allow ivshmem devices to be transparently mapped in
the process running on the guest.
* New ivshmem library to create and manage metadata exported to guest VM's
* New ivshmem compilation targets
* Mempool and ring changes to allow export of structures to a VM and allow
a VM to attach to those structures.
* New autotests to unit tests this functionality.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Tue, 11 Feb 2014 16:24:25 +0000 (16:24 +0000)]
mem: add bounded reserve function
For certain functionality, e.g. Xen Dom0 support, it is required that
we can guarantee that memzones for descriptor rings won't cross 2M
boundaries. So add new memzone reserve function where we can pass in a
boundary condition parameter.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Wed, 12 Feb 2014 13:38:45 +0000 (13:38 +0000)]
mem: allow virtual memory address hinting
For multi-process applications, it can sometimes occur that part of the
address ranges used for memory mapping in the primary process are not
free in the secondary process, which causes the secondary processes to
abort on startup.
This patch adds in a memory hinting mechanism, where you can hint a
starting base address to the primary process for where you would like
the hugepage memory to be mapped. It is just a hint, so the memory will
not always go exactly where requested, but it should allow the memory
addresses used by a primary process to be adjusted up or down a little,
thereby fixing issues with secondary process startup.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Mon, 17 Feb 2014 19:15:19 +0000 (20:15 +0100)]
mbuf: rework check on mbuf freeing
Allow poll-mode drivers to maintain their own caches of mbufs, by allowing them
to check if it's ok to free an mbuf (to their local cache) without actually
freeing it back to the memory pool itself.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@6wind.com>
Bruce Richardson [Tue, 11 Feb 2014 10:08:43 +0000 (10:08 +0000)]
eal: new common macros added
Added the following new macros/inline functions, which are both
generally useful and needed for later functionality:
* rte_align64pow2: aligns a 64bit parameter to next power of 2
* RTE_LEN2MASK: create mask of type <tp> with the first <ln> bits
* RTE_DIM: return the number of elements in an array.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Mon, 10 Feb 2014 14:43:44 +0000 (14:43 +0000)]
eal: add rte_compiler_barrier() macro
The rte_ring functions used a compiler barrier to stop the compiler
reordering certain expressions. This is generally useful so is moved
to the common header file with the other barriers.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Mon, 17 Feb 2014 12:46:06 +0000 (13:46 +0100)]
mk: rework cpu flags detection
For cases where the compilation microarchitecture is explicitly given, we
extract the cpu-flags to use from the compiler rather than hard-coding. This
means that we will only ever use instruction sets supported by the compiler,
rather than having a case where the uarch and the Intel DPDK both support a
given instruction-set, but the compiler does not.
In the case where 'native' uarch support is requested, the same mechanism is
also used to detect the instruction-sets supported
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@6wind.com>
Thomas Monjalon [Wed, 23 Oct 2013 09:40:56 +0000 (11:40 +0200)]
config: fix combined/shared lib
- Configuration for combined and shared library was only in the template
defconfig_x86_64-default-linuxapp-gcc.
- CONFIG_RTE_LIBNAME was in the wrong section
- RTE_LIBNAME had no quote in "C context" (include/rte_config.h)
- and then CONFIG_RTE_LIBNAME quotes were not properly removed in "make context"
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Bruce Richardson [Mon, 10 Feb 2014 11:49:10 +0000 (11:49 +0000)]
add FreeBSD support
Changes to allow compilation and use on FreeBSD. Includes:
* contigmem and nic_uio driver for FreeBSD
* new EAL instance
* new "bsdapp" compilation target
* various compilation fixes due to differences between linux and freebsd
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
A static list of 64 mbufs was being reused in Rx function.
This caused two errors:
1) If more than 64 buffers were requested in a single burst,
only the last 64 buffers are returned, the others are lost.
2) Application will free the mbuf being returned, but the receive
function will reuse the buffer anyway. If some other allocation
is done, there is suddenly multiple writers for the same mbuf.
It is fixed by allocating mbuf on demand.
In the same time, some length errors are fixed.
Reported-by: Mats Liljegren <mats.liljegren@enea.com> Reported-by: Robert Sanford <rsanford@prolexic.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Intel [Wed, 4 Dec 2013 09:00:00 +0000 (10:00 +0100)]
ixgbe: residual fix about resetting big Tx queues
Index overflow when resetting big queues was partially fixed in bcf457f8c0d64a5c (ixgbe: fix index overflow when resetting big Tx queues)
and better fixed in e8ae856140bce4e4 (igb/ixgbe: fix index overflow when resetting big queues)
But this version (1.5.2r0) has residues of the initial fix from 1.5.1r0.
Thomas Monjalon [Fri, 15 Nov 2013 11:08:10 +0000 (12:08 +0100)]
igb/ixgbe: fix index overflow when resetting big queues
Rings are resetted with a loop because memset cannot be used without
issuing a warning about volatile casting.
The index of the loop was a 16-bit variable which is sufficient for
ring entries number but not for the byte size of the whole ring.
The overflow happens when rings are configured for 4096 entries
(descriptor size is 16 bytes). The result is an endless loop.
It is fixed by indexing ring entries and resetting all bytes of the entry
with a simple assignment.
The descriptor initializer is zeroed thanks to its static declaration.
There already was a fix for ixgbe Tx only
(commit bcf457f8c0d64a5cb094fd55836b324bddb930b6).
It is reverted to use the same fix everywhere (Rx/Tx for igb/ixgbe).
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Ivan Boule <ivan.boule@6wind.com>
Intel [Fri, 8 Nov 2013 02:00:00 +0000 (03:00 +0100)]
app/cmdline_test: fix build without app/test
This application is built if LIBRTE_CMDLINE is enabled.
But there was no enabled source file if APP_TEST is disabled.
Let's consider that CONFIG_RTE_APP_TEST apply only on app/test.
Intel [Fri, 8 Nov 2013 02:00:00 +0000 (03:00 +0100)]
ixgbe: fix index overflow when resetting big Tx queues
The index of the loop was a 16-bit variable which is sufficient for
ring entries number but not for the byte size of the whole ring.
The overflow happens when queue rings are configured for 4096 entries
(descriptor size is 16 bytes). The result is an endless loop.
Intel [Fri, 8 Nov 2013 02:00:00 +0000 (03:00 +0100)]
ethdev: add bypass logic
An overview of bypass logic can be seen in this document:
http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-server-bypass-adapter-x520-x540-family-brief.pdf