Jijiang Liu [Thu, 23 Oct 2014 13:18:57 +0000 (21:18 +0800)]
i40e: VXLAN filter
The filter types supported are listed below for VXLAN:
1. Inner MAC and Inner VLAN ID.
2. Inner MAC address, inner VLAN ID and tenant ID.
3. Inner MAC and tenant ID.
4. Inner MAC address.
5. Outer MAC address, tenant ID and inner MAC address.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:56 +0000 (21:18 +0800)]
ethdev: tunnel filter
Add definitions of the data structures of tunneling packet filter.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:55 +0000 (21:18 +0800)]
app/testpmd: VXLAN packet identification
Add two commands to test VXLAN packet identification.
The test steps are as follows:
1> use commands to add/delete VxLAN UDP port.
2> use rxonly mode to receive VxLAN packet.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:54 +0000 (21:18 +0800)]
i40e: VXLAN packet identification
Implement the configuration API of VXLAN destination UDP port,
and add new Rx offload flags for supporting VXLAN packet offload.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:53 +0000 (21:18 +0800)]
ethdev: UDP tunnels
Add two functions to support UDP tunneling port configuration.
There are "some" destination UDP port numbers that have unique meaning.
In terms of VxLAN, "IANA has assigned the value 4789 for the VXLAN UDP port,
and this value SHOULD be used by default as the destination UDP port.
Some early implementations of VXLAN have used other values for the destination
port. To enable interoperability with these implementations, the destination
port SHOULD be configurable."
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:52 +0000 (21:18 +0800)]
ether: add VXLAN header
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Jijiang Liu [Thu, 23 Oct 2014 13:18:51 +0000 (21:18 +0800)]
mbuf: add fields for tunnels
Replace the "reserved2" field with the "packet_type" field
and add the "inner_l2_l3_len" field in the rte_mbuf structure.
The "packet_type" field is used to indicate ordinary packet format and also
tunneling packet format such as IP in IP, IP in GRE, MAC in GRE and MAC in UDP.
The "inner_l2_len" and the "inner_l3_len" fields are added
in the second cache line, they use 2 bytes for TX offloading of tunnels.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bernard Iremonger [Thu, 16 Oct 2014 13:27:42 +0000 (13:27 +0000)]
doc: getting started guide for linux
The 1.7 DPDK_Linux_GSG document in MSWord has been converted to rst format for
use with Sphinx. There is an rst file for each chapter and an index.rst file
which contains the table of contents.
This is the first document from a set of documents.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:26 +0000 (12:38 +0800)]
examples/vhost: add new example based on lib
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: clean makefile and add in examples/Makefile]
Huawei Xie [Mon, 20 Oct 2014 04:38:23 +0000 (12:38 +0800)]
examples/vhost: minor fixes
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:24 +0000 (12:38 +0800)]
examples/vhost: add branch hints
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Thu, 23 Oct 2014 09:24:26 +0000 (11:24 +0200)]
examples/vhost: disable guest notifications
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Thu, 23 Oct 2014 09:21:13 +0000 (11:21 +0200)]
examples/vhost: mergeable buffer option
Mergeable feature doesn't work with latest mbuf change.
Disabling IXGBE_INC_VECTOR is a temporary workaround.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:19 +0000 (12:38 +0800)]
examples/vhost: adapt Tx routing to lib
The packet passed to virtio_tx_route has been allocated
mbuf, so there is no need to allocate mbuf for it.
Use vlan offload to transmit vlan tagged packet.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove useless mbuf pool]
Huawei Xie [Mon, 20 Oct 2014 04:38:18 +0000 (12:38 +0800)]
examples/vhost: use burst enqueue and dequeue from lib
In switch_worker and virtio_tx_local, rte_vhost_enqueue_burst is called to
push host packets to guest VM.
Before enqueue packets to guest VM, vhost example uses configure-able retry logic
to wait for enough vring entries.
In switch_worker, rte_vhost_dequeue_burst is called to get packets from guest VM,
then virtio device will be bound to a queue in VMDQ for the first transmitted
packet.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Thu, 23 Oct 2014 09:24:56 +0000 (11:24 +0200)]
examples/vhost: register with lib
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:15 +0000 (12:38 +0800)]
examples/vhost: hpa regions for zero copy
check_hpa_regions, fill_hpa_memory_regions and hpa memory region
data structure are added back from old virtio-net.c.
Add hpa (host physical address) region generation/destroy logic.
gpa<->hpa memory translation regions are generated at new_device,
when a virtio device is ready for packet processing.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:16 +0000 (12:38 +0800)]
examples/vhost: add vhost dev struct
Define vhost_dev data structure.
Change reference to virtio_dev to vhost_dev.
The vhost example use vdev data structure for switching related logic
and container for virtio_dev.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:14 +0000 (12:38 +0800)]
examples/vhost: remove functions implemented in lib
Those functions are integrated into the user space vhost library:
virtio_dev_rx, virtio_dev_merge_rx, virtio_dev_tx, virtio_dev_merge_tx,
copy_from_mbuf_to_ring, gpa_to_vva.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Mon, 20 Oct 2014 04:38:13 +0000 (12:38 +0800)]
examples/vhost: copy old vhost example
This patch copies two files main.c/main.h from most recent vhost example
(before transforming into a library) as the base for new vhost example.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Declan Doherty [Wed, 22 Oct 2014 12:59:24 +0000 (13:59 +0100)]
bond: disable broadcast mode if mbuf refcnt is disabled
Link bonding broadcast mode requires refcnt parameter in the mbuf struct to
allow efficient transmission of duplicated mbufs on slave ports.
This patch disables broadcast mode when the complication option RTE_MBUF_REFCNT
is disabled to allow clean building of the bonding library.
A warning message notify user of disabling of broadcast mode.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Marc Sune [Wed, 22 Oct 2014 10:23:11 +0000 (12:23 +0200)]
kni: fix build
Fix compilation warning 'missing-field-initializers' for some GCC and clang
versions introduced in commit
0c6bc8e due to the use of C89/C90 initializers.
Using C99-style initializers
Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Marc Sune [Tue, 21 Oct 2014 10:46:55 +0000 (12:46 +0200)]
kni: memzone pool for alloc and release
The previous implementation of rte_kni_alloc() was allocating memzones with a
name composed of a fixed string and the interface name. When an application was
allocating and deallocating multiple interfaces with different names, memzones
were quickly exhausted, even though memzones from deallocated interfaces were
never used anymore (unless an interface with the same name was re-allocated).
As a result, the application was unable to allocate more KNI interfaces with
different names.
This patch implements the KNI memzone pool in order to prevent memzone
exhaustion when allocating/deallocating KNI interfaces. It adds a new API call,
rte_kni_init(max_kni_ifaces) that shall be called before any call to
rte_kni_alloc() if KNI is used. The memzones are pre-allocated with interface-
independent names so that they can be reused.
Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Ouyang Changchun [Tue, 21 Oct 2014 06:59:16 +0000 (14:59 +0800)]
ixgbe: fix build with mbuf refcnt disabled
An error has been introduced by commit
1f22652ca8869
("fix perf regression due to moved pool ptr").
Fix the case where RTE_MBUF_REFCNT is disabled.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Jingjing Wu [Mon, 20 Oct 2014 05:40:33 +0000 (13:40 +0800)]
i40e: generic filter control
Only provide empty handler.
It can be completed to support filter features on fortville.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: remove unused empty functions]
Jingjing Wu [Mon, 20 Oct 2014 05:40:32 +0000 (13:40 +0800)]
ethdev: introduce generic filter control
Define a new API umbrella to configure any kind of Rx filtering.
New functions:
- rte_eth_dev_filter_supported
- rte_eth_dev_filter_ctrl
Filter types, operations, and structures are defined specifically in new
header file lib/librte_eth/rte_dev_ctrl.h.
As to the implementation discussion, please refer to
http://dpdk.org/ml/archives/dev/2014-September/005179.html
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
[Thomas: rename ops and remove unused types]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Helin Zhang [Mon, 20 Oct 2014 02:58:18 +0000 (10:58 +0800)]
i40evf: support RSS
i40e hardware supports RSS in VF.
It's now supported in this driver.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Helin Zhang [Mon, 20 Oct 2014 02:58:17 +0000 (10:58 +0800)]
i40e: expose RSS functions and relevant macros
To reuse code, 'i40e_config_hena()' and 'i40e_parse_hena()' and
their relevant macros need to be extern, and then can be used for
both PF and VF parts.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Helin Zhang [Mon, 20 Oct 2014 02:58:16 +0000 (10:58 +0800)]
ethdev: better typing of RSS constants
Forced type conversion is not needed to define a macro with
constant. The alternate is to let compiler use the default width,
or specify the width with suffix of 'U', 'UL', 'ULL', etc.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Bruce Richardson [Fri, 17 Oct 2014 13:18:12 +0000 (14:18 +0100)]
app/test: fix crash for fbk hashes with a lot of entries
The four-byte-key (fbk) autotest was allocating the keys to be used for
the test on the stack. When the number of entries in the table was
increased significantly, for example, to test larger hashes by increase the
value of ENTRIES, this array of keys was greater than that
allowed on the stack, and so caused problems, i.e. crashes and core dumps.
The solution is to have the keys dynamically allocated on the heap using
malloc. Now if ENTRIES is increased and we run out of memory we get an
error message instead of a crash.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Alan Carew [Tue, 14 Oct 2014 12:18:36 +0000 (13:18 +0100)]
contigmem: fix buffer overrun on unload
The maximum mount contiguous memory regions for FreeBSD is limited by
RTE_CONTIGMEM_MAX_NUM_BUFS, a pointer to each region is stored in
static void * contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS]
A user can specify a greater amount via hw.contigmem.num_buffers,
while the allocation logic will prevent this allocation from occuring the logic
in contigmem_unload() will attempt to free hw.contigmem.num_buffers and an
overrun occurs.
This patch limits the freeing to a maximum of RTE_CONTIGMEM_MAX_NUM_BUFS.
Signed-off-by: Alan Carew <alan.carew@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Mon, 20 Oct 2014 17:26:35 +0000 (18:26 +0100)]
ethdev: fix memory corruption with default Rx/Tx configuration
Commit
fbde27f1 (get default Rx/Tx configuration from dev info),
introduced a bug, which caused memory corruption in dev_info.
To get RX/TX configuration, both rx/tx queue setup functions were calling
dev_info_get from PMDs, so dev_info structure was not being reseted
before being populated.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Thomas Monjalon [Fri, 17 Oct 2014 15:00:53 +0000 (17:00 +0200)]
mk: fix doc cleaning
With make 3.x, guides-% is matched instead of guides-%-clean.
Move the less specific target pattern (guides-%) at the end
to allow matching guides-%-clean first.
Reported-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bernard Iremonger [Wed, 15 Oct 2014 20:17:02 +0000 (22:17 +0200)]
doc: add copyright and version
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Thomas Monjalon [Wed, 15 Oct 2014 20:12:20 +0000 (22:12 +0200)]
mk: generate html guides with sphinx
Add minimal configuration and index to validate new rules
inside "make doc" and "make doc-clean".
RTE_SPHINX_BUILD can be overriden.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Thomas Monjalon [Wed, 15 Oct 2014 19:43:41 +0000 (21:43 +0200)]
doc: move doxygen files in api subdirectory
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Thomas Monjalon [Wed, 15 Oct 2014 19:28:13 +0000 (21:28 +0200)]
mk: rename doxygen rules
This new naming will help to be consistent with coming rules.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bernard Iremonger [Wed, 15 Oct 2014 16:51:10 +0000 (18:51 +0200)]
mk: fix doxygen clean
RTE_OUTPUT variable is always defined, unlike $O.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Ouyang Changchun [Mon, 13 Oct 2014 03:46:12 +0000 (11:46 +0800)]
virtio: increase max Rx packet length
Since commit
13ce5e7eb94 ("virtio: mergeable buffers"),
this driver has the capability of receiving and transmitting jumbo frame.
So update max Rx packet length.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Tested-by: Jingguo Fu <jingguox.fu@intel.com>
Sergio Gonzalez Monroy [Thu, 9 Oct 2014 10:08:33 +0000 (11:08 +0100)]
mk: pass CC option for kernel modules
At least on kernels 3.15 or newer, wrong compiler flags are set when building
kernel modules.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Sergio Gonzalez Monroy [Mon, 6 Oct 2014 16:09:09 +0000 (17:09 +0100)]
mk: pass verbose flag for kernel modules
Linux kernel build system requires V=1 to enable verbose output, but
current DPDK framework just check if V is defined.
Fix: force V=1 when building Linux kernel modules if verbose output is
enabled.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Jijiang Liu [Wed, 15 Oct 2014 03:14:38 +0000 (11:14 +0800)]
i40e: add Rx error statistics
Add incoming packet error statistics for i40e.
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Helin Zhang [Mon, 13 Oct 2014 07:18:19 +0000 (15:18 +0800)]
i40e/base: fix build with gcc < 4.4
It fixes the compile error as below on gcc version 4.3.4.
cc1: error: unrecognized command line option "-Wno-unused-but-set-variable"
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Zhaochen Zhan <zhaochen.zhan@intel.com>
Ouyang Changchun [Wed, 15 Oct 2014 03:11:00 +0000 (11:11 +0800)]
virtio: fix needed vring entry number
Fix one issue in virtio TX: it needs one more vring descriptor to hold the virtio
header when transmitting packets, it is used later to determine whether to free
more entries from used vring.
It fixes failing to transmit any packet with 1 segment in the circumstance of only
1 descriptor in the vring free list.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Thomas Monjalon [Mon, 13 Oct 2014 17:22:45 +0000 (19:22 +0200)]
vhost: add in doc
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:59 +0000 (02:54 +0800)]
vhost: add makefile
vhost lib is turned off by default.
vhost lib is based on cuse, which requires fuse development package
to be installed.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: fix build dependencies]
Huawei Xie [Wed, 8 Oct 2014 18:54:58 +0000 (02:54 +0800)]
vhost: comment identified issues
1) FIXME: concurrent calls to vhost set mem table from different guests
could cause mem_temp to be overrided.
2) TODO: cmpset cost quite some cpu cyles. Allow app to disable this
feature if there is no contention in real workload.
3) FIXME: fix scatter gather mbuf copy to vhost vring chained buffers.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:57 +0000 (02:54 +0800)]
vhost: coding style fixes
Fix serious coding style issues reported by checkpatch.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:55 +0000 (02:54 +0800)]
vhost: static variable fixes
Add "static" for some variable definitions.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:54 +0000 (02:54 +0800)]
vhost: clean includes
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:52 +0000 (02:54 +0800)]
vhost: add debug print
Define PRINT_PACKET and LOG_DEBUG macros.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:56 +0000 (02:54 +0800)]
vhost: add private context field
priv field could be used to store application specific context.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:53 +0000 (02:54 +0800)]
vhost: supported features
VHOST_SUPPORTED_FEATURES is the feature mask that vhost lib supports.
VHOST_FEATURES is the feature mask vhost currently supports after some features are turned on/off.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: split patch]
Huawei Xie [Wed, 8 Oct 2014 18:54:51 +0000 (02:54 +0800)]
vhost: allow to enable or disable features
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: split patch]
Huawei Xie [Wed, 8 Oct 2014 18:54:51 +0000 (02:54 +0800)]
vhost: get available vring entries
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: split patch]
Huawei Xie [Wed, 8 Oct 2014 18:54:50 +0000 (02:54 +0800)]
vhost: rename ops registering function
Rename init_virtio_net as rte_vhost_callback_register API.
rte_vhost_callback_register register the callbacks called when a
vhost device is created and ready to be added to data processing core
or is de-actived by guest.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:49 +0000 (02:54 +0800)]
vhost: expose register and start functions
Rename register_cuse_device as rte_vhost_driver_register API.
Rename start_session_loop as rte_vhost_driver_session_start API.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:48 +0000 (02:54 +0800)]
vhost: get internal ops when registering
vhost_net_device_ops is internal implementation in vhost lib.
register_cuse_device will be vhost driver register API.
There is no need for it to know the internal vhost ops.
Instead, that ops is retrieved in register_cuse_device
through get_virtio_net_callbacks.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:47 +0000 (02:54 +0800)]
vhost: remove index parameter
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:45 +0000 (02:54 +0800)]
vhost: enqueue/dequeue burst
rte_vhost_enqueue_burst copies host packets to guest.
rte_vhost_enqueue_burst will call virtio_dev_rx and virtio_dev_merge_rx
respectively depending on whether merge-able feature is negotiated or not
in the vhost device.
virtio_dev_merge_tx is renamed to rte_vhost_dequeue_burst.
rte_vhost_dequeue_burst gets to-be-sent packets from guest.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: merged patches]
Huawei Xie [Wed, 8 Oct 2014 18:54:43 +0000 (02:54 +0800)]
vhost: add queue id parameter
queue_id parameter is added to Rx/Tx functions for multiple queue support
in future.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:42 +0000 (02:54 +0800)]
vhost: calculate mbuf size
As a lib, we have no idea the app defined mbuf size.
This patch will calculate mbuf size dynamically.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:41 +0000 (02:54 +0800)]
vhost: return packets to upper layer
This patch makes virtio_dev_merge_tx return the received packets to app layer.
Previously virtio_tx_route was called to route these packets and then free them.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:51 +0000 (02:54 +0800)]
vhost: move address translation function
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: split from a previous patch]
Huawei Xie [Wed, 8 Oct 2014 18:54:46 +0000 (02:54 +0800)]
vhost: move internal structure
The structure virtio_net_config_ll is moved to virtio_net.c.
It is related to internal virtio device management,
so it should not be exposed to other files.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:40 +0000 (02:54 +0800)]
vhost: remove retry logic
It was used to wait some time and retry when there are not enough descriptors.
App could implement this policy easily if it needs.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:39 +0000 (02:54 +0800)]
vhost: remove zero copy memory region generation logic
Currently zero copy feature isn't generic as it couples closely with nic.
It isn't put in the vhost lib in this version.
gpa(guest physical address) to hpa(host physical address) mapping region
logic is removed.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:38 +0000 (02:54 +0800)]
vhost: remove switching related logics
The following logics will be moved to vhost example:
1. mac learning, which is used to learn the mac address from the first
transmitted packet of guest and bind the vhost device to a queue in a
pool of VMDQ.
2. VMDQ mac/vlan filter: Each pool the vhost device is bind to is
assigned a mac/vlan filter.
3. num_devices is used to specify the maximum vhost devices the nic supports.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Huawei Xie [Wed, 8 Oct 2014 18:54:37 +0000 (02:54 +0800)]
vhost: remove useless code for Rx/Tx
Remove all other codes and only keep virtio_dev_rx, copy_from_mbuf_to_vring,
virtio_dev_merge_rx, virtio_dev_merge_tx.
Previous vhost merge-able feature introduces another version of tx function,
virtio_dev_merge_tx. Actually it is not related to merge-able feature but is
the fix for memcpy between mbuf and vring descriptors.
This lib will create the tx functions based on virtio_dev_merge_tx.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: do not remove code used or moved later]
Huawei Xie [Wed, 8 Oct 2014 18:54:35 +0000 (02:54 +0800)]
vhost: move from examples to dedicated library
Those files will be refactored in subsequent patches to form user space
vhost library.
Makefile and main.h are removed.
main.c is renamed to vhost_rxtx.c and will provide vring enqueue/dequeue API.
virtio-net.h is renamed to rte_virtio_net.h which is the API header file.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove from examples Makefile and merge file renaming]
Daniel Mrzyglod [Fri, 10 Oct 2014 10:08:08 +0000 (11:08 +0100)]
tools: fix setup script for Fedora 21
script was expecting /lib/modules/$(uname -r)/kernel/drivers/uio/uio.ko
but in fedora 21 there are Compressed kernel modules - xz (LZMA)
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Keith Wiles [Thu, 9 Oct 2014 20:02:28 +0000 (15:02 -0500)]
mempool: remove useless variable
Remove n_orig variable as it is not required.
Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Ouyang Changchun [Thu, 9 Oct 2014 07:27:59 +0000 (15:27 +0800)]
ixgbe/base: disable some gcc warnings
This patch disables compilation complain from lower GCC version (less than 4.6).
Note: Only supported versions of GCC are 4.x.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Pablo de Lara [Wed, 1 Oct 2014 09:49:05 +0000 (10:49 +0100)]
examples: use factorized default Rx/Tx configuration
For apps that were using default rte_eth_rxconf and rte_eth_txconf
structures, these have been removed and now they are obtained by
calling rte_eth_dev_info_get, just before setting up RX/TX queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
i40e: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
ixgbe: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
igb: set default Rx/Tx configuration
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:04 +0000 (10:49 +0100)]
ethdev: get default Rx/Tx configuration from dev info
Many sample apps use duplicated code to set rte_eth_txconf and rte_eth_rxconf
structures. This patch allows the user to get a default optimal RX/TX configuration
through rte_eth_dev_info get, and still any parameters may be tweaked as wished,
before setting up queues.
Besides, if a NULL pointer is passed to rte_eth_rx_queue_setup or
rte_eth_tx_queue_setup, these functions get internally the default RX/TX
configuration for the user.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: split patch]
Pablo de Lara [Wed, 1 Oct 2014 09:49:03 +0000 (10:49 +0100)]
ethdev: reset whole dev info structure before filling
To guarantee that RX/TX configuration structures are reseted
before modifying them, plus the other dev info fields,
dev info structure is zeroed beforehand.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Pablo de Lara [Wed, 1 Oct 2014 22:42:56 +0000 (23:42 +0100)]
examples/netmap_compat: add default build target
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Nicolás Pernas Maradei [Sat, 4 Oct 2014 19:19:51 +0000 (20:19 +0100)]
app/testpmd: print message if queue start/stop is not supported
Print an error message to the user when trying to start/stop a rx/tx queue and
this function is not supported by the PMD driver. The patch does not check if
the return value is -EINVAL because testpmd is already validating the port and
queue id.
Signed-off-by: Nicolás Pernas Maradei <nico@emutex.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Nicolás Pernas Maradei [Sat, 4 Oct 2014 22:24:17 +0000 (23:24 +0100)]
pcap: fix double stop error
librte_pmd_pcap driver was opening the pcap/interfaces only at init time and
closing them only when the port was being stopped. This behaviour would cause
problems (leading to segfault) if the user closed the port 2 times. The first
time the pcap/interfaces would be normally closed but libpcap would throw an
error causing a segfault if the closed pcaps/interfaces were closed again.
This behaviour is solved by re-openning pcaps/interfaces when the port is
started (only if these weren't open already for example at init time).
Signed-off-by: Nicolás Pernas Maradei <nico@emutex.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Jim Harris [Wed, 1 Oct 2014 22:00:21 +0000 (15:00 -0700)]
i40e: fix Tx descriptors reset
Fix the descriptor initialization loop, so that it initializes
the i40e_tx_desc::cmd_type_offset_bsz for the correct index
into the tx_ring array.
Previously it would use the index once to initialize the txd
local variable, then again when setting cmd_type_offset_bsz.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Pablo de Lara [Wed, 1 Oct 2014 22:27:25 +0000 (23:27 +0100)]
ixgbe: fix build with bypass enabled
Since commit
aae1047905621 ("use the right debug macro"),
DEBUGOUT was replaced by PMD_DRV_LOG which requires at least
2 arguments. But the level argument was missing.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Keith Wiles [Sun, 5 Oct 2014 06:16:22 +0000 (01:16 -0500)]
mempool: fix build with debug enabled and clang
When enabling RTE_LIBRTE_MEMPOOL_DEBUG and compiling with clang
compiler an error occurs, because ifdefed code includes push/pop pragmas.
Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
David Marchand [Wed, 8 Oct 2014 08:43:31 +0000 (10:43 +0200)]
eal/bsd: fix core detection
Following "options parsing" patchset (commit
d7cb626f and
489a9d6c), core
detection is not working correctly on bsd.
./x86_64-native-bsdapp-gcc/app/test -c f -n 4 -- -i
[...]
EAL: lcore 0 unavailable
EAL: invalid coremask
Align bsd to linux:
- commit
f563a372 "eal: fix recording of detected/enabled logical cores"
- commit
4f04db8b "eal: check coremask against detected lcores"
Reported-by: Zhan, Zhaochen <zhaochen.zhan@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Zhaochen Zhan <zhaochen.zhan@intel.com>
Bruce Richardson [Fri, 3 Oct 2014 15:36:52 +0000 (16:36 +0100)]
mbuf: comment for ctrl mbuf flag
Add in a doxygen comment for the ctrl mbuf flag definition.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bruce Richardson [Fri, 3 Oct 2014 15:36:51 +0000 (16:36 +0100)]
mbuf: update Rx flag format
Update the format of the RX flags to match that of the TX flags. In
general the flags are now specified as "1ULL << X", with a few
exceptions.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bruce Richardson [Fri, 3 Oct 2014 15:36:50 +0000 (16:36 +0100)]
mbuf: group Tx flags near end of field
This patch takes the existing TX flags defined for the mbuf and shifts
each uniquely defined one left so that additional RX flags can be
defined without having RX and TX flags mixed together. Under the new
scheme, RX flags start at bit 0 and work left, TX flags start at bit 55
and work right, and bits 56-63 are reserved for generic mbuf use, not
for offloads.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:15 +0000 (12:08 +0100)]
app/testpmd: change rxfreet default to 32
To improve performance by using bulk alloc or vectored RX routines, we
need to set rx free threshold (rxfreet) value to 32, so make this the
testpmd default.
Thirty-two is the minimum setting needed to enable either the
bulk alloc or vector RX routines inside the ixgbe driver, so it's
best made the default for that reason. Please see
"check_rx_burst_bulk_alloc_preconditions()" in ixgbe_rxtx.c, and
RX function assignment logic in "ixgbe_dev_rx_queue_setup()" in
the same file.
The difference in IO performance for testpmd when called without any
optional parameters, and using 10G NICs using the ixgbe driver, can be
significant - approx 25% or more.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:14 +0000 (12:08 +0100)]
ixgbe: add prefetch to improve slow-path tx perf
Make a small improvement to slow path TX performance by adding in a
prefetch for the second mbuf cache line.
Also move assignment of l2/l3 length values only when needed.
What I've done with the prefetches is two-fold:
1) changed it from prefetching the mbuf (first cache line) to prefetching
the mbuf pool pointer (second cache line) so that when we go to access
the pool pointer to free transmitted mbufs we don't get a cache miss. When
clearing the ring and freeing mbufs, the pool pointer is the only mbuf
field used, so we don't need that first cache line.
2) changed the code to prefetch earlier - in effect to prefetch one mbuf
ahead. The original code prefetched the mbuf to be freed as soon as it
started processing the mbuf to replace it. Instead now, every time we
calculate what the next mbuf position is going to be we prefetch the mbuf
in that position (i.e. the mbuf pool pointer we are going to free the mbuf
to), even while we are still updating the previous mbuf slot on the ring.
This gives the prefetch much more time to resolve and get the data we need
in the cache before we need it.
In terms of performance difference, a quick sanity test using testpmd
on a Xeon (Sandy Bridge uarch) platform showed performance increases
between approx 8-18%, depending on the particular RX path used in
conjuntion with this TX path code.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:17 +0000 (12:08 +0100)]
mbuf: switch vlan_tci and reserved2 fields
Move the vlan_tci field up by two bytes in the mbuf data structure. This
has two effects:
* Ensures the the ixgbe vector driver places the vlan tag in the correct
place in the mbuf.
* Allows a second vlan tag field, if one is added in the future, to be
placed after the existing vlan field, rather than before.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:16 +0000 (12:08 +0100)]
mbuf: add userdata pointer field
While some applications may store metadata about packets in the packet
mbuf headroom, this is not a workable solution for packet metadata which
is either:
* larger than the headroom (or headroom is needed for adding pkt headers)
* needs to be shared or copied among packets
To support these use cases in applications, we reserve a general
"userdata" pointer field inside the second cache-line of the mbuf. This
is better than having the application store the pointer to the external
metadata in the packet headroom, as it saves an additional cache-line
from being used.
Apart from storing metadata, this field also provides a general 8-byte
scratch space inside the mbuf for any other application uses that are
applicable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bruce Richardson [Tue, 23 Sep 2014 11:08:13 +0000 (12:08 +0100)]
mbuf: ensure next pointer is set to null on free
The receive functions for packets do not modify the next pointer so
the next pointer should always be cleared on mbuf free, just in case.
The slow-path TX needs to clear it, and the standard mbuf free function
also needs to clear it. Fast path TX does not handle chained mbufs so
is unaffected
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:38 +0000 (15:21 +0800)]
i40e/base: fix arq_event_info struct
Overloading the 'msg_size' field in the 'arq_event_info' struct
is a bad idea. It leads to bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the
output value (actual message length). The fix introduces one
more field of 'buf_len' for the buffer size, and renames the
field of 'msg_size' to 'msg_len' for the real message size.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:35 +0000 (15:21 +0800)]
i40e/base: debug write register request
The firmware api request of writes to hardware registers should be
exposed to driver. The new API of 'i40e_aq_debug_write_register'
is introduced for that.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:34 +0000 (15:21 +0800)]
i40e/base: support 10G base T
10G base T type support is added.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:37 +0000 (15:21 +0800)]
i40e/base: get link status to report flow control settings
The fix is to use get_link_status but not get_phy_capabilities
for reporting FC settings.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:36 +0000 (15:21 +0800)]
i40e/base: workaround for firmware version
The workaround helps fix the API if the FW is 4.2 or later.
In addition, an unreachable 'break' statement has been removed.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Helin Zhang [Tue, 9 Sep 2014 07:21:31 +0000 (15:21 +0800)]
i40e/base: get rid of sparse warnings
There are variables that represent values in little endian.
Adding prefix of '__Le' can remove warnings during sparse
checks. In addition, remove some unreachable 'break' statements,
and add 'UL' on a couple of constants.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Reviewed-by: Chen Jing <jing.d.chen@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>