dpdk.git
9 years agodoc: add reorder api to doxygen
Sergio Gonzalez Monroy [Fri, 20 Feb 2015 12:10:51 +0000 (12:10 +0000)]
doc: add reorder api to doxygen

Add missing reorder lirbary directory to doxygen configuration.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agolib: fix C++11 compilation
Stefan Puiu [Fri, 20 Feb 2015 13:23:26 +0000 (15:23 +0200)]
lib: fix C++11 compilation

In C++11 concatenated string literals need to have a space in between.
Found with clang++-3.4, IIRC g++-4.8 also complains about this.

Sample error message:
error: invalid suffix on literal; C++11 requires a space between literal
and identifier [-Wreserved-user-defined-literal]

Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
9 years agokni: optimize Rx burst
Hemant Agrawal [Wed, 23 Jul 2014 06:45:12 +0000 (12:15 +0530)]
kni: optimize Rx burst

The current implementation of rte_kni_rx_burst polls the fifo for buffers.
Irrespective of success or failure, it allocates the mbuf and try to put them into the alloc_q
if the buffers are not added to alloc_q, it frees them.
This waste lots of cpu cycles in allocating and freeing the buffers if alloc_q is full.

The logic has been changed to:
1. Initially allocand add buffer(burstsize) to alloc_q
2. Add buffers to alloc_q only when you are pulling out the buffers.

Signed-off-by: Hemant Agrawal <hemant@freescale.com>
Reviewed-by: Jay Rolette <rolette@infiniteio.com>
9 years agolpm: fix overflow issue
Igor Ryzhov [Fri, 20 Feb 2015 13:16:46 +0000 (16:16 +0300)]
lpm: fix overflow issue

LPM table overflow may occur if table is full and added rule has
the biggest depth that already have some rules.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agopipeline: fix port meta for non-default entries
Ildar Mustafin [Sat, 21 Feb 2015 08:31:21 +0000 (11:31 +0300)]
pipeline: fix port meta for non-default entries

Signed-off-by: Ildar Mustafin <imustafin@bk.ru>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
9 years agovhost: support dynamically registering server
Huawei Xie [Mon, 23 Feb 2015 17:36:33 +0000 (17:36 +0000)]
vhost: support dynamically registering server

* support calling rte_vhost_driver_register after rte_vhost_driver_session_start
* add mutext to protect fdset from concurrent access
* add busy flag in fdentry. this flag is set before cb and cleared after cb is finished.

mutex lock scenario in vhost:

* event_dispatch(in rte_vhost_driver_session_start) runs in a separate thread, infinitely
processing vhost messages through cb(callback).
* event_dispatch acquires the lock, get the cb and its context, mark the busy flag,
and releases the mutex.
* vserver_new_vq_conn cb calls fdset_add, which acquires the mutex and add new fd into fdset.
* vserver_message_handler cb frees data context, marks remove flag to request to delete
connfd(connection fd) from fdset.
* after cb returns, event_dispatch
  1. clears busy flag.
  2. if there is remove request, call fdset_del, which acquires mutex, checks busy flag, and
removes connfd from fdset.
* rte_vhost_driver_unregister(not implemented) runs in another thread, acquires the mutex,
calls fdset_del to remove fd(listenerfd) from fdset. Then it could free data context.

The above steps ensures fd data context isn't freed when cb is using.

VM(s) should have been shutdown before rte_vhost_driver_unregister.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: support ifname for vhost-user
Huawei Xie [Mon, 23 Feb 2015 17:36:32 +0000 (17:36 +0000)]
vhost: support ifname for vhost-user

for vhost-cuse, ifname is the name of the tap device
for vhost-user, ifname is the name of the unix domain socket path

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: support vhost-user
Huawei Xie [Mon, 23 Feb 2015 17:36:31 +0000 (17:36 +0000)]
vhost: support vhost-user

In rte_vhost_driver_register(), vhost unix domain socket listener fd is created
and added to polled(based on select) fdset.

In rte_vhost_driver_session_start(), fds in the fdset are checked for
processing. If there is new connection from qemu, connection fd accepted is
added to polled fdset. The listener and connection fds in the fdset are
then both checked. When there is message on the connection fd, its
callback vserver_message_handler is called to process vhost-user messages.

To support identifying which virtio is from which guest VM, we could call
rte_vhost_driver_register with different socket path. Virtio devices from
same VM will connect to VM specific socket. The socket path information is
stored in the virtio_net structure.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: add select based event driven processing
Huawei Xie [Mon, 23 Feb 2015 17:36:30 +0000 (17:36 +0000)]
vhost: add select based event driven processing

for more generic event driven processing, refer to:
http://libevent.org/

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: implement cuse memory table
Huawei Xie [Mon, 23 Feb 2015 17:36:29 +0000 (17:36 +0000)]
vhost: implement cuse memory table

remove set_memory_table ops

vhost-cuse or vhost-user will both implement their own set_memory_region handler.

In current vhost-cuse implementation, guest numa memory isn't supported.
Assume that guest memory is backed by only one file.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
9 years agovhost: make host memory mapping more generic
Huawei Xie [Mon, 23 Feb 2015 17:36:28 +0000 (17:36 +0000)]
vhost: make host memory mapping more generic

This functions accepts a virtual address and pid(qemu), and maps it into
current process(vhost)'s address space.

The memory behind the virtual address should be backed by a file,
and virtual address should be the starting address.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: copy host memory mapping to a new cuse file
Huawei Xie [Mon, 23 Feb 2015 17:36:27 +0000 (17:36 +0000)]
vhost: copy host memory mapping to a new cuse file

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: move fd copying into cuse subdirectory
Huawei Xie [Mon, 23 Feb 2015 17:36:26 +0000 (17:36 +0000)]
vhost: move fd copying into cuse subdirectory

File descriptor is copied from qemu process into vhost process.
vhost-user doesn't need eventfd kernel module to copy fds between processes.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
9 years agovhost: rename header file
Huawei Xie [Mon, 23 Feb 2015 17:36:25 +0000 (17:36 +0000)]
vhost: rename header file

Rename vhost-net-cdev.h to vhost-net.h.
This file defines common operations provided by virtio-net(.c).

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: move cuse related handling in a subdirectory
Huawei Xie [Mon, 23 Feb 2015 17:36:24 +0000 (17:36 +0000)]
vhost: move cuse related handling in a subdirectory

Create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse.

vhost-cuse driver will be divided into two parts: cuse driver specific message
handling(in cuse directory) and common message handling(in virtio-net.c).

vhost ioctl message is pre-processed in cuse and then sent to virtio-net
if is not terminated.

virtio-net.c provides common message handling for both vhost-cuse and vhost-user.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agovhost: enable virtio control channel Rx mode
Huawei Xie [Mon, 23 Feb 2015 17:36:23 +0000 (17:36 +0000)]
vhost: enable virtio control channel Rx mode

VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ.
Observed that virtio-net driver in guest would crash with only CTRL_RX enabled.

In virtnet_send_command:

/* Caller should know better */
BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
(out + in > VIRTNET_SEND_COMMAND_SG_MAX));

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
9 years agoexamples/rxtx_callbacks: show use of callbacks
Bruce Richardson [Mon, 23 Feb 2015 18:30:10 +0000 (18:30 +0000)]
examples/rxtx_callbacks: show use of callbacks

Example showing how callbacks can be used to insert a timestamp
into each packet on RX. On TX the timestamp is used to calculate
the packet latency through the app, in cycles.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agoethdev: support optional Rx and Tx callbacks
Bruce Richardson [Mon, 23 Feb 2015 18:30:09 +0000 (18:30 +0000)]
ethdev: support optional Rx and Tx callbacks

Add optional support for inline processing of packets inside the RX
or TX call. For an RX callback, what happens is that we get a set of
packets from the NIC and then pass them to a callback function, if
configured, to allow additional processing to be done on them, e.g.
filling in more mbuf fields, before passing back to the application.
On TX, the packets are similarly post-processed before being handed
to the NIC for transmission.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoethdev: rename interrupt callbacks field
Bruce Richardson [Mon, 23 Feb 2015 18:30:08 +0000 (18:30 +0000)]
ethdev: rename interrupt callbacks field

The 'callbacks' member of the rte_eth_dev structure has been renamed
to 'link_intr_cbs' to make it clear that it refers to callbacks from
NIC interrupts. This allows us to add other types of callbacks to
the structure without ambiguity.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoi40e: enable internal switch of PF
Jingjing Wu [Thu, 29 Jan 2015 01:41:55 +0000 (09:41 +0800)]
i40e: enable internal switch of PF

This patch enables PF's internal switch by setting ALLOWLOOPBACK
flag when VEB is created. With this patch, traffic from PF can be
switched on the VEB.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/013237.html

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
9 years agoi40e: fix vsi configuration
Jingjing Wu [Thu, 29 Jan 2015 01:41:54 +0000 (09:41 +0800)]
i40e: fix vsi configuration

In i40e_vsi_config_tc_queue_mapping, should add a flag to indicate
another valid setting by OR operation, but not set this flag to
valid_sections, otherwise it will overwrite the flags set before.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/013237.html

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
9 years agoi40e: workaround for XL710 performance
Helin Zhang [Mon, 29 Dec 2014 01:41:28 +0000 (09:41 +0800)]
i40e: workaround for XL710 performance

On XL710, performance number is far from the expectation on recent
firmware versions, if promiscuous mode is disabled, or promiscuous
mode is enabled and port MAC address is equal to the packet
destination MAC address. The fix for this issue may not be
integrated in the following firmware version. So the workaround in
software driver is needed. For XL710, it needs to modify the initial
values of 3 internal only registers, which are the same as X710.
Note that the values for X710 and XL710 registers could be different,
and the workaround can be removed when it is fixed in firmware in
the future.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/012749.html

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
9 years agoeal/linux: allow to map BARs with MSI-X tables
Dan Aloni [Wed, 28 Jan 2015 22:04:53 +0000 (00:04 +0200)]
eal/linux: allow to map BARs with MSI-X tables

While VFIO doesn't allow us to map complete BARs with MSI-X tables,
it does allow us to map around them in PAGE_SIZE granularity. There
might be adapters that provide their registers in the same BAR
but on a different page. For example, Intel's NVME adapter, though
not a network adapter, provides only one MMIO BAR that contains
the MSI-X table.

Signed-off-by: Dan Aloni <dan@kernelim.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
9 years agombuf: remove build option to disable refcnt
Sergio Gonzalez Monroy [Wed, 18 Feb 2015 11:03:03 +0000 (11:03 +0000)]
mbuf: remove build option to disable refcnt

This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
field in the mbuf struct permanently.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agombuf: introduce indirect attached flag
Sergio Gonzalez Monroy [Wed, 18 Feb 2015 11:03:02 +0000 (11:03 +0000)]
mbuf: introduce indirect attached flag

Currently for mbufs with refcnt, we cannot free mbufs with external memory
buffers (ie. vhost zero copy), as they are recognized as indirect
attached mbufs and therefore we free the direct mbuf it points to,
resulting in an error in the case of external memory buffers.

We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
that the mbuf is an indirect attached mbuf pointing to another mbuf.
When we free an mbuf, we only free the direct mbuf if the flag is set.
Freeing an mbuf with external buffer is the same as freeing a non attached mbuf.
The flag is set during attach and clear on detach.

So in the case of vhost zero copy where we have mbufs with external
buffers, by default we just free the mbuf and it is up to the user to deal with
the external buffer.

This patch would allow the removal of the RTE_MBUF_REFCNT config option,
setting refcnt for all mbufs permanently.

The patch also modifies the vhost example as it was using the
RTE_MBUF_INDIRECT macro to detect if it was an mbuf with external buffer.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agokni: add build option to disable preempting
Marc Sune [Fri, 13 Feb 2015 14:25:25 +0000 (15:25 +0100)]
kni: add build option to disable preempting

This patch introduces CONFIG_RTE_KNI_PREEMPT_DEFAULT flag. When set to 'no',
KNI kernel thread(s) do not call schedule_timeout_interruptible(), which
improves overall KNI performance at the expense of CPU cycles (polling).

Default values is 'yes', maintaining the same behaviour as of now.

Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoapp/test: remove redundant compile checks
Yerden Zhumabekov [Thu, 29 Jan 2015 08:50:47 +0000 (14:50 +0600)]
app/test: remove redundant compile checks

Since rte_hash_crc() can now be run regardless of SSE4.2 support,
we can safely remove compile checks for RTE_MACHINE_CPUFLAG_SSE4_2
in test utilities.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: slice CRC data into 8-byte pieces
Yerden Zhumabekov [Thu, 29 Jan 2015 08:50:26 +0000 (14:50 +0600)]
hash: slice CRC data into 8-byte pieces

Calculating hash for data of variable length is more efficient
when that data is sliced into 8-byte pieces. The rest part of data
is hashed using CRC32 functions with either 8 and 4 byte operands.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: fallback to software CRC32 implementation
Yerden Zhumabekov [Thu, 29 Jan 2015 08:50:03 +0000 (14:50 +0600)]
hash: fallback to software CRC32 implementation

Initially, SSE4.2 support is detected via the constructor function.

Added rte_hash_crc_set_alg() function to detect and set CRC32
implementation if necessary. SSE4.2 is allowed by default.

rte_hash_crc_*byte() functions reworked so they choose available
CRC32 implementation in the runtime.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: add CRC function for 8 bytes
Yerden Zhumabekov [Thu, 29 Jan 2015 08:49:47 +0000 (14:49 +0600)]
hash: add CRC function for 8 bytes

SSE4.2 provides CRC32 intrinsic with 8-byte operand.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: replace built-in functions implementing SSE4.2
Yerden Zhumabekov [Thu, 29 Jan 2015 08:49:17 +0000 (14:49 +0600)]
hash: replace built-in functions implementing SSE4.2

Give up using built-in intrinsics and use our own assembly
implementation. Remove #include entry as well.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: add assembly implementation of CRC32 intrinsics
Yerden Zhumabekov [Thu, 29 Jan 2015 08:48:59 +0000 (14:48 +0600)]
hash: add assembly implementation of CRC32 intrinsics

Added:
- crc32c_sse42_u32() emits 'crc32l' asm instruction;
- crc32c_sse42_u64() emits 'crc32q' asm instruction;
- crc32c_sse42_u64_mimic(), wrapper in case of run on 32-bit platform.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agohash: add software CRC32 implementation
Yerden Zhumabekov [Thu, 29 Jan 2015 08:48:41 +0000 (14:48 +0600)]
hash: add software CRC32 implementation

Add lookup tables for CRC32 algorithm, crc32c_1word() and
crc32c_2words() functions returning hash of 32-bit and 64-bit
operand.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoapp/testpmd: support NVGRE in Tx checksum offload
Jijiang Liu [Fri, 20 Feb 2015 17:01:47 +0000 (17:01 +0000)]
app/testpmd: support NVGRE in Tx checksum offload

Enhance csum fwd engine based on current TX checksum framework in order
to test TX Checksum offload for NVGRE packet.

It includes:
 - IPv4 and IPv6 packet
 - outer L3, inner L3 and L4 checksum offload for Tx side.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
9 years agoapp/testpmd: support NVGRE in Rx tunnel filtering
Jijiang Liu [Fri, 20 Feb 2015 17:01:46 +0000 (17:01 +0000)]
app/testpmd: support NVGRE in Rx tunnel filtering

Extend the "tunnel_filter" command in testpmd to test the RX tunnel filter API for NVGRE packet.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
9 years agoi40e: support NVGRE in Rx tunnel filtering
Jijiang Liu [Fri, 20 Feb 2015 17:01:45 +0000 (17:01 +0000)]
i40e: support NVGRE in Rx tunnel filtering

The filter types supported are listed below for NVGRE packet:
   1. Inner MAC and Inner VLAN ID.
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID.
   4. Inner MAC address.
   5. Outer MAC address, tenant ID and inner MAC address.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
9 years agoether: add transparent ethernet bridging type
Jijiang Liu [Fri, 20 Feb 2015 17:01:44 +0000 (17:01 +0000)]
ether: add transparent ethernet bridging type

Add an Ethernet type definition for Transparent Ethernet Bridging.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
9 years agoapp/testpmd: support new rss offloads
Helin Zhang [Wed, 4 Feb 2015 07:16:33 +0000 (15:16 +0800)]
app/testpmd: support new rss offloads

RSS offloads supported 'ip' and 'udp' only, which did not demonstrate
all of the hardware capabilities. The modifications adds support of
new RSS offloads of 'tcp', 'sctp', 'ether' and 'all'.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoapp/testpmd: fix some indent
Helin Zhang [Wed, 4 Feb 2015 07:16:27 +0000 (15:16 +0800)]
app/testpmd: fix some indent

Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: unification of RSS offload types
Helin Zhang [Wed, 4 Feb 2015 07:16:32 +0000 (15:16 +0800)]
ethdev: unification of RSS offload types

RSS offload types were defined separately for 1/10G and 40G NICs,
and have no relationship with flow types. The modifications are to
unify all RSS offload types for all PMDs. Unified RSS offload types
have new and common names which can be used for any PMD or
applications, and decouple from specific hardwares.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: merge with fm10k]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoethdev: unification of flow types
Helin Zhang [Wed, 4 Feb 2015 07:16:31 +0000 (15:16 +0800)]
ethdev: unification of flow types

Flow types was defined actually for i40e hardware specifically,
and wasn't able to be used for defining RSS offload types of all
PMDs. It removed the enum flow types, and uses macros instead
with new names. The new macros can be used for defining RSS
offload types later. Also modifications are made in i40e and
testpmd accordingly.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: merge with new flow director API]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoethdev: fix size of flow type mask array
Helin Zhang [Wed, 4 Feb 2015 07:16:30 +0000 (15:16 +0800)]
ethdev: fix size of flow type mask array

It wrongly calculates the size of the flow type mask array. The fix
is to align the flow type maximum index ID with the number of
element bit width, and then divide the number of element bit width.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: minor comment changes
Helin Zhang [Wed, 4 Feb 2015 07:16:28 +0000 (15:16 +0800)]
ethdev: minor comment changes

Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoi40e: remove some useless line breaks
Helin Zhang [Wed, 4 Feb 2015 07:16:29 +0000 (15:16 +0800)]
i40e: remove some useless line breaks

Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: remove old ethertype filter ABI
Thomas Monjalon [Sun, 22 Feb 2015 02:04:44 +0000 (03:04 +0100)]
ethdev: remove old ethertype filter ABI

The old ethertype filter API was removed in commit 75db20648,
but was still in (newly integrated) version map for ABI.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoethdev: remove old ntuple filter API
Jingjing Wu [Tue, 10 Feb 2015 04:48:32 +0000 (12:48 +0800)]
ethdev: remove old ntuple filter API

Following structures are removed:
 - rte_2tuple_filter
 - rte_5tuple_filter
Following APIs are removed:
 - rte_eth_dev_add_2tuple_filter
 - rte_eth_dev_remove_2tuple_filter
 - rte_eth_dev_get_2tuple_filter
 - rte_eth_dev_add_5tuple_filter
 - rte_eth_dev_remove_5tuple_filter
 - rte_eth_dev_get_5tuple_filter
It also move macros TCP_*_FLAG to rte_eth_ctrl.h, and removes the macro
TCP_UGR_FLAG which is duplicated with TCP_URG_FLAG.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
[Thomas: remove also from version map]

9 years agoapp/testpmd: new commands for ntuple filter
Jingjing Wu [Tue, 10 Feb 2015 04:48:31 +0000 (12:48 +0800)]
app/testpmd: new commands for ntuple filter

Following commands of 5tuple and 2tuple filter are removed:
 - add_2tuple_filter (port_id) protocol (pro_value) (pro_mask)
   dst_port (port_value) (port_mask) flags (flg_value) priority (prio_value)
   queue (queue_id) index (idx)
 - remove_2tuple_filter (port_id) index (idx)
 - get_2tuple_filter (port_id) index (idx)
 - add_5tuple_filter (port_id) dst_ip (dst_address) src_ip (src_address)
   dst_port (dst_port_value) src_port (src_port_value) protocol (protocol_value)
   mask (mask_value) flags (flags_value) priority (prio_value)"
   queue (queue_id) index (idx)
 - remove_5tuple_filter (port_id) index (idx)
 - get_5tuple_filter (port_id) index (idx)

New commands are added for 5tuple and 2tuple filter by using filter_ctrl API
and new ntuple filter structure:
 - 2tuple_filter (port_id) (add|del)
   dst_port (dst_port_value) protocol (protocol_value)
   mask (mask_value) tcp_flags (tcp_flags_value)
   priority (prio_value) queue (queue_id)
 - 5tuple_filter (port_id) (add|del)
   dst_ip (dst_address) src_ip (src_address)
   dst_port (dst_port_value) src_port (src_port_value)
   protocol (protocol_value)
   mask (mask_value) tcp_flags (tcp_flags_value)
   priority (prio_value) queue (queue_id)

Test report: http://dpdk.org/ml/archives/dev/2015-February/013049.html

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Huilong Xu <huilongx.xu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoixgbe: migrate ntuple filter to new API
Jingjing Wu [Tue, 10 Feb 2015 04:48:29 +0000 (12:48 +0800)]
ixgbe: migrate ntuple filter to new API

This patch defines new functions dealing with ntuple filters which is
corresponding to 5tuple in HW.
It removes old functions which deal with 5tuple filters.
Ntuple filter is dealt with through entrance ixgbe_dev_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoigb: migrate ntuple filter to new API
Jingjing Wu [Tue, 10 Feb 2015 04:48:30 +0000 (12:48 +0800)]
igb: migrate ntuple filter to new API

This patch defines new functions dealing with ntuple filters which is
corresponding to 2tuple filter for 82580 and i350 in HW, and to 5tuple
filter for 82576 in HW.
It removes old functions which deal with 2tuple and 5tuple filters in igb driver.
Ntuple filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoethdev: new ntuple filter API
Jingjing Wu [Tue, 10 Feb 2015 04:48:28 +0000 (12:48 +0800)]
ethdev: new ntuple filter API

This patch defines ntuple filter type RTE_ETH_FILTER_NTUPLE and its structure rte_eth_ntuple_filter.
It also corrects the typo TCP_UGR_FLAG to TCP_URG_FLAG

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoethdev: remove old syn filter API
Jingjing Wu [Wed, 11 Feb 2015 07:51:49 +0000 (15:51 +0800)]
ethdev: remove old syn filter API

Structure rte_syn_filter is removed.
Following APIs are removed:
  - rte_eth_dev_add_syn_filter
  - rte_eth_dev_remove_syn_filter
  - rte_eth_dev_get_syn_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: remove also from version map]

9 years agoapp/testpmd: new commands for syn filter
Jingjing Wu [Wed, 11 Feb 2015 07:51:48 +0000 (15:51 +0800)]
app/testpmd: new commands for syn filter

Following commands of syn filter are removed:
  - add_syn_filter (port_id) priority (high|low) queue (queue_id)
  - remove_syn_filter (port_id)
  - get_syn_filter (port_id)
New command is added for syn filter by using filter_ctrl API and new
syn filter structure:
  - syn_filter (port_id) (add|del) priority (high|low) queue (queue_id)

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoixgbe: migrate syn filter to new API
Jingjing Wu [Wed, 11 Feb 2015 07:51:46 +0000 (15:51 +0800)]
ixgbe: migrate syn filter to new API

This patch defines new functions dealing with syn filter.
It removes old functions which deal with syn filter.
Syn filter is dealt with through entrance ixgbe_dev_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoigb: migrate syn filter to new API
Jingjing Wu [Wed, 11 Feb 2015 07:51:47 +0000 (15:51 +0800)]
igb: migrate syn filter to new API

This patch defines new functions dealing with syn filter.
It removes old functions of syn filter in igb driver.
Syn filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: new syn filter API
Jingjing Wu [Wed, 11 Feb 2015 07:51:45 +0000 (15:51 +0800)]
ethdev: new syn filter API

This patch defines syn filter type RTE_ETH_FILTER_SYN and its structure rte_eth_syn_filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: remove old flex filter API
Jingjing Wu [Sat, 21 Feb 2015 01:53:10 +0000 (01:53 +0000)]
ethdev: remove old flex filter API

Structure rte_flex_filter is removed.
Following APIs are removed:
  - rte_eth_dev_add_flex_filter
  - rte_eth_dev_remove_flex_filter
  - rte_eth_dev_get_flex_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: remove also from version map]

9 years agoapp/testpmd: new commands for flex filter
Jingjing Wu [Sat, 21 Feb 2015 01:53:09 +0000 (01:53 +0000)]
app/testpmd: new commands for flex filter

Following commands of flex filter are removed:
  - add_flex_filter (port_id) len (len_value) bytes (bytes_string) mask (mask_value)
    priority (prio_value) queue (queue_id)
  - remove_flex_filter (port_id) index (idx)
  - get_flex_filter (port_id) index (idx)
New command is added for flex filter by using filter_ctrl API and new flex filter structure:
  - flex_filter (port_id) (add|del) len (len_value) bytes (bytes_value) mask (mask_value)
    priority (prio_value) queue (queue_id)

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoigb: migrate flex filter to new API
Jingjing Wu [Sat, 21 Feb 2015 01:53:08 +0000 (01:53 +0000)]
igb: migrate flex filter to new API

This patch defines new functions dealing with flex filter.
It removes old functions of flex filter in igb driver.
Syn filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoethdev: new flex filter API
Jingjing Wu [Sat, 21 Feb 2015 01:53:07 +0000 (01:53 +0000)]
ethdev: new flex filter API

This patch defines flex filter type RTE_ETH_FILTER_FLEXIBLE and its structure rte_eth_flex_filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
9 years agoapp/testpmd: set default flow director mask
Jingjing Wu [Thu, 29 Jan 2015 05:29:23 +0000 (13:29 +0800)]
app/testpmd: set default flow director mask

This patch sets the default value of flow director's mask.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoapp/testpmd: update display of flow director information
Jingjing Wu [Thu, 29 Jan 2015 05:29:22 +0000 (13:29 +0800)]
app/testpmd: update display of flow director information

update the function to print information includes:
 - capability
 - mask
 - flex configuration

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoapp/testpmd: update flow director commands
Jingjing Wu [Thu, 29 Jan 2015 05:29:21 +0000 (13:29 +0800)]
app/testpmd: update flow director commands

Add new command to set flow director's mask:
  - flow_director_mask
Update arguments of commands:
  - flow_director_filter
  - flow_director_flex_mask
  - flow_director_flex_payload
Following commands of flow director filter are removed:
  - add_signature_filter
  - upd_signature_filter
  - rm_signature_filter
  - add_perfect_filter
  - upd_perfect_filter
  - rm_perfect_filter
  - set_masks_filter
  - set_ipv6_masks_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoixgbe: support flow director flush
Jingjing Wu [Thu, 29 Jan 2015 05:29:20 +0000 (13:29 +0800)]
ixgbe: support flow director flush

This patch implement RTE_ETH_FILTER_FLUSH operation to delete all
flow director filters in ixgbe driver.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoixgbe: migrate flow director info and statistic to new API
Jingjing Wu [Thu, 29 Jan 2015 05:29:19 +0000 (13:29 +0800)]
ixgbe: migrate flow director info and statistic to new API

This patch changes the get info operation to be implemented through
filter_ctrl API and RTE_ETH_FILTER_INFO/RTE_ETH_FILTER_STATS ops.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoixgbe: support new flow director masks
Jingjing Wu [Thu, 29 Jan 2015 05:29:18 +0000 (13:29 +0800)]
ixgbe: support new flow director masks

This patch implement the mask configuration of flow director filter,
by using the mask defined in rte_fdir_conf instead of callback function
fdir_set_masks.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoethdev: add new flow director masks
Jingjing Wu [Thu, 29 Jan 2015 05:29:17 +0000 (13:29 +0800)]
ethdev: add new flow director masks

This patch defines structure rte_eth_fdir_masks.
It extends rte_fdir_conf and rte_eth_fdir_info to contain mask's configuration.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoethdev: remove flexbytes offset from flow director
Jingjing Wu [Thu, 29 Jan 2015 05:29:16 +0000 (13:29 +0800)]
ethdev: remove flexbytes offset from flow director

This patch removes the flexbytes_offset from rte_fdir_conf, because
the flexible payload setting is done by flex_conf instead of flexbytes_offset.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoixgbe: support flexpayload configuration of flow director
Jingjing Wu [Thu, 29 Jan 2015 05:29:13 +0000 (13:29 +0800)]
ixgbe: support flexpayload configuration of flow director

This patch implement the flexpayload configuration of flow director filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoethdev: extend flow type and flexible payload type for flow director
Jingjing Wu [Thu, 29 Jan 2015 05:29:12 +0000 (13:29 +0800)]
ethdev: extend flow type and flexible payload type for flow director

This patch adds RTE_ETH_FLOW_TYPE_RAW and RTE_ETH_RAW_PAYLOAD to support the
flexible payload is started from the beginning of the packet.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoixgbe: migrate flow director filtering to new API
Jingjing Wu [Thu, 29 Jan 2015 05:29:11 +0000 (13:29 +0800)]
ixgbe: migrate flow director filtering to new API

This patch changes the add/delete/update operations to be implemented through
filter_ctrl API and RTE_ETH_FILTER_ADD/RTE_ETH_FILTER_DELETE/RTE_ETH_FILTER_UPDATE ops.
It also removes the callback functions:
 - ixgbe_eth_dev_ops.fdir_add_signature_filter
 - ixgbe_eth_dev_ops.fdir_update_signature_filter
 - ixgbe_eth_dev_ops.fdir_remove_signature_filter
 - ixgbe_eth_dev_ops.fdir_add_perfect_filter
 - ixgbe_eth_dev_ops.fdir_update_perfect_filter
 - ixgbe_eth_dev_ops.fdir_remove_perfect_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agotools: enable binding device to uio_pci_generic
Danny Zhou [Fri, 20 Feb 2015 16:59:17 +0000 (16:59 +0000)]
tools: enable binding device to uio_pci_generic

Add uio_pci_generic to the list of supported kernel drivers.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agoeal/linux: toggle interrupt for uio_pci_generic
Danny Zhou [Fri, 20 Feb 2015 16:59:16 +0000 (16:59 +0000)]
eal/linux: toggle interrupt for uio_pci_generic

enable/disable interrupt by manipulating a control bit of command
register on NIC's PCIe configuration space.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agoeal/linux: enable uio_pci_generic support
Danny Zhou [Fri, 20 Feb 2015 16:59:15 +0000 (16:59 +0000)]
eal/linux: enable uio_pci_generic support

Change the EAL PCI code so that it can work with both the
uio_pci_generic in-tree driver, as well as the igb_uio
DPDK-specific driver.

This involves changes to
1) Modify method of retrieving BAR resource mapping information
2) Mapping using resource files in /sys rather than /dev/uio*
2) Setup bus master bit in NIC's PCIe configuration space for
uio_pci_generic.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agoapp/test: add unit tests for link bonding mode 6
Maciej Gajdzica [Fri, 20 Feb 2015 16:09:23 +0000 (17:09 +0100)]
app/test: add unit tests for link bonding mode 6

Added 4 unit tests checking link bonding mode 6 behavior.

Also modified virtual_pmd so it is possible to provide packets,
that should be received with rx_burst and to inspect packets
transmitted by tx_burst.

In packet_burst_generator.c function creating eth_header is
modified, so it accepts ether_type as a parameter and function
creating arp_header is added. Updated other unit tests to get
rid of compilation errors.

Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agobond: rename mode 5
Daniel Mrzyglod [Fri, 20 Feb 2015 16:09:22 +0000 (17:09 +0100)]
bond: rename mode 5

This patch modify mode older name from
BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING to BONDING_MODE_TLB
This patch also changes order of TEST_ASSERT macro in
test_tlb_verify_slave_link_status_change_failover.

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agoexamples/bond: add example application for link bonding mode 6
Michal Jastrzebski [Fri, 20 Feb 2015 16:09:21 +0000 (17:09 +0100)]
examples/bond: add example application for link bonding mode 6

This patch contains an example for link bonding mode 6.
It interact with user by a command prompt. Available commands are:
Start - starts ARP_thread which respond to ARP_requests and sends
ARP_updates (this
Is enabled by default after startup),
Stop  -stops ARP_thread,
Send count ip - send count ARP requests for IP,
Show - prints basic bond information, like IPv4 statistics from clients
Help,
Quit.
The best way to test mode 6 is to use this example together with
previous patch:
[PATCH 3/4] bond: add debug info for mode 6 link bonding.
Connect clients thru switch to bonding machine and send:
arping -c 1 bond_ip or
generate IPv4 traffic to bond_ip (IPv4 traffic from different clients
should be then balanced on slaves in round robin manner).

Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agobond: add debug info for mode 6
Michal Jastrzebski [Fri, 20 Feb 2015 16:09:20 +0000 (17:09 +0100)]
bond: add debug info for mode 6

This patch add some debug information when using link bonding mode 6.
It prints basic information about ARP packets on RX and TX (MAC, ip,
packet number, arp packet type).
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB == y.
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB_L1 is enabled instead of previous
one, use show command to see IPv4 balancing from clients.

Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agobond: add mode 6
Maciej Gajdzica [Fri, 20 Feb 2015 16:09:19 +0000 (17:09 +0100)]
bond: add mode 6

This mode includes adaptive TLB and receive load balancing (RLB). In RLB
the bonding driver intercepts ARP replies send by local system and
overwrites its source MAC address, so that different peers send data to
the server on different slave interfaces. When local system sends ARP
request, it saves IP information from it. When ARP reply from that peer
is received, its MAC is stored, one of slave MACs assigned and ARP reply
send to that peer.

Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
9 years agonet: change arp header struct declaration
Maciej Gajdzica [Fri, 20 Feb 2015 16:09:18 +0000 (17:09 +0100)]
net: change arp header struct declaration

Changed MAC address type from uint8_t[6] to struct ether_addr and IP
address type from uint8_t[4] to uint32_t to make it consistent with
other DPDK code using MAC and IP addresses. It allows us to use
is_same_ether_addr and ether_addr_copy functions on MAC addresses in ARP header.  Also
removed union from arp_hdr struct to make calls to arp_data items
shorter. Updated test-pmd to match new arp_hdr version.

Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
[Thomas: doxygenize comments]

9 years agovirtio: remove Rx hotspots from early return
Ouyang Changchun [Mon, 9 Feb 2015 01:14:13 +0000 (09:14 +0800)]
virtio: remove Rx hotspots from early return

Remove those hotspots which is unnecessary when early returning occurs.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: use port IO to get PCI resource
Ouyang Changchun [Mon, 9 Feb 2015 01:14:06 +0000 (09:14 +0800)]
virtio: use port IO to get PCI resource

Make virtio not require UIO for some security reasons, this is to match
6WIND's virtio-net-pmd.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: free mbuf's with threshold
Stephen Hemminger [Mon, 9 Feb 2015 01:14:05 +0000 (09:14 +0800)]
virtio: free mbuf's with threshold

This makes virtio driver work like ixgbe. Transmit buffers are
held until a transmit threshold is reached. The previous behavior
was to hold mbuf's until the ring entry was reused which caused
more memory usage than needed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: set MAC address
Stephen Hemminger [Mon, 9 Feb 2015 01:14:04 +0000 (09:14 +0800)]
virtio: set MAC address

Need to have do special things to set default mac address.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: suport multiple MAC addresses
Stephen Hemminger [Mon, 9 Feb 2015 01:14:03 +0000 (09:14 +0800)]
virtio: suport multiple MAC addresses

Virtio support multiple MAC addresses.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: support vlan filtering
Stephen Hemminger [Mon, 9 Feb 2015 01:14:02 +0000 (09:14 +0800)]
virtio: support vlan filtering

Virtio supports vlan filtering.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: fix states handling during initialization
Stephen Hemminger [Mon, 9 Feb 2015 01:13:58 +0000 (09:13 +0800)]
virtio: fix states handling during initialization

Change order of initialization to match Linux kernel.
Don't blow away control queue by doing reset when stopped.

Calling dev_stop then dev_start would not work.
Dev_stop was calling virtio reset and that would clear all queues
and clear all feature negotiation.
Resolved by only doing reset on device removal.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: move allocation before initialization
Stephen Hemminger [Mon, 9 Feb 2015 01:14:01 +0000 (09:14 +0800)]
virtio: move allocation before initialization

If allocation fails, don't want to leave virtio device stuck
in middle of initialization sequence.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agoexamples/vhost: add vlan strip command line option
Ouyang Changchun [Mon, 9 Feb 2015 01:14:10 +0000 (09:14 +0800)]
examples/vhost: add vlan strip command line option

Support turn on/off RX VLAN strip on host, this let guest get the chance of
using its software VLAN strip functionality.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agoexamples/vhost: avoid inserting vlan twice
Ouyang Changchun [Mon, 9 Feb 2015 01:14:09 +0000 (09:14 +0800)]
examples/vhost: avoid inserting vlan twice

Check if it has already been vlan-tagged packet, if true, avoid inserting a
duplicated vlan tag into it.

This is a possible case when guest has the capability of inserting vlan tag.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: use soft vlan strip with mergeable Rx packets
Ouyang Changchun [Mon, 9 Feb 2015 01:14:11 +0000 (09:14 +0800)]
virtio: use soft vlan strip with mergeable Rx packets

To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: use soft vlan encap/decap
Stephen Hemminger [Mon, 9 Feb 2015 01:13:55 +0000 (09:13 +0800)]
virtio: use soft vlan encap/decap

Implement VLAN stripping in software. This allows application
to be device independent.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agoether: add soft vlan encap/decap
Stephen Hemminger [Mon, 9 Feb 2015 01:13:54 +0000 (09:13 +0800)]
ether: add soft vlan encap/decap

It is helpful to allow device drivers that don't support hardware
VLAN stripping to emulate this in software. This allows application
to be device independent.

Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any
packet to be tagged that has a reference count > 1.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: remove redundant alignment field
Stephen Hemminger [Mon, 9 Feb 2015 01:13:57 +0000 (09:13 +0800)]
virtio: remove redundant alignment field

Since vq_alignment is constant (always 4K), it does not
need to be part of the vring struct.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: remove unnecessary adapter structure
Stephen Hemminger [Mon, 9 Feb 2015 01:13:56 +0000 (09:13 +0800)]
virtio: remove unnecessary adapter structure

Cleanup virtio code by eliminating unnecessary nesting of
virtio hardware structure inside adapter structure.
Also allows removing unneeded macro, making code clearer.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: support link state interrupt
Stephen Hemminger [Mon, 9 Feb 2015 01:13:53 +0000 (09:13 +0800)]
virtio: support link state interrupt

Virtio has link state interrupt which can be used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: allow starting with link down
Stephen Hemminger [Mon, 9 Feb 2015 01:13:52 +0000 (09:13 +0800)]
virtio: allow starting with link down

Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: fix update of vring descriptor index
Ouyang Changchun [Mon, 9 Feb 2015 01:14:15 +0000 (09:14 +0800)]
virtio: fix update of vring descriptor index

Updating the vring descriptor index should be done before notifying host;
Remove 2 duplicated store memory barriers in both Rx and Tx path because there is
store memory barrier in vq_update_avail_idx function;
Notify the host only if packets actually transmitted(nb_tx > 0).

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: use weaker barriers
Stephen Hemminger [Mon, 9 Feb 2015 01:13:51 +0000 (09:13 +0800)]
virtio: use weaker barriers

The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: check packet headroom at compile time
Stephen Hemminger [Mon, 9 Feb 2015 01:14:00 +0000 (09:14 +0800)]
virtio: check packet headroom at compile time

Better to check at compile time than fail at runtime.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
9 years agovirtio: make a function local
Stephen Hemminger [Mon, 9 Feb 2015 01:13:59 +0000 (09:13 +0800)]
virtio: make a function local

Make vtpci_get_status a local function as it is used in one file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>