dpdk.git
7 years agonet/virtio: unbind interrupt/eventfd when stopping
Jianfeng Tan [Tue, 17 Jan 2017 07:10:27 +0000 (07:10 +0000)]
net/virtio: unbind interrupt/eventfd when stopping

When virtio devices get stopped, tell the kernel to unbind the
mapping between interrupts and eventfds.

Note: it behaves differently from other NICs which close eventfds,
free struct. In virtio, we do those things when close device in
following patch.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: setup Rx queue interrupts
Jianfeng Tan [Tue, 17 Jan 2017 08:00:03 +0000 (08:00 +0000)]
net/virtio: setup Rx queue interrupts

This patch mainly allocates structure to store queue/irq mapping,
and configure queue/irq mapping down through PCI ops. It also creates
eventfds for each Rx queue and tell the kernel about the eventfd/intr
binding.

Note: So far, we hard-code 1:1 queue/irq mapping (each rx queue has
one exclusive interrupt), like this:
  vec 0 -> config irq
  vec 1 -> rxq0
  vec 2 -> rxq1
  ...

which means, the "vectors" option of QEMU should be configured with
a value >= N+1 (N is the number of the queue pairs).

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: add Rx interrupt enable/disable functions
Jianfeng Tan [Tue, 17 Jan 2017 07:10:25 +0000 (07:10 +0000)]
net/virtio: add Rx interrupt enable/disable functions

This patch implements interrupt enable/disable functions for each
Rx queue. And we rely on flags of avail queue as the hint for virtio
device to interrupt virtio driver or not.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: add PCI operation for queue/irq binding
Jianfeng Tan [Tue, 17 Jan 2017 07:10:24 +0000 (07:10 +0000)]
net/virtio: add PCI operation for queue/irq binding

Add handler in virtio_pci_ops to set queue/irq bind.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: add Rx descriptor check
Jianfeng Tan [Tue, 17 Jan 2017 07:10:23 +0000 (07:10 +0000)]
net/virtio: add Rx descriptor check

Under interrupt mode, rx_descriptor_done is used as an indicator
for applications to check if some number of packets are ready to
be received.

This patch enables this by checking used ring's local consumed idx
with shared (with backend) idx.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: invoke method directly for setting IRQ config
Jianfeng Tan [Tue, 17 Jan 2017 07:10:22 +0000 (07:10 +0000)]
net/virtio: invoke method directly for setting IRQ config

We need to define a prototype for such wrapper, which makes thing
too complicated. Remove wrapper and call set_config_irq directly.

Suggested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: fix rewriting LSC flag
Jianfeng Tan [Tue, 17 Jan 2017 07:10:21 +0000 (07:10 +0000)]
net/virtio: fix rewriting LSC flag

The LSC flag is decided according to if VIRTIO_NET_F_STATUS feature
is negotiated. Copy the PCI info after the judgement will rewrite
the correct result.

Fixes: 198ab33677c9 ("net/virtio: move device initialization in a function")
CC: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: enable multiqueue with kernel vhost
Jianfeng Tan [Fri, 13 Jan 2017 12:18:40 +0000 (12:18 +0000)]
net/virtio-user: enable multiqueue with kernel vhost

With vhost kernel, to enable multiqueue, we need backend device
in kernel support multiqueue feature. Specifically, with tap
as the backend, as linux/Documentation/networking/tuntap.txt shows,
we check if tap supports IFF_MULTI_QUEUE feature.

And for vhost kernel, each queue pair has a vhost fd, and with a tap
fd binding this vhost fd. All tap fds are set with the same tap
interface name.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: enable offloading
Jianfeng Tan [Fri, 13 Jan 2017 12:18:39 +0000 (12:18 +0000)]
net/virtio-user: enable offloading

When used with vhost kernel backend, we can offload at both directions.
  - From vhost kernel to virtio_user, the offload is enabled so that
    DPDK app can trust the flow is checksum-correct; and if DPDK app
    sends it through another port, the checksum needs to be
    recalculated or offloaded. It also applies to TSO.
  - From virtio_user to vhost_kernel, the offload is enabled so that
    kernel can trust the flow is L4-checksum-correct, no need to verify
    it; if kernel will consume it, DPDK app should make sure the
    l3-checksum is correctly set.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: support kernel vhost
Jianfeng Tan [Fri, 13 Jan 2017 12:18:38 +0000 (12:18 +0000)]
net/virtio-user: support kernel vhost

This patch add support vhost kernel as the backend for virtio_user.
Three main hook functions are added:
  - vhost_kernel_setup() to open char device, each vq pair needs one
    vhostfd;
  - vhost_kernel_ioctl() to communicate control messages with vhost
    kernel module;
  - vhost_kernel_enable_queue_pair() to open tap device and set it
    as the backend of corresonding vhost fd (that is to say, vq pair).

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: abstract backend operations
Jianfeng Tan [Fri, 13 Jan 2017 12:18:37 +0000 (12:18 +0000)]
net/virtio-user: abstract backend operations

Add a struct virtio_user_backend_ops to abstract three kinds of backend
operations:
  - setup, create the unix socket connection;
  - send_request, sync messages with backend;
  - enable_qp, enable some queue pair.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: move vhost-user specific code
Jianfeng Tan [Fri, 13 Jan 2017 12:18:36 +0000 (12:18 +0000)]
net/virtio-user: move vhost-user specific code

To support vhost kernel as the backend of net_virtio_user in coming
patches, we move vhost_user specific structs and macros into
vhost_user.c, and only keep common definitions in vhost.h.

Besides, remove VHOST_USER_MQ feature check.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: fix not properly reset device
Jianfeng Tan [Fri, 13 Jan 2017 12:18:35 +0000 (12:18 +0000)]
net/virtio-user: fix not properly reset device

virtio_user is not properly reset when users call vtpci_reset(),
as it ignores VIRTIO_CONFIG_STATUS_RESET status in
virtio_user_set_status().

This might lead to initialization failure as it starts to re-init
the device before sending RESET messege to backend. Besides, previous
callfds and kickfds are not closed.

To fix it, we add support to disable virtqueues when it's set to
DRIVER OK status, and re-init fields in struct virtio_user_dev.

Fixes: e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio-user: fix wrongly get/set features
Jianfeng Tan [Fri, 13 Jan 2017 12:18:34 +0000 (12:18 +0000)]
net/virtio-user: fix wrongly get/set features

Before the commit 86d59b21468a ("net/virtio: support LRO"), features
in virtio PMD, is decided and properly set at device initialization
and will not be changed. But afterward, features could be changed in
virtio_dev_configure(), and will be re-negotiated if it's changed.

In virtio-user, device features is obtained at driver probe phase
only once, but we did not store it. So the added feature bits in
re-negotiation will fail.

To fix it, we store it down, and will be used to feature negotiation
either at device initialization phase or device configure phase.

Fixes: e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: do not store PCI device pointer at shared memory
Yuanhan Liu [Thu, 12 Jan 2017 05:37:00 +0000 (13:37 +0800)]
net/virtio: do not store PCI device pointer at shared memory

hw->dev, a pointer to pci_dev, was actually not used, until the
refactor of decouping from PCI device. This would somehow break
the multiple process again, since "hw" is stored at shared memory,
while "pci_dev" is not: the primary and secondary process could
have different address for it, while just one value is allowed.

Thus we should not store it to "hw", instead, we could retrieve
it from the "eth_dev->device" field.

Fixes: ae34410a8a8a ("ethdev: move info filling of PCI into drivers")
Fixes: eac901ce29be ("ethdev: decouple from PCI device")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: access interrupt handler directly
Yuanhan Liu [Thu, 12 Jan 2017 05:31:57 +0000 (13:31 +0800)]
net/virtio: access interrupt handler directly

Since commit 0e1b45a284b4 ("ethdev: decouple interrupt handling from
PCI device"), intr_handle is stored at eth_dev struct, that we could
use it directly. Thus there is no need to get it from hw.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: fix multiple process support
Yuanhan Liu [Fri, 6 Jan 2017 10:16:19 +0000 (18:16 +0800)]
net/virtio: fix multiple process support

The introduce of virtio 1.0 support brings yet another set of ops, badly,
it's not handled correctly, that it breaks the multiple process support.

The issue is the data/function pointer may vary from different processes,
and the old used to do one time set (for primary process only). That
said, the function pointer the secondary process saw is actually from the
primary process space. Accessing it could likely result to a crash.

Kudos to the last patches, we now be able to maintain those info that may
vary among different process locally, meaning every process could have its
own copy for each of them, with the correct value set. And this is what
this patch does:

- remap the PCI (IO port for legacy device and memory map for modern
  device)

- set vtpci_ops correctly

After that, multiple process would work like a charm. (At least, it
passed my fuzzy test)

Fixes: b8f04520ad71 ("virtio: use PCI ioport API")
Fixes: d5bbeefca826 ("virtio: introduce PCI implementation structure")
Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")
Cc: stable@dpdk.org
Reported-by: Juho Snellman <jsnell@iki.fi>
Reported-by: Yaron Illouz <yaroni@radcom.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: store IO port info locally
Yuanhan Liu [Fri, 6 Jan 2017 10:16:18 +0000 (18:16 +0800)]
net/virtio: store IO port info locally

Like vtpci_ops, the rte_pci_ioport has to store in local memory. This
is basically for the rte_pci_device field is allocated from process
local memory, but not from shared memory.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: store PCI operators pointer locally
Yuanhan Liu [Fri, 6 Jan 2017 10:16:17 +0000 (18:16 +0800)]
net/virtio: store PCI operators pointer locally

We used to store the vtpci_ops at virtio_hw structure. The struct,
however, is stored in shared memory. That means only one value is
allowed. For the multiple process model, however, the address of
vtpci_ops should be different among different processes.

Take virtio PMD as example, the vtpci_ops is set by the primary
process, based on its own process space. If we access that address
from the secondary process, that would be an illegal memory access,
A crash then might happen.

To make the multiple process model work, we need store the vtpci_ops
in local memory but not in a shared memory. This is what the patch
does: a local virtio_hw_internal array of size RTE_MAX_ETHPORTS is
allocated. This new structure is used to store all these kind of
info in a non-shared memory. Current, we have:

- vtpci_ops

- rte_pci_ioport

- virtio pci mapped memory, such as common_cfg.

The later two will be done in coming patches. Later patches would also
set them correctly for secondary process, so that the multiple process
model could work.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: fix wrong Rx/Tx method for secondary process
Yuanhan Liu [Fri, 6 Jan 2017 10:16:16 +0000 (18:16 +0800)]
net/virtio: fix wrong Rx/Tx method for secondary process

If the primary enables the vector Rx/Tx path, the current code would
let the secondary always choose the non vector Rx/Tx path. This results
to a Rx/Tx method mismatch between primary and secondary process. Werid
errors then may happen, something like:

    PMD: virtio_xmit_pkts() tx: virtqueue_enqueue error: -14

Fix it by choosing the correct Rx/Tx callbacks for the secondary process.
That is, use vector path if it's given.

Fixes: 8d8393fb1861 ("virtio: pick simple Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoethdev: fix port data mismatched in multiple process model
Yuanhan Liu [Mon, 9 Jan 2017 07:50:59 +0000 (15:50 +0800)]
ethdev: fix port data mismatched in multiple process model

Assume we have two virtio ports, 00:03.0 and 00:04.0. The first one is
managed by the kernel driver, while the later one is managed by DPDK.

Now we start the primary process. 00:03.0 will be skipped by DPDK virtio
PMD driver (since it's being used by the kernel). 00:04.0 would be
successfully initiated by DPDK virtio PMD (if nothing abnormal happens).
After that, we would get a port id 0, and all the related info needed
by virtio (virtio_hw) is stored at rte_eth_dev_data[0].

Then we start the secondary process. As usual, 00:03.0 will be firstly
probed. It firstly tries to get a local eth_dev structure for it (by
rte_eth_dev_allocate):

        port_id = rte_eth_dev_find_free_port();
        ...

        eth_dev = &rte_eth_devices[port_id];
        eth_dev->data = &rte_eth_dev_data[port_id];
        ...

        return eth_dev;

Since it's a first PCI device, port_id will be 0. eth_dev->data would
then point to rte_eth_dev_data[0]. And here things start going wrong,
as rte_eth_dev_data[0] actually stores the virtio_hw for 00:04.0.

That said, in the secondary process, DPDK will continue to drive PCI
device 00.03.0 (despite the fact it's been managed by kernel), with
the info from PCI device 00:04.0. Which is wrong.

The fix is to attach the port already registered by the primary process.
That is, iterate the rte_eth_dev_data[], and get the port id who's PCI
ID matches the current PCI device.

This would let us maintain same port ID for the same PCI device, keeping
the chance of referencing to wrong data minimal.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agovhost: allow many vhost-user ports
Jan Wickbom [Wed, 21 Dec 2016 09:45:13 +0000 (17:45 +0800)]
vhost: allow many vhost-user ports

Currently select() is used to monitor file descriptors for vhostuser
ports. This limits the number of ports possible to create since the
fd number is used as index in the fd_set and we have seen fds > 1023.
This patch changes select() to poll(). This way we can keep an
packed (pollfd) array for the fds, e.g. as many fds as the size of
the array.

Also see:
http://dpdk.org/ml/archives/dev/2016-April/037024.html

Reported-by: Patrik Andersson <patrik.r.andersson@ericsson.com>
Signed-off-by: Jan Wickbom <jan.wickbom@ericsson.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: introduce reply ack feature
Maxime Coquelin [Mon, 12 Dec 2016 17:54:00 +0000 (18:54 +0100)]
vhost: introduce reply ack feature

REPLY_ACK features provide a generic way for QEMU to ensure both
completion and success of a request.

As described in vhost-user spec in QEMU repository, QEMU sets
VHOST_USER_NEED_REPLY flag (bit 3) when expecting a reply_ack from
the backend. Backend must reply with 0 for success or non-zero
otherwise when flag is set.

Currently, only VHOST_USER_SET_MEM_TABLE request implements reply_ack,
in order to synchronize mapping updates.

This patch enables REPLY_ACK feature generally, but only checks error
code for VHOST_USER_SET_MEM_TABLE.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoexamples/vhost: fix lcore initialization
Yong Wang [Wed, 11 Jan 2017 08:59:46 +0000 (03:59 -0500)]
examples/vhost: fix lcore initialization

when "TAILQ_INIT()" was added to the loop of "for (lcore_id = 0; ...)"
statement, the assignment to "lcore_ids" was removed out of the loop.
It changed the original initialization of "lcore_ids".

Fix it by introducing two braces.

Fixes: 45657a5c6861 ("examples/vhost: use tailq to link vhost devices")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoexamples/vhost: fix calculation of mbuf count
Yong Wang [Thu, 12 Jan 2017 03:52:17 +0000 (22:52 -0500)]
examples/vhost: fix calculation of mbuf count

When calculating 'nr_mbufs_per_core', 'MAX_PKT_BURST' was mutiplied
twice. Fix it by removing one of them.

Fixes: bdb19b771e6f ("examples/vhost: fix mbuf allocation failure")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/vhost: emulate device start/stop behavior
Chas Williams [Tue, 3 Jan 2017 16:22:43 +0000 (11:22 -0500)]
net/vhost: emulate device start/stop behavior

.dev_start()/.dev_stop() roughly corresponds to the local device's port
being ready.  This is different from the remote client being connected
which is roughly link up or down.  Emulate the device start/stop behavior
by separately tracking the start/stop state to determine if we should
allow packets to be queued to/from the remote client.

Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/vhost: fix socket file deleted on stop
Chas Williams [Tue, 3 Jan 2017 16:22:42 +0000 (11:22 -0500)]
net/vhost: fix socket file deleted on stop

If you create a vhost server device, it doesn't create the actual datagram
socket until you call .dev_start().  If you call .dev_stop() is also
deletes those sockets.  For QEMU clients, this is a problem since QEMU
doesn't know how to re-attach to datagram sockets that have gone away.

To fix this, register and unregister the datagram sockets during device
creation and removal.

Fixes: ee584e9710b9 ("vhost: add driver on top of the library")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: fix memory leak
Yong Wang [Wed, 4 Jan 2017 03:57:55 +0000 (22:57 -0500)]
vhost: fix memory leak

In function vhost_new_device(), current code dose not free 'dev'
in "i == MAX_VHOST_DEVICE" condition statements. It will lead to a
memory leak.

Fixes: 45ca9c6f7bc6 ("vhost: get rid of linked list for devices")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: use any layout for version 1.0
Pierre Pfister [Wed, 30 Nov 2016 09:18:42 +0000 (09:18 +0000)]
net/virtio: use any layout for version 1.0

Current virtio driver advertises VERSION_1 support,
but does not handle device's VERSION_1 support when
sending packets (it looks for ANY_LAYOUT feature,
which is absent).

This patch enables 'can_push' in tx path when VERSION_1
is advertised by the device.

This significantly improves small packets forwarding rate
towards devices advertising VERSION_1 feature.

Signed-off-by: Pierre Pfister <ppfister@cisco.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: fix guest/host physical address mapping
Haifeng Lin [Thu, 1 Dec 2016 11:42:02 +0000 (19:42 +0800)]
vhost: fix guest/host physical address mapping

When reg_size < page_size the function read in
rte_mem_virt2phy would not return, because
host_user_addr is invalid.

Fixes: e246896178e6 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org
Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agombuf: add a function to linearize a packet
Tomasz Kulasek [Thu, 12 Jan 2017 09:40:44 +0000 (10:40 +0100)]
mbuf: add a function to linearize a packet

This patch adds function rte_pktmbuf_linearize to let crypto PMD coalesce
chained mbuf before crypto operation and extend their capabilities to
support segmented mbufs when device cannot handle them natively.

Included unit tests for rte_pktmbuf_linearize functionality:

 1) Creates banch of segmented mbufs with different size and number of
    segments.
 2) Fills noncontigouos mbuf with sequential values.
 3) Uses rte_pktmbuf_linearize to coalesce segmented buffer into one
    contiguous.
 4) Verifies data in linearized buffer.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoapp/testpmd: add MACsec commands
Tiwei Bie [Fri, 13 Jan 2017 11:21:40 +0000 (19:21 +0800)]
app/testpmd: add MACsec commands

Below MACsec offload commands are added:

- set macsec offload <port_id> on encrypt on|off replay-protect on|off
- set macsec offload <port_id> off
- set macsec sc tx|rx <port_id> <mac> <pi>
- set macsec sa tx|rx <port_id> <idx> <an> <pn> <key>

Also update the testpmd user guide.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
7 years agonet/ixgbe: add MACsec offload
Tiwei Bie [Fri, 13 Jan 2017 11:21:39 +0000 (19:21 +0800)]
net/ixgbe: add MACsec offload

MACsec (or LinkSec, 802.1AE) is a MAC level encryption/authentication
scheme defined in IEEE 802.1AE that uses symmetric cryptography.
This commit adds the MACsec offload support for ixgbe.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
7 years agoethdev: add MACsec capability flags
Tiwei Bie [Fri, 13 Jan 2017 11:21:38 +0000 (19:21 +0800)]
ethdev: add MACsec capability flags

If these flags are advertised by a PMD, the NIC supports the MACsec
offload. The incoming MACsec traffics can be offloaded transparently
after the MACsec offload is configured correctly by the application.
And the application can set the PKT_TX_MACSEC flag in mbufs to enable
the MACsec offload for the packets to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
7 years agoethdev: add MACsec event type
Tiwei Bie [Fri, 13 Jan 2017 11:21:37 +0000 (19:21 +0800)]
ethdev: add MACsec event type

This commit adds a below event type:

- RTE_ETH_EVENT_MACSEC

This event will occur when the PN counter in a MACsec connection
reaches the exhaustion threshold.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
7 years agombuf: add MACsec flag
Tiwei Bie [Fri, 13 Jan 2017 11:21:36 +0000 (19:21 +0800)]
mbuf: add MACsec flag

Add a new Tx flag in mbuf, that can be set by applications to
enable the MACsec offload for a packet to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agokvargs: make pointers in string arrays const
Bruce Richardson [Thu, 12 Jan 2017 16:18:27 +0000 (16:18 +0000)]
kvargs: make pointers in string arrays const

Change the parameters of functions from const char *valid[] to
const char * const valid[]. This additional const is needed to
allow us to fix some checkpatch warnings, as well as being good
programming practice.

For the checkpatch warnings, if we have a set of command line
args that we want to check defined as:
static const char *args[] = { "arg1", "arg2", NULL };
kvlist = rte_kvargs_parse(params, args);

checkpatch will complain:
WARNING:STATIC_CONST_CHAR_ARRAY: static const char *
array should probably be static const char * const

Adding the additional const to the definition of the args
will then trigger a compiler error in the absence of this
change to the kvargs library, as we lose the const in the
call to kvargs_parse.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoapp/testpmd: fix static build link ordering
Jerin Jacob [Thu, 12 Jan 2017 07:46:54 +0000 (13:16 +0530)]
app/testpmd: fix static build link ordering

By introducing explicit -lrte_pmd_ixgbe link request in
testpmd Makefile,"-Wl,-lrte_pmd_ixgbe" provided twice, and linker
removes the duplication by keeping only first occurrence.
This moves "-Wl,-lrte_pmd_ixgbe" out of "-Wl,--whole-archive" flag
and makes symbol generation totally different than previous version
in case of static build.
This patch fixes the static build linking order by introducing
-lrte_pmd_ixgbe under the shared library config
(CONFIG_RTE_BUILD_SHARED_LIB).

Fixes: 425781ff5afe ("app/testpmd: add ixgbe VF management")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agodevtools: skip capitalization check for commit prefixes
Bruce Richardson [Fri, 13 Jan 2017 13:02:17 +0000 (13:02 +0000)]
devtools: skip capitalization check for commit prefixes

The prefix in the commit title must be a valid component name and is
checked in separate checks. For capitalization, just check the part after
the colon. This is already done for most capitalization checks, just make
the remainder consistent with this.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agomempool: use cache in single producer or consumer mode
Wenfeng Liu [Wed, 11 Jan 2017 02:25:28 +0000 (02:25 +0000)]
mempool: use cache in single producer or consumer mode

Currently we will check mempool flags when we put/get objects from
mempool. However, this makes cache useless when mempool is SC|SP,
SC|MP, MC|SP cases.
This patch makes cache available in above cases and improves performance.

Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agopci: use address struct in function arguments
Ben Walker [Wed, 11 Jan 2017 17:10:12 +0000 (10:10 -0700)]
pci: use address struct in function arguments

Instead of passing domain, bus, devid, func, just pass
an rte_pci_addr.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agopci: separate detaching ethernet ports from PCI devices
Ben Walker [Wed, 11 Jan 2017 17:10:11 +0000 (10:10 -0700)]
pci: separate detaching ethernet ports from PCI devices

Attaching and detaching ethernet ports from an application
is not the same thing as physically removing a PCI device,
so clarify the flags indicating support. All PCI devices
are assumed to be physically removable, so no flag is
necessary in the PCI layer.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
7 years agopci: unmap resources if probe fails
Ben Walker [Wed, 11 Jan 2017 17:10:10 +0000 (10:10 -0700)]
pci: unmap resources if probe fails

If resources were mapped prior to probe, unmap them
if probe fails.

This does not handle the case where the kernel driver was
forcibly unbound prior to probe.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agoethdev: define default item masks in flow API
Adrien Mazarguil [Tue, 10 Jan 2017 13:08:30 +0000 (14:08 +0100)]
ethdev: define default item masks in flow API

Leaving default pattern item mask values up for interpretation by PMDs is
an undefined behavior that applications might find difficult to use in the
wild. It also needlessly complicates PMD implementation.

This commit addresses this by defining consistent default masks for each
item type.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoethdev: clarify RSS action in flow API
Adrien Mazarguil [Tue, 10 Jan 2017 13:08:29 +0000 (14:08 +0100)]
ethdev: clarify RSS action in flow API

Contrary to the current description, mbuf RSS hash result storage does not
overlap with the returned MARK value (hash.fdir.lo vs. hash.fdir.hi), and
both may be combined.

Reflect this change by allowing testpmd to display both values
simultaneously.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoethdev: clarify MARK and FLAG actions in flow API
Adrien Mazarguil [Tue, 10 Jan 2017 13:08:28 +0000 (14:08 +0100)]
ethdev: clarify MARK and FLAG actions in flow API

Both actions share the PKT_RX_FDIR mbuf flag, as a result there is no way
to tell them apart. Moreover, the maximum allowed value for the MARK action
may not necessarily cover the entire 32-bit space.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoethdev: modify flow API error function
Adrien Mazarguil [Tue, 10 Jan 2017 13:08:27 +0000 (14:08 +0100)]
ethdev: modify flow API error function

Based on initial PMD implementations of the flow API, returning the error
structure which may be NULL is useless and always discarded.

Returning the error code instead appears to be much more convenient.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoapp/testpmd: fix array bounds checks
Adrien Mazarguil [Tue, 10 Jan 2017 13:08:26 +0000 (14:08 +0100)]
app/testpmd: fix array bounds checks

This commit addresses several obvious issues reported by Coverity
with array bounds checks in functions related to the flow API.

Coverity issue: 139596, 139597, 139598, 139599
Fixes: 938a184a1870 ("app/testpmd: implement basic support for flow API")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agopmdinfogen: fix null dereference
Neil Horman [Thu, 5 Jan 2017 19:22:41 +0000 (14:22 -0500)]
pmdinfogen: fix null dereference

Coverity reports a forward null dereference from a for loop
that works with a variable previously tested for null that had no error
handling or condition to prevent it.  Pretty obvious fix below.

Coverity issue: 139593
Fixes: 98b0fdb0ffc6 ("pmdinfogen: add buildtools and pmdinfogen utility")

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
7 years agoeal: define generic vector types
Nelio Laranjeiro [Wed, 16 Nov 2016 15:20:38 +0000 (16:20 +0100)]
eal: define generic vector types

Add common vector type definitions to all CPU architectures.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
7 years agotools: move to usertools
Thomas Monjalon [Thu, 15 Dec 2016 21:25:36 +0000 (22:25 +0100)]
tools: move to usertools

Rename tools/ into usertools/ to differentiate from buildtools/
and devtools/ while making clear these scripts are part of
DPDK runtime.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agoscripts: move to devtools
Thomas Monjalon [Thu, 15 Dec 2016 21:47:44 +0000 (22:47 +0100)]
scripts: move to devtools

The remaining scripts in the scripts/ directory are only useful
to developers. That's why devtools/ is a better name.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agoscripts: move to buildtools
Thomas Monjalon [Thu, 15 Dec 2016 21:46:47 +0000 (22:46 +0100)]
scripts: move to buildtools

There is already a directory buildtools for pmdinfogen used by
the build system. The scripts used in makefiles are moved here.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agodoc: add required python versions
John McNamara [Wed, 21 Dec 2016 15:03:49 +0000 (15:03 +0000)]
doc: add required python versions

Add a requirement to support both Python 2 and 3 to the
DPDK Python Coding Standards and Getting started Guide.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agomake python scripts python2/3 compliant
John McNamara [Wed, 21 Dec 2016 15:03:48 +0000 (15:03 +0000)]
make python scripts python2/3 compliant

Make all the DPDK Python apps work with Python 2 or 3 to
allow them to work with whatever is the system default.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agomake python scripts PEP8 compliant
John McNamara [Wed, 21 Dec 2016 15:03:47 +0000 (15:03 +0000)]
make python scripts PEP8 compliant

Make all DPDK python application compliant with the PEP8 standard
to allow for consistency checking of patches and to allow further
refactoring.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agomk: disable icc warning 188
Ferruh Yigit [Tue, 3 Jan 2017 16:15:42 +0000 (16:15 +0000)]
mk: disable icc warning 188

error #188: enumerated type mixed with another type

This is get when an integer assigned to an enum variable.

Since this usage is common and causing many ICC compilation errors, and
other compilers accept this usage. Disabling the warning.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoapp/testpmd: use Tx preparation in checksum engine
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:54 +0000 (19:40 +0100)]
app/testpmd: use Tx preparation in checksum engine

Since all current drivers supports Tx preparation API, it is used
in csum forwarding engine by default for all drivers.

Adding additional step to the csum engine costs about 3-4% of performance
drop, on my setup with ixgbe driver. It's caused mostly by the need
of reaccessing and modification of packet data.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agonet/ena: add Tx preparation
Konstantin Ananyev [Fri, 23 Dec 2016 18:40:53 +0000 (19:40 +0100)]
net/ena: add Tx preparation

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agonet/vmxnet3: add Tx preparation
Konstantin Ananyev [Fri, 23 Dec 2016 18:40:52 +0000 (19:40 +0100)]
net/vmxnet3: add Tx preparation

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Yong Wang <yongwang@vmware.com>
7 years agonet/fm10k: add Tx preparation
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:49 +0000 (19:40 +0100)]
net/fm10k: add Tx preparation

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agonet/i40e: add Tx preparation
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:50 +0000 (19:40 +0100)]
net/i40e: add Tx preparation

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agonet/ixgbe: add Tx preparation
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:51 +0000 (19:40 +0100)]
net/ixgbe: add Tx preparation

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agonet/e1000: add Tx preparation
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:48 +0000 (19:40 +0100)]
net/e1000: add Tx preparation

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agoethdev: add Tx preparation
Tomasz Kulasek [Fri, 23 Dec 2016 18:40:47 +0000 (19:40 +0100)]
ethdev: add Tx preparation

Added API for `rte_eth_tx_prepare`

uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)

Added fields to the `struct rte_eth_desc_lim`:

uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */

uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */

These fields can be used to create valid packets according to the
following rules:

 * For non-TSO packet, a single transmit packet may span up to
   "nb_mtu_seg_max" buffers.

 * For TSO packet the total number of data descriptors is "nb_seg_max",
   and each segment within the TSO may span up to "nb_mtu_seg_max".

Added functions:

int
rte_validate_tx_offload(struct rte_mbuf *m)

  to validate general requirements for tx offload set in mbuf of packet
  such a flag completness. In current implementation this function is
  called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.

int rte_net_intel_cksum_prepare(struct rte_mbuf *m)

  to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
  before hardware tx checksum offload.
   - for non-TSO tcp/udp packets full pseudo-header checksum is
     counted and set.
   - for TSO the IP payload length is not included.

int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)

  this function uses same logic as rte_net_intel_cksum_prepare, but
  allows application to choose which offloads should be taken into
  account, if full preparation is not required.

PERFORMANCE TESTS
-----------------

This feature was tested with modified csum engine from test-pmd.

The packet checksum preparation was moved from application to Tx
preparation step placed before burst.

We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)

We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.

IMPACT:

1) For unimplemented Tx preparation callback the performance impact is
   negligible,
2) For packet condition check without checksum modifications (nb_segs,
   available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
   initialization) is 14060924/13588094 (~3.48% drop)

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agoethdev: clarify extended statistics documentation
Olivier Matz [Fri, 23 Dec 2016 20:35:48 +0000 (21:35 +0100)]
ethdev: clarify extended statistics documentation

Reword the API documentation of ethdev xstats.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
7 years agoethdev: fix extended statistics name index
Olivier Matz [Fri, 16 Dec 2016 09:44:13 +0000 (10:44 +0100)]
ethdev: fix extended statistics name index

The function rte_eth_xstats_get() return an array of tuples (id,
value). The value is the statistic counter, while the id references a
name in the array returned by rte_eth_xstats_get_name().

Today, each 'id' returned by rte_eth_xstats_get() is equal to the index
in the returned array, making this value useless. It also prevents a
driver from having different indexes for names and value, like in the
example below:

  rte_eth_xstats_get_name() returns:
    0: "rx0_stat"
    1: "rx1_stat"
    2: ...
    7: "rx7_stat"
    8: "tx0_stat"
    9: "tx1_stat"
    ...
    15: "tx7_stat"

  rte_eth_xstats_get() returns:
    0: id=0, val=<stat>    ("rx0_stat")
    1: id=1, val=<stat>    ("rx1_stat")
    2: id=8, val=<stat>    ("tx0_stat")
    3: id=9, val=<stat>    ("tx1_stat")

This patch fixes the drivers to set the 'id' in their ethdev->xstats_get()
(except e1000 which was already doing it), and fixes ethdev by not setting
the 'id' field to the index of the table for pmd-specific stats: instead,
they should just be shifted by the max number of generic statistics.

Fixes: bd6aa172cf35 ("ethdev: fetch extended statistics with integer ids")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
7 years agoethdev: decouple from PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:11 +0000 (16:58 +0100)]
ethdev: decouple from PCI device

This makes struct rte_eth_dev independent of struct rte_pci_device by
replacing it with a pointer to the generic struct rte_device.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agoethdev: move info filling of PCI into drivers
Jan Blunck [Fri, 23 Dec 2016 15:58:10 +0000 (16:58 +0100)]
ethdev: move info filling of PCI into drivers

Only the drivers itself can decide if it could fill PCI information fields
of dev_info.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agoethdev: decouple interrupt handling from PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:09 +0000 (16:58 +0100)]
ethdev: decouple interrupt handling from PCI device

The struct rte_intr_handle is an abstraction layer for different types of
interrupt mechanisms. It is embedded in the low-level device (e.g. PCI).
On allocation of a struct rte_eth_dev a reference to the intr_handle
should be stored for devices supporting interrupts.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agonet/vmxnet3: use driver name from ethdev
Jan Blunck [Fri, 23 Dec 2016 15:58:08 +0000 (16:58 +0100)]
net/vmxnet3: use driver name from ethdev

Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/szedata2: localize handling of PCI resources
Jan Blunck [Fri, 23 Dec 2016 15:58:06 +0000 (16:58 +0100)]
net/szedata2: localize handling of PCI resources

This changes the driver to handle the PCI resource directly instead
of repeatedly going through eth_dev.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/nfp: localize mapping of ethdev to PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:07 +0000 (16:58 +0100)]
net/nfp: localize mapping of ethdev to PCI device

This simplifies later changes to ethdev.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/qede: localize mapping of ethdev to PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:05 +0000 (16:58 +0100)]
net/qede: localize mapping of ethdev to PCI device

This simplifies later changes to ethdev.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: localize mapping of ethdev to PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:03 +0000 (16:58 +0100)]
net/bnx2x: localize mapping of ethdev to PCI device

Use device private information to minimize the places that assume eth_dev
contains pci_dev.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/fm10k: localize mapping of ethdev to PCI device
Jan Blunck [Fri, 23 Dec 2016 15:58:04 +0000 (16:58 +0100)]
net/fm10k: localize mapping of ethdev to PCI device

This simplifies later changes to ethdev.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/virtio: do not depend on PCI device of ethdev
Jan Blunck [Fri, 23 Dec 2016 15:58:02 +0000 (16:58 +0100)]
net/virtio: do not depend on PCI device of ethdev

We don't need to depend on rte_eth_dev->pci_dev to differentiate between
the virtio_user and the virtio_pci case. Instead we can use the private
virtio_hw struct to get that information.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agonet/virtio: add helper to get interrrupt handle
Jan Blunck [Fri, 23 Dec 2016 15:58:01 +0000 (16:58 +0100)]
net/virtio: add helper to get interrrupt handle

This adds a helper to get the rte_intr_handle from the virtio_hw. This is
safe to do since the usage of the helper is guarded by RTE_ETH_DEV_INTR_LSC
which is only set if we found a PCI device during initialization.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agonet/virtio: remove useless driver name copy
Jan Blunck [Fri, 23 Dec 2016 15:58:00 +0000 (16:58 +0100)]
net/virtio: remove useless driver name copy

This is overwritten in rte_eth_dev_info_get().

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Reviewed-by: David Marchand <david.marchand@6wind.com>
7 years agonet/bnxt: localize mapping of ethdev to PCI device
Stephen Hemminger [Fri, 23 Dec 2016 15:57:59 +0000 (16:57 +0100)]
net/bnxt: localize mapping of ethdev to PCI device

Use existing information about pci and interrupt handle to minimize
the number of places that assume eth_dev contains pci_device
information.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/i40e: localize mapping of ethdev to PCI device
Stephen Hemminger [Fri, 23 Dec 2016 15:57:58 +0000 (16:57 +0100)]
net/i40e: localize mapping of ethdev to PCI device

Simplify later changes to eth_dev.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/ixgbe: localize mapping of ethdev to PCI device
Stephen Hemminger [Fri, 23 Dec 2016 15:57:57 +0000 (16:57 +0100)]
net/ixgbe: localize mapping of ethdev to PCI device

Since later changes will change where PCI information is,
localize mapping in one macro.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
7 years agonet/e1000: localize mapping of ethdev to PCI device
Stephen Hemminger [Fri, 23 Dec 2016 15:57:56 +0000 (16:57 +0100)]
net/e1000: localize mapping of ethdev to PCI device

Create one macro for where PCI device information is extracted
from ethernet device. Makes later changes easier to review, and test.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
7 years agodrivers: remove useless reset of PCI device pointer
Stephen Hemminger [Fri, 23 Dec 2016 15:57:55 +0000 (16:57 +0100)]
drivers: remove useless reset of PCI device pointer

Since rte_eth_dev_info_get does memset() on dev_info before
calling device specific code, the explicit assignment of NULL
in all these virtual drivers has no effect.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
7 years agoeal: make driver pointer const in device struct
Stephen Hemminger [Fri, 23 Dec 2016 15:57:54 +0000 (16:57 +0100)]
eal: make driver pointer const in device struct

The info in rte_device about driver is immutable and
shouldn't change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
7 years agoeal: allow passing const interrupt handle
Jan Blunck [Fri, 23 Dec 2016 15:57:53 +0000 (16:57 +0100)]
eal: allow passing const interrupt handle

Both register/unregister and enable/disable don't necessarily require the
rte_intr_handle to be modifiable. Therefore lets constify it.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agoeal: define container_of macro
Jan Blunck [Fri, 23 Dec 2016 15:57:52 +0000 (16:57 +0100)]
eal: define container_of macro

This macro is based on Jan Viktorin's original patch but also checks the
type of the passed pointer against the type of the member.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
[jblunck@infradead.org: add type checking and __extension__]
Signed-off-by: Jan Blunck <jblunck@infradead.org>
7 years agoapp/testpmd: fix flow command build on FreeBSD
Adrien Mazarguil [Fri, 23 Dec 2016 15:52:56 +0000 (16:52 +0100)]
app/testpmd: fix flow command build on FreeBSD

A missing include causes the following compilation errors:

 error: use of undeclared identifier 'AF_INET'
 error: use of undeclared identifier 'AF_INET6'

Fixes: ef6e38550f07 ("app/testpmd: add items ipv4/ipv6 to flow command")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agodoc: update release notes for flow API
Adrien Mazarguil [Fri, 23 Dec 2016 14:00:05 +0000 (15:00 +0100)]
doc: update release notes for flow API

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agoapp/testpmd: add protocol fields to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:42 +0000 (15:51 +0100)]
app/testpmd: add protocol fields to flow command

This commit exposes the following item fields through the flow command:

- VLAN priority code point, drop eligible indicator and VLAN identifier
  (all part of TCI).
- IPv4 type of service, time to live and protocol.
- IPv6 traffic class, flow label, next header and hop limit.
- SCTP tag and checksum.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agodoc: describe testpmd flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:41 +0000 (15:51 +0100)]
doc: describe testpmd flow command

Document syntax, interaction with rte_flow and provide usage examples.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agoapp/testpmd: add queue actions to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:40 +0000 (15:51 +0100)]
app/testpmd: add queue actions to flow command

- QUEUE: assign packets to a given queue index.
- DUP: duplicate packets to a given queue index.
- RSS: spread packets among several queues.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add various actions to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:39 +0000 (15:51 +0100)]
app/testpmd: add various actions to flow command

- MARK: attach 32 bit value to packets.
- FLAG: flag packets.
- DROP: drop packets.
- COUNT: enable counters for a rule.
- PF: redirect packets to physical device function.
- VF: redirect packets to virtual device function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add L4 items to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:38 +0000 (15:51 +0100)]
app/testpmd: add L4 items to flow command

Add the ability to match a few properties of common L4[.5] protocol
headers:

- ICMP: type and code.
- UDP: source and destination ports.
- TCP: source and destination ports.
- SCTP: source and destination ports.
- VXLAN: network identifier.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add items ipv4/ipv6 to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:37 +0000 (15:51 +0100)]
app/testpmd: add items ipv4/ipv6 to flow command

Add the ability to match basic fields from IPv4 and IPv6 headers (source
and destination addresses only).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add items eth/vlan to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:36 +0000 (15:51 +0100)]
app/testpmd: add items eth/vlan to flow command

These pattern items match basic Ethernet headers (source, destination and
type) and related 802.1Q/ad VLAN headers.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add item raw to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:35 +0000 (15:51 +0100)]
app/testpmd: add item raw to flow command

Matches arbitrary byte strings with properties:

- relative: look for pattern after the previous item.
- search: search pattern from offset (see also limit).
- offset: absolute or relative offset for pattern.
- limit: search area limit for start of pattern.
- length: pattern length.
- pattern: byte string to look for.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add various items to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:34 +0000 (15:51 +0100)]
app/testpmd: add various items to flow command

- PF: match packets addressed to the physical function.
- VF: match packets addressed to a virtual function ID.
- PORT: device-specific physical port index to use.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: add item any to flow command
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:33 +0000 (15:51 +0100)]
app/testpmd: add item any to flow command

This pattern item matches any protocol in place of the current layer and
has two properties:

- min: minimum number of layers covered (0 or more).
- max: maximum number of layers covered (0 means infinity).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
7 years agoapp/testpmd: support flow bit-field
Adrien Mazarguil [Wed, 21 Dec 2016 14:51:32 +0000 (15:51 +0100)]
app/testpmd: support flow bit-field

Several rte_flow structures expose bit-fields that cannot be set in a
generic fashion at byte level. Add bit-mask support to handle them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>