dpdk.git
10 years agopci: add i40e devices
Helin Zhang [Thu, 5 Jun 2014 05:08:47 +0000 (13:08 +0800)]
pci: add i40e devices

The PCI device IDs of i40e have been added.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agombuf: add new packet flags for i40e
Helin Zhang [Thu, 5 Jun 2014 05:08:49 +0000 (13:08 +0800)]
mbuf: add new packet flags for i40e

New packet flags of both RX and TX have been added to support i40e.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoethdev: set port based vlan
Helin Zhang [Thu, 5 Jun 2014 05:08:50 +0000 (13:08 +0800)]
ethdev: set port based vlan

To support i40e, new ops has been added to support setting
port based vlan insertion.

New command 'tx_vlan set pvid port_id vlan_id (on|off)'
has been added in testpmd to configure port based vlan insertion.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoethdev: function to compare ethernet addresses
Helin Zhang [Thu, 5 Jun 2014 05:08:50 +0000 (13:08 +0800)]
ethdev: function to compare ethernet addresses

To support i40e, function is_same_ether_addr() has been added to
compare two ethernet address.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoethdev: new link speeds
Helin Zhang [Thu, 5 Jun 2014 05:08:50 +0000 (13:08 +0800)]
ethdev: new link speeds

For i40e support, new link speeds are needed.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoethdev: more RSS flags
Helin Zhang [Thu, 5 Jun 2014 05:08:50 +0000 (13:08 +0800)]
ethdev: more RSS flags

- i40e RSS flags have been added (and enlarged to 64-bit)
- A new configuration of 'uint8_t rss_key_len' has been added in
  'struct rte_eth_rss_conf' to support different length of RSS keys.
- In each PMD, only the supported flags are masked.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoethdev: allow maximum packet length to less than 1518
Helin Zhang [Thu, 5 Jun 2014 05:08:51 +0000 (13:08 +0800)]
ethdev: allow maximum packet length to less than 1518

In ethdev, it ignores setting maximum packet length to less than
1518. The changes is to remove this limitation and let less than
1518 can be set for 'maximum packet length'.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agopci: access to specific bits via sysfs
Helin Zhang [Thu, 5 Jun 2014 05:09:09 +0000 (13:09 +0800)]
pci: access to specific bits via sysfs

Enabling 'Extended Tag' and resetting 'Max Read Request Size' in PCI
config space have big impacts to i40e performance. They cannot be
changed on some BIOS implementations, though can on others. Two sys
files of 'extended_tag' and 'max_read_request_size' are added to
support changing them by 'echo' in user space.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agoexamples/vmdq: fix Tx queue id
Ouyang Changchun [Thu, 12 Jun 2014 07:10:05 +0000 (15:10 +0800)]
examples/vmdq: fix Tx queue id

This patch fixes a core id issue in sample vmdq, in case core mask
doesn't start with lcore_id 0 but 20, for instance,
queue id should use core_id instead of lcore_id.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoixgbe: fix link status interrupt of bypass device
Pablo de Lara [Tue, 10 Jun 2014 21:33:27 +0000 (22:33 +0100)]
ixgbe: fix link status interrupt of bypass device

Function ixgbe_get_media_type_82599 returns media_type =
ixgbe_media_type_unknown, when using an 82599 Bypass NIC,
so that causes link status interrupt not to work properly.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoethdev: add Rx error counters for missed, badcrc and badlen packets
Ivan Boule [Thu, 12 Jun 2014 21:55:41 +0000 (23:55 +0200)]
ethdev: add Rx error counters for missed, badcrc and badlen packets

Split input error stats to have a better understanding of why packets
have been dropped.
Keep ierrors field untouched for backward compatibility.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
10 years agoapp/test: packet framework unit tests
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:39 +0000 (19:08 +0100)]
app/test: packet framework unit tests

Unit tests for Packet Framework libraries.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoexamples/pipeline: packet framework sample
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:38 +0000 (19:08 +0100)]
examples/pipeline: packet framework sample

This Packet Framework sample application illustrates the capabilities
of the Intel DPDK Packet Framework toolbox.

It creates different functional blocks used by a typical IPv4 framework like:
flow classification, firewall, routing, etc.

CPU cores are connected together through standard interfaces built on SW rings,
which each CPU core running a separate pipeline instance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoapp/pipeline: packet framework benchmark
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:37 +0000 (19:08 +0100)]
app/pipeline: packet framework benchmark

This application is purposefully built to benchmark the performance
of the Intel DPDK Packet Framework toolbox.

It uses 3 CPU cores connected in a chain through SW rings
(NICs --> Core A --> Core B --> Core C --> NICs)
1. Core A: reads packets from NIC ports and writes them to SW queues;
2. Core B: instantiates a Packet Framework pipeline that uses ring reader
   input ports, a table whose type is selected trhough command line arguments
   (--none, --stub, --lpm, --acl, --hash[-spec]-KEYSZ-TYPE, with KEYSZ as
   8, 16 or 32 bytes and TYPE as ext (Extendible bucket) or lru (LRU))
   and ring writers output ports;
3. Core C: reads packets from SW rings and writes them to NIC ports.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
[Thomas: remove dedicated build option]

10 years agocfgfile: library to interpret config files
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:36 +0000 (19:08 +0100)]
cfgfile: library to interpret config files

This library provides a tool to interpret config files that have
standard structure.

It is used by the Packet Framework examples/ip_pipeline sample application.

It originates from examples/qos_sched sample application and now it makes
this code available as a library for other sample applications to use.
The code duplication with qos_sched sample app to be addressed later.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agopipeline: new packet framework logic
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:35 +0000 (19:08 +0100)]
pipeline: new packet framework logic

The Packet Framework pipeline library provides a standard methodology
(logically similar to OpenFlow) for rapid development of complex packet
processing pipelines out of ports, tables and actions.

A pipeline is constructed by connecting its input ports to its output ports
through a chain of lookup tables. As result of lookup operation into the
current table, one of the table entries (or the default table entry, in case
of lookup miss) is identified to provide the actions to be executed on the
current packet and the associated action meta-data.

The behavior of user actions is defined through the configurable table action
handler, while the reserved actions define the next hop for the current packet
(either another table, an output port or packet drop) and are handled
transparently by the framework.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: stub
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:33 +0000 (19:08 +0100)]
table: stub

The stub table is a simple implementation of the Packet Framework table
API that produces lookup miss for all input packets.

It is used as simple cable-type forwarder by the Packet Framework
pipeline library.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: array
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:32 +0000 (19:08 +0100)]
table: array

Packet Framework array tables.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: hash
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:31 +0000 (19:08 +0100)]
table: hash

Various types of hash tables presented under the Packet Framework toolbox.

Hash table types:
1. Extendible bucket (ext): when bucket is full, bucket is extended with
   more keys
2. Least Recently Used (LRU): when bucket is full, the LRU entry is discarded
3. Pre-computed key signature: RX core extracts the key n-tuple from the
   packet, computes the key signature and saves the key and key signature
   within the packet meta-data; flow classification core performs the actual
   lookup (the bucket search stage) after reading the key and key signature
   from packet meta-data
4. Signature computed on-the-fly (do-sig version): the same CPU core extracts
   the key n-tuple from pkt, computes key signature and performs the table
   lookup
5. Configurable key size or optimized for single key size (8-byte, 16-byte
   and 32-byte key sizes)

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: ACL
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:30 +0000 (19:08 +0100)]
table: ACL

Packet Framework ACL table for ACL rule database.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: LPM IPv6
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:29 +0000 (19:08 +0100)]
table: LPM IPv6

Routing table for IPv6.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: LPM IPv4
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:28 +0000 (19:08 +0100)]
table: LPM IPv4

Routing table for IPv4.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agotable: new packet framework API
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:27 +0000 (19:08 +0100)]
table: new packet framework API

This file defines the operations to be implemented by
any Packet Framework table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoport: source and sink
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:25 +0000 (19:08 +0100)]
port: source and sink

Source port is a packet generator, similar to /dev/zero Linux device.

Sink port is a packet terminator (drops all input packets), similar
to /dev/null Linux device.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoport: hierarchical scheduler
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:24 +0000 (19:08 +0100)]
port: hierarchical scheduler

The QoS hierarchical scheduler presented as Packet Framework port.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoport: IPv4 reassembly
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:23 +0000 (19:08 +0100)]
port: IPv4 reassembly

The IPv4 reassembly operation is presented as a Packet Framework port.

The code duplication with examples/ip_reassembly sample application
to be addressed soon by linking the relevant library once upstreamed.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
[Thomas: update to new ip_frag library]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoport: IPv4 fragmentation
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:22 +0000 (19:08 +0100)]
port: IPv4 fragmentation

This port presents the IPv4 fragmentation operation as a Packet Framework port.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
[Thomas: update to new ip_frag library]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoport: ring
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:21 +0000 (19:08 +0100)]
port: ring

ring_reader input port (on top of single consumer rte_ring)
ring writer output port (on top of single producer rte_ring)

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoport: ethdev
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:20 +0000 (19:08 +0100)]
port: ethdev

The input port ethdev_reader implements the Packet Framework port API
on top of the Intel DPDK poll mode driver for a NIC RX queue.

The output port ethdev_writer implements the Packet Framework port API
on top of the Intel DPDK poll mode driver for a NIC TX queue.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoport: new packet framework API
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:19 +0000 (19:08 +0100)]
port: new packet framework API

This file defines the port operations that have to be implemented
by Packet Framework ports.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agolpm: check rule existence
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:17 +0000 (19:08 +0100)]
lpm: check rule existence

Added API function for LPM IPv4 and IPv6 to query for the existence
of a rule/route and return the next hop ID associated with the route
if route is present.
This is used by the Packet Framework LPM table for implementing a
routing table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agombuf: meta-data offset
Cristian Dumitrescu [Wed, 4 Jun 2014 18:08:18 +0000 (19:08 +0100)]
mbuf: meta-data offset

Added zero-size field (offset in data structure) to specify the beginning
of packet meta-data in the packet buffer just after the mbuf.

The size of the packet meta-data is application specific and the packet
meta-data is managed by the application.

The packet meta-data should always be accessed through the provided macros.

This is used by the Packet Framework libraries (port, table, pipeline).

There is absolutely no performance impact due to this mbuf field, as it
does not take any space in the mbuf structure (zero-size field).

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoip_frag: clean includes
Thomas Monjalon [Tue, 17 Jun 2014 00:31:30 +0000 (02:31 +0200)]
ip_frag: clean includes

Add required rte_byteorder in rte_ip_frag.h.
Remove useless includes in *.c files.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoexamples/vhost: restrict log type namespace
Thomas Monjalon [Mon, 16 Jun 2014 21:10:08 +0000 (23:10 +0200)]
examples/vhost: restrict log type namespace

RTE_LOGTYPE_CONFIG, RTE_LOGTYPE_DATA and RTE_LOGTYPE_PORT are renamed
by adding VHOST prefix.
It prevents from conflict with new RTE_LOGTYPE_PORT of packet framework.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoapp/testpmd: add commands for filters
Jingjing Wu [Mon, 16 Jun 2014 07:31:46 +0000 (15:31 +0800)]
app/testpmd: add commands for filters

add commands in testpmd for NIC filters:
add_ethertype_filter
remove_ethertype_filter
get_ethertype_filter
add_2tuple_filter
remove_2tuple_filter
get_2tuple_filter
add_5tuple_filter
remove_5tuple_filter
get_5tuple_filter
add_syn_filter
remove_syn_filter
get_syn_filter
add_flex_filter
remove_flex_filter
get_flex_filter

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoixgbe: add filters
Jingjing Wu [Mon, 16 Jun 2014 07:31:45 +0000 (15:31 +0800)]
ixgbe: add filters

This patch adds following ixgbe NIC filters implement:
  syn filter, ethertype filter, 5tuple filter for intel NIC 82599

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoigb: add filters
Jingjing Wu [Mon, 16 Jun 2014 07:31:44 +0000 (15:31 +0800)]
igb: add filters

This patch adds following igb NIC filters implement:
  syn filter, ethertype filter, 2tuple filter, flex filter for intel NIC 82580 and i350
  syn filter, ethertype filter, 5tuple filter for intel NIC 82576

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoethdev: add filters
Jingjing Wu [Mon, 16 Jun 2014 07:31:43 +0000 (15:31 +0800)]
ethdev: add filters

This patch adds APIs for NIC filters list below:
ethertype filter, syn filter, 2tuple filter, flex filter, 5tuple filter

Signed-off-by: jingjing.wu <jingjing.wu@intel.com>
Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoexamples/ip_reassembly: overhaul
Anatoly Burakov [Wed, 28 May 2014 17:32:47 +0000 (18:32 +0100)]
examples/ip_reassembly: overhaul

New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: add IPv6 reassembly
Anatoly Burakov [Wed, 28 May 2014 17:32:46 +0000 (18:32 +0100)]
ip_frag: add IPv6 reassembly

Mostly a copy-paste of IPv4, with a few caveats.

Only supported packets are those in which fragment extension header is
just after the IPv6 header.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoexamples/ip_fragmentation: overhaul
Anatoly Burakov [Wed, 28 May 2014 17:32:45 +0000 (18:32 +0100)]
examples/ip_fragmentation: overhaul

New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoexamples: rename ipv4_frag example to ip_fragmentation
Anatoly Burakov [Wed, 28 May 2014 17:32:44 +0000 (18:32 +0100)]
examples: rename ipv4_frag example to ip_fragmentation

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: add IPv6 fragmentation support
Anatoly Burakov [Wed, 28 May 2014 17:32:43 +0000 (18:32 +0100)]
ip_frag: add IPv6 fragmentation support

Mostly a copy-paste of IPv4.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: rename ipv4_fragmentation function
Anatoly Burakov [Wed, 28 May 2014 17:32:42 +0000 (18:32 +0100)]
ip_frag: rename ipv4_fragmentation function

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: refactor reassembly code into a proper library
Anatoly Burakov [Wed, 28 May 2014 17:32:41 +0000 (18:32 +0100)]
ip_frag: refactor reassembly code into a proper library

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: rename structures in fragmentation table
Anatoly Burakov [Wed, 28 May 2014 17:32:40 +0000 (18:32 +0100)]
ip_frag: rename structures in fragmentation table

Technically, fragmentation table can work for both IPv4 and IPv6
packets, so we're renaming everything to be generic enough to make sense
in IPv6 context.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: remove unneeded check and macro
Anatoly Burakov [Wed, 28 May 2014 17:32:39 +0000 (18:32 +0100)]
ip_frag: remove unneeded check and macro

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: new internal common header
Anatoly Burakov [Wed, 28 May 2014 17:32:38 +0000 (18:32 +0100)]
ip_frag: new internal common header

Moved out debug log macros into common, as reassembly code will later
need them as well.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: fix code style
Anatoly Burakov [Wed, 28 May 2014 17:32:37 +0000 (18:32 +0100)]
ip_frag: fix code style

Issues were reported by checkpatch.pl.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: refactor IPv4 fragmentation into a proper library
Anatoly Burakov [Wed, 28 May 2014 17:32:36 +0000 (18:32 +0100)]
ip_frag: refactor IPv4 fragmentation into a proper library

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
[Thomas: add in doxygen]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoip_frag: move fragmentation/reassembly headers into a library
Anatoly Burakov [Wed, 28 May 2014 17:32:35 +0000 (18:32 +0100)]
ip_frag: move fragmentation/reassembly headers into a library

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agotools: add vfio support to setup script
Anatoly Burakov [Fri, 13 Jun 2014 14:52:54 +0000 (15:52 +0100)]
tools: add vfio support to setup script

Support for loading/unloading VFIO drivers, binding/unbinding devices
to/from VFIO, also setting up correct userspace permissions.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agotools: support vfio in dpdk_nic_bind
Anatoly Burakov [Fri, 13 Jun 2014 14:52:53 +0000 (15:52 +0100)]
tools: support vfio in dpdk_nic_bind

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agotools: rename igb_uio_bind to dpdk_nic_bind
Anatoly Burakov [Mon, 16 Jun 2014 12:05:28 +0000 (14:05 +0200)]
tools: rename igb_uio_bind to dpdk_nic_bind

Renaming the igb_uio_bind script to dpdk_nic_bind to have a generic name
before supporting two drivers.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoigb_uio: remove PCI id table
Anatoly Burakov [Fri, 13 Jun 2014 14:52:52 +0000 (15:52 +0100)]
igb_uio: remove PCI id table

Removing PCI ID list to make igb_uio more similar to a generic driver
like vfio-pci or pci_uio_generic. This is done to make it easier for
the binding script to support multiple drivers.

Note that since igb_uio no longer has a PCI ID list, it can now be
bound to any device, not just those explicitly supported by DPDK. In
other words, it now behaves similar to PCI stub, VFIO and other generic
PCI drivers.

Therefore to bind a new device to igb_uio, the user will now have to
first write its PCI ID to "new_id" file inside the igb_uio driver
directory, and only then write the PCI ID to "bind". This is reflected
in changes to PCI binding script as well.

There's a weird behaviour of sysfs when a new device ID is added to
new_id. Subsequent writing to "bind" will result in IOError on
closing the file. This error is harmless but it triggers the
exception anyway, so in order to work around that, we check if the
device was actually bound to the driver before raising an error.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoeal: add command line option to select vfio interrupt type
Anatoly Burakov [Fri, 13 Jun 2014 14:52:49 +0000 (15:52 +0100)]
eal: add command line option to select vfio interrupt type

Unlike igb_uio, VFIO interrupt type is not set by kernel module
parameters but is set up via ioctl() calls at runtime. This warrants
a new EAL command-line parameter. It will have no effect if VFIO is
not compiled, but will set VFIO interrupt type to either "legacy", "msi"
or "msix" if VFIO support is compiled. Note that VFIO initialization
will fail if the interrupt type selected is not supported by the system.

If the interrupt type parameter wasn't specified, VFIO will try all
interrupt types (starting with MSI-X).

In unit tests, we don't know if VFIO is compiled (eal_vfio.h header is
internal to Linuxapp EAL), so we check this flag regardless.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: enable vfio device binding
Anatoly Burakov [Fri, 13 Jun 2014 14:52:48 +0000 (15:52 +0100)]
pci: enable vfio device binding

Add support for binding VFIO devices if RTE_PCI_DRV_NEED_MAPPING is set
for this driver. Try VFIO first, if not mapped then try IGB_UIO too.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agovfio: add multiprocess support
Anatoly Burakov [Fri, 13 Jun 2014 14:52:47 +0000 (15:52 +0100)]
vfio: add multiprocess support

Since VFIO cannot be used to map the same device twice, secondary
processes receive the device/group fd's by means of communicating over a
local socket. Only group and container fd's should be sent, as device
fd's can be obtained via ioctl() calls' on the group fd.

For multiprocess, VFIO distinguishes between existing but unused groups
(e.g. grups that aren't bound to VFIO driver) and non-existing groups in
order to know if the secondary process requests a valid group, or if
secondary process requests something that doesn't exist.

VFIO multiprocess sync communicates over a simple protocol. It defines
two requests - request for group fd, and request for container fd.
Possible replies are: SOCKET_OK (an OK signal), SOCKET_ERR (error
signal) and SOCKET_NO_FD (a signal that indicates that the requested
VFIO group is valid, but no fd is present for that group - indicating
that the respective group is simply not bound to VFIO driver).

Here is the logic in a nutshell:

1. secondary process sends SOCKET_REQ_CONTAINER or SOCKET_REQ_GROUP
1a. in case of SOCKET_REQ_GROUP, client also then sends group number
2. primary process receives message
2a. in case of invalid group, SOCKET_ERR is sent back to secondary
2b. in case of unbound group, SOCKET_NO_FD is sent back to secondary
2c. in case of valid group, SOCKET_OK is sent and followed by fd
3. socket is closed

in case of any error, socket is closed and SOCKET_ERR is sent.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agovfio: DMA mapping
Anatoly Burakov [Fri, 13 Jun 2014 14:52:46 +0000 (15:52 +0100)]
vfio: DMA mapping

Adding code to support VFIO mapping (primary processes only). Most of
the things are done via ioctl() calls on either /dev/vfio/vfio (the
container) or a /dev/vfio/$GROUP_NR (IOMMU group).

In a nutshell, the code does the following:
1. creates a VFIO container (an entity that allows sharing IOMMU DMA
   mappings between devices)
2. checks if a given PCI device is a member of an IOMMU group (if it's
   not, this indicates that the device isn't bound to VFIO)
3. calls open() the group file to obtain a group fd
4. checks if the group is viable (that is, if all the devices in the
   same IOMMU group are either bound to VFIO or not bound to anything)
5. adds the group to a container
6. sets up DMA mappings (only done once, mapping whole DPDK hugepage
   memory for DMA, with a 1:1 correspondence of IOVA to PA)
7. gets the actual PCI device fd from the group fd (can fail, which
   simply means that this particular device is not bound to VFIO)
8. maps BARs (MSI-X BAR cannot be mmaped, so skipping it)
9. sets up interrupt structures (but not enables them!)
10. enables PCI bus mastering

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agovfio: interrupts
Anatoly Burakov [Fri, 13 Jun 2014 14:52:44 +0000 (15:52 +0100)]
vfio: interrupts

Creating code to handle VFIO interrupts in EAL interrupts (supports all
types of interrupts).

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agovfio: header for build support
Anatoly Burakov [Fri, 13 Jun 2014 14:52:42 +0000 (15:52 +0100)]
vfio: header for build support

Add VFIO compilation option to linuxapp config.

Adding a header that will determine if VFIO support should be compiled
in. If VFIO is enabled in config (and it's enabled by default), then the
header will also check for kernel version. If VFIO is enabled in config
and if the kernel version is 3.6+, then VFIO_PRESENT will be defined.
This is the macro that should be used to determine if VFIO support is
being compiled in.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoeal: move interrupt type out of igb_uio
Anatoly Burakov [Fri, 13 Jun 2014 14:52:41 +0000 (15:52 +0100)]
eal: move interrupt type out of igb_uio

Moving interrupt type enum out of igb_uio and renaming it to be more
generic. Such a strange header naming and separation is done mostly to
make coming virtio patches easier to port to dpdk.org tree.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoigb_uio: make compilation optional
Anatoly Burakov [Fri, 13 Jun 2014 14:52:40 +0000 (15:52 +0100)]
igb_uio: make compilation optional

Currently, igb_uio is always compiled. Some Linux distributions may not
want to include igb_uio with DPDK, so we need to make sure that igb_uio
compilation for Linuxapp targets can be optional.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: rename RTE_PCI_DRV_NEED_IGB_UIO to RTE_PCI_DRV_NEED_MAPPING
Anatoly Burakov [Fri, 13 Jun 2014 14:52:39 +0000 (15:52 +0100)]
pci: rename RTE_PCI_DRV_NEED_IGB_UIO to RTE_PCI_DRV_NEED_MAPPING

Rename the RTE_PCI_DRV_NEED_IGB_UIO to be more generic.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: distinguish between legitimate failures and non-fatal errors
Anatoly Burakov [Fri, 13 Jun 2014 14:52:38 +0000 (15:52 +0100)]
pci: distinguish between legitimate failures and non-fatal errors

Currently, EAL does not distinguish between actual failures and expected
initialization errors. E.g. sometimes the driver fails to initialize
because it was not supposed to be initialized in the first place, such
as device not being managed by said driver.

This patch makes EAL fail on actual initialization errors while still
skipping over expected initialization errors.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: fix code style
Anatoly Burakov [Fri, 13 Jun 2014 14:52:37 +0000 (15:52 +0100)]
pci: fix code style

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: move uio mapping in a dedicated file
Anatoly Burakov [Fri, 13 Jun 2014 14:52:36 +0000 (15:52 +0100)]
pci: move uio mapping in a dedicated file

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: rework uio mapping to prepare for vfio
Anatoly Burakov [Fri, 13 Jun 2014 14:52:35 +0000 (15:52 +0100)]
pci: rework uio mapping to prepare for vfio

Separating mapping code and calls to open. This is a preparatory work
for VFIO patch since it'll need to map BARs too but it doesn't use path
in mapped_pci_resource. Also, renaming structs to be more generic.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agomem: make --no-huge use mmap instead of malloc
Anatoly Burakov [Fri, 13 Jun 2014 14:52:50 +0000 (15:52 +0100)]
mem: make --no-huge use mmap instead of malloc

This makes it possible to run DPDK without hugepage memory when VFIO
is used, as VFIO uses virtual addresses to set up DMA mappings.

Technically, malloc is just fine, but we want to guarantee that
memory will be page-aligned, so using mmap to be safe.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoeal: remove useless compilation flag
Anatoly Burakov [Fri, 13 Jun 2014 14:52:45 +0000 (15:52 +0100)]
eal: remove useless compilation flag

eal_hpet.c was renamed to eal_timer.c and, thanks to code changes, does
not need the -Wno-return-type any more.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoixgbe: new vectorized functions for Rx/Tx
Bruce Richardson [Fri, 13 Jun 2014 22:52:24 +0000 (23:52 +0100)]
ixgbe: new vectorized functions for Rx/Tx

New file containing optimized receive and transmit functions which
use 128bit vector instructions to improve performance. When conditions
permit, these functions will be enabled at runtime by the device
initialization routines already in the PMD.

The compilation of the vectorized RX and TX code paths is controlled by
a new setting in the build time configuration for the IXGBE driver. Also
added is a setting which allows an optional further performance increase
by disabling the use of the olflags field on packet RX.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: XiaonanX Zhang <xiaonanx.zhang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
[Thomas: code-style adjustments]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoacl: new sample l3fwd-acl
Konstantin Ananyev [Fri, 13 Jun 2014 11:26:53 +0000 (12:26 +0100)]
acl: new sample l3fwd-acl

Demonstrates the use of the ACL library in the DPDK application to
implement packet classification and L3 forwarding.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]

10 years agoacl: new test-acl application
Konstantin Ananyev [Fri, 13 Jun 2014 11:26:52 +0000 (12:26 +0100)]
acl: new test-acl application

Usage example and main test application for the ACL library.
Provides IPv4/IPv6 5-tuple classification.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]

10 years agoacl: update unit tests
Konstantin Ananyev [Fri, 13 Jun 2014 11:26:51 +0000 (12:26 +0100)]
acl: update unit tests

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
10 years agoacl: new library
Konstantin Ananyev [Fri, 13 Jun 2014 11:26:50 +0000 (12:26 +0100)]
acl: new library

The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]

10 years agovirtio: fix build with debug enabled
Stephen Hemminger [Fri, 13 Jun 2014 01:32:50 +0000 (18:32 -0700)]
virtio: fix build with debug enabled

Remove useless message that breaks if VIRTIO_DEBUG_DRIVER is defined.
virtio_ethdev.c:224:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
10 years agovirtio: checkpatch cleanups
Stephen Hemminger [Fri, 13 Jun 2014 01:32:40 +0000 (18:32 -0700)]
virtio: checkpatch cleanups

This fixes style problems reported by checkpatch including:
  * extra whitespace
  * spaces before tabs
  * strings broken across lines
  * excessively long lines
  * missing spaces after keywords
  * unnecessary paren's in return statements

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
10 years agoconfig: minor cleanup
Thomas Monjalon [Thu, 12 Jun 2014 12:57:24 +0000 (14:57 +0200)]
config: minor cleanup

Move things at their right location and add missing comment.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agodistributor: add unit tests
Bruce Richardson [Thu, 29 May 2014 10:12:17 +0000 (11:12 +0100)]
distributor: add unit tests

Add a set of unit tests and some basic performance test for the
distributor library. These tests cover all the major functionality of
the library on both distributor and worker sides.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
10 years agodistributor: new packet distributor library
Bruce Richardson [Thu, 29 May 2014 10:12:14 +0000 (11:12 +0100)]
distributor: new packet distributor library

This adds the code for a new Intel DPDK library for packet distribution.
The distributor is a component which is designed to pass packets
one-at-a-time to workers, with dynamic load balancing. Using the RSS
field in the mbuf as a tag, the distributor tracks what packet tag is
being processed by what worker and then ensures that no two packets with
the same tag are in-flight simultaneously. Once a tag is not in-flight,
then the next packet with that tag will be sent to the next available
core.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
[Thomas: add doxygen @file comment]

10 years agoexamples/l3fwd: reorganise and optimize LPM code path
Konstantin Ananyev [Wed, 11 Jun 2014 13:38:46 +0000 (14:38 +0100)]
examples/l3fwd: reorganise and optimize LPM code path

With latest HW and optimised RX/TX path there is a huge gap between
tespmd iofwd and l3fwd performance results.
So there is an attempt to optimise l3fwd LPM code path and reduce the gap:
 - Instead of processing each input packet up to completion -
 divide packet processing into several stages and perform
 stage by stage for the whole burst.
 - Unroll things by the factor of 4 whenever possible.
 - Use SSE instincts for some operations (bswap, replace MAC addresses, etc).
 - Avoid TX packet buffering whenever possible.
 - Move some checks from RX/TX into setup phase.

Note that new(optimized) code path can be switched on/off by setting
ENABLE_MULTI_BUFFER_OPTIMIZE macro to 1/0.

Some performance data:
SUT: dual-socket board IVB 2.8GHz, 2x1GB pages.
4 ports on 4 NICs (all at socket 0) connected to the traffic generator.
kernel: 3.11.3-201.fc19.x86_64, gcc: 4.8.2.
64B packets, using the packet flooding method.
All 4 ports are managed by one logical core:
Optimised scalar PMD RX/TX was used.

                          DIFF % (NEW-OLD)
IPV4-CONT-BURST:               +23%
IPV6-CONT-BURST :              +13%
IPV4/IPV6-CONT-BURST:          +8%
IPV4-4STREAMSX8:               +7%
IPV4-4STREAMSX1:               -2%

Test cases description:
IPV4-CONT-BURST - IPV4 packets all packets from the one input port
are destined for the same output port.
IPV6-CONT-BURST - IPV6 packets all packets from the one input port
are destined for the same output port.
IPV4/IPV6-CONT-BURST - mix of the first 2 with interleave=1
(e.g: IPV4,IPV6,IPV4,IPV6, ...)
IPV4-4STREAMSX1 - 4 streams of IPV4 packets, where all packets
from same stream are destined for the same output port
(e.g: IPV4_DST_P0, IPV4_DST_P1,  IPV4_DST_P2, IPV4_DST_P3, IPV4_DST_P0, ...)
IPV4-4STREAMSX8 - same as above but packets for each stream
are coming in groups of 8
(e.g: IPV4_DST_P0 X 8, IPV4_DST_P1 X 8, IPV4_DST_P2 X 8, IPV4_DST_P3 X 8,
IPV4_DST_P0 X 8, ...)

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
10 years agolpm: introduce rte_lpm_lookupx4
Konstantin Ananyev [Wed, 11 Jun 2014 13:38:45 +0000 (14:38 +0100)]
lpm: introduce rte_lpm_lookupx4

Allows to lookup four IP addresses in an LPM table.
Uses SSE instrincts.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
10 years agopci: remove conditions on device definitions
Pawel Wodkowski [Wed, 11 Jun 2014 07:20:37 +0000 (08:20 +0100)]
pci: remove conditions on device definitions

This patch removes obsolete code that prevents defining
NICs 82575EB, I218 and I350.

Signed-off-by: Pawel Wodkowski <pawelx.wdkowski@intel.com>
[Thomas: remove conditions for I218]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoapp/testpmd: Tx rate limitation for queue and VF
Ouyang Changchun [Mon, 26 May 2014 07:45:31 +0000 (15:45 +0800)]
app/testpmd: Tx rate limitation for queue and VF

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
10 years agoixgbe: Tx rate limitation for queue and VF
Ouyang Changchun [Mon, 26 May 2014 07:45:30 +0000 (15:45 +0800)]
ixgbe: Tx rate limitation for queue and VF

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
10 years agoethdev: Tx rate limitation for queue and VF
Ouyang Changchun [Mon, 26 May 2014 07:45:29 +0000 (15:45 +0800)]
ethdev: Tx rate limitation for queue and VF

Add API to support setting TX rate for a queue and a VF.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
10 years agoapp/testpmd: add commands for link up and down
Ouyang Changchun [Wed, 28 May 2014 07:15:02 +0000 (15:15 +0800)]
app/testpmd: add commands for link up and down

This patch adds commands to test the functionality of setting link up and down.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoixgbe: link up and down
Ouyang Changchun [Wed, 28 May 2014 07:15:01 +0000 (15:15 +0800)]
ixgbe: link up and down

It is implemented by enabling or disabling TX laser.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoethdev: API for link up and down
Ouyang Changchun [Wed, 28 May 2014 07:15:00 +0000 (15:15 +0800)]
ethdev: API for link up and down

This patch adds API to support the functionality of setting link up and down.
It can be used to repeatedly stop and restart RX/TX of a port without
re-allocating resources for the port and re-configuring the port.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>

10 years agoethdev: fix compiler warning on PMD_DEBUG_TRACE formats
Konstantin Ananyev [Mon, 9 Jun 2014 17:26:17 +0000 (18:26 +0100)]
ethdev: fix compiler warning on PMD_DEBUG_TRACE formats

icc 12.1 complains about RTE_LOG() format:
"argument is incompatible with corresponding format string conversion"

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoethdev: prevent from starting/stopping already started/stopped device
Konstantin Ananyev [Mon, 9 Jun 2014 17:26:16 +0000 (18:26 +0100)]
ethdev: prevent from starting/stopping already started/stopped device

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoigb/ixgbe: reset queue pointers after releasing
Konstantin Ananyev [Mon, 9 Jun 2014 17:26:15 +0000 (18:26 +0100)]
igb/ixgbe: reset queue pointers after releasing

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoe1000: do not release queue on alloc error
Konstantin Ananyev [Mon, 9 Jun 2014 17:26:14 +0000 (18:26 +0100)]
e1000: do not release queue on alloc error

If igb_alloc_rx_queue_mbufs() would fail to allocate an mbuf for RX queue,
it calls igb_rx_queue_release(rxq).
That causes rxq to be silently freed, without updating
dev->data->rx_queues[].
So any further reference to it will trigger the SIGSEGV.
Same thing in em PMD too.

To fix: igb_alloc_rx_queue_mbufs() should just return an error to the
caller and let upper layer to deal with the probem.
That's what ixgbe PMD is doing right now.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agoremove trailing whitespaces
Bruce Richardson [Tue, 3 Jun 2014 23:42:50 +0000 (00:42 +0100)]
remove trailing whitespaces

This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
10 years agopci: fix build for FreeBSD
Alan Carew [Thu, 5 Jun 2014 16:12:08 +0000 (17:12 +0100)]
pci: fix build for FreeBSD

Add __rte_unused to
pci_unbind_kernel_driver(struct rte_pci_device *dev)

Signed-off-by: Alan Carew <alan.carew@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
10 years agoeal: fix build for FreeBSD
Alan Carew [Thu, 5 Jun 2014 16:12:07 +0000 (17:12 +0100)]
eal: fix build for FreeBSD

Recent change to rte_dump_tailq (commit 591a9d7985c1230652),
which now uses a FILE parameter causes compilation to fail under FreeBSD
and sourced to a missing include of stdio.h.

Errors:
rte_tailq.h:  unknown type name 'FILE' void rte_dump_tailq(FILE *f);
rte_memory.h: unknown type name 'FILE' void rte_dump_physmem_layout(FILE *f);

Signed-off-by: Alan Carew <alan.carew@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
10 years agomk: factorize config rules
Thomas Monjalon [Tue, 10 Jun 2014 13:42:57 +0000 (15:42 +0200)]
mk: factorize config rules

Error message for missing template is factorized in notemplate rule.

RTE_OUTPUT directory is marked as order-only prerequisite.

RTE_OUTPUT is always created after having been cleaned for rte_config.h.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by Olivier Matz <olivier.matz@6wind.com>

10 years agomk: allow updates to build config on make install
Bruce Richardson [Wed, 14 May 2014 15:55:10 +0000 (16:55 +0100)]
mk: allow updates to build config on make install

When running "make config", an additional config.orig file is also
generated, which is intended to hold the original, clean configuration
from the template.
When running make install, we first check if there is no existing
.config file, and run make config if not. If there is a file, we then
check if it's unmodified, in which case we regenerate a new .config to
take account of any possible updates to the template. Finally, in the
case where there is an existing .config file, and it HAS been modified,
we then do a check to see if the template has had further updates, and
throw an error if so. If no updates, we continue with the build using
the existing, user-modified config.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
10 years agomk: fix 32-bit link with gcc
Thomas Monjalon [Mon, 19 May 2014 21:45:03 +0000 (23:45 +0200)]
mk: fix 32-bit link with gcc

Some linker options were not prefixed by -Wl, when using CC:
-z muldefs
-melf_i386 (CPU_LDFLAGS in 32-bit config)

I didn't see any error with -z muldefs but it isn't documented in gcc
manual. So it's safer to explicitly pass it to the linker.
Also building 32-bit shared library raises this error:
gcc: error: unrecognized command line option ‘-melf_i386’

Using macro linkerprefix fixes it.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
10 years agopcap: fix Tx mbuf corruption
Konstantin Ananyev [Wed, 28 May 2014 14:47:02 +0000 (15:47 +0100)]
pcap: fix Tx mbuf corruption

If pcap_sendpacket() fails, then eth_pcap_tx shouldn't silently free that
mbuf and continue.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>