dpdk.git
9 years agombuf: generic support for TCP segmentation offload
Olivier Matz [Wed, 26 Nov 2014 15:04:52 +0000 (16:04 +0100)]
mbuf: generic support for TCP segmentation offload

Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.

Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).

To delegate the TCP segmentation to the hardware, the user has to:

- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
  the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
  and set it in the TCP header, for instance by using
  rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.

This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agonet: new checksum functions
Olivier Matz [Wed, 26 Nov 2014 15:04:51 +0000 (16:04 +0100)]
net: new checksum functions

Introduce new functions to calculate checksums. These new functions
are derivated from the ones provided csumonly.c but slightly reworked.
There is still some room for future optimization of these functions
(maybe SSE/AVX, ...).

This API will be modified in tbe next commits by the introduction of
TSO that requires a different pseudo header checksum to be set in the
packet.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: rework checksum forward engine
Olivier Matz [Wed, 26 Nov 2014 15:04:50 +0000 (16:04 +0100)]
app/testpmd: rework checksum forward engine

The csum forward engine was becoming too complex to be used and
extended (the next commits want to add the support of TSO):

- no explaination about what the code does
- code is not factorized, lots of code duplicated, especially between
  ipv4/ipv6
- user command line api: use of bitmasks that need to be calculated by
  the user
- the user flags don't have the same semantic:
  - for legacy IP/UDP/TCP/SCTP, it selects software or hardware checksum
  - for other (vxlan), it selects between hardware checksum or no
    checksum
- the code relies too much on flags set by the driver without software
  alternative (ex: PKT_RX_TUNNEL_IPV4_HDR). It is nice to be able to
  compare a software implementation with the hardware offload.

This commit tries to fix these issues, and provide a simple definition
of what is done by the forward engine:

 * Receive a burst of packets, and for supported packet types:
 *  - modify the IPs
 *  - reprocess the checksum in SW or HW, depending on testpmd command line
 *    configuration
 * Then packets are transmitted on the output port.
 *
 * Supported packets are:
 *   Ether / (vlan) / IP|IP6 / UDP|TCP|SCTP .
 *   Ether / (vlan) / IP|IP6 / UDP / VxLAN / Ether / IP|IP6 / UDP|TCP|SCTP
 *
 * The network parser supposes that the packet is contiguous, which may
 * not be the case in real life.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
9 years agoapp/testpmd: fix use of offload flags
Olivier Matz [Wed, 26 Nov 2014 15:04:49 +0000 (16:04 +0100)]
app/testpmd: fix use of offload flags

In testpmd the rte_port->tx_ol_flags flag was used in 2 incompatible
manners:
- sometimes used with testpmd specific flags (0xff for checksums, and
  bit 11 for vlan)
- sometimes assigned to m->ol_flags directly, which is wrong in case
  of checksum flags

This commit replaces the hardcoded values by named definitions, which
are not compatible with mbuf flags. The testpmd forward engines are
fixed to use the flags properly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agombuf: get the name of offload flags
Olivier Matz [Wed, 26 Nov 2014 15:04:48 +0000 (16:04 +0100)]
mbuf: get the name of offload flags

In test-pmd (rxonly.c), the code is able to dump the list of ol_flags.
The issue is that the list of flags in the application has to be
synchronized with the flags defined in rte_mbuf.h.

This patch introduces 2 new functions rte_get_rx_ol_flag_name()
and rte_get_tx_ol_flag_name() that returns the name of a flag from
its mask. It also fixes rxonly.c to use this new functions and to
display the proper flags.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
9 years agombuf: remove too specific flags mask
Olivier Matz [Wed, 26 Nov 2014 15:04:47 +0000 (16:04 +0100)]
mbuf: remove too specific flags mask

This definition is specific to Intel PMD drivers and its definition
"indicate what bits required for building TX context" shows that it
should not be in the generic rte_mbuf.h but in the PMD driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agombuf: add help about Tx checksum flags
Olivier Matz [Wed, 26 Nov 2014 15:04:46 +0000 (16:04 +0100)]
mbuf: add help about Tx checksum flags

Describe how to use hardware checksum API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agombuf: reorder Tx flags
Olivier Matz [Wed, 26 Nov 2014 15:04:45 +0000 (16:04 +0100)]
mbuf: reorder Tx flags

The tx mbuf flags are now ordered from the lowest value to the
the highest. Add comments to explain where to add new flags.

By the way, move the PKT_TX_VXLAN_CKSUM at the right place.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoixgbe: fix flags variable size to 64 bits
Olivier Matz [Wed, 26 Nov 2014 15:04:44 +0000 (16:04 +0100)]
ixgbe: fix flags variable size to 64 bits

Since commit 4332beee9 "mbuf: expand ol_flags field to 64-bits", the
packet flags are now 64 bits wide. Some occurences were forgotten in
the ixgbe driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoigb/ixgbe: fix IP checksum calculation
Olivier Matz [Wed, 26 Nov 2014 15:04:43 +0000 (16:04 +0100)]
igb/ixgbe: fix IP checksum calculation

According to IntelĀ® 82599 10 GbE Controller Datasheet (Table 7-38), both
L2 and L3 lengths are needed to offload the IP checksum.

Note that the e1000 driver does not need to be patched as it already
contains the fix.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/test: vm power management
Alan Carew [Tue, 25 Nov 2014 16:18:11 +0000 (16:18 +0000)]
app/test: vm power management

Updated the unit tests to cover both librte_power implementations as well as
the external API.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agopower: integration of vm power management
Alan Carew [Tue, 25 Nov 2014 16:18:10 +0000 (16:18 +0000)]
power: integration of vm power management

librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agopower: packet format for vm power management
Alan Carew [Tue, 25 Nov 2014 16:18:09 +0000 (16:18 +0000)]
power: packet format for vm power management

Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agopower: common interface for guest and host
Alan Carew [Tue, 25 Nov 2014 16:18:08 +0000 (16:18 +0000)]
power: common interface for guest and host

Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implementation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agopower: vm communication channels in guest
Alan Carew [Tue, 25 Nov 2014 16:18:07 +0000 (16:18 +0000)]
power: vm communication channels in guest

Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/vm_power: cli in guest
Alan Carew [Tue, 25 Nov 2014 16:18:06 +0000 (16:18 +0000)]
examples/vm_power: cli in guest

Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/vm_power: vm power management application
Alan Carew [Tue, 25 Nov 2014 16:18:05 +0000 (16:18 +0000)]
examples/vm_power: vm power management application

For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/vm_power: cpu frequency in host
Alan Carew [Tue, 25 Nov 2014 16:18:04 +0000 (16:18 +0000)]
examples/vm_power: cpu frequency in host

A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/vm_power: cli in host
Alan Carew [Tue, 25 Nov 2014 16:18:03 +0000 (16:18 +0000)]
examples/vm_power: cli in host

The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-separated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/vm_power: channel manager and monitor in host
Alan Carew [Tue, 25 Nov 2014 16:18:02 +0000 (16:18 +0000)]
examples/vm_power: channel manager and monitor in host

The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agoexamples/skeleton: very simple code for packet forwarding
Bruce Richardson [Fri, 14 Nov 2014 14:31:50 +0000 (14:31 +0000)]
examples/skeleton: very simple code for packet forwarding

This is a very simple example app for doing packet forwarding with the
Intel DPDK. It's designed to serve as a start point for people new to
the Intel DPDK and who want to develop a new app.

Therefore it's meant to:
* have as good a performance out-of-the-box as possible, using the
  best-known settings for configuring the PMDs, so that any new apps can
  be based off it.
* be kept as short as possible to make it easy to understand it and get
  started with it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal/linux: map pci memory resources after hugepages
Anatoly Burakov [Tue, 11 Nov 2014 10:09:25 +0000 (10:09 +0000)]
eal/linux: map pci memory resources after hugepages

Multi-process DPDK application must mmap hugepages and PCI resources
into the same virtual address space. By default the virtual addresses
are chosen by the primary process automatically when calling the mmap.
But sometimes the chosen virtual addresses aren't usable in secondary
process - for example, secondary process is linked with more libraries
than primary process, and the library occupies the same address space
that the primary process has requested for PCI mappings.

This patch makes EAL try and map PCI BARs right after the hugepages
(instead of location chosen by mmap) in virtual memory, so that PCI BARs
have less chance of ending up in random places in virtual memory.

Signed-off-by: Liang Xu <liang.xu@cinfotech.cn>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoconfig: support 128 cores
Didier Pallard [Tue, 22 Apr 2014 10:20:25 +0000 (10:20 +0000)]
config: support 128 cores

New platforms have more than 64 cores.
Set default max cores number to 128.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: add option --master-lcore
Simon Kuenzer [Tue, 8 Jul 2014 08:28:30 +0000 (10:28 +0200)]
eal: add option --master-lcore

Enable users to specify the lcore id that is used as master lcore.

Signed-off-by: Simon Kuenzer <simon.kuenzer@neclab.eu>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoeal: get relative core index
Patrick Lu [Wed, 11 Jun 2014 20:45:09 +0000 (13:45 -0700)]
eal: get relative core index

EAL -c option allows the user to enable any lcore in the system.
Often times, the user app wants to know 1st enabled core, 2nd
enabled core, etc, rather than phyical core ID (rte_lcore_id().)

The new API rte_lcore_index() will return an index from enabled lcores
starting from zero.

Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: add core list input format
Didier Pallard [Fri, 27 Jun 2014 14:09:32 +0000 (16:09 +0200)]
eal: add core list input format

In current version, used cores can only be specified using a bitmask.
It will now be possible to specify cores in 2 different ways:
- Using a bitmask (-c [0x]nnn): bitmask must be in hex format
- Using a list in following format: -l <c1>[-c2][,c3[-c4],...]

The letter -l can stand for lcore or list.

-l 0-7,16-23,31 being equivalent to -c 0x80FF00FF

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: factorize configuration adjustment
Thomas Monjalon [Thu, 20 Nov 2014 21:57:22 +0000 (22:57 +0100)]
eal: factorize configuration adjustment

Some adjustments are done after options parsing and are common
to Linux and BSD.

Remove process_type adjustment in rte_config_init() because
it is already done in eal_parse_args().
eal_proc_type_detect() is kept duplicated because it open a
file descriptor which is used later in each eal.c.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: factorize options sanity check
Thomas Monjalon [Mon, 17 Nov 2014 09:14:10 +0000 (10:14 +0100)]
eal: factorize options sanity check

No need to have duplicated check for common options.

Some flags are set for options -c and -m in order to simplify the
checks.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: factorize internal config reset
Thomas Monjalon [Mon, 17 Nov 2014 09:08:39 +0000 (10:08 +0100)]
eal: factorize internal config reset

Now that internal config structure is common to Linux and BSD,
we can have a common function to initialize it.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: fix header guards
Thomas Monjalon [Fri, 21 Nov 2014 14:26:17 +0000 (15:26 +0100)]
eal: fix header guards

Some guards are missing or have a wrong name.
Others have LINUXAPP in their name but are now common.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: factorize common headers
Thomas Monjalon [Mon, 17 Nov 2014 08:07:59 +0000 (09:07 +0100)]
eal: factorize common headers

No need to have different headers for Linux and BSD.
These files are identicals with exception of internal config which has
uio and vfio fields only useful for Linux.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: move internal headers in source directory
Thomas Monjalon [Mon, 17 Nov 2014 07:46:24 +0000 (08:46 +0100)]
eal: move internal headers in source directory

The directory include/ should be reserved to public headers.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoethdev: fix doxygen comments about RSS
Thomas Monjalon [Mon, 24 Nov 2014 22:06:25 +0000 (23:06 +0100)]
ethdev: fix doxygen comments about RSS

The parameters port_id didn't match with comments about port.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agobond: fix doxygen
Thomas Monjalon [Mon, 24 Nov 2014 21:16:44 +0000 (22:16 +0100)]
bond: fix doxygen

There is no parameter delay_ms in *_delay_get functions.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoapp/testpmd: set socket id when adding new port
Declan Doherty [Mon, 24 Nov 2014 16:33:40 +0000 (16:33 +0000)]
app/testpmd: set socket id when adding new port

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agopci: new ixgbe devices
Ouyang Changchun [Tue, 25 Nov 2014 05:02:42 +0000 (13:02 +0800)]
pci: new ixgbe devices

EAL misses 4 device ID but base codes support them, so add them into EAL.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
9 years agoapp/testpmd: configure flow director flexible payload
Jingjing Wu [Fri, 21 Nov 2014 00:46:56 +0000 (08:46 +0800)]
app/testpmd: configure flow director flexible payload

Test command is added to configure flexible payload

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: configure flow director flexible mask
Jingjing Wu [Fri, 21 Nov 2014 00:46:55 +0000 (08:46 +0800)]
app/testpmd: configure flow director flexible mask

test command added to configure flexible mask

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: take flow director flexible payload configuration
Jingjing Wu [Fri, 21 Nov 2014 00:46:54 +0000 (08:46 +0800)]
i40e: take flow director flexible payload configuration

configure flexible payload and flex mask in i40e driver
It includes arguments verification and HW setting.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoethdev: add flow director flexible payload setting in port config
Jingjing Wu [Fri, 21 Nov 2014 00:46:53 +0000 (08:46 +0800)]
ethdev: add flow director flexible payload setting in port config

add flexible payload setting in eth_conf

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: display flow director information
Jingjing Wu [Fri, 21 Nov 2014 00:46:52 +0000 (08:46 +0800)]
app/testpmd: display flow director information

display flow director's information, includes
 - statistics
 - configuration
 - capability

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: get flow director statistics
Jingjing Wu [Fri, 21 Nov 2014 00:46:51 +0000 (08:46 +0800)]
i40e: get flow director statistics

implement operation to get flow director statistics in i40e pmd driver

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoethdev: get flow director statistics
Jingjing Wu [Fri, 21 Nov 2014 00:46:50 +0000 (08:46 +0800)]
ethdev: get flow director statistics

define structures for getting flow director statistics

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: get flow director information
Jingjing Wu [Fri, 21 Nov 2014 00:46:49 +0000 (08:46 +0800)]
i40e: get flow director information

implement operation to get flow director information in i40e pmd driver, includes
 - mode
 - supported flow types
 - table space
 - flexible payload size and granularity
 - configured flexible payload and mask information

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoethdev: get flow director information
Jingjing Wu [Fri, 21 Nov 2014 00:46:48 +0000 (08:46 +0800)]
ethdev: get flow director information

define structures for getting flow director information includes:
 - mode
 - supported flow types
 - table space
 - flexible payload size and granularity
 - configured flexible payload and mask information

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: flush flow director table
Jingjing Wu [Fri, 21 Nov 2014 00:46:47 +0000 (08:46 +0800)]
app/testpmd: flush flow director table

Test command is added to flush flow director table

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: flush flow director table
Jingjing Wu [Fri, 21 Nov 2014 00:46:46 +0000 (08:46 +0800)]
i40e: flush flow director table

implement operation to flush flow director table

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: print extended flow director info
Jingjing Wu [Fri, 21 Nov 2014 00:46:45 +0000 (08:46 +0800)]
app/testpmd: print extended flow director info

Extended fdir info is printed in rxonly fwd engine when fdir match.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: report flow director matching
Jingjing Wu [Fri, 21 Nov 2014 00:46:44 +0000 (08:46 +0800)]
i40e: report flow director matching

setting the FDIR flag and report FD_ID plus flex bytes in mbuf if match

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agombuf: extend flow director field
Jingjing Wu [Fri, 21 Nov 2014 00:46:43 +0000 (08:46 +0800)]
mbuf: extend flow director field

fdir field in rte_mbuf is extended to support flex bytes reported when fdir match.
8 flex bytes can be reported in maximum.
The reported flex bytes are part of flexible payload.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: flow director matching counter
Jingjing Wu [Fri, 21 Nov 2014 00:46:42 +0000 (08:46 +0800)]
i40e: flow director matching counter

support to get the fdir_match counter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: add or delete flow director filter
Jingjing Wu [Fri, 21 Nov 2014 00:46:41 +0000 (08:46 +0800)]
app/testpmd: add or delete flow director filter

Commands are added to test adding or deleting flow director filters.
10 flow types in flow director are supported: ipv4, ipv4-frag, tcpv4, udpv4, sctpv4, ipv6, ipv6-frag, tcpv6, udpv6, sctpv6

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: add or delete flow director
Jingjing Wu [Fri, 21 Nov 2014 00:46:40 +0000 (08:46 +0800)]
i40e: add or delete flow director

deal with two operations for flow director
 - RTE_ETH_FILTER_ADD
 - RTE_ETH_FILTER_DELETE
encode the flow inputs to programming packet
sent the packet to filter programming queue and check status on the status report queue

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: transition between flow type and pctype
Jingjing Wu [Fri, 21 Nov 2014 00:46:39 +0000 (08:46 +0800)]
i40e: transition between flow type and pctype

- macros to validate flow_type and pctype
- functions for transition between flow_type and pctype:
  - i40e_flowtype_to_pctype
  - i40e_pctype_to_flowtype

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoethdev: structures to add or delete flow director
Jingjing Wu [Fri, 21 Nov 2014 00:46:38 +0000 (08:46 +0800)]
ethdev: structures to add or delete flow director

define structures to add or delete flow director filter
  - struct rte_eth_fdir_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: initialize flow director flexible payload setting
Jingjing Wu [Fri, 21 Nov 2014 00:46:37 +0000 (08:46 +0800)]
i40e: initialize flow director flexible payload setting

set flexible payload related registers to default value at initialization time.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: tear down flow director
Jingjing Wu [Fri, 21 Nov 2014 00:46:36 +0000 (08:46 +0800)]
i40e: tear down flow director

release fortville resources on flow director, includes
 - queue 0 pair release
 - release vsi

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: set up and initialize flow director
Jingjing Wu [Fri, 21 Nov 2014 00:46:35 +0000 (08:46 +0800)]
i40e: set up and initialize flow director

set up fortville resources to support flow director, includes
 - queue 0 pair allocated and set up for flow director
 - create vsi
 - reserve memzone for flow director programming packet

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40evf: support querying and updating redirection table
Helin Zhang [Sat, 15 Nov 2014 16:03:44 +0000 (00:03 +0800)]
i40evf: support querying and updating redirection table

Support of updating/querying redirection table has been added for VF.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoethdev: support multiple sizes of redirection table
Helin Zhang [Sat, 15 Nov 2014 16:03:43 +0000 (00:03 +0800)]
ethdev: support multiple sizes of redirection table

As 40G NIC supports different sizes (128/512/64 entries) of
redirection table from that (128 entries) of 1G and 10G NICs,
support of multiple sizes of redirection table is needed.
It includes,
* Redefine 'struct rte_eth_rss_reta' in ethdev.
  - To 'struct rte_eth_rss_reta_entry64' which contains 64
    entries and 64 bits mask.
  - Array of above new structure can be used for any number of
    redirection table entries, as long as the number is multiple
    of 64. This is quite flexible for the future expanding of
    redirection table.
* Redefinition of relevant interfaces in ethdev.
  - Interface of reta update has been redefined with new parameters.
  - Interface of reta query has been redefined with new parameters.
* Rework of 1G PMD in igb.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 10G PMD in ixgbe.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 40G PMD (PF only) in i40e.
  - reta update has been reworked.
  - reta query has been reworked.
* Implement relevant commands in testpmd.

Test report: http://dpdk.org/ml/archives/dev/2014-November/008362.html

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Erlu Chen <erlu.chen@intel.com>
9 years agoi40e: add redirection table size in device info
Helin Zhang [Sat, 15 Nov 2014 16:03:42 +0000 (00:03 +0800)]
i40e: add redirection table size in device info

Returning redirection table size has been supported in ops of
'dev_infos_get' for both PF and VF. Default RX/TX configurations
of VF can be returned in ops of 'dev_infos_get', while it was
missed before.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoixgbe: add redirection table size in device info
Helin Zhang [Sat, 15 Nov 2014 16:03:41 +0000 (00:03 +0800)]
ixgbe: add redirection table size in device info

As more and more information are different between PF and VF, ops
of 'dev_infos_get' has been implemented respectively. In addition,
returning redirection table size has been supported in it.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoigb: add redirection table size in device info
Helin Zhang [Sat, 15 Nov 2014 16:03:40 +0000 (00:03 +0800)]
igb: add redirection table size in device info

As more and more information are different between PF and VF,
ops of 'dev_infos_get' has been implemented respectively. In
addition, new field of 'reta_size' has been added in
'struct rte_eth_dev_info' for returning redirection table size.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40e: support setting hash lookup table size
Helin Zhang [Sat, 15 Nov 2014 16:03:39 +0000 (00:03 +0800)]
i40e: support setting hash lookup table size

Add support of setting hash lookup table size according
to the hardawre capability.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoi40evf: fix code style
Helin Zhang [Sat, 15 Nov 2014 16:03:38 +0000 (00:03 +0800)]
i40evf: fix code style

Fix of several code style issues.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/testpmd: fix code style for redirection table
Helin Zhang [Sat, 15 Nov 2014 16:03:37 +0000 (00:03 +0800)]
app/testpmd: fix code style for redirection table

Fix of several code style issues.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
9 years agoapp/test: refactor bonding checks with macros
Declan Doherty [Mon, 24 Nov 2014 16:33:42 +0000 (16:33 +0000)]
app/test: refactor bonding checks with macros

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agobond: support link status polling
Declan Doherty [Mon, 24 Nov 2014 16:33:41 +0000 (16:33 +0000)]
bond: support link status polling

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
9 years agobond: free mbufs on Tx burst failure
Declan Doherty [Mon, 24 Nov 2014 16:33:39 +0000 (16:33 +0000)]
bond: free mbufs on Tx burst failure

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
9 years agobond: fix naming inconsistency
Declan Doherty [Mon, 24 Nov 2014 16:33:38 +0000 (16:33 +0000)]
bond: fix naming inconsistency

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agobond: remove switch statement from Rx burst
Declan Doherty [Mon, 24 Nov 2014 16:33:37 +0000 (16:33 +0000)]
bond: remove switch statement from Rx burst

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agobond: support link status interrupt
Declan Doherty [Mon, 24 Nov 2014 16:33:36 +0000 (16:33 +0000)]
bond: support link status interrupt

Adding support for lsc interrupt from bonded device to link
bonding library with supporting unit tests in the test application.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
9 years agoaf_packet: add PMD for AF_PACKET-based virtual devices
John W. Linville [Mon, 17 Nov 2014 15:57:58 +0000 (10:57 -0500)]
af_packet: add PMD for AF_PACKET-based virtual devices

This is a Linux-specific virtual PMD driver backed by an AF_PACKET
socket.  This implementation uses mmap'ed ring buffers to limit copying
and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
AF_PACKET is used for frame reception.  In the current implementation,
Tx and Rx queues are always paired, and therefore are always equal
in number -- changing this would be a Simple Matter Of Programming.

Interfaces of this type are created with a command line option like
"--vdev=eth_af_packet0,iface=...".  There are a number of options available
as arguments:

 - Interface is chosen by "iface" (required)
 - Number of queue pairs set by "qpairs" (optional, default: 1)
 - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
 - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
 - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
[Thomas: disable because of incompatibility with some kernels]

9 years agoapp/testpmd: add some missing commands in help
Pablo de Lara [Sat, 15 Nov 2014 19:01:36 +0000 (19:01 +0000)]
app/testpmd: add some missing commands in help

set link-up and set link-down were not included
in the help command.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoapp/test: fix misplaced braces in devargs check
Bruce Richardson [Wed, 19 Nov 2014 09:06:13 +0000 (09:06 +0000)]
app/test: fix misplaced braces in devargs check

This patch fixes two occurrences where a call to strncmp had the closing
brace in the wrong place. Changing this form:
if (strncmp(X,Y,sizeof(X) != 0))
which does a comparison of length 1, to
if (strncmp(X,Y,sizeof(X)) != 0)
which does the correct length comparison and then compares the result to
zero in the "if" part, as the author presumably originally intended.

Reported-by: Larry Wang <liang-min.wang@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
9 years agoapp/test: remove file prefix option for bsd
Pablo de Lara [Sat, 15 Nov 2014 20:34:09 +0000 (20:34 +0000)]
app/test: remove file prefix option for bsd

eal_flags and multiprocess unit tests use --file-prefix option
which is not supported in FreeBSD, so it has been removed
if compiled for this OS.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agocmdline: fix for bsd
Sergio Gonzalez Monroy [Thu, 20 Nov 2014 14:17:13 +0000 (14:17 +0000)]
cmdline: fix for bsd

Some features of the cmdline were broken in FreeBSD as a result of
termios not being compiled.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoexamples/dpdk_qat: fix reference to old mbuf field
Pablo de Lara [Thu, 20 Nov 2014 10:50:07 +0000 (10:50 +0000)]
examples/dpdk_qat: fix reference to old mbuf field

Since commit 08b563ffb19 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoxenvirt: fix reference to old mbuf field
Sergio Gonzalez Monroy [Wed, 19 Nov 2014 12:26:25 +0000 (12:26 +0000)]
xenvirt: fix reference to old mbuf field

Since commit 08b563ffb19 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoalarm: make cancellation thread-safe
Pawel Wodkowski [Wed, 1 Oct 2014 14:20:22 +0000 (15:20 +0100)]
alarm: make cancellation thread-safe

It eliminates a race between threads using rte_alarm_cancel() and
rte_alarm_set().

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
9 years agotable: fix pointer calculations at initialization
Balazs Nemeth [Fri, 26 Sep 2014 09:37:40 +0000 (09:37 +0000)]
table: fix pointer calculations at initialization

During initialization of rte_table_hash_ext and rte_table_hash_lru, a
contiguous region of memory is allocated to store meta data, buckets,
extended buckets, keys, stack of keys, stack of extended buckets and
data entries. The size of each region depends on the hash table
configuration.

The address of each region is calculated using offsets relative to the
beginning of the memory region. Without this patch, the offsets
contain the size of the table meta data (sizeof(struct
rte_table_hash)). These addresses are stored in pointers which are
used when entries are added or deleted and lookups are performed.

Instead of adding these offsets to the address of the beginning of the
memory region, they are added to the address of the end of the meta
data (= address of the beginning of the memory region + sizeof(struct
rte_table_hash)). The resulting addresses are off by sizeof(struct
rte_table_hash) bytes. As a consequence, memory past the allocated
region can be accessed by the add, delete and lookup operations.

This patch corrects the address calculation by not including the size
of the meta data in the offsets.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
9 years agotable: fix incorrect initialization
Balazs Nemeth [Fri, 26 Sep 2014 09:37:39 +0000 (09:37 +0000)]
table: fix incorrect initialization

During initialization of rte_hash_table_ext and rte_hash_table_lru,
t->data_size_shl is calculated.  This member contains the number of
bits to shift left during calculation of the location of entries in
the hash table.  To determine the number of bits to shift left, the
size of the entry (as provided to the rte_table_hash_ext_create and
rte_table_hash_lru_create) has to be used instead of the size of the
key.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
9 years agotable: fix checking extended buckets in unoptimized case
Balazs Nemeth [Fri, 26 Sep 2014 09:37:38 +0000 (09:37 +0000)]
table: fix checking extended buckets in unoptimized case

If a key is not found in a bucket and the bucket has been extended,
the extended buckets also have to checked for potentially matching
keys. The extended buckets are checked at the end of the lookup. In
most cases, this logic is skipped as it is uncommon to have buckets in
an extended state.

In case the lookup is performed with less than 5 packets, an
unoptimized version is run instead (the optimized version requires at
least 5 packets). The extended buckets should also be checked in this
case instead of simply ignoring the extended buckets.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
9 years agotable: fix empty bucket removal during entry deletion
Balazs Nemeth [Fri, 26 Sep 2014 09:37:37 +0000 (09:37 +0000)]
table: fix empty bucket removal during entry deletion

When an entry is deleted from an extensible rte_table_hash, the bucket
that stored the entry can become empty. If this is the case, the
bucket needs to be removed from the chain of buckets.

During removal of the bucket, the chain should be updated first. If
the bucket that will be removed is cleared first, the chain is broken
and the information to update the chain is lost.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
9 years agodoc: programmers guide
Bernard Iremonger [Fri, 14 Nov 2014 11:53:41 +0000 (11:53 +0000)]
doc: programmers guide

The 1.7 DPDK_Prog_Guide document in MSWord has been converted to rst format for
use with Sphinx. There is an rst file for each chapter and an index.rst file
which contains the table of contents.
The top level index file has been modified to include this guide.

This document contains some png image files. If any of these png files are modified
they should be replaced with an svg file.

This is the sixth document from a set of 6 documents.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
9 years agodoc: fix eal paths
Thomas Monjalon [Mon, 17 Nov 2014 08:17:14 +0000 (09:17 +0100)]
doc: fix eal paths

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoexamples/distributor: new sample app
Reshma Pattan [Mon, 3 Nov 2014 15:49:44 +0000 (15:49 +0000)]
examples/distributor: new sample app

A new sample app that shows the usage of the distributor library. This
app works as follows:

* An RX thread runs which pulls packets from each ethernet port in turn
  and passes those packets to worker using a distributor component.
* The workers take the packets in turn, and determine the output port
  for those packets using basic l2forwarding doing an xor on the source
  port id.
* The RX thread takes the returned packets from the workers and enqueue
  those packets into an rte_ring structure.
* A TX thread pulls the packets off the rte_ring structure and then
  sends each packet out the output port specified previously by the worker
* Command-line option support provided only for portmask.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
9 years agovmxnet3: leverage data ring on Tx path
Yong Wang [Wed, 5 Nov 2014 01:49:43 +0000 (17:49 -0800)]
vmxnet3: leverage data ring on Tx path

Data_ring is a pre-mapped guest ring buffer that vmxnet3
backend has access to directly without a need for buffer
address mapping and unmapping during packet transmission.
It is useful in reducing device emulation cost on the tx
path.  There are some additional cost though on the guest
driver for packet copy and overall it's a win.

This patch leverages the data_ring for packets with a
length less than or equal to the data_ring entry size
(128B).  For larger packet, we won't use the data_ring
as that requires one extra tx descriptor and it's not
clear if doing this will be beneficial.

Performance results show that this patch significantly
boosts vmxnet3 64B tx performance (pkt rate) for l2fwd
application on a Ivy Bridge server by >20% at which
point we start to hit some bottleneck on the rx side.

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agovmxnet3: improve Rx performance
Yong Wang [Wed, 5 Nov 2014 01:49:42 +0000 (17:49 -0800)]
vmxnet3: improve Rx performance

This patch includes two small performance optimizations
on the rx path:

(1) It adds unlikely hints on various infrequent error
paths to the compiler to make branch prediction more
efficient.

(2) It also moves a constant assignment out of the pkt
polling loop.  This saves one branching per packet.

Performance evaluation configs:
- On the DPDK-side, it's running some l3 forwarding app
inside a VM on ESXi with one core assigned for polling.
- On the client side, pktgen/dpdk is used to generate
64B tcp packets at line rate (14.8M PPS).

Performance results on a Nehalem box (4cores@2.8GHzx2)
shown below.  CPU usage is collected factoring out the
idle loop cost.
- Before the patch, ~900K PPS with 65% CPU of a core
used for DPDK.
- After the patch, only 45% of a core used, while
maintaining the same packet rate.

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agovmxnet3: add Rx check offloads
Yong Wang [Wed, 5 Nov 2014 01:49:41 +0000 (17:49 -0800)]
vmxnet3: add Rx check offloads

Only supports IPv4 so far.

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agovmxnet3: fix stop/restart
Yong Wang [Wed, 5 Nov 2014 01:49:40 +0000 (17:49 -0800)]
vmxnet3: fix stop/restart

This change makes vmxnet3 consistent with other pmds in
terms of dev_stop behavior: rather than releasing tx/rx
rings, it only resets the ring structure and release the
pending mbufs.

Verified with various tests (test-pmd and pktgen) over
vmxnet3 that dev stop/restart works fine.

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agovmxnet3: add vlan Tx offload
Yong Wang [Wed, 5 Nov 2014 01:49:39 +0000 (17:49 -0800)]
vmxnet3: add vlan Tx offload

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agovmxnet3: fix vlan Rx stripping
Yong Wang [Wed, 5 Nov 2014 01:49:38 +0000 (17:49 -0800)]
vmxnet3: fix vlan Rx stripping

Shouldn't reset vlan_tci to 0 if a valid VLAN tag is stripped.

Signed-off-by: Yong Wang <yongwang@vmware.com>
9 years agoacl: fix code typos
Thomas Monjalon [Fri, 14 Nov 2014 15:22:31 +0000 (16:22 +0100)]
acl: fix code typos

Replace indicies by indices.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agoacl: fix comments typos
Thomas Monjalon [Fri, 14 Nov 2014 14:59:31 +0000 (15:59 +0100)]
acl: fix comments typos

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agodistributor: enhance and fix tag matching
Qinglai Xiao [Mon, 10 Nov 2014 14:44:02 +0000 (16:44 +0200)]
distributor: enhance and fix tag matching

With introduction of in_flight_bitmask, the whole 32 bits of tag can be
used. Further more, this patch fixed the integer overflow when finding
the matched tags.
The maximum number workers is now defined as 64, which is length of
double-word. The link between number of workers and RTE_MAX_LCORE is
now removed. Compile time check is added to ensure the
RTE_DISTRIB_MAX_WORKERS is less than or equal to size of double-word.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agombuf: add usr alias for hash
Qinglai Xiao [Mon, 10 Nov 2014 12:52:46 +0000 (14:52 +0200)]
mbuf: add usr alias for hash

This field is added for librte_distributor. User of librte_distributor
is advocated to set value of mbuf->hash.usr before calling
rte_distributor_process. The value of usr is the tag which stands as
identifier of flow.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
9 years agoeal: update i40e supported devices
Helin Zhang [Thu, 13 Nov 2014 08:29:52 +0000 (16:29 +0800)]
eal: update i40e supported devices

According to the changes of the i40e base driver, two device
IDs (0x1573, 0x1582) are not supported anymore, and one new
device ID (0x1586) is supported. The list of i40e device IDs
DPDK supported should be modified accordingly.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
9 years agodoc: fix file attributes of guides
Bernard Iremonger [Tue, 11 Nov 2014 13:39:38 +0000 (13:39 +0000)]
doc: fix file attributes of guides

The file attributes of the rst files have been changed to 644

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
9 years agodoc: sample application user guide
Bernard Iremonger [Tue, 11 Nov 2014 12:27:01 +0000 (12:27 +0000)]
doc: sample application user guide

The 1.7 DPDK_SampleApp_UG document in MSWord has been converted to rst format for
use with Sphinx. There is an rst file for each chapter and an index.rst file
which contains the table of contents.
The top level index file has been modified to include this guide.

This document contains some png image files. If any of thes png files are modified
they should be replaced with an svg file.

This is the fifth document from a set of 6 documents.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>