Harry van Haaren [Tue, 11 Jul 2017 14:19:28 +0000 (15:19 +0100)]
service: initialize with EAL
This commit shows the changes required in rte_eal_init()
to transparently launch the service threads. The threads
are launched into the service worker functions here because
after rte_eal_init() the application is not gauranteed to
call any other DPDK API.
As the registration of services happens at initialization
time, the services that require CPU time are already available
when we reach the end of rte_eal_init().
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Harry van Haaren [Tue, 11 Jul 2017 14:19:27 +0000 (15:19 +0100)]
service: introduce service cores concept
Add header files, update .map files with new service
functions, and add the service header to the doxygen
for building.
This service header API allows DPDK to use services as
a concept of something that requires CPU cycles. An example
is a PMD that runs in software to schedule events, where a
hardware version exists that does not require a CPU.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Qiming Yang [Tue, 20 Jun 2017 03:24:11 +0000 (11:24 +0800)]
test/alarm: add delay tolerance
Because accuracy of timing to the microsecond is not guaranteed
in rte_eal_alarm_set, this function will not be called before
the requested time, but may be called a period of time
afterwards which can not be calculated. In order to ensure
test alarm running success, this patch added the delay time
before check the flag.
Signed-off-by: Qiming Yang <qiming.yang@intel.com> Acked-by: Jing Chen <jing.d.chen@intel.com>
Providing this parameter requests flow API isolated mode on all ports at
initialization time. It ensures all traffic is received through the
configured flow rules only (see flow command).
Ports that do not support this mode are automatically discarded.
Thomas Monjalon [Thu, 6 Jul 2017 21:45:32 +0000 (23:45 +0200)]
ethdev: fix build with gcc 5.4.0
Seen on Ubuntu 16.04 with GCC 5.4.0:
lib/librte_ether/rte_ethdev.c: In function 'get_mac_addr_index':
lib/librte_ether/rte_ethdev.c:2369:26: error:
'dev_info.max_mac_addrs' may be used uninitialized in this function
Indeed, rte_eth_dev_info_get() do not write into dev_info
if the port_id is not valid.
So we need to check the port_id and return in case of error.
This extra check should not be needed because the port_id is always
checked before calling get_mac_addr_index().
However it does not hurt.
This patch introduces the generic ethdev API for the traffic manager
capability, which includes: hierarchical scheduling, traffic shaping,
congestion management, packet marking.
Main features:
- Exposed as ethdev plugin capability (similar to rte_flow)
- Capability query API per port, per level and per node
- Scheduling algorithms: Strict Priority (SP), Weighed Fair Queuing (WFQ)
- Traffic shaping: single/dual rate, private (per node) and shared (by
multiple nodes) shapers
- Congestion management for hierarchy leaf nodes: algorithms of tail drop,
head drop, WRED; private (per node) and shared (by multiple nodes) WRED
contexts
- Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for
TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP)
The rte_flow feature breaks the monolithic approach for ethdev by
introducing the new rte_flow API to ethdev using a plugin-like approach.
Basically, the rte_flow API is still logically part of ethdev:
- It extends the ethdev functionality: rte_flow is a new feature/
capability of ethdev;
- all its functions work on an Ethernet device: the first parameter of the
rte_flow functions is Ethernet device port ID.
Also, the rte_flow API is a sort of capability plugin for ethdev:
- the rte_flow API functions have their own name space: they are called
rte_flow_operationXYZ() as opposed to rte_eth_dev_flow_operationXYZ());
- the rte_flow API functions are placed in separate files in the same
librte_ether folder as opposed to rte_ethdev.[hc].
The way it works is by using the existing ethdev API function
rte_eth_dev_filter_ctrl() to query the current Ethernet device port ID for
the support of the rte_flow capability and return the pointer to the
rte_flow operations when supported and NULL otherwise:
struct rte_flow_ops *eth_flow_ops;
int rte = rte_eth_dev_filter_ctrl(eth_port_id,
RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, ð_flow_ops);
This patch reuses the same approach for ethdev Traffic Management API.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Keith Wiles <keith.wiles@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Pablo de Lara [Mon, 10 Jul 2017 02:59:23 +0000 (03:59 +0100)]
cryptodev: fix build with icc
Removed unnecessary macro RTE_STD_C11, which is used
for unnamed structs.
Since there is no longer an unnamed structure in
rte_cryptodev_sym_session, this is not needed and
it is actually breaking compilation on icc:
lib/librte_cryptodev/rte_cryptodev.h(887): error: expected a declaration
__extension__ void *sess_private_data[0];
^ Fixes: 7c110ce7aa4e ("cryptodev: remove mempool from session") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
John McNamara [Fri, 23 Jun 2017 13:05:29 +0000 (14:05 +0100)]
doc: import sphinx rtd theme when available
The ReadTheDocs theme is no longer available by default in
Sphinx versions >= 1.3.1 and if it isn't available then an
exception is raised. This patch imports the ReadTheDocs
theme when it is available and warns but proceeds when it
isn't.
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Updated note to make users aware that the packet capture framework
is initialized by default only in testpmd. Other primary applications
need to explicitly modify the code to do this initialization.
Tom Barbette [Wed, 5 Jul 2017 13:59:44 +0000 (15:59 +0200)]
ethdev: document VMDq Rx configuration
From documentation it is very unclear how VMDq configuration can be
tweaked, and online search offer very poor results.
This patch will ultimately spawn an online documentation page
for the rte_eth_vmdq_rx_conf struct which will eventually add a bit of
documentation about the rx_mode tag and how to allow e.g. VMDq pools
to receive packets without VLAN tags.
Signed-off-by: Tom Barbette <tom.barbette@ulg.ac.be> Acked-by: John McNamara <john.mcnamara@intel.com>
In order to be able to replicate a configuration onto a second port,
device configuration should be fully described and available.
Other configuration items (i.e. MAC addresses) are stored within
rte_eth_dev_data, but not this one.
This trivial patch removes wrong comments about
the return value of the rte_bus_dump(), as
this method does not return any value
(it's return type is void)
Fixes: a97725791eec ("bus: introduce bus abstraction") Signed-off-by: Rami Rosen <rami.rosen@intel.com>
The current virtio supports Transmit Segmentation Offload, but
does not really support Large Receive Offload. The driver was confusing
the two offloads.
Fixes: 86d59b21468a ("net/virtio: support LRO") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
net/virtio: do not falsely claim to do IP checksum
The virtio driver is confused about the meaning of the ip_checksum
flag. In DPDK, ip_checksum means the hardware is capable of checking
the Layer 3 IP checksum. But KVM/QEMU does not do that. The flag
VIRTIO_NET_F_GUEST_CSUM controls whether the receive side does
Layer 4 (TCP/UDP) checksum offload.
Fix by erroring out any requests to do IP checksum.
Fixes: 96cb6711939e ("net/virtio: support Rx checksum offload") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jiayu Hu [Sun, 9 Jul 2017 05:46:46 +0000 (13:46 +0800)]
app/testpmd: enable TCP/IPv4 GRO
This patch enables TCP/IPv4 GRO library in csum forwarding engine.
By default, GRO is turned off. Users can use command "gro (on|off)
(port_id)" to enable or disable GRO for a given port. If a port is
enabled GRO, all TCP/IPv4 packets received from the port are performed
GRO. Besides, users can set max flow number and packets number per-flow
by command "gro set (max_flow_num) (max_item_num_per_flow) (port_id)".
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Jingjing Wu <jingjing.wu@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com>
Jiayu Hu [Sun, 9 Jul 2017 05:46:45 +0000 (13:46 +0800)]
lib/gro: support TCP/IPv4
In this patch, we introduce five APIs to support TCP/IPv4 GRO.
- gro_tcp4_reassemble: reassemble an inputted TCP/IPv4 packet.
- gro_tcp4_tbl_create: create a TCP/IPv4 reassembly table, which is used
to merge packets.
- gro_tcp4_tbl_destroy: free memory space of a TCP/IPv4 reassembly table.
- gro_tcp4_tbl_pkt_count: return the number of packets in a TCP/IPv4
reassembly table.
- gro_tcp4_tbl_timeout_flush: flush timeout packets from a TCP/IPv4
reassembly table.
TCP/IPv4 GRO API assumes all inputted packets are with correct IPv4
and TCP checksums. And TCP/IPv4 GRO API doesn't update IPv4 and TCP
checksums for merged packets. If inputted packets are IP fragmented,
TCP/IPv4 GRO API assumes they are complete packets (i.e. with L4
headers).
In TCP/IPv4 GRO, we use a table structure, called TCP/IPv4 reassembly
table, to reassemble packets. A TCP/IPv4 reassembly table includes a key
array and a item array, where the key array keeps the criteria to merge
packets and the item array keeps packet information.
One key in the key array points to an item group, which consists of
packets which have the same criteria value. If two packets are able to
merge, they must be in the same item group. Each key in the key array
includes two parts:
- criteria: the criteria of merging packets. If two packets can be
merged, they must have the same criteria value.
- start_index: the index of the first incoming packet of the item group.
Each element in the item array keeps the information of one packet. It
mainly includes three parts:
- firstseg: the address of the first segment of the packet
- lastseg: the address of the last segment of the packet
- next_pkt_index: the index of the next packet in the same item group.
All packets in the same item group are chained by next_pkt_index.
With next_pkt_index, we can locate all packets in the same item
group one by one.
To process an incoming packet needs three steps:
a. check if the packet should be processed. Packets with one of the
following properties won't be processed:
- FIN, SYN, RST, URG, PSH, ECE or CWR bit is set;
- packet payload length is 0.
b. traverse the key array to find a key which has the same criteria
value with the incoming packet. If find, goto step c. Otherwise,
insert a new key and insert the packet into the item array.
c. locate the first packet in the item group via the start_index in the
key. Then traverse all packets in the item group via next_pkt_index.
If find one packet which can merge with the incoming one, merge them
together. If can't find, insert the packet into this item group.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Jiayu Hu [Sun, 9 Jul 2017 05:46:44 +0000 (13:46 +0800)]
lib/gro: add Generic Receive Offload API framework
Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.
To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweight mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweight mode.
If applications need more fine-grained controls, they can choose the
heavyweight mode.
rte_gro_reassemble_burst is the main reassembly API which is used in
lightweight mode and processes N packets at a time. For applications,
performing GRO in lightweight mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.
rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and tries to merge N inputted packets with the packets
in GRO reassembly tables. For applications, performing GRO in heavyweight
mode is relatively complicated. Before performing GRO, applications need
to create a GRO context object, which keeps reassembly tables of
desired GRO types, by rte_gro_ctx_create. Then applications can use
rte_gro_reassemble to merge packets. The GROed packets are in the
reassembly tables of the GRO context object. If applications want to get
them, applications need to manually flush them by flush API.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Jan Blunck [Sun, 9 Jul 2017 09:44:16 +0000 (05:44 -0400)]
crypto/scheduler: fix build with old gcc
Seen with gcc 4.9.2:
drivers/crypto/scheduler/scheduler_multicore.c:286:2: error:
'for' loop initial declarations are only allowed in C99 or C11 mode
for (uint16_t i = 0; i < sched_ctx->nb_wc; i++)
^
Introduce a more versatile helper to parse device strings. This
helper expects a generic rte_devargs structure as storage in order not
to require API changes in the future, should this structure be
updated.
The old equivalent function is thus being deprecated, as its API does
not allow to accompany rte_devargs evolutions.
A deprecation notice is issued.
This new helper will parse bus information as well as device name and
device parameters. It does not allocate an rte_devargs structure and
expects one to be given as input.
rte_devargs now represents any device from any bus.
The related devtypes do not identify a bus anymore, only which scan
policy the device subscribes to.
The bus itself is identified by a bus handle previously introduced.
Scan policies describe the way a bus should scan the system to search
for possible devices.
Three flags are introduced:
RTE_BUS_SCAN_UNDEFINED: Configuration is irrelevant for this bus
RTE_BUS_SCAN_WHITELIST: Scanning should be limited to declared devices
RTE_BUS_SCAN_BLACKLIST: Scanning should exclude only declared devices
Thomas Monjalon [Fri, 7 Jul 2017 00:04:26 +0000 (02:04 +0200)]
examples/ethtool: include PCI header directly
In devargs rework, rte_pci.h won't be included by rte_ethdev.h
(via rte_devargs.h) anymore.
rte_ethtool_get_drvinfo() could use rte_devargs.name instead of
creating equivalent bus specific name.
For now, it is workarounded by just including rte_pci.h.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
This operation can be used either to validate that a device
representation can be understood by a bus, as well as store the resulting
specialized device representation in any format determined by the bus.
Implementing this function allows EAL initialization routines to infer
which bus should handle a device. This is used as a way to respect
backward compatibility.
This API will disappear once this compatibility is not enforced anymore.
Thomas Monjalon [Fri, 7 Jul 2017 00:03:07 +0000 (02:03 +0200)]
bus: fix driver registration
The bus name was stored with embedded double quotes.
Indeed the bus name is given with a string in a macro,
which is not used elsewhere.
These macros are useless because the buses are drivers,
so they must not have any API for the application writer.
The registration can be done with a hardcoded value without quotes.
There is another (small) benefit of not using macros for driver names:
it is to have a meaningful constructor function name.
For instance, it was businitfn_PCI_BUS_NAME instead of businitfn_pci.
The bus registration macro is also changed to use
the new RTE_INIT_PRIO macro, similar to RTE_INIT used for other drivers.
The priority is the highest (101) in order to be sure that the bus driver
is registered before its device drivers.
Fixes: 0fd1a0eaae19 ("pci: add bus driver") Fixes: fea892e35f21 ("bus/vdev: use standard bus registration") Fixes: 7e7df6d0a41d ("bus/fslmc: introduce fsl-mc bus driver") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Mike Stolarchuk [Fri, 7 Jul 2017 05:54:25 +0000 (06:54 +0100)]
hash: fix lock release on add
When adding items to a hash table with multiple threads,
there is an spinlock used to prevent data corruption
(unless Transactional Memory is supported).
If there is a failure, the spinlock should be released,
but there were cases where that was not happening.
Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX") Cc: stable@dpdk.org Signed-off-by: Mike Stolarchuk <mike.stolarchuk@bigswitch.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This allows PMDs and applications to save flow rules in their generic
format for later processing. This is useful when rules cannot be applied
immediately, such as when the device is not properly initialized.
Jerin Jacob [Tue, 4 Jul 2017 04:53:27 +0000 (10:23 +0530)]
doc: add perf all types queue test in eventdev test guide
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Jerin Jacob [Tue, 4 Jul 2017 04:53:26 +0000 (10:23 +0530)]
doc: add perf queue test in eventdev test guide
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com> Acked-by: John McNamara <john.mcnamara@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>