Bernard Iremonger [Tue, 8 Mar 2016 17:10:25 +0000 (17:10 +0000)]
ixgbe: fix releasing queues twice when detaching VF
Releasing the rx and tx queues is already done in ixgbe_dev_close()
so it does not need to be done in eth_ixgbevf_dev_uninit().
Fixes:
2866c5f1b87e ("ixgbe: support port hotplug")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Zhe Tao [Thu, 10 Mar 2016 15:26:22 +0000 (15:26 +0000)]
ixgbe: fix VF Rx/Tx function assignment
For the secondary process of DPDK to initialize ixgbevf, it will always
use the simple RX function or LRO RX function, and this behavior is not
the same RX/TX function selection logic as it is for the primary process.
Use the ixgbe_set_tx_function and ixgbe_set_rx_function to select the
RX/TX function when secondary process calls the init function for eth dev.
Fixes:
9d8a92628f21 ("ixgbe: remove simple scalar scattered Rx method")
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Wenzhuo Lu [Fri, 26 Feb 2016 03:05:29 +0000 (11:05 +0800)]
ixgbe: support link speed auto-negotiation on X550em_x
Normally the auto-negotiation is supported by FW. SW need not care about
that. But on x550em_x, FW doesn't support auto-neg. As the x550em_x ports
are 10G, if we connect the port will a peer which is 1G, the link will
always be down.
We need support auto-neg by SW to avoid this link down issue. As we already
have the code to handle the link speed setting, what we need is a trigger.
When the advertised link speed changes, a PHY interruption will be
triggered. So, we should handle this interrupt and call ixgbe_handle_lasi
to set the link speed correctly.
Please be aware it's working when auto-neg is on. If the auto-neg of the
peer port is turned off and its speed is indicated manually, we should also
set the speed of our own port manually.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 06:24:47 +0000 (14:24 +0800)]
ixgbe: support multicast promiscuous mode on VF
Add multicast promiscuous mode support on ixgbe VF driver.
Please note if we want to use this promiscuous mode, we need both PF
and VF driver to support it. The reason is this VF feature is
configged on PF.
If use kernel PF driver + dpdk VF driver, make sure kernel PF driver
support VF multicast promiscuous mode. If use dpdk PF + dpdk VF,
better make sure PF driver is the same version as VF.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:06 +0000 (16:55 +0800)]
ixgbe: support new devices and MAC types
Add the support for new devices and mac types, as supported by the base
code update.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:05 +0000 (16:55 +0800)]
ixgbe/base: update readme
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:04 +0000 (16:55 +0800)]
ixgbe/base: abstract out link read/write
It's more valuable to abstract the link read/write interface. As such,
change the following method names, and add them to a new link info
structure:
read_i2c_combined => read_link
read_i2c_combined_unlocked => read_link_unlocked
write_i2c_combined => write_link
write_i2c_combined_unlocked => write_link_unlocked
This will allow X550EM_a to override these methods for MDIO access
while X550EM_x provides methods to use I2C combined access.
Initially the structure is just method pointers and a bus
address.
Two functions involved in combined I2C accesses were moved from
ixgbe_phy.c to ixgbe_x550.c. The underlying functions that carry
out the combined I2C accesses were left in ixgbe_phy.c because
they share some functions with other I2C methods.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:03 +0000 (16:55 +0800)]
ixgbe/base: set MDIO speed after MAC reset
The MDIO clock speed must be reconfigured after the MAC reset.
The MDIO clock speed becomes invalid, therefore the driver reads
invalid PHY register values. The driver now set the MDIO clock
speed prior to initializing PHY ops and again after the MAC reset.
As now the MDIO speed gets set in more than one place, make a
function for it so it will always be done correctly.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:02 +0000 (16:55 +0800)]
ixgbe/base: fix setting flow director flag twice
Do not set FDIRCTRL.DROP_NO_MATCH in ixgbe_init_fdir_perfect_82599(),
this bit is already set in ixgbe_set_fdir_drop_queue_82599() which
makes more sense for drivers that call that function.
This resolves an issue where packets were being dropped when switching
to perfect filters mode.
Setting this bit makes no sense in perfect filters mode for the
driver as we do not want to route all packets that don't match an FDIR
rule to a single queue and instead fall back to RSS.
Drivers that need this bit set can call ixgbe_set_fdir_drop_queue_82599()
and the ones that don't, can preserve the old behavior.
Fixes:
2241ce281646 ("ixgbe/base: add flow director drop queue")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:01 +0000 (16:55 +0800)]
ixgbe/base: add register definition for SGMII busy
The X550EM_a device provides the MAC_SGMII_BUSY register to
indicate when slow SGMII register writes complete. Add
definitions for the register. No definitions are provided for
the individual bits under the theory that it is better to wait
for everything to complete when needed rather than try to map
out which reads need to wait for which writes. So we should wait
when anything is marked as "busy".
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:00 +0000 (16:55 +0800)]
ixgbe/base: ignore manageability for PHY power on
Instead of not defining the callback for set_phy_power when
manageability is enabled, put the check in the set_phy_power
function so that only turning the power off is conditional on
management, but not turning the PHY on.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:59 +0000 (16:54 +0800)]
ixgbe/base: set VF MAC address only when acked by PF
This patch resolves an issue where VF mac address is zeroed out
in cases where the VF driver is loaded while the PF interface
is down.
The solution is to only set it when we get an ACK from the PF.
Fixes:
6202266e5680 ("ixgbe/base: vf changes")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:58 +0000 (16:54 +0800)]
ixgbe/base: add sw-firmware sync for resource sharing on X550em_a
Use a PHY token, shared between sw-fw for PHY access on X550EM_a.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:57 +0000 (16:54 +0800)]
ixgbe/base: support X550em_x V2 device
Only x550em_x V1 was supported before. Now V2 is supported.
A mask for V1 and V2 is defined and used to support both.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:56 +0000 (16:54 +0800)]
ixgbe/base: support X550em_a device
Add new X550EM_a devices and their mac types, X550EM_a
and X550EM_a_vf.
Update the code to use the new devices and mac types.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Michael Qiu [Fri, 29 Jan 2016 05:58:10 +0000 (13:58 +0800)]
ixgbe: fix disable interrupt twice
Currently, ixgbe vf and pf will disable interrupt twice in
stop stage and uninit stage. It will cause an error:
testpmd> quit
Shutting down port 0...
Stopping ports...
Done
Closing ports...
EAL: Error disabling MSI-X interrupts for fd 26
Done
because the interrupt has already been disabled in stop stage.
Since it is enabled in init stage, better remove from
stop stage.
Fixes:
0eb609239efd ("ixgbe: enable Rx queue interrupts for PF and VF")
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Stephen Hemminger [Wed, 13 Jan 2016 04:54:10 +0000 (20:54 -0800)]
ixgbe: fix whitespace
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Stephen Hemminger [Fri, 13 Nov 2015 16:10:13 +0000 (08:10 -0800)]
ixgbe: speed up non-vector Tx
The freeing of mbuf's in ixgbe is one of the observable hot spots
under load. Optimize it by doing bulk free of mbufs using code similar
to i40e and fm10k.
Drop the no longer needed micro-optimization for the no refcount flag.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Stephen Hemminger [Fri, 22 Jan 2016 01:38:37 +0000 (17:38 -0800)]
igb: set default thresholds based on MAC type
This brings the DPDK igb driver inline with the behavior used by
the current Linux driver. The IGB hardware has several different
MAC types and the threshold values that work vary based on the hardware.
Since DPDK 1.8 it has been up to devices to provide the correct default
configuration parameter. But the igb driver gives values that are broken
on some devices, and always causes a warning message at startup.
Please test this on real hardware, I don't have the luxury of a
hardware lab full of variations of this chip.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Bernard Iremonger [Wed, 2 Mar 2016 16:09:06 +0000 (16:09 +0000)]
e1000: fix VF MAC address on close
Allow reprogramming of the RAR with a zero mac address,
to ensure that the VF traffic goes to the PF after
stop, close and detach of the VF.
Fixes:
be2d648a2dd3 ("igb: add PF support")
Fixes:
d82170d27918 ("igb: add VF support")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Yury Kylulin [Tue, 9 Feb 2016 09:09:43 +0000 (12:09 +0300)]
e1000: support VF promiscuous and allmulticast
Enable promiscuous and allmulticast mode control from the VF using
rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable() and
rte_eth_allmulticast_enable()/rte_eth_allmulticast_disable().
For promiscuous mode host/PF igb driver should be built with
IGB_ENABLE_VF_PROMISC.
For allmulticast mode "allmulti" flag should be set for appropriate PF
ifconfig eth0 allmulti
Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Ravi Kerur [Wed, 2 Mar 2016 13:59:34 +0000 (05:59 -0800)]
e1000: support I217 and I218 devices
Modified driver and eal code to support I217 and I218 Intel NICs.
Compiled and tested (via testpmd) on Ubuntu 14.04 for target
x86_64-native-linuxapp-gcc
Compiled for target x86_64-native-linuxapp-clang
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
John Daley [Tue, 8 Mar 2016 18:49:07 +0000 (10:49 -0800)]
enic: fix last packet not being sent
The last packet of the tx burst function array was not being
emitted until the subsequent call. The nic descriptor index
was being set to the current tx descriptor instead of one past
the descriptor as required by the nic.
Fixes:
d739ba4c6abf ("enic: improve Tx packet rate")
Signed-off-by: John Daley <johndale@cisco.com>
John Daley [Fri, 4 Mar 2016 21:09:00 +0000 (13:09 -0800)]
enic: improve Rx performance
This is a wholesale replacement of the Enic PMD receive path in order
to improve performance and code clarity. The changes are:
- Simplify and reduce code path length of receive function.
- Put most of the fast-path receive functions in one file.
- Reduce the number of posted_index updates (pay attention to
rx_free_thresh)
- Remove the unneeded container structure around the RQ mbuf ring
- Prefetch next Mbuf and descriptors while processing the current one
- Use a lookup table for converting CQ flags to mbuf flags.
Signed-off-by: John Daley <johndale@cisco.com>
Yoann Desmouceaux [Wed, 24 Feb 2016 23:06:15 +0000 (00:06 +0100)]
enic: fix DMA address of outgoing packets
The enic PMD driver send function uses a constant offset instead
of relying on the data_off in the mbuf to find the start of the packet.
Fixes:
fefed3d1e62c ("enic: new driver")
Signed-off-by: Yoann Desmouceaux <ydesmouc@cisco.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Rahul Lakkireddy [Thu, 25 Feb 2016 09:37:53 +0000 (15:07 +0530)]
cxgbe: fix PCI info copy to ports under same PF
Chelsio NIC ports share a single PF. Move rte_eth_copy_pci_info()
to copy the pci device information to the remaining ports as well.
Also update license year to 2016.
Fixes:
eeefe73f0af1 ("drivers: copy PCI device info to ethdev data")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Fri, 12 Feb 2016 11:45:30 +0000 (17:15 +0530)]
cxgbe: fix memory leak after initialization failure
Add missing code to free adapter when the device initialization fails.
Fixes:
8318984927ff ("cxgbe: add pmd skeleton")
Reported-by: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Tue, 19 Jan 2016 10:17:08 +0000 (15:47 +0530)]
cxgbe: fix setting wrong MTU
max_rx_pkt_len already includes ETHER_HDR_LEN and ETHER_CRC_LEN for the
mtu. But, the firmware also adds ETHER_HDR_LEN and ETHER_CRC_LEN to the
mtu specified. Fix by subtracting these values from the mtu before
passing it to firmware.
Fixes:
4b2eff452d2e ("cxgbe: enable jumbo frames")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Tue, 19 Jan 2016 10:17:07 +0000 (15:47 +0530)]
cxgbe: fix allocated size for RSS table
The size of each entry in the port's rss table is actually 2 bytes
and not 1 byte. A segfault occurs when accessing part of port 0's rss
table because it gets overwritten by subsequent port 1's part of the
rss table. Fix by setting the size of each entry appropriately.
Fixes:
92c8a63223e5 ("cxgbe: add device configuration and Rx support")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Charles (Chas) Williams [Thu, 31 Dec 2015 00:37:51 +0000 (19:37 -0500)]
bnx2x: determine queue sizes sooner
The VF needs to determine the queues sizes before .dev_infos_get
so that it can hint to the upper layer the proper sizes. Move
bnx2x_vf_get_resources() to .eth_dev_init and probe with the guesses
from bnx2x_init_rte().
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Charles (Chas) Williams [Thu, 31 Dec 2015 00:37:50 +0000 (19:37 -0500)]
bnx2x: fix resource allocattion error handling
bnx2x_loop_obtain_resources() returns a struct containing the status and
the error message. If bnx2x_do_req4pf() fails, it shouldn't return both
of these fields set to 0 indicating failure and no error.
Further, bnx2x_do_req4pf() needs to be able fail and return NO_RESOURCES
so that bnx2x_loop_obtain_resources() can negotiate reduced resource
requirments. This requires additional checking around bnx2x_do_req4pf().
Fixes:
540a211084a7 ("bnx2x: driver core")
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Stephen Hemminger [Tue, 5 Jan 2016 16:32:00 +0000 (08:32 -0800)]
bnx2x: remove unused variable
The mbuf_alloc_size is leftover from BSD or some other code base.
It is set but never used in DPDK driver. After that the related defines
can also be eliminated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
Liming Sun [Wed, 10 Feb 2016 05:15:21 +0000 (00:15 -0500)]
mpipe: fix crash when testpmd is quit under load
Fixes: the hung/crash issue when quitting testpmd under high
traffic rate. The following issue were found and fixed.
1. edesc->size is not initialized properly in mpipe_do_xmit() and could
cause buffer leak or corruption when HW buffer return is used.
2. Check the 'idesc.be' error bit in mpipe_recv_flush() to make sure
buffer is valid before releasing it. This is to avoid issues when
running out of buffers.
3. priv->rx_buffers counter is not accurate when HW buffer return is
used. Remove this counter to simplify the code.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:38 +0000 (09:30 -0500)]
mpipe: fix link initialization ordering
Mpipe link structure is initialized in function mpipe_link_init().
Currently it's only called from the eth_dev_ops.dev_start, which
caused crashes when link mgmt APIs (like promiscuous_enable)
was called before eth_dev_ops.dev_start(). This submit fixed it
by calling mpipe_link_init() in rte_pmd_mpipe_devinit().
Fixes:
a8dd50513dea ("mpipe: add TILE-Gx mPIPE poll mode driver")
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:37 +0000 (09:30 -0500)]
mpipe: optimize buffer return mechanism
This submit has changes to optimize the mpipe buffer return. When
a packet is received, instead of allocating and refilling the
buffer stack right away, it tracks the number of pending buffers,
and use HW buffer return as an optimization when the pending
number is below certain threshold, thus save two MMIO writes and
improves performance especially for bidirectional traffic case.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:36 +0000 (09:30 -0500)]
mk: support native build on TILE-Gx
The CROSS variable has empty default value (for native) and
must be set when using a cross-toolchain.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Thomas Monjalon [Tue, 15 Mar 2016 18:43:55 +0000 (19:43 +0100)]
doc: fix IPsec entry in the release notes
It was inserted in the "Resolved Issues" section.
Move the entry with the new features.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tetsuya Mukawa [Mon, 14 Mar 2016 08:53:32 +0000 (17:53 +0900)]
vhost: fix default value of kickfd and callfd
Currently, default values of kickfd and callfd are -1.
If the values are -1, current code guesses kickfd and callfd haven't
been initialized yet. Then vhost library will guess the virtqueue isn't
ready for processing.
But callfd and kickfd will be set as -1 when "--enable-kvm"
isn't specified in QEMU command line. It means we cannot treat -1 as
uninitialized state.
The patch defines -1 and -2 as VIRTIO_INVALID_EVENTFD and
VIRTIO_UNINITIALIZED_EVENTFD, and uses VIRTIO_UNINITIALIZED_EVENTFD for
the default values of kickfd and callfd.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:46 +0000 (12:32 +0800)]
vhost: avoid dead loop chain
If a malicious guest forges a dead loop chain, it could lead to a dead
loop of copying the desc buf to mbuf, which results to all mbuf being
exhausted.
Add a var nr_desc to avoid such case.
Suggested-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:45 +0000 (12:32 +0800)]
vhost: check for ring descriptors overflow
A malicious guest may easily forge some illegal vring desc buf.
To make our vhost robust, we need make sure desc->next will not
go beyond the vq->desc[] array.
Suggested-by: Rich Lane <rich.lane@bigswitch.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:44 +0000 (12:32 +0800)]
vhost: do sanity check for ring descriptor length
We need make sure that desc->len is bigger than the size of virtio net
header, otherwise, unexpected behaviour might happen due to "desc_avail"
would become a huge number with for following code:
desc_avail = desc->len - vq->vhost_hlen;
For dequeue code path, it will try to allocate enough mbuf to hold such
size of desc buf, which ends up with consuming all mbufs, leading to no
free mbuf is available. Therefore, you might see an error message:
Failed to allocate memory for mbuf.
Also, for both dequeue/enqueue code path, while it copies data from/to
desc buf, the big "desc_avail" would result to access memory not belong
the desc buf, which could lead to some potential memory access errors.
A malicious guest could easily forge such malformed vring desc buf. Every
time we restart an interrupted DPDK application inside guest would also
trigger this issue, as all huge pages are reset to 0 during DPDK re-init,
leading to desc->len being 0.
Therefore, this patch does a sanity check for desc->len, to make vhost
robust.
Reported-by: Rich Lane <rich.lane@bigswitch.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:43 +0000 (12:32 +0800)]
vhost: remove wrong unlikely prediction in Rx
VIRTIO_NET_F_MRG_RXBUF is a default feature supported by vhost.
Adding unlikely for VIRTIO_NET_F_MRG_RXBUF detection doesn't
make sense to me at all.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:42 +0000 (12:32 +0800)]
vhost: remove rte_memcpy from header copy
First of all, rte_memcpy() is mostly useful for copying big packets
by leveraging hardware advanced instructions like AVX. But for virtio
net hdr, which is 12 bytes at most, invoking rte_memcpy() will not
introduce any performance boost.
And, to my suprise, rte_memcpy() is VERY huge. Since rte_memcpy()
is inlined, it increases the binary code size linearly every time
we call it at a different place. Replacing the two rte_memcpy()
with directly copy saves nearly 12K bytes of code size!
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 14 Mar 2016 07:35:22 +0000 (15:35 +0800)]
vhost: refactor mergeable Rx
Current virtio_dev_merge_rx() implementation just looks like the
old rte_vhost_dequeue_burst(), full of twisted logic, that you
can see same code block in quite many different places.
However, the logic of virtio_dev_merge_rx() is quite similar to
virtio_dev_rx(). The big difference is that the mergeable one
could allocate more than one available entries to hold the data.
Fetching all available entries to vec_buf at once makes the
difference a bit bigger then.
The refactored code looks like below:
while (mbuf_has_not_drained_totally || mbuf_has_next) {
if (this_desc_has_no_room) {
this_desc = fetch_next_from_vec_buf();
if (it is the last of a desc chain)
update_used_ring();
}
if (this_mbuf_has_drained_totally)
mbuf = fetch_next_mbuf();
COPY(this_desc, this_mbuf);
}
This patch reduces quite many lines of code, therefore, make it much
more readable.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:40 +0000 (12:32 +0800)]
vhost: refactor Rx
This is a simple refactor, as there isn't any twisted logic in old
code. Here I just broke the code and introduced two helper functions,
reserve_avail_buf() and copy_mbuf_to_desc() to make the code more
readable.
Also, it saves nearly 1K bytes of binary code size.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:39 +0000 (12:32 +0800)]
vhost: refactor dequeueing
The current rte_vhost_dequeue_burst() implementation is a bit messy
and logic twisted. And you could see repeat code here and there.
However, rte_vhost_dequeue_burst() acutally does a simple job: copy
the packet data from vring desc to mbuf. What's tricky here is:
- desc buff could be chained (by desc->next field), so that you need
fetch next one if current is wholly drained.
- One mbuf could not be big enough to hold all desc buff, hence you
need to chain the mbuf as well, by the mbuf->next field.
The simplified code looks like following:
while (this_desc_is_not_drained_totally || has_next_desc) {
if (this_desc_has_drained_totally) {
this_desc = next_desc();
}
if (mbuf_has_no_room) {
mbuf = allocate_a_new_mbuf();
}
COPY(mbuf, desc);
}
Note that the old patch does a special handling for skipping virtio
header. However, that could be simply done by adjusting desc_avail
and desc_offset var:
desc_avail = desc->len - vq->vhost_hlen;
desc_offset = vq->vhost_hlen;
This refactor makes the code much more readable (IMO), yet it reduces
binary code size.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 07:01:20 +0000 (15:01 +0800)]
virtio: fix query of legacy features
Declare dst as type uint32_t instead of uint64_t, otherwise, we will get
a random upper 32 bit feature bits, as the following io port read reads
lower 32 bit only. It could lead a feature bits that include VIRTIO_F_VERSION_1
(the 32th bit) for legacy virtio, which is obviously wrong.
Fixes:
b8f04520ad71 ("virtio: use PCI ioport API")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Keith Wiles [Thu, 10 Sep 2015 19:40:50 +0000 (14:40 -0500)]
eal: decrease log level of some debug messages
When log level is set to 7 (INFO) these messages are still displayed
and should be set to DEBUG.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Stephen Hemminger [Sun, 29 Nov 2015 18:46:49 +0000 (10:46 -0800)]
sched: eliminate floating point in calculating byte clock
The old code was doing a floating point divide for each rte_dequeue()
which is very expensive. Change to using fixed point scaled inverse
multiply. To maintain equivalent precision, scaled math is used.
The application ABI is the same.
This improved performance from 5Gbit/sec to 10 Gbit/sec when configured
for 10 Gbit/sec rate.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Stephen Hemminger [Sun, 29 Nov 2015 18:46:48 +0000 (10:46 -0800)]
sched: introduce reciprocal divide
This adds (with permission of the original author)
reciprocal divide based on algorithm in Linux.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Stephen Hemminger [Sun, 29 Nov 2015 18:46:47 +0000 (10:46 -0800)]
sched: keep track of RED drops
Add new statistic to keep track of drops due to RED.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Panu Matilainen [Thu, 10 Mar 2016 13:16:01 +0000 (15:16 +0200)]
mk: fix eal shared library dependencies
Add DT_NEEDED entries for librte_eal external dependencies.
Details between the platforms differ somewhat, and for static
builds they need to be handled from mk/exec-env still.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Panu Matilainen [Thu, 10 Mar 2016 13:16:00 +0000 (15:16 +0200)]
mk: fix vhost shared library dependencies
Add DT_NEEDED entries for external library dependencies which
are the most critical ones for sane operation.
Clean up vhost_cuse CFLAGS/LDFLAGS confusion while at it.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Panu Matilainen [Thu, 10 Mar 2016 13:15:59 +0000 (15:15 +0200)]
mk: fix shared library dependencies on libm and librt
There are two places that need -lm (test app and librte_sched) and
exactly one that needs -lrt (librte_sched). Add the relevant
DT_NEEDED entries to both, and eliminate the bogus discrepancy
between Linux and BSD EXECENV_LDLIBS wrt these libs.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Reshma Pattan [Fri, 11 Mar 2016 15:16:55 +0000 (15:16 +0000)]
app/testpmd: support unidirectional configuration
Added testpmd support to validate zero nb_rxq/nb_txq
changes of ethdev (
d505ba8).
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Panu Matilainen [Thu, 10 Mar 2016 13:49:55 +0000 (15:49 +0200)]
examples/ip_pipeline: use unsigned constants for left shift operations
Tell the compiler to use unsigned constants for left shift ops,
otherwise building with gcc >= 6.0 fails due to multiple warnings like:
warning: left shift of negative value [-Wshift-negative-value]
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Thu, 10 Mar 2016 15:29:02 +0000 (15:29 +0000)]
examples/ip_pipeline: add load balancing to pass-through
The pass-through pipeline implementation is extended with load balancing
function. This function allows uniform distribution of the packets among
its output ports. For packets distribution, any application level logic
can be applied. For instance, in this implementation, hash value
computed over specific header fields of the incoming packets has been
used to spread traffic uniformly among the output ports.
The following pass-through configuration can be used for implementing
load balancing function over ipv4 traffic;
[PIPELINE0]
type = PASS-THROUGH
core = 0
pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
dma_src_offset = 278; mbuf (128) + headroom (128) + 1st ethertype offset (14) + ttl offset within ip header = 278 (ipv4)
dma_dst_offset = 128; mbuf (128)
dma_size = 16
dma_src_mask =
00FF0000FFFFFFFFFFFFFFFFFFFFFFFF
dma_hash_offset = 144; (dma_dst_offset+dma_size)
lb = hash
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fan Zhang [Fri, 11 Mar 2016 17:08:10 +0000 (17:08 +0000)]
examples/ip_pipeline: add pcap file dump
This patch add packet dumping feature to ip_pipeline. Output port type
SINK now supports dumping packets to PCAP file before releasing mbuf back
to mempool. This feature can be applied by specifying parameters in
configuration file as shown below:
[PIPELINE1]
type = PASS-THROUGH
core = 1
pktq_in = SOURCE0 SOURCE1
pktq_out = SINK0 SINK1
pcap_file_wr = /path/to/eth1.pcap /path/to/eth2.pcap
pcap_n_pkt_wr = 80 0
The configuration section "pcap_file_wr" contains full path and name of
the PCAP file which the packets will be dumped to. If multiple SINKs
exists, each shall have its own PCAP file path listed in this section,
separated by spaces. Multiple SINK ports shall NOT share same PCAP file to
be dumped.
The configuration section "pcap_n_pkt_wr" contains integer value(s)
and indicates the maximum number of packets to be dumped to the PCAP file.
If this value is "0", the "infinite" dumping mode will be used. If this
value is N (N > 0), the dumping will be finished when the number of
packets dumped to the file reaches N.
To enable PCAP dumping support to IP pipeline, the compiler option
CONFIG_RTE_PORT_PCAP must be set to 'y'. It is possible to disable this
feature by removing "pcap_file_wr" and "pcap_n_pkt_wr" lines from the
configuration file.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fan Zhang [Fri, 11 Mar 2016 17:08:09 +0000 (17:08 +0000)]
port: add pcap file dump
Originally, sink ports in librte_port releases received mbufs back to
mempool. This patch adds optional packet dumping to PCAP feature in sink
port: the packets will be dumped to user defined PCAP file for storage or
debugging. The user may also choose the sink port's activity: either it
continuously dump the packets to the file, or stops at certain dumping
This feature shares same CONFIG_RTE_PORT_PCAP compiler option as source
port PCAP file support feature. Users can enable or disable this feature
by setting CONFIG_RTE_PORT_PCAP compiler option "y" or "n".
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fan Zhang [Fri, 11 Mar 2016 17:08:08 +0000 (17:08 +0000)]
examples/ip_pipeline: add pcap file source
This patch add PCAP file support to ip_pipeline. Input port type SOURCE
now supports loading specific PCAP file and sends the packets in it to
pipeline instance. The packets are then released by SINK output port. This
feature can be applied by specifying parameters in configuration file as
shown below;
[PIPELINE1]
type = PASS-THROUGH
core = 1
pktq_in = SOURCE0 SOURCE1
pktq_out = SINK0 SINK1
pcap_file_rd = /path/to/eth1.PCAP /path/to/eth2.PCAP
pcap_bytes_rd_per_pkt = 0 64
The configuration section "pcap_file_rd" contains full path and name of
the PCAP file to be loaded. If multiple SOURCEs exists, each shall have
its own PCAP file path listed in this section, separated by spaces.
Multiple SOURCE ports may share same PCAP file to be copied.
The configuration section "pcap_bytes_rd_per_pkt" contains integer value
and indicates the maximum number of bytes to be copied from each packet
in the PCAP file. If this value is "0", all packets in the file will be
copied fully; if the packet size is smaller than the assigned value, the
entire packet is copied. Same as "pcap_file_rd", every SOURCE shall have
its own maximum copy byte number.
To enable PCAP support to IP pipeline, the compiler option
CONFIG_RTE_PORT_PCAP must be set to 'y'. It is possible to disable PCAP
support by removing "pcap_file_rd" and "pcap_bytes_rd_per_pkt" lines
from the configuration file.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fan Zhang [Fri, 11 Mar 2016 17:08:07 +0000 (17:08 +0000)]
port: add pcap file source
Originally, source ports in librte_port is an input port used as packet
generator. Similar to Linux kernel /dev/zero character device, it
generates null packets. This patch adds optional PCAP file support to
source port: instead of sending NULL packets, the source port generates
packets copied from a PCAP file. To increase the performance, the packets
in the file are loaded to memory initially, and copied to mbufs in circular
manner. Users can enable or disable this feature by setting
CONFIG_RTE_PORT_PCAP compiler option "y" or "n".
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Xutao Sun [Thu, 10 Mar 2016 03:06:01 +0000 (11:06 +0800)]
i40e: add tunnel filter for IP in GRE
Signed-off-by: Xutao Sun <xutao.sun@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Xutao Sun [Thu, 10 Mar 2016 03:06:00 +0000 (11:06 +0800)]
ethdev: add IP in GRE tunnel
Signed-off-by: Xutao Sun <xutao.sun@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Xutao Sun [Thu, 10 Mar 2016 03:05:59 +0000 (11:05 +0800)]
ethdev: rework tunnel filtering structure
Change the fields of outer_mac and inner_mac in struct
rte_eth_tunnel_filter_conf from pointer to struct in order to
keep the code's readability.
Signed-off-by: Xutao Sun <xutao.sun@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Wenzhuo Lu [Thu, 10 Mar 2016 02:42:14 +0000 (10:42 +0800)]
ixgbe: offload VxLAN and NVGRE Tx checksum on X550
The patch add VxLAN & NVGRE TX checksum off-load. When the flag of
outer IP header checksum offload is set, we'll set the context
descriptor to enable this checksum off-load.
Also update release notes for VxLAN & NVGRE checksum off-load support.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Wenzhuo Lu [Thu, 10 Mar 2016 02:42:13 +0000 (10:42 +0800)]
ixgbe: offload VxLAN and NVGRE Rx checksum on X550
X550 will do VxLAN & NVGRE RX checksum off-load automatically.
This patch exposes the result of the checksum off-load.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Wenzhuo Lu [Thu, 10 Mar 2016 02:42:12 +0000 (10:42 +0800)]
ixgbe: configure UDP tunnel port
Add UDP tunnel port add/del support on ixgbe. Now only
support VxLAN port configuration.
Although according to the specification the VxLAN port has
a default value 4789, it can be changed. We support VxLAN
port configuration to meet the change.
Note, the default value of VxLAN port in ixgbe NICs is 0. So
please set it when using VxLAN off-load.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Wenzhuo Lu [Thu, 10 Mar 2016 02:42:10 +0000 (10:42 +0800)]
ethdev: rename UDP tunnel port functions
The names of function for tunnel port configuration are not
accurate. They're tunnel_add/del, better change them to
tunnel_port_add/del.
The old functions are directly replaced because the API and ABI
compatibility of ethdev are already broken in 16.04.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Wenzhuo Lu [Fri, 11 Mar 2016 01:10:11 +0000 (09:10 +0800)]
app/testpmd: add commands for E-tag operation
Add the CLIs to support the E-tag operation.
1, Offloading of E-tag insertion and stripping.
2, Forwarding the E-tag packets to pools based on the GRP and E-CID_base.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Wenzhuo Lu [Fri, 11 Mar 2016 01:10:10 +0000 (09:10 +0800)]
app/testpmd: add commands for L2 tunnel config
Add CLIs to config ether type of l2 tunnel, and to enable/disable
a type of l2 tunnel.
Now only e-tag tunnel is supported.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Wenzhuo Lu [Fri, 11 Mar 2016 01:10:09 +0000 (09:10 +0800)]
ixgbe: support L2 tunnel operations
Add support of l2 tunnel configuration and operations.
1, Support modifying ether type of a type of l2 tunnel.
2, Support enabling and disabling the support of a type of l2 tunnel.
3, Support enabling/disabling l2 tunnel tag insertion/stripping.
4, Support enabling/disabling l2 tunnel packets forwarding.
5, Support adding/deleting forwarding rules for l2 tunnel packets.
Only support E-tag now.
Also update the release note.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Wenzhuo Lu [Fri, 11 Mar 2016 01:10:08 +0000 (09:10 +0800)]
ethdev: support L2 tunnel operations
Add functions to support l2 tunnel configuration and operations.
1, L2 tunnel ether type modification.
It means modifying the ether type of a specific type of tunnel.
So the packet with this ether type will be parsed as this type
of tunnel.
2, Enabling/disabling l2 tunnel support.
It means enabling/disabling the ability of parsing the specific
type of tunnel. This ability should be enabled before we enable
filtering, forwarding, offloading for this specific type of
tunnel.
3, Insertion and stripping for l2 tunnel tag.
4, Forwarding the packets to a pool based on l2 tunnel tag.
Only support e-tag tunnel now.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Wenzhuo Lu [Fri, 11 Mar 2016 01:10:07 +0000 (09:10 +0800)]
ixgbe: select pool by MAC when using double VLAN
On X550, as required by datasheet, E-tag packets are not expected
when double VLAN are used. So modify the register PFVTCTL after
enabling double VLAN to select pool by MAC but not MAC or E-tag.
An introduction of E-tag:
It's defined in IEEE802.1br. Please reference this website,
http://www.ieee802.org/1/pages/802.1br.html.
A brief description.
E-tag means external tag, and it's a kind of l2 tunnel. It means a
tag will be inserted in the l2 header. Like below,
|31 24|23 16|15 8|7 0|
0| Destination MAC address |
4| Dest MAC address(cont.) | Src MAC address |
8| Source MAC address(cont.) |
12| E-tag Etherenet type (0x893f) | E-tag header |
16| E-tag header(cont.) |
20| VLAN Ethertype(optional) | VLAN header(optional) |
24| Original type | ...... |
...| ...... |
The E-tag format is like below,
|0 15|16 18|19 |20 31|
| Ethertype - 0x893f | E-PCP |DEI| Ingress E-CID_base |
|32 33|34 35|36 47|48 55 |56 63|
| RSV | GRP |E-CID_base|Ingress_E-CID_ext| E-CID_ext |
The Ingess_E-CID_ext and E-CID_ext are always zero for endpoints
and are effectively reserved.
The more details of E-tag is in IEEE 802.1BR. 802.1BR is used to
replace 802.1Qbh. 802.1BR is a standard for Bridge Port Extension.
It specifies the operation of Bridge Port Extenders, including
management, protocols, and algorithms. Bridge Port Extenders
operate in support of the MAC Service by Extended Bridges.
The E-tag is added to l2 header to identify the VM channel and
the virtual port.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Helin Zhang [Fri, 11 Mar 2016 16:50:58 +0000 (00:50 +0800)]
i40e: fix overflow
The array 'ptype_table' was defined in depth of 'UINT8_MAX' which
is 255, while the querying index could be from 0 to 255. The issue
can be fixed with expanding the array to one more element.
Fixes:
9571ea028489 ("i40e: replace some offload flags with unified packet type")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Helin Zhang [Fri, 11 Mar 2016 16:50:57 +0000 (00:50 +0800)]
ethdev: add vlan type when setting ether type
In order to set ether type of VLAN for single VLAN, inner
and outer VLAN, the VLAN type as an input parameter is added
to 'rte_eth_dev_set_vlan_ether_type()'.
In addition, corresponding changes in e1000, ixgbe and i40e
are also added.
It is an ABI break but ethdev library is already bumped for 16.04.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Tomasz Kulasek [Thu, 10 Mar 2016 17:19:35 +0000 (18:19 +0100)]
examples: use buffered Tx
The internal buffering of packets for TX in sample apps is no longer
needed, so this patchset also replaces this code with calls to the new
rte_eth_tx_buffer* APIs in:
* l2fwd-jobstats
* l2fwd-keepalive
* l2fwd
* l3fwd-acl
* l3fwd-power
* link_status_interrupt
* client_server_mp
* l2fwd_fork
* packet_ordering
* qos_meter
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tomasz Kulasek [Thu, 10 Mar 2016 17:19:34 +0000 (18:19 +0100)]
ethdev: add buffered Tx
Many sample apps include internal buffering for single-packet-at-a-time
operation. Since this is such a common paradigm, this functionality is
better suited to being implemented in the ethdev API.
The new APIs in the ethdev library are:
* rte_eth_tx_buffer_init - initialize buffer
* rte_eth_tx_buffer - buffer up a single packet for future transmission
* rte_eth_tx_buffer_flush - flush any unsent buffered packets
* rte_eth_tx_buffer_set_err_callback - set up a callback to be called in
case transmitting a buffered burst fails. By default, we just free the
unsent packets.
As well as these, an additional reference callbacks are provided, which
frees the packets:
* rte_eth_tx_buffer_drop_callback - silently drop packets (default
behavior)
* rte_eth_tx_buffer_count_callback - drop and update user-provided counter
to track the number of dropped packets
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:20:01 +0000 (12:20 +0800)]
vhost: fix queue pair reallocation
vq is allocated on pairs, hence we should do pair reallocation
at numa_realloc() as well, otherwise an error like following
occurs while do numa reallocation:
VHOST_CONFIG: reallocate vq from 0 to 1 node
PANIC in rte_free():
Fatal error: Invalid memory
The reason we don't catch it is because numa_realloc() will
not take effect when RTE_LIBRTE_VHOST_NUMA is not enabled,
which is the default case.
Fixes:
e049ca6d10e0 ("vhost-user: prepare multiple queue setup")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:20:00 +0000 (12:20 +0800)]
vhost: simplify numa reallocation
We could first check if we need realloc vq or not, if so,
reallocate it. We then do similar to vhost dev realloc.
This could get rid of the tons of repeated "if (realloc_dev)"
and "if (realloc_vq)" statements, therefore, makes code
a bit more readable.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:19:59 +0000 (12:19 +0800)]
vhost: get rid of linked list for devices
While we use a single linked list to maintain all devices, we could
use a static array to achieve the same goal, just like what we did
to maintain the eth devices with rte_eth_devices array. This could
simplifies the code a bit.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Yuanhan Liu [Tue, 8 Mar 2016 08:51:21 +0000 (16:51 +0800)]
vhost: fix build with kernel < 3.5
VIRTIO_NET_F_GUEST_ANNOUNCE is a new feature introduced since kernel
v3.5. For older kernels (or more precisely, old distributions), we
could simply define it manually, to fix the "macro not defined" error.
Fixes:
d293dac8f30e ("vhost: claim support of guest announce")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Sergio Gonzalez Monroy [Fri, 11 Mar 2016 15:32:30 +0000 (15:32 +0000)]
examples: fix build dependencies
Building examples fails with CONFIG_RTE_LIBRTE_LPM=n
The error is caused by the new app ipsec-secgw that gets build
without checking for configuration dependencies.
Fixes:
d299106e8e31 ("examples/ipsec-secgw: add IPsec sample application")
The patch also reorders a couple entries to maintain alphabetic order.
Reported-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Maciej Czekaj [Thu, 10 Mar 2016 16:06:22 +0000 (17:06 +0100)]
examples/l3fwd: fix ARM build
Enable NEON support in exact match mode.
l3fwd example did not compile on ARM due to SSE2 instrincics used
in generic part.
Some instrinsins were used to initialize data structures and those were
replaced by ordinary structure initalization.
All SSE2 intrinsics used in forwarding, i.e. masking the IP/TCP header
are moved to single inline function and made arch-specific.
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Jerin Jacob [Fri, 11 Mar 2016 03:52:59 +0000 (09:22 +0530)]
maintainers: claim responsibility for arm64 files of hash
Fixes:
f123e3d2ca92 ("hash: replace libc memcmp with optimized functions for arm64")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Jerin Jacob [Fri, 11 Mar 2016 03:52:58 +0000 (09:22 +0530)]
lpm/arm: support NEON
Enabled CONFIG_RTE_LIBRTE_LPM, CONFIG_RTE_LIBRTE_TABLE,
CONFIG_RTE_LIBRTE_PIPELINE libraries for arm and arm64
TABLE, PIPELINE libraries were disabled due to LPM library dependency.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Jerin Jacob [Fri, 11 Mar 2016 03:52:57 +0000 (09:22 +0530)]
lpm/x86: move SSE implementation to be architecture agnostic
-Used architecture agnostic xmm_t to represent 128 bit SIMD variable
-Introduced vect_* API abstraction in app/test to test rte_lpm_lookupx4
API in architecture agnostic way
-Moved rte_lpm_lookupx4 SSE implementation to architecture specific
rte_lpm_sse.h file to accommodate new rte_lpm_lookupx4 implementation
for a different architecture.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Olivier Matz [Fri, 11 Mar 2016 13:29:40 +0000 (14:29 +0100)]
mk: fix static build without crypto
If the experimental CONFIG_RTE_LIBRTE_CRYPTODEV is disabled, build of
any crypto pmds will fail because of the missing dependency. The commit
94288d645 fixes the issue when compiled with shared libraries but there
is still an issue at link time with static libs:
LD test
/usr/bin/ld: cannot find -lrte_pmd_null_crypto
collect2: error: ld returned 1 exit status
Only add the -l linker flags related to crypto PMDs if CRYPTODEV is
enabled.
Fixes:
94288d645 ("mk: fix build without crypto")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:51 +0000 (00:02 +0000)]
examples/l2fwd-crypto: discover capabilities
Crypto devices now have information about
which crypto operations they are capable of provide.
This patch makes the app use this information,
removing all hardcoded values.
User now needs to create the virtual crypto devices
or bind the HW crypto devices, and the app will use
the ones capable of performing the crypto op specified
(user can select between HW/SW through command line)
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:50 +0000 (00:02 +0000)]
examples/l2fwd-crypto: add cipher/hash only cases
Added cipher-only, hash-only operation cases,
which will be supported in the future.
Also, only sets authentication and ciphering parameters
when needed.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:49 +0000 (00:02 +0000)]
examples/l2fwd-crypto: parse AAD parameter
So far, L2fwd crypto app could parse cipher, auth keys
and IV, but not AAD (additional authentication data).
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:48 +0000 (00:02 +0000)]
examples/l2fwd-crypto: parse key parameters
Implement key parsing functionality, so user can provide
auth and cipher keys, plus IV, from the command line.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:47 +0000 (00:02 +0000)]
examples/l2fwd-crypto: update auth algo list
Updated authentication algorithm list:
- Added MD5_HMAC and SHA384_HMAC
- Removed SHA1, SHA224, SHA256
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 11 Mar 2016 00:02:46 +0000 (00:02 +0000)]
examples/l2fwd-crypto: clean up
- Removed unnecessary blank lines
- Changed some variable types (longer)
- Removed commented code
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Sergio Gonzalez Monroy [Fri, 11 Mar 2016 02:12:40 +0000 (02:12 +0000)]
examples/ipsec-secgw: add IPsec sample application
Sample app implementing an IPsec Security Geteway.
The main goal of this app is to show the use of cryptodev framework
in a "real world" application.
Currently only supported static IPv4 ESP IPsec tunnels for the following
algorithms:
- Cipher: AES-CBC, NULL
- Authentication: HMAC-SHA1, NULL
Not supported:
- SA auto negotiation (No IKE implementation)
- chained mbufs
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Thu, 10 Mar 2016 17:14:09 +0000 (17:14 +0000)]
aesni_mb: remove parameters from config file
Parse the device parameters from rte_eal_vdev_init,
instead of the config file, so user can change the parameters
at runtime.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Declan Doherty [Fri, 11 Mar 2016 01:36:54 +0000 (01:36 +0000)]
cryptodev: add capabilities discovery
This patch add a mechanism for discovery of crypto device features and supported
crypto operations and algorithms. It also provides a method for a crypto PMD to
publish any data range limitations it may have for the operations and algorithms
it supports.
The parameter feature_flags added to rte_cryptodev struct is used to capture
features such as operations supported (symmetric crypto, operation chaining etc)
as well parameter such as whether the device is hardware accelerated or uses
SIMD instructions.
The capabilities parameter allows a PMD to define an array of supported operations
with any limitation which that implementation may have.
Finally the rte_cryptodev_info struct has been extended to allow retrieval of
these parameter using the existing rte_cryptodev_info_get() API.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Panu Matilainen [Fri, 11 Mar 2016 09:13:48 +0000 (11:13 +0200)]
mk: fix build without crypto
If the experimental CONFIG_RTE_LIBRTE_CRYPTODEV is disabled,
build of any crypto pmds will fail because of the missing dependency.
This has been present for a while now but hidden until the addition
of null_crypto since all the other crypto pmds have been disabled
by default.
Conditionalize the entire drivers/crypto directory on
CONFIG_RTE_LIBRTE_CRYPTODEV to fix.
Fixes:
1703e94ac5ce ("qat: add driver for QuickAssist devices")
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Declan Doherty [Fri, 11 Mar 2016 01:04:10 +0000 (01:04 +0000)]
null_crypto: add driver for null crypto operations
This patch provides the implementation of a NULL crypto PMD, which supports
NULL cipher and NULL authentication operations, which can be chained together
as follows:
- Authentication Only
- Cipher Only
- Authentication then Cipher
- Cipher then Authentication
As this is a NULL operation device the crypto operations which are submitted for
processing are not actually modified and are stored in a queue pairs processed
packets ring ready for collection when rte_cryptodev_burst_dequeue() is called.
The patch also contains the related unit tests function to test the PMDs
supported operations.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Thomas Monjalon [Thu, 10 Mar 2016 22:35:37 +0000 (23:35 +0100)]
maintainers: add doc for crypto devices
Fixes:
1703e94ac5ce ("qat: add driver for QuickAssist devices")
Fixes:
924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Fiona Trahe [Fri, 5 Feb 2016 16:36:01 +0000 (16:36 +0000)]
maintainers: claim responsibility for Intel QuickAssist PMD
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: John Griffin <john.griffin@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>