Interrupt Throttling interval.
-Driver Compilation
-~~~~~~~~~~~~~~~~~~
-
-To compile the I40E PMD see :ref:`Getting Started Guide for Linux <linux_gsg>` or
-:ref:`Getting Started Guide for FreeBSD <freebsd_gsg>` depending on your platform.
-
-
-Linux
------
-
-
-Running testpmd
-~~~~~~~~~~~~~~~
-
-This section demonstrates how to launch ``testpmd`` with Intel XL710/X710
-devices managed by ``librte_pmd_i40e`` in the Linux operating system.
-
-#. Load ``igb_uio`` or ``vfio-pci`` driver:
-
- .. code-block:: console
-
- modprobe uio
- insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
-
- or
-
- .. code-block:: console
-
- modprobe vfio-pci
-
-#. Bind the XL710/X710 adapters to ``igb_uio`` or ``vfio-pci`` loaded in the previous step:
-
- .. code-block:: console
-
- ./usertools/dpdk-devbind.py --bind igb_uio 0000:83:00.0
-
- Or setup VFIO permissions for regular users and then bind to ``vfio-pci``:
-
- .. code-block:: console
-
- ./usertools/dpdk-devbind.py --bind vfio-pci 0000:83:00.0
-
-#. Start ``testpmd`` with basic parameters:
-
- .. code-block:: console
-
- ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -w 83:00.0 -- -i
-
- Example output:
-
- .. code-block:: console
-
- ...
- EAL: PCI device 0000:83:00.0 on NUMA socket 1
- EAL: probe driver: 8086:1572 rte_i40e_pmd
- EAL: PCI memory mapped at 0x7f7f80000000
- EAL: PCI memory mapped at 0x7f7f80800000
- PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a
- Interactive-mode selected
- Configuring Port 0 (socket 0)
- ...
-
- PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are
- satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0.
-
- ...
- Port 0: 68:05:CA:26:85:84
- Checking link statuses...
- Port 0 Link Up - speed 10000 Mbps - full-duplex
- Done
+Driver compilation and testing
+------------------------------
- testpmd>
+Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
+for details.
SR-IOV: Prerequisites and sample Application Notes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+--------------------------------------------------
#. Load the kernel module:
#. Assign VF to VM, and bring up the VM.
Please see the documentation for the *I40E/IXGBE/IGB Virtual Function Driver*.
+#. Running testpmd:
+
+ Follow instructions available in the document
+ :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`
+ to run testpmd.
+
+ Example output:
+
+ .. code-block:: console
+
+ ...
+ EAL: PCI device 0000:83:00.0 on NUMA socket 1
+ EAL: probe driver: 8086:1572 rte_i40e_pmd
+ EAL: PCI memory mapped at 0x7f7f80000000
+ EAL: PCI memory mapped at 0x7f7f80800000
+ PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a
+ Interactive-mode selected
+ Configuring Port 0 (socket 0)
+ ...
+
+ PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are
+ satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0.
+
+ ...
+ Port 0: 68:05:CA:26:85:84
+ Checking link statuses...
+ Port 0 Link Up - speed 10000 Mbps - full-duplex
+ Done
+
+ testpmd>
+
Sample Application Notes
------------------------
is to say, user should keep ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n`` in
config file.
-Link down with i40e kernel driver after DPDK application exit
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-After DPDK application quit, and the device is bound back to Linux i40e
-kernel driver, the link cannot be up after ``ifconfig <dev> up``.
-To work around this issue, ``ethtool -s <dev> autoneg on`` should be
-set first and then the link can be brought up through ``ifconfig <dev> up``.
-
-NOTE: requires Linux kernel i40e driver version >= 1.4.X
-
Receive packets with Ethertype 0x88A8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
effect.
It's suggested to set the strict priority mode for a TC that is latency
sensitive but no consuming much bandwidth.
+
+VF performance is impacted by PCI extended tag setting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To reach maximum NIC performance in the VF the PCI extended tag must be
+enabled. The DPDK I40E PF driver will set this feature during initialization,
+but the kernel PF driver does not. So when running traffic on a VF which is
+managed by the kernel PF driver, a significant NIC performance downgrade has
+been observed (for 64 byte packets, there is about 25% linerate downgrade for
+a 25G device and about 35% for a 40G device).
+
+For kernel version >= 4.11, the kernel's PCI driver will enable the extended
+tag if it detects that the device supports it. So by default, this is not an
+issue. For kernels <= 4.11 or when the PCI extended tag is disabled it can be
+enabled using the steps below.
+
+#. Get the current value of the PCI configure register::
+
+ setpci -s <XX:XX.X> a8.w
+
+#. Set bit 8::
+
+ value = value | 0x100
+
+#. Set the PCI configure register with new value::
+
+ setpci -s <XX:XX.X> a8.w=<value>
+
+High Performance of Small Packets on 40G NIC
+--------------------------------------------
+
+As there might be firmware fixes for performance enhancement in latest version
+of firmware image, the firmware update might be needed for getting high performance.
+Check with the local Intel's Network Division application engineers for firmware updates.
+Users should consult the release notes specific to a DPDK release to identify
+the validated firmware version for a NIC using the i40e driver.
+
+Use 16 Bytes RX Descriptor Size
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets.
+Configuration of ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` in config files can be changed to use 16 bytes size RX descriptors.
+
+High Performance and per Packet Latency Tradeoff
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Due to the hardware design, the interrupt signal inside NIC is needed for per
+packet descriptor write-back. The minimum interval of interrupts could be set
+at compile time by ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` in configuration files.
+Though there is a default configuration, the interval could be tuned by the
+users with that configuration item depends on what the user cares about more,
+performance or per packet latency.
+
+Example of getting best performance with l3fwd example
+------------------------------------------------------
+
+The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with an
+Intel server platform and Intel XL710 NICs.
+
+The example scenario is to get best performance with two Intel XL710 40GbE ports.
+See :numref:`figure_intel_perf_test_setup` for the performance test setup.
+
+.. _figure_intel_perf_test_setup:
+
+.. figure:: img/intel_perf_test_setup.*
+
+ Performance Test Setup
+
+
+1. Add two Intel XL710 NICs to the platform, and use one port per card to get best performance.
+ The reason for using two NICs is to overcome a PCIe Gen3's limitation since it cannot provide 80G bandwidth
+ for two 40G ports, but two different PCIe Gen3 x8 slot can.
+ Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports::
+
+ 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
+ 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
+
+2. Connect the ports to the traffic generator. For high speed testing, it's best to use a hardware traffic generator.
+
+3. Check the PCI devices numa node (socket id) and get the cores number on the exact socket id.
+ In this case, ``82:00.0`` and ``85:00.0`` are both in socket 1, and the cores on socket 1 in the referenced platform
+ are 18-35 and 54-71.
+ Note: Don't use 2 logical cores on the same core (e.g core18 has 2 logical cores, core18 and core54), instead, use 2 logical
+ cores from different cores (e.g core18 and core19).
+
+4. Bind these two ports to igb_uio.
+
+5. As to XL710 40G port, we need at least two queue pairs to achieve best performance, then two queues per port
+ will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets.
+
+6. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding.
+ Compile the ``l3fwd sample`` with the default lpm mode.
+
+7. The command line of running l3fwd would be something like the following::
+
+ ./l3fwd -l 18-21 -n 4 -w 82:00.0 -w 85:00.0 \
+ -- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)'
+
+ This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding,
+ core 20 for port 1, queue pair 0 forwarding, and core 21 for port 1, queue pair 1 forwarding.
+
+8. Configure the traffic at a traffic generator.
+
+ * Start creating a stream on packet generator.
+
+ * Set the Ethernet II type to 0x0800.