+ testpmd> ddp add 0 ./gtp.pkgo,./backup.pkgo
+
+Delete a GTP profile and restore backup profile:
+
+.. code-block:: console
+
+ testpmd> ddp del 0 ./backup.pkgo
+
+Get loaded DDP package info list:
+
+.. code-block:: console
+
+ testpmd> ddp get list 0
+
+Display information about a GTP profile:
+
+.. code-block:: console
+
+ testpmd> ddp get info ./gtp.pkgo
+
+Input set configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+Input set for any PCTYPE can be configured with user defined configuration,
+For example, to use only 48bit prefix for IPv6 src address for IPv6 TCP RSS:
+
+.. code-block:: console
+
+ testpmd> port config 0 pctype 43 hash_inset clear all
+ testpmd> port config 0 pctype 43 hash_inset set field 13
+ testpmd> port config 0 pctype 43 hash_inset set field 14
+ testpmd> port config 0 pctype 43 hash_inset set field 15
+
+Queue region configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The Intel® Ethernet 700 Series supports a feature of queue regions
+configuration for RSS in the PF, so that different traffic classes or
+different packet classification types can be separated to different
+queues in different queue regions. There is an API for configuration
+of queue regions in RSS with a command line. It can parse the parameters
+of the region index, queue number, queue start index, user priority, traffic
+classes and so on. Depending on commands from the command line, it will call
+i40e private APIs and start the process of setting or flushing the queue
+region configuration. As this feature is specific for i40e only private
+APIs are used. These new ``test_pmd`` commands are as shown below. For
+details please refer to :doc:`../testpmd_app_ug/index`.
+
+.. code-block:: console
+
+ testpmd> set port (port_id) queue-region region_id (value) \
+ queue_start_index (value) queue_num (value)
+ testpmd> set port (port_id) queue-region region_id (value) flowtype (value)
+ testpmd> set port (port_id) queue-region UP (value) region_id (value)
+ testpmd> set port (port_id) queue-region flush (on|off)
+ testpmd> show port (port_id) queue-region
+
+Generic flow API
+~~~~~~~~~~~~~~~~~~~
+
+- ``RSS Flow``
+
+ RSS Flow supports to set hash input set, hash function, enable hash
+ and configure queue region.
+ For example:
+ Configure queue region as queue 0, 1, 2, 3.
+
+ .. code-block:: console
+
+ testpmd> flow create 0 ingress pattern end actions rss types end \
+ queues 0 1 2 3 end / end
+
+ Enable hash and set input set for ipv4-tcp.
+
+ .. code-block:: console
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / tcp / end \
+ actions rss types ipv4-tcp l3-src-only end queues end / end
+
+ Set symmetric hash enable for flow type ipv4-tcp.
+
+ .. code-block:: console
+
+ testpmd> flow create 0 ingress pattern eth / ipv4 / tcp / end \
+ actions rss types ipv4-tcp end queues end func symmetric_toeplitz / end
+
+ Set hash function as simple xor.
+
+ .. code-block:: console
+
+ testpmd> flow create 0 ingress pattern end actions rss types end \
+ queues end func simple_xor / end
+
+Limitations or Known issues
+---------------------------
+
+MPLS packet classification
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For firmware versions prior to 5.0, MPLS packets are not recognized by the NIC.
+The L2 Payload flow type in flow director can be used to classify MPLS packet
+by using a command in testpmd like:
+
+ testpmd> flow_director_filter 0 mode IP add flow l2_payload ether \
+ 0x8847 flexbytes () fwd pf queue <N> fd_id <M>
+
+With the NIC firmware version 5.0 or greater, some limited MPLS support
+is added: Native MPLS (MPLS in Ethernet) skip is implemented, while no
+new packet type, no classification or offload are possible. With this change,
+L2 Payload flow type in flow director cannot be used to classify MPLS packet
+as with previous firmware versions. Meanwhile, the Ethertype filter can be
+used to classify MPLS packet by using a command in testpmd like:
+
+ testpmd> flow create 0 ingress pattern eth type is 0x8847 / end \
+ actions queue index <M> / end
+
+16 Byte RX Descriptor setting on DPDK VF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Currently the VF's RX descriptor mode is decided by PF. There's no PF-VF
+interface for VF to request the RX descriptor mode, also no interface to notify
+VF its own RX descriptor mode.
+For all available versions of the i40e driver, these drivers don't support 16
+byte RX descriptor. If the Linux i40e kernel driver is used as host driver,
+while DPDK i40e PMD is used as the VF driver, DPDK cannot choose 16 byte receive
+descriptor. The reason is that the RX descriptor is already set to 32 byte by
+the i40e kernel driver.
+In the future, if the Linux i40e driver supports 16 byte RX descriptor, user
+should make sure the DPDK VF uses the same RX descriptor mode, 16 byte or 32
+byte, as the PF driver.
+
+The same rule for DPDK PF + DPDK VF. The PF and VF should use the same RX
+descriptor mode. Or the VF RX will not work.
+
+Receive packets with Ethertype 0x88A8
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Due to the FW limitation, PF can receive packets with Ethertype 0x88A8
+only when floating VEB is disabled.
+
+Incorrect Rx statistics when packet is oversize
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When a packet is over maximum frame size, the packet is dropped.
+However, the Rx statistics, when calling `rte_eth_stats_get` incorrectly
+shows it as received.
+
+RX/TX statistics may be incorrect when register overflowed
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The rx_bytes/tx_bytes statistics register is 48 bit length.
+Although this limitation is enlarged to 64 bit length on the software side,
+but there is no way to detect if the overflow occurred more than once.
+So rx_bytes/tx_bytes statistics data is correct when statistics are
+updated at least once between two overflows.
+
+VF & TC max bandwidth setting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel.
+The behavior is different when handling per VF and per TC max bandwidth setting.
+When enabling per VF max bandwidth, SW will check if per TC max bandwidth is
+enabled. If so, return failure.
+When enabling per TC max bandwidth, SW will check if per VF max bandwidth
+is enabled. If so, disable per VF max bandwidth and continue with per TC max
+bandwidth setting.
+
+TC TX scheduling mode setting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are 2 TX scheduling modes for TCs, round robin and strict priority mode.
+If a TC is set to strict priority mode, it can consume unlimited bandwidth.
+It means if APP has set the max bandwidth for that TC, it comes to no
+effect.
+It's suggested to set the strict priority mode for a TC that is latency
+sensitive but no consuming much bandwidth.
+
+VF performance is impacted by PCI extended tag setting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To reach maximum NIC performance in the VF the PCI extended tag must be
+enabled. The DPDK i40e PF driver will set this feature during initialization,
+but the kernel PF driver does not. So when running traffic on a VF which is
+managed by the kernel PF driver, a significant NIC performance downgrade has
+been observed (for 64 byte packets, there is about 25% line-rate downgrade for
+a 25GbE device and about 35% for a 40GbE device).
+
+For kernel version >= 4.11, the kernel's PCI driver will enable the extended
+tag if it detects that the device supports it. So by default, this is not an
+issue. For kernels <= 4.11 or when the PCI extended tag is disabled it can be
+enabled using the steps below.
+
+#. Get the current value of the PCI configure register::
+
+ setpci -s <XX:XX.X> a8.w
+
+#. Set bit 8::
+
+ value = value | 0x100
+
+#. Set the PCI configure register with new value::
+
+ setpci -s <XX:XX.X> a8.w=<value>
+
+Vlan strip of VF
+~~~~~~~~~~~~~~~~
+
+The VF vlan strip function is only supported in the i40e kernel driver >= 2.1.26.
+
+DCB function
+~~~~~~~~~~~~
+
+DCB works only when RSS is enabled.
+
+Global configuration warning
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+I40E PMD will set some global registers to enable some function or set some
+configure. Then when using different ports of the same NIC with Linux kernel
+and DPDK, the port with Linux kernel will be impacted by the port with DPDK.
+For example, register I40E_GL_SWT_L2TAGCTRL is used to control L2 tag, i40e
+PMD uses I40E_GL_SWT_L2TAGCTRL to set vlan TPID. If setting TPID in port A
+with DPDK, then the configuration will also impact port B in the NIC with
+kernel driver, which don't want to use the TPID.
+So PMD reports warning to clarify what is changed by writing global register.
+
+Cloud Filter
+~~~~~~~~~~~~
+
+When programming cloud filters for IPv4/6_UDP/TCP/SCTP with SRC port only or DST port only,
+it will make any cloud filter using inner_vlan or tunnel key invalid. Default configuration will be
+recovered only by NIC core reset.
+
+High Performance of Small Packets on 40GbE NIC
+----------------------------------------------
+
+As there might be firmware fixes for performance enhancement in latest version
+of firmware image, the firmware update might be needed for getting high performance.
+Check the Intel support website for the latest firmware updates.
+Users should consult the release notes specific to a DPDK release to identify
+the validated firmware version for a NIC using the i40e driver.
+
+Use 16 Bytes RX Descriptor Size
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets.
+In ``config/rte_config.h`` set the following to use 16 bytes size RX descriptors::
+
+ #define RTE_LIBRTE_I40E_16BYTE_RX_DESC 1
+
+Input set requirement of each pctype for FDIR
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Each PCTYPE can only have one specific FDIR input set at one time.
+For example, if creating 2 rte_flow rules with different input set for one PCTYPE,
+it will fail and return the info "Conflict with the first rule's input set",
+which means the current rule's input set conflicts with the first rule's.
+Remove the first rule if want to change the input set of the PCTYPE.
+
+Example of getting best performance with l3fwd example
+------------------------------------------------------
+
+The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with a
+server with Intel Xeon processors and Intel Ethernet CNA XL710.
+
+The example scenario is to get best performance with two Intel Ethernet CNA XL710 40GbE ports.
+See :numref:`figure_intel_perf_test_setup` for the performance test setup.
+
+.. _figure_intel_perf_test_setup:
+
+.. figure:: img/intel_perf_test_setup.*
+
+ Performance Test Setup
+
+
+1. Add two Intel Ethernet CNA XL710 to the platform, and use one port per card to get best performance.
+ The reason for using two NICs is to overcome a PCIe v3.0 limitation since it cannot provide 80GbE bandwidth
+ for two 40GbE ports, but two different PCIe v3.0 x8 slot can.
+ Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports::
+
+ 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
+ 85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
+
+2. Connect the ports to the traffic generator. For high speed testing, it's best to use a hardware traffic generator.
+
+3. Check the PCI devices numa node (socket id) and get the cores number on the exact socket id.
+ In this case, ``82:00.0`` and ``85:00.0`` are both in socket 1, and the cores on socket 1 in the referenced platform
+ are 18-35 and 54-71.
+ Note: Don't use 2 logical cores on the same core (e.g core18 has 2 logical cores, core18 and core54), instead, use 2 logical
+ cores from different cores (e.g core18 and core19).
+
+4. Bind these two ports to igb_uio.
+
+5. As to Intel Ethernet CNA XL710 40GbE port, we need at least two queue pairs to achieve best performance, then two queues per port
+ will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets.
+
+6. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding.
+ Compile the ``l3fwd sample`` with the default lpm mode.
+
+7. The command line of running l3fwd would be something like the following::
+
+ ./dpdk-l3fwd -l 18-21 -n 4 -w 82:00.0 -w 85:00.0 \
+ -- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)'
+
+ This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding,
+ core 20 for port 1, queue pair 0 forwarding, and core 21 for port 1, queue pair 1 forwarding.
+
+8. Configure the traffic at a traffic generator.
+
+ * Start creating a stream on packet generator.
+
+ * Set the Ethernet II type to 0x0800.
+
+Tx bytes affected by the link status change
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~