+2. Use the CPU near local NUMA node to which the PCIe adapter is connected,
+ for better performance. For VMs, verify that the right CPU
+ and NUMA node are pinned according to the above. Run:
+
+ .. code-block:: console
+
+ lstopo-no-graphics
+
+ to identify the NUMA node to which the PCIe adapter is connected.
+
+3. If more than one adapter is used, and root complex capabilities allow
+ to put both adapters on the same NUMA node without PCI bandwidth degradation,
+ it is recommended to locate both adapters on the same NUMA node.
+ This in order to forward packets from one to the other without
+ NUMA performance penalty.
+
+4. Disable pause frames:
+
+ .. code-block:: console
+
+ ethtool -A <netdev> rx off tx off
+
+5. Verify IO non-posted prefetch is disabled by default. This can be checked
+ via the BIOS configuration. Please contact you server provider for more
+ information about the settings.
+
+.. note::
+
+ On some machines, depends on the machine integrator, it is beneficial
+ to set the PCI max read request parameter to 1K. This can be
+ done in the following way:
+
+ To query the read request size use:
+
+ .. code-block:: console
+
+ setpci -s <NIC PCI address> 68.w
+
+ If the output is different than 3XXX, set it by:
+
+ .. code-block:: console
+
+ setpci -s <NIC PCI address> 68.w=3XXX
+
+ The XXX can be different on different systems. Make sure to configure
+ according to the setpci output.
+
+6. To minimize overhead of searching Memory Regions:
+
+ - '--socket-mem' is recommended to pin memory by predictable amount.
+ - Configure per-lcore cache when creating Mempools for packet buffer.
+ - Refrain from dynamically allocating/freeing memory in run-time.