**Tables**
-:ref:`Table 1. Packet Processing Pipeline Implementing QoS <pg_table_1>`
+:numref:`table_qos_1` :ref:`table_qos_1`
-:ref:`Table 2. Infrastructure Blocks Used by the Packet Processing Pipeline <pg_table_2>`
+:numref:`table_qos_2` :ref:`table_qos_2`
-:ref:`Table 3. Port Scheduling Hierarchy <pg_table_3>`
+:numref:`table_qos_3` :ref:`table_qos_3`
-:ref:`Table 4. Scheduler Internal Data Structures per Port <pg_table_4>`
+:numref:`table_qos_4` :ref:`table_qos_4`
-:ref:`Table 5. Ethernet Frame Overhead Fields <pg_table_5>`
+:numref:`table_qos_5` :ref:`table_qos_5`
-:ref:`Table 6. Token Bucket Generic Operations <pg_table_6>`
+:numref:`table_qos_6` :ref:`table_qos_6`
-:ref:`Table 7. Token Bucket Generic Parameters <pg_table_7>`
+:numref:`table_qos_7` :ref:`table_qos_7`
-:ref:`Table 8. Token Bucket Persistent Data Structure <pg_table_8>`
+:numref:`table_qos_8` :ref:`table_qos_8`
-:ref:`Table 9. Token Bucket Operations <pg_table_9>`
+:numref:`table_qos_9` :ref:`table_qos_9`
-:ref:`Table 10. Subport/Pipe Traffic Class Upper Limit Enforcement Persistent Data Structure <pg_table_10>`
+:numref:`table_qos_10` :ref:`table_qos_10`
-:ref:`Table 11. Subport/Pipe Traffic Class Upper Limit Enforcement Operations <pg_table_11>`
+:numref:`table_qos_11` :ref:`table_qos_11`
-:ref:`Table 12. Weighted Round Robin (WRR) <pg_table_12>`
+:numref:`table_qos_12` :ref:`table_qos_12`
-:ref:`Table 13. Subport Traffic Class Oversubscription <pg_table_13>`
+:numref:`table_qos_13` :ref:`table_qos_13`
-:ref:`Table 14. Watermark Propagation from Subport Level to Member Pipes at the Beginning of Each Traffic Class Upper Limit Enforcement Period <pg_table_14>`
+:numref:`table_qos_14` :ref:`table_qos_14`
-:ref:`Table 15. Watermark Calculation <pg_table_15>`
+:numref:`table_qos_15` :ref:`table_qos_15`
-:ref:`Table 16. RED Configuration Parameters <pg_table_16>`
+:numref:`table_qos_16` :ref:`table_qos_16`
-:ref:`Table 17. Relative Performance of Alternative Approaches <pg_table_17>`
+:numref:`table_qos_17` :ref:`table_qos_17`
-:ref:`Table 18. RED Configuration Corresponding to RED Configuration File <pg_table_18>`
+:numref:`table_qos_18` :ref:`table_qos_18`
-:ref:`Table 19. Port types <pg_table_19>`
+:numref:`table_qos_19` :ref:`table_qos_19`
-:ref:`Table 20. Port abstract interface <pg_table_20>`
+:numref:`table_qos_20` :ref:`table_qos_20`
-:ref:`Table 21. Table types <pg_table_21>`
+:numref:`table_qos_21` :ref:`table_qos_21`
-:ref:`Table 29. Table Abstract Interface <pg_table_29_1>`
+:numref:`table_qos_22` :ref:`table_qos_22`
-:ref:`Table 22. Configuration parameters common for all hash table types <pg_table_22>`
+:numref:`table_qos_23` :ref:`table_qos_23`
-:ref:`Table 23. Configuration parameters specific to extendable bucket hash table <pg_table_23>`
+:numref:`table_qos_24` :ref:`table_qos_24`
-:ref:`Table 24. Configuration parameters specific to pre-computed key signature hash table <pg_table_24>`
+:numref:`table_qos_25` :ref:`table_qos_25`
-:ref:`Table 25. The main large data structures (arrays) used for configurable key size hash tables <pg_table_25>`
+:numref:`table_qos_26` :ref:`table_qos_26`
-:ref:`Table 26. Field description for bucket array entry (configurable key size hash tables) <pg_table_26>`
+:numref:`table_qos_27` :ref:`table_qos_27`
-:ref:`Table 27. Description of the bucket search pipeline stages (configurable key size hash tables) <pg_table_27>`
+:numref:`table_qos_28` :ref:`table_qos_28`
-:ref:`Table 28. Lookup tables for match, match_many, match_pos <pg_table_28>`
+:numref:`table_qos_29` :ref:`table_qos_29`
-:ref:`Table 29. Collapsed lookup tables for match, match_many and match_pos <pg_table_29>`
+:numref:`table_qos_30` :ref:`table_qos_30`
-:ref:`Table 30. The main large data structures (arrays) used for 8-byte and 16-byte key size hash tables <pg_table_30>`
+:numref:`table_qos_31` :ref:`table_qos_31`
-:ref:`Table 31. Field description for bucket array entry (8-byte and 16-byte key hash tables) <pg_table_31>`
+:numref:`table_qos_32` :ref:`table_qos_32`
-:ref:`Table 32. Description of the bucket search pipeline stages (8-byte and 16-byte key hash tables) <pg_table_32>`
+:numref:`table_qos_33` :ref:`table_qos_33`
-:ref:`Table 33. Next hop actions (reserved) <pg_table_33>`
-
-:ref:`Table 34. User action examples <pg_table_34>`
+:numref:`table_qos_34` :ref:`table_qos_34`
Port Types
~~~~~~~~~~
-Table 19 is a non-exhaustive list of ports that can be implemented with the Packet Framework.
-
-.. _pg_table_19:
-
-**Table 19 Port Types**
-
-+---+------------------+---------------------------------------------------------------------------------------+
-| # | Port type | Description |
-| | | |
-+===+==================+=======================================================================================+
-| 1 | SW ring | SW circular buffer used for message passing between the application threads. Uses |
-| | | the DPDK rte_ring primitive. Expected to be the most commonly used type of |
-| | | port. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 2 | HW ring | Queue of buffer descriptors used to interact with NIC, switch or accelerator ports. |
-| | | For NIC ports, it uses the DPDK rte_eth_rx_queue or rte_eth_tx_queue |
-| | | primitives. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 3 | IP reassembly | Input packets are either IP fragments or complete IP datagrams. Output packets are |
-| | | complete IP datagrams. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 4 | IP fragmentation | Input packets are jumbo (IP datagrams with length bigger than MTU) or non-jumbo |
-| | | packets. Output packets are non-jumbo packets. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 5 | Traffic manager | Traffic manager attached to a specific NIC output port, performing congestion |
-| | | management and hierarchical scheduling according to pre-defined SLAs. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 6 | KNI | Send/receive packets to/from Linux kernel space. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 7 | Source | Input port used as packet generator. Similar to Linux kernel /dev/zero character |
-| | | device. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
-| 8 | Sink | Output port used to drop all input packets. Similar to Linux kernel /dev/null |
-| | | character device. |
-| | | |
-+---+------------------+---------------------------------------------------------------------------------------+
+:numref:`table_qos_19` is a non-exhaustive list of ports that can be implemented with the Packet Framework.
+
+.. _table_qos_19:
+
+.. table:: Port Types
+
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | # | Port type | Description |
+ | | | |
+ +===+==================+=======================================================================================+
+ | 1 | SW ring | SW circular buffer used for message passing between the application threads. Uses |
+ | | | the DPDK rte_ring primitive. Expected to be the most commonly used type of |
+ | | | port. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 2 | HW ring | Queue of buffer descriptors used to interact with NIC, switch or accelerator ports. |
+ | | | For NIC ports, it uses the DPDK rte_eth_rx_queue or rte_eth_tx_queue |
+ | | | primitives. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 3 | IP reassembly | Input packets are either IP fragments or complete IP datagrams. Output packets are |
+ | | | complete IP datagrams. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 4 | IP fragmentation | Input packets are jumbo (IP datagrams with length bigger than MTU) or non-jumbo |
+ | | | packets. Output packets are non-jumbo packets. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 5 | Traffic manager | Traffic manager attached to a specific NIC output port, performing congestion |
+ | | | management and hierarchical scheduling according to pre-defined SLAs. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 6 | KNI | Send/receive packets to/from Linux kernel space. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 7 | Source | Input port used as packet generator. Similar to Linux kernel /dev/zero character |
+ | | | device. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
+ | 8 | Sink | Output port used to drop all input packets. Similar to Linux kernel /dev/null |
+ | | | character device. |
+ | | | |
+ +---+------------------+---------------------------------------------------------------------------------------+
Port Interface
~~~~~~~~~~~~~~
defines the initialization and run-time operation of the port.
The port abstract interface is described in.
-.. _pg_table_20:
-
-**Table 20 Port Abstract Interface**
-
-+---+----------------+-----------------------------------------------------------------------------------------+
-| # | Port Operation | Description |
-| | | |
-+===+================+=========================================================================================+
-| 1 | Create | Create the low-level port object (e.g. queue). Can internally allocate memory. |
-| | | |
-+---+----------------+-----------------------------------------------------------------------------------------+
-| 2 | Free | Free the resources (e.g. memory) used by the low-level port object. |
-| | | |
-+---+----------------+-----------------------------------------------------------------------------------------+
-| 3 | RX | Read a burst of input packets. Non-blocking operation. Only defined for input ports. |
-| | | |
-+---+----------------+-----------------------------------------------------------------------------------------+
-| 4 | TX | Write a burst of input packets. Non-blocking operation. Only defined for output ports. |
-| | | |
-+---+----------------+-----------------------------------------------------------------------------------------+
-| 5 | Flush | Flush the output buffer. Only defined for output ports. |
-| | | |
-+---+----------------+-----------------------------------------------------------------------------------------+
+.. _table_qos_20:
+
+.. table:: 20 Port Abstract Interface
+
+ +---+----------------+-----------------------------------------------------------------------------------------+
+ | # | Port Operation | Description |
+ | | | |
+ +===+================+=========================================================================================+
+ | 1 | Create | Create the low-level port object (e.g. queue). Can internally allocate memory. |
+ | | | |
+ +---+----------------+-----------------------------------------------------------------------------------------+
+ | 2 | Free | Free the resources (e.g. memory) used by the low-level port object. |
+ | | | |
+ +---+----------------+-----------------------------------------------------------------------------------------+
+ | 3 | RX | Read a burst of input packets. Non-blocking operation. Only defined for input ports. |
+ | | | |
+ +---+----------------+-----------------------------------------------------------------------------------------+
+ | 4 | TX | Write a burst of input packets. Non-blocking operation. Only defined for output ports. |
+ | | | |
+ +---+----------------+-----------------------------------------------------------------------------------------+
+ | 5 | Flush | Flush the output buffer. Only defined for output ports. |
+ | | | |
+ +---+----------------+-----------------------------------------------------------------------------------------+
Table Library Design
--------------------
Table Types
~~~~~~~~~~~
-.. _pg_table_21:
-
-Table 21 is a non-exhaustive list of types of tables that can be implemented with the Packet Framework.
-
-**Table 21 Table Types**
-
-+---+----------------------------+-----------------------------------------------------------------------------+
-| # | Table Type | Description |
-| | | |
-+===+============================+=============================================================================+
-| 1 | Hash table | Lookup key is n-tuple based. |
-| | | |
-| | | Typically, the lookup key is hashed to produce a signature that is used to |
-| | | identify a bucket of entries where the lookup key is searched next. |
-| | | |
-| | | The signature associated with the lookup key of each input packet is either |
-| | | read from the packet descriptor (pre-computed signature) or computed at |
-| | | table lookup time. |
-| | | |
-| | | The table lookup, add entry and delete entry operations, as well as any |
-| | | other pipeline block that pre-computes the signature all have to use the |
-| | | same hashing algorithm to generate the signature. |
-| | | |
-| | | Typically used to implement flow classification tables, ARP caches, routing |
-| | | table for tunnelling protocols, etc. |
-| | | |
-+---+----------------------------+-----------------------------------------------------------------------------+
-| 2 | Longest Prefix Match (LPM) | Lookup key is the IP address. |
-| | | |
-| | | Each table entries has an associated IP prefix (IP and depth). |
-| | | |
-| | | The table lookup operation selects the IP prefix that is matched by the |
-| | | lookup key; in case of multiple matches, the entry with the longest prefix |
-| | | depth wins. |
-| | | |
-| | | Typically used to implement IP routing tables. |
-| | | |
-+---+----------------------------+-----------------------------------------------------------------------------+
-| 3 | Access Control List (ACLs) | Lookup key is 7-tuple of two VLAN/MPLS labels, IP destination address, |
-| | | IP source addresses, L4 protocol, L4 destination port, L4 source port. |
-| | | |
-| | | Each table entry has an associated ACL and priority. The ACL contains bit |
-| | | masks for the VLAN/MPLS labels, IP prefix for IP destination address, IP |
-| | | prefix for IP source addresses, L4 protocol and bitmask, L4 destination |
-| | | port and bit mask, L4 source port and bit mask. |
-| | | |
-| | | The table lookup operation selects the ACL that is matched by the lookup |
-| | | key; in case of multiple matches, the entry with the highest priority wins. |
-| | | |
-| | | Typically used to implement rule databases for firewalls, etc. |
-| | | |
-+---+----------------------------+-----------------------------------------------------------------------------+
-| 4 | Pattern matching search | Lookup key is the packet payload. |
-| | | |
-| | | Table is a database of patterns, with each pattern having a priority |
-| | | assigned. |
-| | | |
-| | | The table lookup operation selects the patterns that is matched by the |
-| | | input packet; in case of multiple matches, the matching pattern with the |
-| | | highest priority wins. |
-| | | |
-+---+----------------------------+-----------------------------------------------------------------------------+
-| 5 | Array | Lookup key is the table entry index itself. |
-| | | |
-+---+----------------------------+-----------------------------------------------------------------------------+
+:numref:`table_qos_21` is a non-exhaustive list of types of tables that can be implemented with the Packet Framework.
+
+.. _table_qos_21:
+
+.. table:: Table Types
+
+ +---+----------------------------+-----------------------------------------------------------------------------+
+ | # | Table Type | Description |
+ | | | |
+ +===+============================+=============================================================================+
+ | 1 | Hash table | Lookup key is n-tuple based. |
+ | | | |
+ | | | Typically, the lookup key is hashed to produce a signature that is used to |
+ | | | identify a bucket of entries where the lookup key is searched next. |
+ | | | |
+ | | | The signature associated with the lookup key of each input packet is either |
+ | | | read from the packet descriptor (pre-computed signature) or computed at |
+ | | | table lookup time. |
+ | | | |
+ | | | The table lookup, add entry and delete entry operations, as well as any |
+ | | | other pipeline block that pre-computes the signature all have to use the |
+ | | | same hashing algorithm to generate the signature. |
+ | | | |
+ | | | Typically used to implement flow classification tables, ARP caches, routing |
+ | | | table for tunnelling protocols, etc. |
+ | | | |
+ +---+----------------------------+-----------------------------------------------------------------------------+
+ | 2 | Longest Prefix Match (LPM) | Lookup key is the IP address. |
+ | | | |
+ | | | Each table entries has an associated IP prefix (IP and depth). |
+ | | | |
+ | | | The table lookup operation selects the IP prefix that is matched by the |
+ | | | lookup key; in case of multiple matches, the entry with the longest prefix |
+ | | | depth wins. |
+ | | | |
+ | | | Typically used to implement IP routing tables. |
+ | | | |
+ +---+----------------------------+-----------------------------------------------------------------------------+
+ | 3 | Access Control List (ACLs) | Lookup key is 7-tuple of two VLAN/MPLS labels, IP destination address, |
+ | | | IP source addresses, L4 protocol, L4 destination port, L4 source port. |
+ | | | |
+ | | | Each table entry has an associated ACL and priority. The ACL contains bit |
+ | | | masks for the VLAN/MPLS labels, IP prefix for IP destination address, IP |
+ | | | prefix for IP source addresses, L4 protocol and bitmask, L4 destination |
+ | | | port and bit mask, L4 source port and bit mask. |
+ | | | |
+ | | | The table lookup operation selects the ACL that is matched by the lookup |
+ | | | key; in case of multiple matches, the entry with the highest priority wins. |
+ | | | |
+ | | | Typically used to implement rule databases for firewalls, etc. |
+ | | | |
+ +---+----------------------------+-----------------------------------------------------------------------------+
+ | 4 | Pattern matching search | Lookup key is the packet payload. |
+ | | | |
+ | | | Table is a database of patterns, with each pattern having a priority |
+ | | | assigned. |
+ | | | |
+ | | | The table lookup operation selects the patterns that is matched by the |
+ | | | input packet; in case of multiple matches, the matching pattern with the |
+ | | | highest priority wins. |
+ | | | |
+ +---+----------------------------+-----------------------------------------------------------------------------+
+ | 5 | Array | Lookup key is the table entry index itself. |
+ | | | |
+ +---+----------------------------+-----------------------------------------------------------------------------+
Table Interface
~~~~~~~~~~~~~~~
Each table is required to implement an abstract interface that defines the initialization
and run-time operation of the table.
-The table abstract interface is described in Table 29.
-
-.. _pg_table_29_1:
-
-**Table 29 Table Abstract Interface**
-
-+---+-----------------+----------------------------------------------------------------------------------------+
-| # | Table operation | Description |
-| | | |
-+===+=================+========================================================================================+
-| 1 | Create | Create the low-level data structures of the lookup table. Can internally allocate |
-| | | memory. |
-| | | |
-+---+-----------------+----------------------------------------------------------------------------------------+
-| 2 | Free | Free up all the resources used by the lookup table. |
-| | | |
-+---+-----------------+----------------------------------------------------------------------------------------+
-| 3 | Add entry | Add new entry to the lookup table. |
-| | | |
-+---+-----------------+----------------------------------------------------------------------------------------+
-| 4 | Delete entry | Delete specific entry from the lookup table. |
-| | | |
-+---+-----------------+----------------------------------------------------------------------------------------+
-| 5 | Lookup | Look up a burst of input packets and return a bit mask specifying the result of the |
-| | | lookup operation for each packet: a set bit signifies lookup hit for the corresponding |
-| | | packet, while a cleared bit a lookup miss. |
-| | | |
-| | | For each lookup hit packet, the lookup operation also returns a pointer to the table |
-| | | entry that was hit, which contains the actions to be applied on the packet and any |
-| | | associated metadata. |
-| | | |
-| | | For each lookup miss packet, the actions to be applied on the packet and any |
-| | | associated metadata are specified by the default table entry preconfigured for lookup |
-| | | miss. |
-| | | |
-+---+-----------------+----------------------------------------------------------------------------------------+
+The table abstract interface is described in :numref:`table_qos_29_1`.
+
+.. _table_qos_29_1:
+
+.. table:: Table Abstract Interface
+
+ +---+-----------------+----------------------------------------------------------------------------------------+
+ | # | Table operation | Description |
+ | | | |
+ +===+=================+========================================================================================+
+ | 1 | Create | Create the low-level data structures of the lookup table. Can internally allocate |
+ | | | memory. |
+ | | | |
+ +---+-----------------+----------------------------------------------------------------------------------------+
+ | 2 | Free | Free up all the resources used by the lookup table. |
+ | | | |
+ +---+-----------------+----------------------------------------------------------------------------------------+
+ | 3 | Add entry | Add new entry to the lookup table. |
+ | | | |
+ +---+-----------------+----------------------------------------------------------------------------------------+
+ | 4 | Delete entry | Delete specific entry from the lookup table. |
+ | | | |
+ +---+-----------------+----------------------------------------------------------------------------------------+
+ | 5 | Lookup | Look up a burst of input packets and return a bit mask specifying the result of the |
+ | | | lookup operation for each packet: a set bit signifies lookup hit for the corresponding |
+ | | | packet, while a cleared bit a lookup miss. |
+ | | | |
+ | | | For each lookup hit packet, the lookup operation also returns a pointer to the table |
+ | | | entry that was hit, which contains the actions to be applied on the packet and any |
+ | | | associated metadata. |
+ | | | |
+ | | | For each lookup miss packet, the actions to be applied on the packet and any |
+ | | | associated metadata are specified by the default table entry preconfigured for lookup |
+ | | | miss. |
+ | | | |
+ +---+-----------------+----------------------------------------------------------------------------------------+
Hash Table Design
Hash Table Types
^^^^^^^^^^^^^^^^
-.. _pg_table_22:
-
-Table 22 lists the hash table configuration parameters shared by all different hash table types.
-
-**Table 22 Configuration Parameters Common for All Hash Table Types**
-
-+---+---------------------------+------------------------------------------------------------------------------+
-| # | Parameter | Details |
-| | | |
-+===+===========================+==============================================================================+
-| 1 | Key size | Measured as number of bytes. All keys have the same size. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 2 | Key value (key data) size | Measured as number of bytes. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 3 | Number of buckets | Needs to be a power of two. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 4 | Maximum number of keys | Needs to be a power of two. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 5 | Hash function | Examples: jhash, CRC hash, etc. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 6 | Hash function seed | Parameter to be passed to the hash function. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 7 | Key offset | Offset of the lookup key byte array within the packet meta-data stored in |
-| | | the packet buffer. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
+:numref:`table_qos_22` lists the hash table configuration parameters shared by all different hash table types.
+
+.. _table_qos_22:
+
+.. table:: Configuration Parameters Common for All Hash Table Types
+
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | # | Parameter | Details |
+ | | | |
+ +===+===========================+==============================================================================+
+ | 1 | Key size | Measured as number of bytes. All keys have the same size. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 2 | Key value (key data) size | Measured as number of bytes. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 3 | Number of buckets | Needs to be a power of two. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 4 | Maximum number of keys | Needs to be a power of two. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 5 | Hash function | Examples: jhash, CRC hash, etc. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 6 | Hash function seed | Parameter to be passed to the hash function. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 7 | Key offset | Offset of the lookup key byte array within the packet meta-data stored in |
+ | | | the packet buffer. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
Bucket Full Problem
"""""""""""""""""""
the search continues beyond the first group of 4 keys, potentially until all keys in this bucket are examined.
The extendable bucket logic requires maintaining specific data structures per table and per each bucket.
-.. _pg_table_23:
+.. _table_qos_23:
-**Table 23 Configuration Parameters Specific to Extendable Bucket Hash Table**
+.. table:: Configuration Parameters Specific to Extendible Bucket Hash Table
-+---+---------------------------+--------------------------------------------------+
-| # | Parameter | Details |
-| | | |
-+===+===========================+==================================================+
-| 1 | Number of additional keys | Needs to be a power of two, at least equal to 4. |
-| | | |
-+---+---------------------------+--------------------------------------------------+
+ +---+---------------------------+--------------------------------------------------+
+ | # | Parameter | Details |
+ | | | |
+ +===+===========================+==================================================+
+ | 1 | Number of additional keys | Needs to be a power of two, at least equal to 4. |
+ | | | |
+ +---+---------------------------+--------------------------------------------------+
Signature Computation
The same CPU core reads the key from the packet meta-data, uses it to compute the key signature
and also performs the bucket search step of the key lookup operation.
-.. _pg_table_24:
+.. _table_qos_24:
-**Table 24 Configuration Parameters Specific to Pre-computed Key Signature Hash Table**
+.. table:: Configuration Parameters Specific to Pre-computed Key Signature Hash Table
-+---+------------------+-----------------------------------------------------------------------+
-| # | Parameter | Details |
-| | | |
-+===+==================+=======================================================================+
-| 1 | Signature offset | Offset of the pre-computed key signature within the packet meta-data. |
-| | | |
-+---+------------------+-----------------------------------------------------------------------+
+ +---+------------------+-----------------------------------------------------------------------+
+ | # | Parameter | Details |
+ | | | |
+ +===+==================+=======================================================================+
+ | 1 | Signature offset | Offset of the pre-computed key signature within the packet meta-data. |
+ | | | |
+ +---+------------------+-----------------------------------------------------------------------+
Key Size Optimized Hash Tables
""""""""""""""""""""""""""""""
Configurable Key Size Hash Table
""""""""""""""""""""""""""""""""
-:numref:`figure_figure34`, Table 25 and Table 26 detail the main data structures used to implement configurable key size hash tables (either LRU or extendable bucket,
+:numref:`figure_figure34`, :numref:`table_qos_25` and :numref:`table_qos_26` detail the main data structures used to implement configurable key size hash tables (either LRU or extendable bucket,
either with pre-computed signature or "do-sig").
.. _figure_figure34:
Data Structures for Configurable Key Size Hash Tables
-.. _pg_table_25:
-
-**Table 25 Main Large Data Structures (Arrays) used for Configurable Key Size Hash Tables**
-
-+---+-------------------------+------------------------------+---------------------------+-------------------------------+
-| # | Array name | Number of entries | Entry size (bytes) | Description |
-| | | | | |
-+===+=========================+==============================+===========================+===============================+
-| 1 | Bucket array | n_buckets (configurable) | 32 | Buckets of the hash table. |
-| | | | | |
-+---+-------------------------+------------------------------+---------------------------+-------------------------------+
-| 2 | Bucket extensions array | n_buckets_ext (configurable) | 32 | This array is only created |
-| | | | | for extendable bucket tables. |
-| | | | | |
-+---+-------------------------+------------------------------+---------------------------+-------------------------------+
-| 3 | Key array | n_keys | key_size (configurable) | Keys added to the hash table. |
-| | | | | |
-+---+-------------------------+------------------------------+---------------------------+-------------------------------+
-| 4 | Data array | n_keys | entry_size (configurable) | Key values (key data) |
-| | | | | associated with the hash |
-| | | | | table keys. |
-| | | | | |
-+---+-------------------------+------------------------------+---------------------------+-------------------------------+
-
-.. _pg_table_26:
-
-**Table 26 Field Description for Bucket Array Entry (Configurable Key Size Hash Tables)**
-
-+---+------------------+--------------------+------------------------------------------------------------------+
-| # | Field name | Field size (bytes) | Description |
-| | | | |
-+===+==================+====================+==================================================================+
-| 1 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the |
-| | | | current bucket stored as array of 4 entries of 2 bytes each. |
-| | | | Entry 0 stores the index (0 .. 3) of the MRU key, while entry 3 |
-| | | | stores the index of the LRU key. |
-| | | | |
-| | | | For extendable bucket tables, this field represents the next |
-| | | | pointer (i.e. the pointer to the next group of 4 keys linked to |
-| | | | the current bucket). The next pointer is not NULL if the bucket |
-| | | | is currently extended or NULL otherwise. |
-| | | | To help the branchless implementation, bit 0 (least significant |
-| | | | bit) of this field is set to 1 if the next pointer is not NULL |
-| | | | and to 0 otherwise. |
-| | | | |
-+---+------------------+--------------------+------------------------------------------------------------------+
-| 2 | Sig[0 .. 3] | 4 x 2 | If key X (X = 0 .. 3) is valid, then sig X bits 15 .. 1 store |
-| | | | the most significant 15 bits of key X signature and sig X bit 0 |
-| | | | is set to 1. |
-| | | | |
-| | | | If key X is not valid, then sig X is set to zero. |
-| | | | |
-+---+------------------+--------------------+------------------------------------------------------------------+
-| 3 | Key Pos [0 .. 3] | 4 x 4 | If key X is valid (X = 0 .. 3), then Key Pos X represents the |
-| | | | index into the key array where key X is stored, as well as the |
-| | | | index into the data array where the value associated with key X |
-| | | | is stored. |
-| | | | |
-| | | | If key X is not valid, then the value of Key Pos X is undefined. |
-| | | | |
-+---+------------------+--------------------+------------------------------------------------------------------+
-
-
-:numref:`figure_figure35` and Table 27 detail the bucket search pipeline stages (either LRU or extendable bucket,
+.. _table_qos_25:
+
+.. table:: Main Large Data Structures (Arrays) used for Configurable Key Size Hash Tables
+
+ +---+-------------------------+------------------------------+---------------------------+-------------------------------+
+ | # | Array name | Number of entries | Entry size (bytes) | Description |
+ | | | | | |
+ +===+=========================+==============================+===========================+===============================+
+ | 1 | Bucket array | n_buckets (configurable) | 32 | Buckets of the hash table. |
+ | | | | | |
+ +---+-------------------------+------------------------------+---------------------------+-------------------------------+
+ | 2 | Bucket extensions array | n_buckets_ext (configurable) | 32 | This array is only created |
+ | | | | | for extendible bucket tables. |
+ | | | | | |
+ +---+-------------------------+------------------------------+---------------------------+-------------------------------+
+ | 3 | Key array | n_keys | key_size (configurable) | Keys added to the hash table. |
+ | | | | | |
+ +---+-------------------------+------------------------------+---------------------------+-------------------------------+
+ | 4 | Data array | n_keys | entry_size (configurable) | Key values (key data) |
+ | | | | | associated with the hash |
+ | | | | | table keys. |
+ | | | | | |
+ +---+-------------------------+------------------------------+---------------------------+-------------------------------+
+
+.. _table_qos_26:
+
+.. table:: Field Description for Bucket Array Entry (Configurable Key Size Hash Tables)
+
+ +---+------------------+--------------------+------------------------------------------------------------------+
+ | # | Field name | Field size (bytes) | Description |
+ | | | | |
+ +===+==================+====================+==================================================================+
+ | 1 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the |
+ | | | | current bucket stored as array of 4 entries of 2 bytes each. |
+ | | | | Entry 0 stores the index (0 .. 3) of the MRU key, while entry 3 |
+ | | | | stores the index of the LRU key. |
+ | | | | |
+ | | | | For extendible bucket tables, this field represents the next |
+ | | | | pointer (i.e. the pointer to the next group of 4 keys linked to |
+ | | | | the current bucket). The next pointer is not NULL if the bucket |
+ | | | | is currently extended or NULL otherwise. |
+ | | | | To help the branchless implementation, bit 0 (least significant |
+ | | | | bit) of this field is set to 1 if the next pointer is not NULL |
+ | | | | and to 0 otherwise. |
+ | | | | |
+ +---+------------------+--------------------+------------------------------------------------------------------+
+ | 2 | Sig[0 .. 3] | 4 x 2 | If key X (X = 0 .. 3) is valid, then sig X bits 15 .. 1 store |
+ | | | | the most significant 15 bits of key X signature and sig X bit 0 |
+ | | | | is set to 1. |
+ | | | | |
+ | | | | If key X is not valid, then sig X is set to zero. |
+ | | | | |
+ +---+------------------+--------------------+------------------------------------------------------------------+
+ | 3 | Key Pos [0 .. 3] | 4 x 4 | If key X is valid (X = 0 .. 3), then Key Pos X represents the |
+ | | | | index into the key array where key X is stored, as well as the |
+ | | | | index into the data array where the value associated with key X |
+ | | | | is stored. |
+ | | | | |
+ | | | | If key X is not valid, then the value of Key Pos X is undefined. |
+ | | | | |
+ +---+------------------+--------------------+------------------------------------------------------------------+
+
+
+:numref:`figure_figure35` and :numref:`table_qos_27` detail the bucket search pipeline stages (either LRU or extendable bucket,
either with pre-computed signature or "do-sig").
For each pipeline stage, the described operations are applied to each of the two packets handled by that stage.
Tables)
-.. _pg_table_27:
-
-**Table 27 Description of the Bucket Search Pipeline Stages (Configurable Key Size Hash Tables)**
-
-+---+---------------------------+------------------------------------------------------------------------------+
-| # | Stage name | Description |
-| | | |
-+===+===========================+==============================================================================+
-| 0 | Prefetch packet meta-data | Select next two packets from the burst of input packets. |
-| | | |
-| | | Prefetch packet meta-data containing the key and key signature. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 1 | Prefetch table bucket | Read the key signature from the packet meta-data (for extendable bucket hash |
-| | | tables) or read the key from the packet meta-data and compute key signature |
-| | | (for LRU tables). |
-| | | |
-| | | Identify the bucket ID using the key signature. |
-| | | |
-| | | Set bit 0 of the signature to 1 (to match only signatures of valid keys from |
-| | | the table). |
-| | | |
-| | | Prefetch the bucket. |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 2 | Prefetch table key | Read the key signatures from the bucket. |
-| | | |
-| | | Compare the signature of the input key against the 4 key signatures from the |
-| | | packet. As result, the following is obtained: |
-| | | |
-| | | *match* |
-| | | = equal to TRUE if there was at least one signature match and to FALSE in |
-| | | the case of no signature match; |
-| | | |
-| | | *match_many* |
-| | | = equal to TRUE is there were more than one signature matches (can be up to |
-| | | 4 signature matches in the worst case scenario) and to FALSE otherwise; |
-| | | |
-| | | *match_pos* |
-| | | = the index of the first key that produced signature match (only valid if |
-| | | match is true). |
-| | | |
-| | | For extendable bucket hash tables only, set |
-| | | *match_many* |
-| | | to TRUE if next pointer is valid. |
-| | | |
-| | | Prefetch the bucket key indicated by |
-| | | *match_pos* |
-| | | (even if |
-| | | *match_pos* |
-| | | does not point to valid key valid). |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
-| 3 | Prefetch table data | Read the bucket key indicated by |
-| | | *match_pos*. |
-| | | |
-| | | Compare the bucket key against the input key. As result, the following is |
-| | | obtained: |
-| | | *match_key* |
-| | | = equal to TRUE if the two keys match and to FALSE otherwise. |
-| | | |
-| | | Report input key as lookup hit only when both |
-| | | *match* |
-| | | and |
-| | | *match_key* |
-| | | are equal to TRUE and as lookup miss otherwise. |
-| | | |
-| | | For LRU tables only, use branchless logic to update the bucket LRU list |
-| | | (the current key becomes the new MRU) only on lookup hit. |
-| | | |
-| | | Prefetch the key value (key data) associated with the current key (to avoid |
-| | | branches, this is done on both lookup hit and miss). |
-| | | |
-+---+---------------------------+------------------------------------------------------------------------------+
+.. _table_qos_27:
+
+.. table:: Description of the Bucket Search Pipeline Stages (Configurable Key Size Hash Tables)
+
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | # | Stage name | Description |
+ | | | |
+ +===+===========================+==============================================================================+
+ | 0 | Prefetch packet meta-data | Select next two packets from the burst of input packets. |
+ | | | |
+ | | | Prefetch packet meta-data containing the key and key signature. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 1 | Prefetch table bucket | Read the key signature from the packet meta-data (for extendable bucket hash |
+ | | | tables) or read the key from the packet meta-data and compute key signature |
+ | | | (for LRU tables). |
+ | | | |
+ | | | Identify the bucket ID using the key signature. |
+ | | | |
+ | | | Set bit 0 of the signature to 1 (to match only signatures of valid keys from |
+ | | | the table). |
+ | | | |
+ | | | Prefetch the bucket. |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 2 | Prefetch table key | Read the key signatures from the bucket. |
+ | | | |
+ | | | Compare the signature of the input key against the 4 key signatures from the |
+ | | | packet. As result, the following is obtained: |
+ | | | |
+ | | | *match* |
+ | | | = equal to TRUE if there was at least one signature match and to FALSE in |
+ | | | the case of no signature match; |
+ | | | |
+ | | | *match_many* |
+ | | | = equal to TRUE is there were more than one signature matches (can be up to |
+ | | | 4 signature matches in the worst case scenario) and to FALSE otherwise; |
+ | | | |
+ | | | *match_pos* |
+ | | | = the index of the first key that produced signature match (only valid if |
+ | | | match is true). |
+ | | | |
+ | | | For extendable bucket hash tables only, set |
+ | | | *match_many* |
+ | | | to TRUE if next pointer is valid. |
+ | | | |
+ | | | Prefetch the bucket key indicated by |
+ | | | *match_pos* |
+ | | | (even if |
+ | | | *match_pos* |
+ | | | does not point to valid key valid). |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
+ | 3 | Prefetch table data | Read the bucket key indicated by |
+ | | | *match_pos*. |
+ | | | |
+ | | | Compare the bucket key against the input key. As result, the following is |
+ | | | obtained: |
+ | | | *match_key* |
+ | | | = equal to TRUE if the two keys match and to FALSE otherwise. |
+ | | | |
+ | | | Report input key as lookup hit only when both |
+ | | | *match* |
+ | | | and |
+ | | | *match_key* |
+ | | | are equal to TRUE and as lookup miss otherwise. |
+ | | | |
+ | | | For LRU tables only, use branchless logic to update the bucket LRU list |
+ | | | (the current key becomes the new MRU) only on lookup hit. |
+ | | | |
+ | | | Prefetch the key value (key data) associated with the current key (to avoid |
+ | | | branches, this is done on both lookup hit and miss). |
+ | | | |
+ +---+---------------------------+------------------------------------------------------------------------------+
Additional notes:
**Key Signature Comparison Logic**
-The key signature comparison logic is described in Table 28.
-
-.. _pg_table_28:
-
-**Table 28 Lookup Tables for Match, Match_Many and Match_Pos**
-
-+----+------+---------------+--------------------+--------------------+
-| # | mask | match (1 bit) | match_many (1 bit) | match_pos (2 bits) |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 0 | 0000 | 0 | 0 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 1 | 0001 | 1 | 0 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 2 | 0010 | 1 | 0 | 01 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 3 | 0011 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 4 | 0100 | 1 | 0 | 10 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 5 | 0101 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 6 | 0110 | 1 | 1 | 01 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 7 | 0111 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 8 | 1000 | 1 | 0 | 11 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 9 | 1001 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 10 | 1010 | 1 | 1 | 01 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 11 | 1011 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 12 | 1100 | 1 | 1 | 10 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 13 | 1101 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 14 | 1110 | 1 | 1 | 01 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
-| 15 | 1111 | 1 | 1 | 00 |
-| | | | | |
-+----+------+---------------+--------------------+--------------------+
+The key signature comparison logic is described in :numref:`table_qos_28`.
+
+.. _table_qos_28:
+
+.. table:: Lookup Tables for Match, Match_Many and Match_Pos
+
+ +----+------+---------------+--------------------+--------------------+
+ | # | mask | match (1 bit) | match_many (1 bit) | match_pos (2 bits) |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 0 | 0000 | 0 | 0 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 1 | 0001 | 1 | 0 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 2 | 0010 | 1 | 0 | 01 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 3 | 0011 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 4 | 0100 | 1 | 0 | 10 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 5 | 0101 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 6 | 0110 | 1 | 1 | 01 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 7 | 0111 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 8 | 1000 | 1 | 0 | 11 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 9 | 1001 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 10 | 1010 | 1 | 1 | 01 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 11 | 1011 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 12 | 1100 | 1 | 1 | 10 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 13 | 1101 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 14 | 1110 | 1 | 1 | 01 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
+ | 15 | 1111 | 1 | 1 | 00 |
+ | | | | | |
+ +----+------+---------------+--------------------+--------------------+
The input *mask* hash bit X (X = 0 .. 3) set to 1 if input signature is equal to bucket signature X and set to 0 otherwise.
The outputs *match*, *match_many* and *match_pos* are 1 bit, 1 bit and 2 bits in size respectively and their meaning has been explained above.
-As displayed in Table 29, the lookup tables for *match* and *match_many* can be collapsed into a single 32-bit value and the lookup table for
+As displayed in :numref:`table_qos_29`, the lookup tables for *match* and *match_many* can be collapsed into a single 32-bit value and the lookup table for
*match_pos* can be collapsed into a 64-bit value.
Given the input *mask*, the values for *match*, *match_many* and *match_pos* can be obtained by indexing their respective bit array to extract 1 bit,
1 bit and 2 bits respectively with branchless logic.
-.. _pg_table_29:
+.. _table_qos_29:
-**Table 29 Collapsed Lookup Tables for Match, Match_Many and Match_Pos**
+.. table:: Collapsed Lookup Tables for Match, Match_Many and Match_Pos
-+------------+------------------------------------------+-------------------+
-| | Bit array | Hexadecimal value |
-| | | |
-+------------+------------------------------------------+-------------------+
-| match | 1111_1111_1111_1110 | 0xFFFELLU |
-| | | |
-+------------+------------------------------------------+-------------------+
-| match_many | 1111_1110_1110_1000 | 0xFEE8LLU |
-| | | |
-+------------+------------------------------------------+-------------------+
-| match_pos | 0001_0010_0001_0011__0001_0010_0001_0000 | 0x12131210LLU |
-| | | |
-+------------+------------------------------------------+-------------------+
+ +------------+------------------------------------------+-------------------+
+ | | Bit array | Hexadecimal value |
+ | | | |
+ +------------+------------------------------------------+-------------------+
+ | match | 1111_1111_1111_1110 | 0xFFFELLU |
+ | | | |
+ +------------+------------------------------------------+-------------------+
+ | match_many | 1111_1110_1110_1000 | 0xFEE8LLU |
+ | | | |
+ +------------+------------------------------------------+-------------------+
+ | match_pos | 0001_0010_0001_0011__0001_0010_0001_0000 | 0x12131210LLU |
+ | | | |
+ +------------+------------------------------------------+-------------------+
The pseudo-code for match, match_many and match_pos is::
Single Key Size Hash Tables
"""""""""""""""""""""""""""
-:numref:`figure_figure37`, :numref:`figure_figure38`, Table 30 and 31 detail the main data structures used to implement 8-byte and 16-byte key hash tables
+:numref:`figure_figure37`, :numref:`figure_figure38`, :numref:`table_qos_30` and :numref:`table_qos_31` detail the main data structures used to implement 8-byte and 16-byte key hash tables
(either LRU or extendable bucket, either with pre-computed signature or "do-sig").
.. _figure_figure37:
Data Structures for 16-byte Key Hash Tables
-.. _pg_table_30:
-
-**Table 30 Main Large Data Structures (Arrays) used for 8-byte and 16-byte Key Size Hash Tables**
-
-+---+-------------------------+------------------------------+----------------------+------------------------------------+
-| # | Array name | Number of entries | Entry size (bytes) | Description |
-| | | | | |
-+===+=========================+==============================+======================+====================================+
-| 1 | Bucket array | n_buckets (configurable) | *8-byte key size:* | Buckets of the hash table. |
-| | | | | |
-| | | | 64 + 4 x entry_size | |
-| | | | | |
-| | | | | |
-| | | | *16-byte key size:* | |
-| | | | | |
-| | | | 128 + 4 x entry_size | |
-| | | | | |
-+---+-------------------------+------------------------------+----------------------+------------------------------------+
-| 2 | Bucket extensions array | n_buckets_ext (configurable) | *8-byte key size:* | This array is only created for |
-| | | | | extendable bucket tables. |
-| | | | | |
-| | | | 64 + 4 x entry_size | |
-| | | | | |
-| | | | | |
-| | | | *16-byte key size:* | |
-| | | | | |
-| | | | 128 + 4 x entry_size | |
-| | | | | |
-+---+-------------------------+------------------------------+----------------------+------------------------------------+
-
-.. _pg_table_31:
-
-**Table 31 Field Description for Bucket Array Entry (8-byte and 16-byte Key Hash Tables)**
-
-+---+---------------+--------------------+-------------------------------------------------------------------------------+
-| # | Field name | Field size (bytes) | Description |
-| | | | |
-+===+===============+====================+===============================================================================+
-| 1 | Valid | 8 | Bit X (X = 0 .. 3) is set to 1 if key X is valid or to 0 otherwise. |
-| | | | |
-| | | | Bit 4 is only used for extendable bucket tables to help with the |
-| | | | implementation of the branchless logic. In this case, bit 4 is set to 1 if |
-| | | | next pointer is valid (not NULL) or to 0 otherwise. |
-| | | | |
-+---+---------------+--------------------+-------------------------------------------------------------------------------+
-| 2 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the current bucket |
-| | | | stored as array of 4 entries of 2 bytes each. Entry 0 stores the index |
-| | | | (0 .. 3) of the MRU key, while entry 3 stores the index of the LRU key. |
-| | | | |
-| | | | For extendable bucket tables, this field represents the next pointer (i.e. |
-| | | | the pointer to the next group of 4 keys linked to the current bucket). The |
-| | | | next pointer is not NULL if the bucket is currently extended or NULL |
-| | | | otherwise. |
-| | | | |
-+---+---------------+--------------------+-------------------------------------------------------------------------------+
-| 3 | Key [0 .. 3] | 4 x key_size | Full keys. |
-| | | | |
-+---+---------------+--------------------+-------------------------------------------------------------------------------+
-| 4 | Data [0 .. 3] | 4 x entry_size | Full key values (key data) associated with keys 0 .. 3. |
-| | | | |
-+---+---------------+--------------------+-------------------------------------------------------------------------------+
+.. _table_qos_30:
+
+.. table:: Main Large Data Structures (Arrays) used for 8-byte and 16-byte Key Size Hash Tables
+
+ +---+-------------------------+------------------------------+----------------------+------------------------------------+
+ | # | Array name | Number of entries | Entry size (bytes) | Description |
+ | | | | | |
+ +===+=========================+==============================+======================+====================================+
+ | 1 | Bucket array | n_buckets (configurable) | *8-byte key size:* | Buckets of the hash table. |
+ | | | | | |
+ | | | | 64 + 4 x entry_size | |
+ | | | | | |
+ | | | | | |
+ | | | | *16-byte key size:* | |
+ | | | | | |
+ | | | | 128 + 4 x entry_size | |
+ | | | | | |
+ +---+-------------------------+------------------------------+----------------------+------------------------------------+
+ | 2 | Bucket extensions array | n_buckets_ext (configurable) | *8-byte key size:* | This array is only created for |
+ | | | | | extendible bucket tables. |
+ | | | | | |
+ | | | | 64 + 4 x entry_size | |
+ | | | | | |
+ | | | | | |
+ | | | | *16-byte key size:* | |
+ | | | | | |
+ | | | | 128 + 4 x entry_size | |
+ | | | | | |
+ +---+-------------------------+------------------------------+----------------------+------------------------------------+
+
+.. _table_qos_31:
+
+.. table:: Field Description for Bucket Array Entry (8-byte and 16-byte Key Hash Tables)
+
+ +---+---------------+--------------------+-------------------------------------------------------------------------------+
+ | # | Field name | Field size (bytes) | Description |
+ | | | | |
+ +===+===============+====================+===============================================================================+
+ | 1 | Valid | 8 | Bit X (X = 0 .. 3) is set to 1 if key X is valid or to 0 otherwise. |
+ | | | | |
+ | | | | Bit 4 is only used for extendible bucket tables to help with the |
+ | | | | implementation of the branchless logic. In this case, bit 4 is set to 1 if |
+ | | | | next pointer is valid (not NULL) or to 0 otherwise. |
+ | | | | |
+ +---+---------------+--------------------+-------------------------------------------------------------------------------+
+ | 2 | Next Ptr/LRU | 8 | For LRU tables, this fields represents the LRU list for the current bucket |
+ | | | | stored as array of 4 entries of 2 bytes each. Entry 0 stores the index |
+ | | | | (0 .. 3) of the MRU key, while entry 3 stores the index of the LRU key. |
+ | | | | |
+ | | | | For extendible bucket tables, this field represents the next pointer (i.e. |
+ | | | | the pointer to the next group of 4 keys linked to the current bucket). The |
+ | | | | next pointer is not NULL if the bucket is currently extended or NULL |
+ | | | | otherwise. |
+ | | | | |
+ +---+---------------+--------------------+-------------------------------------------------------------------------------+
+ | 3 | Key [0 .. 3] | 4 x key_size | Full keys. |
+ | | | | |
+ +---+---------------+--------------------+-------------------------------------------------------------------------------+
+ | 4 | Data [0 .. 3] | 4 x entry_size | Full key values (key data) associated with keys 0 .. 3. |
+ | | | | |
+ +---+---------------+--------------------+-------------------------------------------------------------------------------+
and detail the bucket search pipeline used to implement 8-byte and 16-byte key hash tables (either LRU or extendable bucket,
either with pre-computed signature or "do-sig").
Tables)
-.. _pg_table_32:
-
-**Table 32 Description of the Bucket Search Pipeline Stages (8-byte and 16-byte Key Hash Tables)**
-
-+---+---------------------------+-----------------------------------------------------------------------------+
-| # | Stage name | Description |
-| | | |
-+===+===========================+=============================================================================+
-| 0 | Prefetch packet meta-data | #. Select next two packets from the burst of input packets. |
-| | | |
-| | | #. Prefetch packet meta-data containing the key and key signature. |
-| | | |
-+---+---------------------------+-----------------------------------------------------------------------------+
-| 1 | Prefetch table bucket | #. Read the key signature from the packet meta-data (for extendable bucket |
-| | | hash tables) or read the key from the packet meta-data and compute key |
-| | | signature (for LRU tables). |
-| | | |
-| | | #. Identify the bucket ID using the key signature. |
-| | | |
-| | | #. Prefetch the bucket. |
-| | | |
-+---+---------------------------+-----------------------------------------------------------------------------+
-| 2 | Prefetch table data | #. Read the bucket. |
-| | | |
-| | | #. Compare all 4 bucket keys against the input key. |
-| | | |
-| | | #. Report input key as lookup hit only when a match is identified (more |
-| | | than one key match is not possible) |
-| | | |
-| | | #. For LRU tables only, use branchless logic to update the bucket LRU list |
-| | | (the current key becomes the new MRU) only on lookup hit. |
-| | | |
-| | | #. Prefetch the key value (key data) associated with the matched key (to |
-| | | avoid branches, this is done on both lookup hit and miss). |
-| | | |
-+---+---------------------------+-----------------------------------------------------------------------------+
+.. _table_qos_32:
+
+.. table:: Description of the Bucket Search Pipeline Stages (8-byte and 16-byte Key Hash Tables)
+
+ +---+---------------------------+-----------------------------------------------------------------------------+
+ | # | Stage name | Description |
+ | | | |
+ +===+===========================+=============================================================================+
+ | 0 | Prefetch packet meta-data | #. Select next two packets from the burst of input packets. |
+ | | | |
+ | | | #. Prefetch packet meta-data containing the key and key signature. |
+ | | | |
+ +---+---------------------------+-----------------------------------------------------------------------------+
+ | 1 | Prefetch table bucket | #. Read the key signature from the packet meta-data (for extendable bucket |
+ | | | hash tables) or read the key from the packet meta-data and compute key |
+ | | | signature (for LRU tables). |
+ | | | |
+ | | | #. Identify the bucket ID using the key signature. |
+ | | | |
+ | | | #. Prefetch the bucket. |
+ | | | |
+ +---+---------------------------+-----------------------------------------------------------------------------+
+ | 2 | Prefetch table data | #. Read the bucket. |
+ | | | |
+ | | | #. Compare all 4 bucket keys against the input key. |
+ | | | |
+ | | | #. Report input key as lookup hit only when a match is identified (more |
+ | | | than one key match is not possible) |
+ | | | |
+ | | | #. For LRU tables only, use branchless logic to update the bucket LRU list |
+ | | | (the current key becomes the new MRU) only on lookup hit. |
+ | | | |
+ | | | #. Prefetch the key value (key data) associated with the matched key (to |
+ | | | avoid branches, this is done on both lookup hit and miss). |
+ | | | |
+ +---+---------------------------+-----------------------------------------------------------------------------+
Additional notes:
through the table action handler configuration.
A special category of the reserved actions is represented by the next hop actions, which regulate the packet flow between input ports,
tables and output ports through the pipeline.
-Table 33 lists the next hop actions.
-
-.. _pg_table_33:
-
-**Table 33 Next Hop Actions (Reserved)**
-
-+---+---------------------+-----------------------------------------------------------------------------------+
-| # | Next hop action | Description |
-| | | |
-+===+=====================+===================================================================================+
-| 1 | Drop | Drop the current packet. |
-| | | |
-+---+---------------------+-----------------------------------------------------------------------------------+
-| 2 | Send to output port | Send the current packet to specified output port. The output port ID is metadata |
-| | | stored in the same table entry. |
-| | | |
-+---+---------------------+-----------------------------------------------------------------------------------+
-| 3 | Send to table | Send the current packet to specified table. The table ID is metadata stored in |
-| | | the same table entry. |
-| | | |
-+---+---------------------+-----------------------------------------------------------------------------------+
+:numref:`table_qos_33` lists the next hop actions.
+
+.. _table_qos_33:
+
+.. table:: Next Hop Actions (Reserved)
+
+ +---+---------------------+-----------------------------------------------------------------------------------+
+ | # | Next hop action | Description |
+ | | | |
+ +===+=====================+===================================================================================+
+ | 1 | Drop | Drop the current packet. |
+ | | | |
+ +---+---------------------+-----------------------------------------------------------------------------------+
+ | 2 | Send to output port | Send the current packet to specified output port. The output port ID is metadata |
+ | | | stored in the same table entry. |
+ | | | |
+ +---+---------------------+-----------------------------------------------------------------------------------+
+ | 3 | Send to table | Send the current packet to specified table. The table ID is metadata stored in |
+ | | | the same table entry. |
+ | | | |
+ +---+---------------------+-----------------------------------------------------------------------------------+
User Actions
^^^^^^^^^^^^
Within the same table, all the table entries (including the table default entry) share the same definition
for the user actions and their associated meta-data,
with each table entry having its own set of enabled user actions and its own copy of the action meta-data.
-Table 34 contains a non-exhaustive list of user action examples.
-
-.. _pg_table_34:
-
-**Table 34 User Action Examples**
-
-+---+-----------------------------------+---------------------------------------------------------------------+
-| # | User action | Description |
-| | | |
-+===+===================================+=====================================================================+
-| 1 | Metering | Per flow traffic metering using the srTCM and trTCM algorithms. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
-| 2 | Statistics | Update the statistics counters maintained per flow. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
-| 3 | App ID | Per flow state machine fed by variable length sequence of packets |
-| | | at the flow initialization with the purpose of identifying the |
-| | | traffic type and application. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
-| 4 | Push/pop labels | Push/pop VLAN/MPLS labels to/from the current packet. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
-| 5 | Network Address Translation (NAT) | Translate between the internal (LAN) and external (WAN) IP |
-| | | destination/source address and/or L4 protocol destination/source |
-| | | port. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
-| 6 | TTL update | Decrement IP TTL and, in case of IPv4 packets, update the IP |
-| | | checksum. |
-| | | |
-+---+-----------------------------------+---------------------------------------------------------------------+
+:numref:`table_qos_34` contains a non-exhaustive list of user action examples.
+
+.. _table_qos_34:
+
+.. table:: User Action Examples
+
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | # | User action | Description |
+ | | | |
+ +===+===================================+=====================================================================+
+ | 1 | Metering | Per flow traffic metering using the srTCM and trTCM algorithms. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | 2 | Statistics | Update the statistics counters maintained per flow. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | 3 | App ID | Per flow state machine fed by variable length sequence of packets |
+ | | | at the flow initialization with the purpose of identifying the |
+ | | | traffic type and application. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | 4 | Push/pop labels | Push/pop VLAN/MPLS labels to/from the current packet. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | 5 | Network Address Translation (NAT) | Translate between the internal (LAN) and external (WAN) IP |
+ | | | destination/source address and/or L4 protocol destination/source |
+ | | | port. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
+ | 6 | TTL update | Decrement IP TTL and, in case of IPv4 packets, update the IP |
+ | | | checksum. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
Multicore Scaling
-----------------
The main blocks implementing QoS in this pipeline are: the policer, the dropper and the scheduler.
A functional description of each block is provided in the following table.
-.. _pg_table_1:
-
-**Table 1. Packet Processing Pipeline Implementing QoS**
-
-+---+------------------------+--------------------------------------------------------------------------------+
-| # | Block | Functional Description |
-| | | |
-+===+========================+================================================================================+
-| 1 | Packet I/O RX & TX | Packet reception/ transmission from/to multiple NIC ports. Poll mode drivers |
-| | | (PMDs) for Intel 1 GbE/10 GbE NICs. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 2 | Packet parser | Identify the protocol stack of the input packet. Check the integrity of the |
-| | | packet headers. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 3 | Flow classification | Map the input packet to one of the known traffic flows. Exact match table |
-| | | lookup using configurable hash function (jhash, CRC and so on) and bucket |
-| | | logic to handle collisions. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 4 | Policer | Packet metering using srTCM (RFC 2697) or trTCM (RFC2698) algorithms. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 5 | Load Balancer | Distribute the input packets to the application workers. Provide uniform load |
-| | | to each worker. Preserve the affinity of traffic flows to workers and the |
-| | | packet order within each flow. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 6 | Worker threads | Placeholders for the customer specific application workload (for example, IP |
-| | | stack and so on). |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 7 | Dropper | Congestion management using the Random Early Detection (RED) algorithm |
-| | | (specified by the Sally Floyd - Van Jacobson paper) or Weighted RED (WRED). |
-| | | Drop packets based on the current scheduler queue load level and packet |
-| | | priority. When congestion is experienced, lower priority packets are dropped |
-| | | first. |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
-| 8 | Hierarchical Scheduler | 5-level hierarchical scheduler (levels are: output port, subport, pipe, |
-| | | traffic class and queue) with thousands (typically 64K) leaf nodes (queues). |
-| | | Implements traffic shaping (for subport and pipe levels), strict priority |
-| | | (for traffic class level) and Weighted Round Robin (WRR) (for queues within |
-| | | each pipe traffic class). |
-| | | |
-+---+------------------------+--------------------------------------------------------------------------------+
+.. _table_qos_1:
+
+.. table:: Packet Processing Pipeline Implementing QoS
+
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | # | Block | Functional Description |
+ | | | |
+ +===+========================+================================================================================+
+ | 1 | Packet I/O RX & TX | Packet reception/ transmission from/to multiple NIC ports. Poll mode drivers |
+ | | | (PMDs) for Intel 1 GbE/10 GbE NICs. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 2 | Packet parser | Identify the protocol stack of the input packet. Check the integrity of the |
+ | | | packet headers. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 3 | Flow classification | Map the input packet to one of the known traffic flows. Exact match table |
+ | | | lookup using configurable hash function (jhash, CRC and so on) and bucket |
+ | | | logic to handle collisions. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 4 | Policer | Packet metering using srTCM (RFC 2697) or trTCM (RFC2698) algorithms. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 5 | Load Balancer | Distribute the input packets to the application workers. Provide uniform load |
+ | | | to each worker. Preserve the affinity of traffic flows to workers and the |
+ | | | packet order within each flow. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 6 | Worker threads | Placeholders for the customer specific application workload (for example, IP |
+ | | | stack and so on). |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 7 | Dropper | Congestion management using the Random Early Detection (RED) algorithm |
+ | | | (specified by the Sally Floyd - Van Jacobson paper) or Weighted RED (WRED). |
+ | | | Drop packets based on the current scheduler queue load level and packet |
+ | | | priority. When congestion is experienced, lower priority packets are dropped |
+ | | | first. |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
+ | 8 | Hierarchical Scheduler | 5-level hierarchical scheduler (levels are: output port, subport, pipe, |
+ | | | traffic class and queue) with thousands (typically 64K) leaf nodes (queues). |
+ | | | Implements traffic shaping (for subport and pipe levels), strict priority |
+ | | | (for traffic class level) and Weighted Round Robin (WRR) (for queues within |
+ | | | each pipe traffic class). |
+ | | | |
+ +---+------------------------+--------------------------------------------------------------------------------+
The infrastructure blocks used throughout the packet processing pipeline are listed in the following table.
-.. _pg_table_2:
+.. _table_qos_2:
-**Table 2. Infrastructure Blocks Used by the Packet Processing Pipeline**
+.. table:: Infrastructure Blocks Used by the Packet Processing Pipeline
-+---+-----------------------+-----------------------------------------------------------------------+
-| # | Block | Functional Description |
-| | | |
-+===+=======================+=======================================================================+
-| 1 | Buffer manager | Support for global buffer pools and private per-thread buffer caches. |
-| | | |
-+---+-----------------------+-----------------------------------------------------------------------+
-| 2 | Queue manager | Support for message passing between pipeline blocks. |
-| | | |
-+---+-----------------------+-----------------------------------------------------------------------+
-| 3 | Power saving | Support for power saving during low activity periods. |
-| | | |
-+---+-----------------------+-----------------------------------------------------------------------+
+ +---+-----------------------+-----------------------------------------------------------------------+
+ | # | Block | Functional Description |
+ | | | |
+ +===+=======================+=======================================================================+
+ | 1 | Buffer manager | Support for global buffer pools and private per-thread buffer caches. |
+ | | | |
+ +---+-----------------------+-----------------------------------------------------------------------+
+ | 2 | Queue manager | Support for message passing between pipeline blocks. |
+ | | | |
+ +---+-----------------------+-----------------------------------------------------------------------+
+ | 3 | Power saving | Support for power saving during low activity periods. |
+ | | | |
+ +---+-----------------------+-----------------------------------------------------------------------+
The mapping of pipeline blocks to CPU cores is configurable based on the performance level required by each specific application
and the set of features enabled for each block.
The functionality of each hierarchical level is detailed in the following table.
-.. _pg_table_3:
-
-**Table 3. Port Scheduling Hierarchy**
-
-+---+--------------------+----------------------------+---------------------------------------------------------------+
-| # | Level | Siblings per Parent | Functional Description |
-| | | | |
-+===+====================+============================+===============================================================+
-| 1 | Port | - | #. Output Ethernet port 1/10/40 GbE. |
-| | | | |
-| | | | #. Multiple ports are scheduled in round robin order with |
-| | | | all ports having equal priority. |
-| | | | |
-+---+--------------------+----------------------------+---------------------------------------------------------------+
-| 2 | Subport | Configurable (default: 8) | #. Traffic shaping using token bucket algorithm (one token |
-| | | | bucket per subport). |
-| | | | |
-| | | | #. Upper limit enforced per Traffic Class (TC) at the |
-| | | | subport level. |
-| | | | |
-| | | | #. Lower priority TCs able to reuse subport bandwidth |
-| | | | currently unused by higher priority TCs. |
-| | | | |
-+---+--------------------+----------------------------+---------------------------------------------------------------+
-| 3 | Pipe | Configurable (default: 4K) | #. Traffic shaping using the token bucket algorithm (one |
-| | | | token bucket per pipe. |
-| | | | |
-+---+--------------------+----------------------------+---------------------------------------------------------------+
-| 4 | Traffic Class (TC) | 4 | #. TCs of the same pipe handled in strict priority order. |
-| | | | |
-| | | | #. Upper limit enforced per TC at the pipe level. |
-| | | | |
-| | | | #. Lower priority TCs able to reuse pipe bandwidth currently |
-| | | | unused by higher priority TCs. |
-| | | | |
-| | | | #. When subport TC is oversubscribed (configuration time |
-| | | | event), pipe TC upper limit is capped to a dynamically |
-| | | | adjusted value that is shared by all the subport pipes. |
-| | | | |
-+---+--------------------+----------------------------+---------------------------------------------------------------+
-| 5 | Queue | 4 | #. Queues of the same TC are serviced using Weighted Round |
-| | | | Robin (WRR) according to predefined weights. |
-| | | | |
-+---+--------------------+----------------------------+---------------------------------------------------------------+
+.. _table_qos_3:
+
+.. table:: Port Scheduling Hierarchy
+
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
+ | # | Level | Siblings per Parent | Functional Description |
+ | | | | |
+ +===+====================+============================+===============================================================+
+ | 1 | Port | - | #. Output Ethernet port 1/10/40 GbE. |
+ | | | | |
+ | | | | #. Multiple ports are scheduled in round robin order with |
+ | | | | all ports having equal priority. |
+ | | | | |
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
+ | 2 | Subport | Configurable (default: 8) | #. Traffic shaping using token bucket algorithm (one token |
+ | | | | bucket per subport). |
+ | | | | |
+ | | | | #. Upper limit enforced per Traffic Class (TC) at the |
+ | | | | subport level. |
+ | | | | |
+ | | | | #. Lower priority TCs able to reuse subport bandwidth |
+ | | | | currently unused by higher priority TCs. |
+ | | | | |
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
+ | 3 | Pipe | Configurable (default: 4K) | #. Traffic shaping using the token bucket algorithm (one |
+ | | | | token bucket per pipe. |
+ | | | | |
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
+ | 4 | Traffic Class (TC) | 4 | #. TCs of the same pipe handled in strict priority order. |
+ | | | | |
+ | | | | #. Upper limit enforced per TC at the pipe level. |
+ | | | | |
+ | | | | #. Lower priority TCs able to reuse pipe bandwidth currently |
+ | | | | unused by higher priority TCs. |
+ | | | | |
+ | | | | #. When subport TC is oversubscribed (configuration time |
+ | | | | event), pipe TC upper limit is capped to a dynamically |
+ | | | | adjusted value that is shared by all the subport pipes. |
+ | | | | |
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
+ | 5 | Queue | 4 | #. Queues of the same TC are serviced using Weighted Round |
+ | | | | Robin (WRR) according to predefined weights. |
+ | | | | |
+ +---+--------------------+----------------------------+---------------------------------------------------------------+
Application Programming Interface (API)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Internal Data Structures per Port
-.. _pg_table_4:
-
-**Table 4. Scheduler Internal Data Structures per Port**
-
-+---+----------------------+-------------------------+---------------------+------------------------------+---------------------------------------------------+
-| # | Data structure | Size (bytes) | # per port | Access type | Description |
-| | | | | | |
-| | | | +-------------+----------------+---------------------------------------------------+
-| | | | | Enq | Deq | |
-| | | | | | | |
-+===+======================+=========================+=====================+=============+================+===================================================+
-| 1 | Subport table entry | 64 | # subports per port | - | Rd, Wr | Persistent subport data (credits, etc). |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
-| 2 | Pipe table entry | 64 | # pipes per port | - | Rd, Wr | Persistent data for pipe, its TCs and its queues |
-| | | | | | | (credits, etc) that is updated during run-time. |
-| | | | | | | |
-| | | | | | | The pipe configuration parameters do not change |
-| | | | | | | during run-time. The same pipe configuration |
-| | | | | | | parameters are shared by multiple pipes, |
-| | | | | | | therefore they are not part of pipe table entry. |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
-| 3 | Queue table entry | 4 | #queues per port | Rd, Wr | Rd, Wr | Persistent queue data (read and write pointers). |
-| | | | | | | The queue size is the same per TC for all queues, |
-| | | | | | | allowing the queue base address to be computed |
-| | | | | | | using a fast formula, so these two parameters are |
-| | | | | | | not part of queue table entry. |
-| | | | | | | |
-| | | | | | | The queue table entries for any given pipe are |
-| | | | | | | stored in the same cache line. |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
-| 4 | Queue storage area | Config (default: 64 x8) | # queues per port | Wr | Rd | Array of elements per queue; each element is 8 |
-| | | | | | | byte in size (mbuf pointer). |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
-| 5 | Active queues bitmap | 1 bit per queue | 1 | Wr (Set) | Rd, Wr (Clear) | The bitmap maintains one status bit per queue: |
-| | | | | | | queue not active (queue is empty) or queue active |
-| | | | | | | (queue is not empty). |
-| | | | | | | |
-| | | | | | | Queue bit is set by the scheduler enqueue and |
-| | | | | | | cleared by the scheduler dequeue when queue |
-| | | | | | | becomes empty. |
-| | | | | | | |
-| | | | | | | Bitmap scan operation returns the next non-empty |
-| | | | | | | pipe and its status (16-bit mask of active queue |
-| | | | | | | in the pipe). |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
-| 6 | Grinder | ~128 | Config (default: 8) | - | Rd, Wr | Short list of active pipes currently under |
-| | | | | | | processing. The grinder contains temporary data |
-| | | | | | | during pipe processing. |
-| | | | | | | |
-| | | | | | | Once the current pipe exhausts packets or |
-| | | | | | | credits, it is replaced with another active pipe |
-| | | | | | | from the bitmap. |
-| | | | | | | |
-+---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+.. _table_qos_4:
+
+.. table:: Scheduler Internal Data Structures per Port
+
+ +---+----------------------+-------------------------+---------------------+------------------------------+---------------------------------------------------+
+ | # | Data structure | Size (bytes) | # per port | Access type | Description |
+ | | | | | | |
+ | | | | +-------------+----------------+---------------------------------------------------+
+ | | | | | Enq | Deq | |
+ | | | | | | | |
+ +===+======================+=========================+=====================+=============+================+===================================================+
+ | 1 | Subport table entry | 64 | # subports per port | - | Rd, Wr | Persistent subport data (credits, etc). |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+ | 2 | Pipe table entry | 64 | # pipes per port | - | Rd, Wr | Persistent data for pipe, its TCs and its queues |
+ | | | | | | | (credits, etc) that is updated during run-time. |
+ | | | | | | | |
+ | | | | | | | The pipe configuration parameters do not change |
+ | | | | | | | during run-time. The same pipe configuration |
+ | | | | | | | parameters are shared by multiple pipes, |
+ | | | | | | | therefore they are not part of pipe table entry. |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+ | 3 | Queue table entry | 4 | #queues per port | Rd, Wr | Rd, Wr | Persistent queue data (read and write pointers). |
+ | | | | | | | The queue size is the same per TC for all queues, |
+ | | | | | | | allowing the queue base address to be computed |
+ | | | | | | | using a fast formula, so these two parameters are |
+ | | | | | | | not part of queue table entry. |
+ | | | | | | | |
+ | | | | | | | The queue table entries for any given pipe are |
+ | | | | | | | stored in the same cache line. |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+ | 4 | Queue storage area | Config (default: 64 x8) | # queues per port | Wr | Rd | Array of elements per queue; each element is 8 |
+ | | | | | | | byte in size (mbuf pointer). |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+ | 5 | Active queues bitmap | 1 bit per queue | 1 | Wr (Set) | Rd, Wr (Clear) | The bitmap maintains one status bit per queue: |
+ | | | | | | | queue not active (queue is empty) or queue active |
+ | | | | | | | (queue is not empty). |
+ | | | | | | | |
+ | | | | | | | Queue bit is set by the scheduler enqueue and |
+ | | | | | | | cleared by the scheduler dequeue when queue |
+ | | | | | | | becomes empty. |
+ | | | | | | | |
+ | | | | | | | Bitmap scan operation returns the next non-empty |
+ | | | | | | | pipe and its status (16-bit mask of active queue |
+ | | | | | | | in the pipe). |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
+ | 6 | Grinder | ~128 | Config (default: 8) | - | Rd, Wr | Short list of active pipes currently under |
+ | | | | | | | processing. The grinder contains temporary data |
+ | | | | | | | during pipe processing. |
+ | | | | | | | |
+ | | | | | | | Once the current pipe exhausts packets or |
+ | | | | | | | credits, it is replaced with another active pipe |
+ | | | | | | | from the bitmap. |
+ | | | | | | | |
+ +---+----------------------+-------------------------+---------------------+-------------+----------------+---------------------------------------------------+
Multicore Scaling Strategy
^^^^^^^^^^^^^^^^^^^^^^^^^^
The number of credits required for the transmission of a packet of n bytes is equal to (n+h),
where h is equal to the number of framing overhead bytes per packet.
-.. _pg_table_5:
-
-**Table 5. Ethernet Frame Overhead Fields**
-
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
-| # | Packet field | Length (bytes) | Comments |
-| | | | |
-+===+================================+================+===========================================================================+
-| 1 | Preamble | 7 | |
-| | | | |
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
-| 2 | Start of Frame Delimiter (SFD) | 1 | |
-| | | | |
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
-| 3 | Frame Check Sequence (FCS) | 4 | Considered overhead only if not included in the mbuf packet length field. |
-| | | | |
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
-| 4 | Inter Frame Gap (IFG) | 12 | |
-| | | | |
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
-| 5 | Total | 24 | |
-| | | | |
-+---+--------------------------------+----------------+---------------------------------------------------------------------------+
+.. _table_qos_5:
+
+.. table:: Ethernet Frame Overhead Fields
+
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
+ | # | Packet field | Length (bytes) | Comments |
+ | | | | |
+ +===+================================+================+===========================================================================+
+ | 1 | Preamble | 7 | |
+ | | | | |
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
+ | 2 | Start of Frame Delimiter (SFD) | 1 | |
+ | | | | |
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
+ | 3 | Frame Check Sequence (FCS) | 4 | Considered overhead only if not included in the mbuf packet length field. |
+ | | | | |
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
+ | 4 | Inter Frame Gap (IFG) | 12 | |
+ | | | | |
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
+ | 5 | Total | 24 | |
+ | | | | |
+ +---+--------------------------------+----------------+---------------------------------------------------------------------------+
Traffic Shaping
"""""""""""""""
The traffic shaping for subport and pipe is implemented using a token bucket per subport/per pipe.
Each token bucket is implemented using one saturated counter that keeps track of the number of available credits.
-The token bucket generic parameters and operations are presented in Table 6 and Table 7.
-
-.. _pg_table_6:
-
-**Table 6. Token Bucket Generic Operations**
-
-+---+------------------------+--------------------+---------------------------------------------------------+
-| # | Token Bucket Parameter | Unit | Description |
-| | | | |
-+===+========================+====================+=========================================================+
-| 1 | bucket_rate | Credits per second | Rate of adding credits to the bucket. |
-| | | | |
-+---+------------------------+--------------------+---------------------------------------------------------+
-| 2 | bucket_size | Credits | Max number of credits that can be stored in the bucket. |
-| | | | |
-+---+------------------------+--------------------+---------------------------------------------------------+
-
-.. _pg_table_7:
-
-**Table 7. Token Bucket Generic Parameters**
-
-+---+------------------------+------------------------------------------------------------------------------+
-| # | Token Bucket Operation | Description |
-| | | |
-+===+========================+==============================================================================+
-| 1 | Initialization | Bucket set to a predefined value, e.g. zero or half of the bucket size. |
-| | | |
-+---+------------------------+------------------------------------------------------------------------------+
-| 2 | Credit update | Credits are added to the bucket on top of existing ones, either periodically |
-| | | or on demand, based on the bucket_rate. Credits cannot exceed the upper |
-| | | limit defined by the bucket_size, so any credits to be added to the bucket |
-| | | while the bucket is full are dropped. |
-| | | |
-+---+------------------------+------------------------------------------------------------------------------+
-| 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed |
-| | | from the bucket. The packet can only be sent if enough credits are in the |
-| | | bucket to send the full packet (packet bytes and framing overhead for the |
-| | | packet). |
-| | | |
-+---+------------------------+------------------------------------------------------------------------------+
+The token bucket generic parameters and operations are presented in :numref:`table_qos_6` and :numref:`table_qos_7`.
+
+.. _table_qos_6:
+
+.. table:: Token Bucket Generic Operations
+
+ +---+------------------------+--------------------+---------------------------------------------------------+
+ | # | Token Bucket Parameter | Unit | Description |
+ | | | | |
+ +===+========================+====================+=========================================================+
+ | 1 | bucket_rate | Credits per second | Rate of adding credits to the bucket. |
+ | | | | |
+ +---+------------------------+--------------------+---------------------------------------------------------+
+ | 2 | bucket_size | Credits | Max number of credits that can be stored in the bucket. |
+ | | | | |
+ +---+------------------------+--------------------+---------------------------------------------------------+
+
+.. _table_qos_7:
+
+.. table:: Token Bucket Generic Parameters
+
+ +---+------------------------+------------------------------------------------------------------------------+
+ | # | Token Bucket Operation | Description |
+ | | | |
+ +===+========================+==============================================================================+
+ | 1 | Initialization | Bucket set to a predefined value, e.g. zero or half of the bucket size. |
+ | | | |
+ +---+------------------------+------------------------------------------------------------------------------+
+ | 2 | Credit update | Credits are added to the bucket on top of existing ones, either periodically |
+ | | | or on demand, based on the bucket_rate. Credits cannot exceed the upper |
+ | | | limit defined by the bucket_size, so any credits to be added to the bucket |
+ | | | while the bucket is full are dropped. |
+ | | | |
+ +---+------------------------+------------------------------------------------------------------------------+
+ | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed |
+ | | | from the bucket. The packet can only be sent if enough credits are in the |
+ | | | bucket to send the full packet (packet bytes and framing overhead for the |
+ | | | packet). |
+ | | | |
+ +---+------------------------+------------------------------------------------------------------------------+
To implement the token bucket generic operations described above,
-the current design uses the persistent data structure presented in,
-while the implementation of the token bucket operations is described in Table 9.
-
-.. _pg_table_8:
-
-**Table 8. Token Bucket Persistent Data Structure**
-
-+---+------------------------+-------+----------------------------------------------------------------------+
-| # | Token bucket field | Unit | Description |
-| | | | |
-+===+========================+=======+======================================================================+
-| 1 | tb_time | Bytes | Time of the last credit update. Measured in bytes instead of seconds |
-| | | | or CPU cycles for ease of credit consumption operation |
-| | | | (as the current time is also maintained in bytes). |
-| | | | |
-| | | | See Section 26.2.4.5.1 "Internal Time Reference" for an |
-| | | | explanation of why the time is maintained in byte units. |
-| | | | |
-+---+------------------------+-------+----------------------------------------------------------------------+
-| 2 | tb_period | Bytes | Time period that should elapse since the last credit update in order |
-| | | | for the bucket to be awarded tb_credits_per_period worth or credits. |
-| | | | |
-+---+------------------------+-------+----------------------------------------------------------------------+
-| 3 | tb_credits_per_period | Bytes | Credit allowance per tb_period. |
-| | | | |
-+---+------------------------+-------+----------------------------------------------------------------------+
-| 4 | tb_size | Bytes | Bucket size, i.e. upper limit for the tb_credits. |
-| | | | |
-+---+------------------------+-------+----------------------------------------------------------------------+
-| 5 | tb_credits | Bytes | Number of credits currently in the bucket. |
-| | | | |
-+---+------------------------+-------+----------------------------------------------------------------------+
+the current design uses the persistent data structure presented in :numref:`table_qos_8`,
+while the implementation of the token bucket operations is described in :numref:`table_qos_9`.
+
+.. _table_qos_8:
+
+.. table:: Token Bucket Persistent Data Structure
+
+ +---+------------------------+-------+----------------------------------------------------------------------+
+ | # | Token bucket field | Unit | Description |
+ | | | | |
+ +===+========================+=======+======================================================================+
+ | 1 | tb_time | Bytes | Time of the last credit update. Measured in bytes instead of seconds |
+ | | | | or CPU cycles for ease of credit consumption operation |
+ | | | | (as the current time is also maintained in bytes). |
+ | | | | |
+ | | | | See Section 26.2.4.5.1 "Internal Time Reference" for an |
+ | | | | explanation of why the time is maintained in byte units. |
+ | | | | |
+ +---+------------------------+-------+----------------------------------------------------------------------+
+ | 2 | tb_period | Bytes | Time period that should elapse since the last credit update in order |
+ | | | | for the bucket to be awarded tb_credits_per_period worth or credits. |
+ | | | | |
+ +---+------------------------+-------+----------------------------------------------------------------------+
+ | 3 | tb_credits_per_period | Bytes | Credit allowance per tb_period. |
+ | | | | |
+ +---+------------------------+-------+----------------------------------------------------------------------+
+ | 4 | tb_size | Bytes | Bucket size, i.e. upper limit for the tb_credits. |
+ | | | | |
+ +---+------------------------+-------+----------------------------------------------------------------------+
+ | 5 | tb_credits | Bytes | Number of credits currently in the bucket. |
+ | | | | |
+ +---+------------------------+-------+----------------------------------------------------------------------+
The bucket rate (in bytes per second) can be computed with the following formula:
where, r = port line rate (in bytes per second).
-.. _pg_table_9:
-
-**Table 9. Token Bucket Operations**
-
-+---+-------------------------+-----------------------------------------------------------------------------+
-| # | Token bucket operation | Description |
-| | | |
-+===+=========================+=============================================================================+
-| 1 | Initialization | *tb_credits = 0; or tb_credits = tb_size / 2;* |
-| | | |
-+---+-------------------------+-----------------------------------------------------------------------------+
-| 2 | Credit update | Credit update options: |
-| | | |
-| | | * Every time a packet is sent for a port, update the credits of all the |
-| | | the subports and pipes of that port. Not feasible. |
-| | | |
-| | | * Every time a packet is sent, update the credits for the pipe and |
-| | | subport. Very accurate, but not needed (a lot of calculations). |
-| | | |
-| | | * Every time a pipe is selected (that is, picked by one |
-| | | of the grinders), update the credits for the pipe and its subport. |
-| | | |
-| | | The current implementation is using option 3. According to Section |
-| | | 26.2.4.4 "Dequeue State Machine", the pipe and subport credits are |
-| | | updated every time a pipe is selected by the dequeue process before the |
-| | | pipe and subport credits are actually used. |
-| | | |
-| | | The implementation uses a tradeoff between accuracy and speed by updating |
-| | | the bucket credits only when at least a full *tb_period* has elapsed since |
-| | | the last update. |
-| | | |
-| | | * Full accuracy can be achieved by selecting the value for *tb_period* |
-| | | for which *tb_credits_per_period = 1*. |
-| | | |
-| | | * When full accuracy is not required, better performance is achieved by |
-| | | setting *tb_credits* to a larger value. |
-| | | |
-| | | Update operations: |
-| | | |
-| | | * n_periods = (time - tb_time) / tb_period; |
-| | | |
-| | | * tb_credits += n_periods * tb_credits_per_period; |
-| | | |
-| | | * tb_credits = min(tb_credits, tb_size); |
-| | | |
-| | | * tb_time += n_periods * tb_period; |
-| | | |
-+---+-------------------------+-----------------------------------------------------------------------------+
-| 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed |
-| | (on packet scheduling) | from the bucket. The packet can only be sent if enough credits are in the |
-| | | bucket to send the full packet (packet bytes and framing overhead for the |
-| | | packet). |
-| | | |
-| | | Scheduling operations: |
-| | | |
-| | | pkt_credits = pkt_len + frame_overhead; |
-| | | if (tb_credits >= pkt_credits){tb_credits -= pkt_credits;} |
-| | | |
-+---+-------------------------+-----------------------------------------------------------------------------+
+.. _table_qos_9:
+
+.. table:: Token Bucket Operations
+
+ +---+-------------------------+-----------------------------------------------------------------------------+
+ | # | Token bucket operation | Description |
+ | | | |
+ +===+=========================+=============================================================================+
+ | 1 | Initialization | *tb_credits = 0; or tb_credits = tb_size / 2;* |
+ | | | |
+ +---+-------------------------+-----------------------------------------------------------------------------+
+ | 2 | Credit update | Credit update options: |
+ | | | |
+ | | | * Every time a packet is sent for a port, update the credits of all the |
+ | | | the subports and pipes of that port. Not feasible. |
+ | | | |
+ | | | * Every time a packet is sent, update the credits for the pipe and |
+ | | | subport. Very accurate, but not needed (a lot of calculations). |
+ | | | |
+ | | | * Every time a pipe is selected (that is, picked by one |
+ | | | of the grinders), update the credits for the pipe and its subport. |
+ | | | |
+ | | | The current implementation is using option 3. According to Section |
+ | | | 26.2.4.4 "Dequeue State Machine", the pipe and subport credits are |
+ | | | updated every time a pipe is selected by the dequeue process before the |
+ | | | pipe and subport credits are actually used. |
+ | | | |
+ | | | The implementation uses a tradeoff between accuracy and speed by updating |
+ | | | the bucket credits only when at least a full *tb_period* has elapsed since |
+ | | | the last update. |
+ | | | |
+ | | | * Full accuracy can be achieved by selecting the value for *tb_period* |
+ | | | for which *tb_credits_per_period = 1*. |
+ | | | |
+ | | | * When full accuracy is not required, better performance is achieved by |
+ | | | setting *tb_credits* to a larger value. |
+ | | | |
+ | | | Update operations: |
+ | | | |
+ | | | * n_periods = (time - tb_time) / tb_period; |
+ | | | |
+ | | | * tb_credits += n_periods * tb_credits_per_period; |
+ | | | |
+ | | | * tb_credits = min(tb_credits, tb_size); |
+ | | | |
+ | | | * tb_time += n_periods * tb_period; |
+ | | | |
+ +---+-------------------------+-----------------------------------------------------------------------------+
+ | 3 | Credit consumption | As result of packet scheduling, the necessary number of credits is removed |
+ | | (on packet scheduling) | from the bucket. The packet can only be sent if enough credits are in the |
+ | | | bucket to send the full packet (packet bytes and framing overhead for the |
+ | | | packet). |
+ | | | |
+ | | | Scheduling operations: |
+ | | | |
+ | | | pkt_credits = pkt_len + frame_overhead; |
+ | | | if (tb_credits >= pkt_credits){tb_credits -= pkt_credits;} |
+ | | | |
+ +---+-------------------------+-----------------------------------------------------------------------------+
Traffic Classes
"""""""""""""""
The upper limit for the traffic classes at the subport and
pipe levels is enforced by periodically refilling the subport / pipe traffic class credit counter,
out of which credits are consumed every time a packet is scheduled for that subport / pipe,
-as described in Table 10 and Table 11.
-
-.. _pg_table_10:
-
-**Table 10. Subport/Pipe Traffic Class Upper Limit Enforcement Persistent Data Structure**
-
-+---+-----------------------+-------+-----------------------------------------------------------------------+
-| # | Subport or pipe field | Unit | Description |
-| | | | |
-+===+=======================+=======+=======================================================================+
-| 1 | tc_time | Bytes | Time of the next update (upper limit refill) for the 4 TCs of the |
-| | | | current subport / pipe. |
-| | | | |
-| | | | See Section 26.2.4.5.1, "Internal Time Reference" for the |
-| | | | explanation of why the time is maintained in byte units. |
-| | | | |
-+---+-----------------------+-------+-----------------------------------------------------------------------+
-| 2 | tc_period | Bytes | Time between two consecutive updates for the 4 TCs of the current |
-| | | | subport / pipe. This is expected to be many times bigger than the |
-| | | | typical value of the token bucket tb_period. |
-| | | | |
-+---+-----------------------+-------+-----------------------------------------------------------------------+
-| 3 | tc_credits_per_period | Bytes | Upper limit for the number of credits allowed to be consumed by the |
-| | | | current TC during each enforcement period tc_period. |
-| | | | |
-+---+-----------------------+-------+-----------------------------------------------------------------------+
-| 4 | tc_credits | Bytes | Current upper limit for the number of credits that can be consumed by |
-| | | | the current traffic class for the remainder of the current |
-| | | | enforcement period. |
-| | | | |
-+---+-----------------------+-------+-----------------------------------------------------------------------+
-
-.. _pg_table_11:
-
-**Table 11. Subport/Pipe Traffic Class Upper Limit Enforcement Operations**
-
-+---+--------------------------+----------------------------------------------------------------------------+
-| # | Traffic Class Operation | Description |
-| | | |
-+===+==========================+============================================================================+
-| 1 | Initialization | tc_credits = tc_credits_per_period; |
-| | | |
-| | | tc_time = tc_period; |
-| | | |
-+---+--------------------------+----------------------------------------------------------------------------+
-| 2 | Credit update | Update operations: |
-| | | |
-| | | if (time >= tc_time) { |
-| | | |
-| | | tc_credits = tc_credits_per_period; |
-| | | |
-| | | tc_time = time + tc_period; |
-| | | |
-| | | } |
-| | | |
-+---+--------------------------+----------------------------------------------------------------------------+
-| 3 | Credit consumption | As result of packet scheduling, the TC limit is decreased with the |
-| | (on packet scheduling) | necessary number of credits. The packet can only be sent if enough credits |
-| | | are currently available in the TC limit to send the full packet |
-| | | (packet bytes and framing overhead for the packet). |
-| | | |
-| | | Scheduling operations: |
-| | | |
-| | | pkt_credits = pk_len + frame_overhead; |
-| | | |
-| | | if (tc_credits >= pkt_credits) {tc_credits -= pkt_credits;} |
-| | | |
-+---+--------------------------+----------------------------------------------------------------------------+
+as described in :numref:`table_qos_10` and :numref:`table_qos_11`.
+
+.. _table_qos_10:
+
+.. table:: Subport/Pipe Traffic Class Upper Limit Enforcement Persistent Data Structure
+
+ +---+-----------------------+-------+-----------------------------------------------------------------------+
+ | # | Subport or pipe field | Unit | Description |
+ | | | | |
+ +===+=======================+=======+=======================================================================+
+ | 1 | tc_time | Bytes | Time of the next update (upper limit refill) for the 4 TCs of the |
+ | | | | current subport / pipe. |
+ | | | | |
+ | | | | See Section 26.2.4.5.1, "Internal Time Reference" for the |
+ | | | | explanation of why the time is maintained in byte units. |
+ | | | | |
+ +---+-----------------------+-------+-----------------------------------------------------------------------+
+ | 2 | tc_period | Bytes | Time between two consecutive updates for the 4 TCs of the current |
+ | | | | subport / pipe. This is expected to be many times bigger than the |
+ | | | | typical value of the token bucket tb_period. |
+ | | | | |
+ +---+-----------------------+-------+-----------------------------------------------------------------------+
+ | 3 | tc_credits_per_period | Bytes | Upper limit for the number of credits allowed to be consumed by the |
+ | | | | current TC during each enforcement period tc_period. |
+ | | | | |
+ +---+-----------------------+-------+-----------------------------------------------------------------------+
+ | 4 | tc_credits | Bytes | Current upper limit for the number of credits that can be consumed by |
+ | | | | the current traffic class for the remainder of the current |
+ | | | | enforcement period. |
+ | | | | |
+ +---+-----------------------+-------+-----------------------------------------------------------------------+
+
+.. _table_qos_11:
+
+.. table:: Subport/Pipe Traffic Class Upper Limit Enforcement Operations
+
+ +---+--------------------------+----------------------------------------------------------------------------+
+ | # | Traffic Class Operation | Description |
+ | | | |
+ +===+==========================+============================================================================+
+ | 1 | Initialization | tc_credits = tc_credits_per_period; |
+ | | | |
+ | | | tc_time = tc_period; |
+ | | | |
+ +---+--------------------------+----------------------------------------------------------------------------+
+ | 2 | Credit update | Update operations: |
+ | | | |
+ | | | if (time >= tc_time) { |
+ | | | |
+ | | | tc_credits = tc_credits_per_period; |
+ | | | |
+ | | | tc_time = time + tc_period; |
+ | | | |
+ | | | } |
+ | | | |
+ +---+--------------------------+----------------------------------------------------------------------------+
+ | 3 | Credit consumption | As result of packet scheduling, the TC limit is decreased with the |
+ | | (on packet scheduling) | necessary number of credits. The packet can only be sent if enough credits |
+ | | | are currently available in the TC limit to send the full packet |
+ | | | (packet bytes and framing overhead for the packet). |
+ | | | |
+ | | | Scheduling operations: |
+ | | | |
+ | | | pkt_credits = pk_len + frame_overhead; |
+ | | | |
+ | | | if (tc_credits >= pkt_credits) {tc_credits -= pkt_credits;} |
+ | | | |
+ +---+--------------------------+----------------------------------------------------------------------------+
Weighted Round Robin (WRR)
""""""""""""""""""""""""""
-The evolution of the WRR design solution from simple to complex is shown in Table 12.
-
-.. _pg_table_12:
-
-**Table 12. Weighted Round Robin (WRR)**
-
-+---+------------+-----------------+-------------+----------------------------------------------------------+
-| # | All Queues | Equal Weights | All Packets | Strategy |
-| | Active? | for All Queues? | Equal? | |
-+===+============+=================+=============+==========================================================+
-| 1 | Yes | Yes | Yes | **Byte level round robin** |
-| | | | | |
-| | | | | *Next queue* queue #i, i = *(i + 1) % n* |
-| | | | | |
-+---+------------+-----------------+-------------+----------------------------------------------------------+
-| 2 | Yes | Yes | No | **Packet level round robin** |
-| | | | | |
-| | | | | Consuming one byte from queue #i requires consuming |
-| | | | | exactly one token for queue #i. |
-| | | | | |
-| | | | | T(i) = Accumulated number of tokens previously consumed |
-| | | | | from queue #i. Every time a packet is consumed from |
-| | | | | queue #i, T(i) is updated as: T(i) += *pkt_len*. |
-| | | | | |
-| | | | | *Next queue* : queue with the smallest T. |
-| | | | | |
-| | | | | |
-+---+------------+-----------------+-------------+----------------------------------------------------------+
-| 3 | Yes | No | No | **Packet level weighted round robin** |
-| | | | | |
-| | | | | This case can be reduced to the previous case by |
-| | | | | introducing a cost per byte that is different for each |
-| | | | | queue. Queues with lower weights have a higher cost per |
-| | | | | byte. This way, it is still meaningful to compare the |
-| | | | | consumption among different queues in order to select |
-| | | | | the next queue. |
-| | | | | |
-| | | | | w(i) = Weight of queue #i |
-| | | | | |
-| | | | | t(i) = Tokens per byte for queue #i, defined as the |
-| | | | | inverse weight of queue #i. |
-| | | | | For example, if w[0..3] = [1:2:4:8], |
-| | | | | then t[0..3] = [8:4:2:1]; if w[0..3] = [1:4:15:20], |
-| | | | | then t[0..3] = [60:15:4:3]. |
-| | | | | Consuming one byte from queue #i requires consuming t(i) |
-| | | | | tokens for queue #i. |
-| | | | | |
-| | | | | T(i) = Accumulated number of tokens previously consumed |
-| | | | | from queue #i. Every time a packet is consumed from |
-| | | | | queue #i, T(i) is updated as: *T(i) += pkt_len * t(i)*. |
-| | | | | *Next queue* : queue with the smallest T. |
-| | | | | |
-+---+------------+-----------------+-------------+----------------------------------------------------------+
-| 4 | No | No | No | **Packet level weighted round robin with variable queue |
-| | | | | status** |
-| | | | | |
-| | | | | Reduce this case to the previous case by setting the |
-| | | | | consumption of inactive queues to a high number, so that |
-| | | | | the inactive queues will never be selected by the |
-| | | | | smallest T logic. |
-| | | | | |
-| | | | | To prevent T from overflowing as result of successive |
-| | | | | accumulations, T(i) is truncated after each packet |
-| | | | | consumption for all queues. |
-| | | | | For example, T[0..3] = [1000, 1100, 1200, 1300] |
-| | | | | is truncated to T[0..3] = [0, 100, 200, 300] |
-| | | | | by subtracting the min T from T(i), i = 0..n. |
-| | | | | |
-| | | | | This requires having at least one active queue in the |
-| | | | | set of input queues, which is guaranteed by the dequeue |
-| | | | | state machine never selecting an inactive traffic class. |
-| | | | | |
-| | | | | *mask(i) = Saturation mask for queue #i, defined as:* |
-| | | | | |
-| | | | | mask(i) = (queue #i is active)? 0 : 0xFFFFFFFF; |
-| | | | | |
-| | | | | w(i) = Weight of queue #i |
-| | | | | |
-| | | | | t(i) = Tokens per byte for queue #i, defined as the |
-| | | | | inverse weight of queue #i. |
-| | | | | |
-| | | | | T(i) = Accumulated numbers of tokens previously consumed |
-| | | | | from queue #i. |
-| | | | | |
-| | | | | *Next queue* : queue with smallest T. |
-| | | | | |
-| | | | | Before packet consumption from queue #i: |
-| | | | | |
-| | | | | *T(i) |= mask(i)* |
-| | | | | |
-| | | | | After packet consumption from queue #i: |
-| | | | | |
-| | | | | T(j) -= T(i), j != i |
-| | | | | |
-| | | | | T(i) = pkt_len * t(i) |
-| | | | | |
-| | | | | Note: T(j) uses the T(i) value before T(i) is updated. |
-| | | | | |
-+---+------------+-----------------+-------------+----------------------------------------------------------+
+The evolution of the WRR design solution from simple to complex is shown in :numref:`table_qos_12`.
+
+.. _table_qos_12:
+
+.. table:: Weighted Round Robin (WRR)
+
+ +---+------------+-----------------+-------------+----------------------------------------------------------+
+ | # | All Queues | Equal Weights | All Packets | Strategy |
+ | | Active? | for All Queues? | Equal? | |
+ +===+============+=================+=============+==========================================================+
+ | 1 | Yes | Yes | Yes | **Byte level round robin** |
+ | | | | | |
+ | | | | | *Next queue* queue #i, i = *(i + 1) % n* |
+ | | | | | |
+ +---+------------+-----------------+-------------+----------------------------------------------------------+
+ | 2 | Yes | Yes | No | **Packet level round robin** |
+ | | | | | |
+ | | | | | Consuming one byte from queue #i requires consuming |
+ | | | | | exactly one token for queue #i. |
+ | | | | | |
+ | | | | | T(i) = Accumulated number of tokens previously consumed |
+ | | | | | from queue #i. Every time a packet is consumed from |
+ | | | | | queue #i, T(i) is updated as: T(i) += *pkt_len*. |
+ | | | | | |
+ | | | | | *Next queue* : queue with the smallest T. |
+ | | | | | |
+ | | | | | |
+ +---+------------+-----------------+-------------+----------------------------------------------------------+
+ | 3 | Yes | No | No | **Packet level weighted round robin** |
+ | | | | | |
+ | | | | | This case can be reduced to the previous case by |
+ | | | | | introducing a cost per byte that is different for each |
+ | | | | | queue. Queues with lower weights have a higher cost per |
+ | | | | | byte. This way, it is still meaningful to compare the |
+ | | | | | consumption amongst different queues in order to select |
+ | | | | | the next queue. |
+ | | | | | |
+ | | | | | w(i) = Weight of queue #i |
+ | | | | | |
+ | | | | | t(i) = Tokens per byte for queue #i, defined as the |
+ | | | | | inverse weight of queue #i. |
+ | | | | | For example, if w[0..3] = [1:2:4:8], |
+ | | | | | then t[0..3] = [8:4:2:1]; if w[0..3] = [1:4:15:20], |
+ | | | | | then t[0..3] = [60:15:4:3]. |
+ | | | | | Consuming one byte from queue #i requires consuming t(i) |
+ | | | | | tokens for queue #i. |
+ | | | | | |
+ | | | | | T(i) = Accumulated number of tokens previously consumed |
+ | | | | | from queue #i. Every time a packet is consumed from |
+ | | | | | queue #i, T(i) is updated as: *T(i) += pkt_len * t(i)*. |
+ | | | | | *Next queue* : queue with the smallest T. |
+ | | | | | |
+ +---+------------+-----------------+-------------+----------------------------------------------------------+
+ | 4 | No | No | No | **Packet level weighted round robin with variable queue |
+ | | | | | status** |
+ | | | | | |
+ | | | | | Reduce this case to the previous case by setting the |
+ | | | | | consumption of inactive queues to a high number, so that |
+ | | | | | the inactive queues will never be selected by the |
+ | | | | | smallest T logic. |
+ | | | | | |
+ | | | | | To prevent T from overflowing as result of successive |
+ | | | | | accumulations, T(i) is truncated after each packet |
+ | | | | | consumption for all queues. |
+ | | | | | For example, T[0..3] = [1000, 1100, 1200, 1300] |
+ | | | | | is truncated to T[0..3] = [0, 100, 200, 300] |
+ | | | | | by subtracting the min T from T(i), i = 0..n. |
+ | | | | | |
+ | | | | | This requires having at least one active queue in the |
+ | | | | | set of input queues, which is guaranteed by the dequeue |
+ | | | | | state machine never selecting an inactive traffic class. |
+ | | | | | |
+ | | | | | *mask(i) = Saturation mask for queue #i, defined as:* |
+ | | | | | |
+ | | | | | mask(i) = (queue #i is active)? 0 : 0xFFFFFFFF; |
+ | | | | | |
+ | | | | | w(i) = Weight of queue #i |
+ | | | | | |
+ | | | | | t(i) = Tokens per byte for queue #i, defined as the |
+ | | | | | inverse weight of queue #i. |
+ | | | | | |
+ | | | | | T(i) = Accumulated numbers of tokens previously consumed |
+ | | | | | from queue #i. |
+ | | | | | |
+ | | | | | *Next queue* : queue with smallest T. |
+ | | | | | |
+ | | | | | Before packet consumption from queue #i: |
+ | | | | | |
+ | | | | | *T(i) |= mask(i)* |
+ | | | | | |
+ | | | | | After packet consumption from queue #i: |
+ | | | | | |
+ | | | | | T(j) -= T(i), j != i |
+ | | | | | |
+ | | | | | T(i) = pkt_len * t(i) |
+ | | | | | |
+ | | | | | Note: T(j) uses the T(i) value before T(i) is updated. |
+ | | | | | |
+ +---+------------+-----------------+-------------+----------------------------------------------------------+
Subport Traffic Class Oversubscription
""""""""""""""""""""""""""""""""""""""
summarizes some of the possible approaches for handling this problem,
with the third approach selected for implementation.
-.. _pg_table_13:
-
-**Table 13. Subport Traffic Class Oversubscription**
-
-+-----+---------------------------+-------------------------------------------------------------------------+
-| No. | Approach | Description |
-| | | |
-+=====+===========================+=========================================================================+
-| 1 | Don't care | First come, first served. |
-| | | |
-| | | This approach is not fair among subport member pipes, as pipes that |
-| | | are served first will use up as much bandwidth for TC X as they need, |
-| | | while pipes that are served later will receive poor service due to |
-| | | bandwidth for TC X at the subport level being scarce. |
-| | | |
-+-----+---------------------------+-------------------------------------------------------------------------+
-| 2 | Scale down all pipes | All pipes within the subport have their bandwidth limit for TC X scaled |
-| | | down by the same factor. |
-| | | |
-| | | This approach is not fair among subport member pipes, as the low end |
-| | | pipes (that is, pipes configured with low bandwidth) can potentially |
-| | | experience severe service degradation that might render their service |
-| | | unusable (if available bandwidth for these pipes drops below the |
-| | | minimum requirements for a workable service), while the service |
-| | | degradation for high end pipes might not be noticeable at all. |
-| | | |
-+-----+---------------------------+-------------------------------------------------------------------------+
-| 3 | Cap the high demand pipes | Each subport member pipe receives an equal share of the bandwidth |
-| | | available at run-time for TC X at the subport level. Any bandwidth left |
-| | | unused by the low-demand pipes is redistributed in equal portions to |
-| | | the high-demand pipes. This way, the high-demand pipes are truncated |
-| | | while the low-demand pipes are not impacted. |
-| | | |
-+-----+---------------------------+-------------------------------------------------------------------------+
+.. _table_qos_13:
+
+.. table:: Subport Traffic Class Oversubscription
+
+ +-----+---------------------------+-------------------------------------------------------------------------+
+ | No. | Approach | Description |
+ | | | |
+ +=====+===========================+=========================================================================+
+ | 1 | Don't care | First come, first served. |
+ | | | |
+ | | | This approach is not fair amongst subport member pipes, as pipes that |
+ | | | are served first will use up as much bandwidth for TC X as they need, |
+ | | | while pipes that are served later will receive poor service due to |
+ | | | bandwidth for TC X at the subport level being scarce. |
+ | | | |
+ +-----+---------------------------+-------------------------------------------------------------------------+
+ | 2 | Scale down all pipes | All pipes within the subport have their bandwidth limit for TC X scaled |
+ | | | down by the same factor. |
+ | | | |
+ | | | This approach is not fair among subport member pipes, as the low end |
+ | | | pipes (that is, pipes configured with low bandwidth) can potentially |
+ | | | experience severe service degradation that might render their service |
+ | | | unusable (if available bandwidth for these pipes drops below the |
+ | | | minimum requirements for a workable service), while the service |
+ | | | degradation for high end pipes might not be noticeable at all. |
+ | | | |
+ +-----+---------------------------+-------------------------------------------------------------------------+
+ | 3 | Cap the high demand pipes | Each subport member pipe receives an equal share of the bandwidth |
+ | | | available at run-time for TC X at the subport level. Any bandwidth left |
+ | | | unused by the low-demand pipes is redistributed in equal portions to |
+ | | | the high-demand pipes. This way, the high-demand pipes are truncated |
+ | | | while the low-demand pipes are not impacted. |
+ | | | |
+ +-----+---------------------------+-------------------------------------------------------------------------+
Typically, the subport TC oversubscription feature is enabled only for the lowest priority traffic class (TC 3),
which is typically used for best effort traffic,
When demand is low, the watermark is set high to prevent it from impeding the subport member pipes from consuming more bandwidth.
The highest value for the watermark is picked as the highest rate configured for a subport member pipe.
-Table 15 illustrates the watermark operation.
-
-.. _pg_table_14:
-
-**Table 14. Watermark Propagation from Subport Level to Member Pipes at the Beginning of Each Traffic Class Upper Limit Enforcement Period**
-
-+-----+---------------------------------+----------------------------------------------------+
-| No. | Subport Traffic Class Operation | Description |
-| | | |
-+=====+=================================+====================================================+
-| 1 | Initialization | **Subport level**: subport_period_id= 0 |
-| | | |
-| | | **Pipe level**: pipe_period_id = 0 |
-| | | |
-+-----+---------------------------------+----------------------------------------------------+
-| 2 | Credit update | **Subport Level**: |
-| | | |
-| | | if (time>=subport_tc_time) |
-| | | |
-| | | { |
-| | | subport_wm = water_mark_update(); |
-| | | |
-| | | subport_tc_time = time + subport_tc_period; |
-| | | |
-| | | subport_period_id++; |
-| | | |
-| | | } |
-| | | |
-| | | **Pipelevel:** |
-| | | |
-| | | if(pipe_period_id != subport_period_id) |
-| | | |
-| | | { |
-| | | |
-| | | pipe_ov_credits = subport_wm \* pipe_weight; |
-| | | |
-| | | pipe_period_id = subport_period_id; |
-| | | |
-| | | } |
-| | | |
-+-----+---------------------------------+----------------------------------------------------+
-| 3 | Credit consumption | **Pipe level:** |
-| | (on packet scheduling) | |
-| | | pkt_credits = pk_len + frame_overhead; |
-| | | |
-| | | if(pipe_ov_credits >= pkt_credits{ |
-| | | |
-| | | pipe_ov_credits -= pkt_credits; |
-| | | |
-| | | } |
-| | | |
-+-----+---------------------------------+----------------------------------------------------+
-
-.. _pg_table_15:
-
-**Table 15. Watermark Calculation**
-
-+-----+------------------+----------------------------------------------------------------------------------+
-| No. | Subport Traffic | Description |
-| | Class Operation | |
-+=====+==================+==================================================================================+
-| 1 | Initialization | **Subport level:** |
-| | | |
-| | | wm = WM_MAX |
-| | | |
-+-----+------------------+----------------------------------------------------------------------------------+
-| 2 | Credit update | **Subport level (water_mark_update):** |
-| | | |
-| | | tc0_cons = subport_tc0_credits_per_period - subport_tc0_credits; |
-| | | |
-| | | tc1_cons = subport_tc1_credits_per_period - subport_tc1_credits; |
-| | | |
-| | | tc2_cons = subport_tc2_credits_per_period - subport_tc2_credits; |
-| | | |
-| | | tc3_cons = subport_tc3_credits_per_period - subport_tc3_credits; |
-| | | |
-| | | tc3_cons_max = subport_tc3_credits_per_period - (tc0_cons + tc1_cons + |
-| | | tc2_cons); |
-| | | |
-| | | if(tc3_consumption > (tc3_consumption_max - MTU)){ |
-| | | |
-| | | wm -= wm >> 7; |
-| | | |
-| | | if(wm < WM_MIN) wm = WM_MIN; |
-| | | |
-| | | } else { |
-| | | |
-| | | wm += (wm >> 7) + 1; |
-| | | |
-| | | if(wm > WM_MAX) wm = WM_MAX; |
-| | | |
-| | | } |
-| | | |
-+-----+------------------+----------------------------------------------------------------------------------+
+:numref:`table_qos_14` and :numref:`table_qos_15` illustrates the watermark operation.
+
+.. _table_qos_14:
+
+.. table:: Watermark Propagation from Subport Level to Member Pipes at the Beginning of Each Traffic Class Upper Limit Enforcement Period
+
+ +-----+---------------------------------+----------------------------------------------------+
+ | No. | Subport Traffic Class Operation | Description |
+ | | | |
+ +=====+=================================+====================================================+
+ | 1 | Initialization | **Subport level**: subport_period_id= 0 |
+ | | | |
+ | | | **Pipe level**: pipe_period_id = 0 |
+ | | | |
+ +-----+---------------------------------+----------------------------------------------------+
+ | 2 | Credit update | **Subport Level**: |
+ | | | |
+ | | | if (time>=subport_tc_time) |
+ | | | |
+ | | | { |
+ | | | subport_wm = water_mark_update(); |
+ | | | |
+ | | | subport_tc_time = time + subport_tc_period; |
+ | | | |
+ | | | subport_period_id++; |
+ | | | |
+ | | | } |
+ | | | |
+ | | | **Pipelevel:** |
+ | | | |
+ | | | if(pipe_period_id != subport_period_id) |
+ | | | |
+ | | | { |
+ | | | |
+ | | | pipe_ov_credits = subport_wm \* pipe_weight; |
+ | | | |
+ | | | pipe_period_id = subport_period_id; |
+ | | | |
+ | | | } |
+ | | | |
+ +-----+---------------------------------+----------------------------------------------------+
+ | 3 | Credit consumption | **Pipe level:** |
+ | | (on packet scheduling) | |
+ | | | pkt_credits = pk_len + frame_overhead; |
+ | | | |
+ | | | if(pipe_ov_credits >= pkt_credits{ |
+ | | | |
+ | | | pipe_ov_credits -= pkt_credits; |
+ | | | |
+ | | | } |
+ | | | |
+ +-----+---------------------------------+----------------------------------------------------+
+
+.. _table_qos_15:
+
+.. table:: Watermark Calculation
+
+ +-----+------------------+----------------------------------------------------------------------------------+
+ | No. | Subport Traffic | Description |
+ | | Class Operation | |
+ +=====+==================+==================================================================================+
+ | 1 | Initialization | **Subport level:** |
+ | | | |
+ | | | wm = WM_MAX |
+ | | | |
+ +-----+------------------+----------------------------------------------------------------------------------+
+ | 2 | Credit update | **Subport level (water_mark_update):** |
+ | | | |
+ | | | tc0_cons = subport_tc0_credits_per_period - subport_tc0_credits; |
+ | | | |
+ | | | tc1_cons = subport_tc1_credits_per_period - subport_tc1_credits; |
+ | | | |
+ | | | tc2_cons = subport_tc2_credits_per_period - subport_tc2_credits; |
+ | | | |
+ | | | tc3_cons = subport_tc3_credits_per_period - subport_tc3_credits; |
+ | | | |
+ | | | tc3_cons_max = subport_tc3_credits_per_period - (tc0_cons + tc1_cons + |
+ | | | tc2_cons); |
+ | | | |
+ | | | if(tc3_consumption > (tc3_consumption_max - MTU)){ |
+ | | | |
+ | | | wm -= wm >> 7; |
+ | | | |
+ | | | if(wm < WM_MIN) wm = WM_MIN; |
+ | | | |
+ | | | } else { |
+ | | | |
+ | | | wm += (wm >> 7) + 1; |
+ | | | |
+ | | | if(wm > WM_MAX) wm = WM_MAX; |
+ | | | |
+ | | | } |
+ | | | |
+ +-----+------------------+----------------------------------------------------------------------------------+
Worst Case Scenarios for Performance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Configuration
~~~~~~~~~~~~~
-A RED configuration contains the parameters given in Table 16.
-
-.. _pg_table_16:
-
-**Table 16. RED Configuration Parameters**
-
-+--------------------------+---------+---------+------------------+
-| Parameter | Minimum | Maximum | Typical |
-| | | | |
-+==========================+=========+=========+==================+
-| Minimum Threshold | 0 | 1022 | 1/4 x queue size |
-| | | | |
-+--------------------------+---------+---------+------------------+
-| Maximum Threshold | 1 | 1023 | 1/2 x queue size |
-| | | | |
-+--------------------------+---------+---------+------------------+
-| Inverse Mark Probability | 1 | 255 | 10 |
-| | | | |
-+--------------------------+---------+---------+------------------+
-| EWMA Filter Weight | 1 | 12 | 9 |
-| | | | |
-+--------------------------+---------+---------+------------------+
+A RED configuration contains the parameters given in :numref:`table_qos_16`.
+
+.. _table_qos_16:
+
+.. table:: RED Configuration Parameters
+
+ +--------------------------+---------+---------+------------------+
+ | Parameter | Minimum | Maximum | Typical |
+ | | | | |
+ +==========================+=========+=========+==================+
+ | Minimum Threshold | 0 | 1022 | 1/4 x queue size |
+ | | | | |
+ +--------------------------+---------+---------+------------------+
+ | Maximum Threshold | 1 | 1023 | 1/2 x queue size |
+ | | | | |
+ +--------------------------+---------+---------+------------------+
+ | Inverse Mark Probability | 1 | 255 | 10 |
+ | | | | |
+ +--------------------------+---------+---------+------------------+
+ | EWMA Filter Weight | 1 | 12 | 9 |
+ | | | | |
+ +--------------------------+---------+---------+------------------+
The meaning of these parameters is explained in more detail in the following sections.
The format of these parameters as specified to the dropper module API
The method that was finally selected (described above in Section 26.3.2.2.1) out performs all of these approaches
in terms of run-time performance and memory requirements and
also achieves accuracy comparable to floating-point evaluation.
-Table 17 lists the performance of each of these alternative approaches relative to the method that is used in the dropper.
+:numref:`table_qos_17` lists the performance of each of these alternative approaches relative to the method that is used in the dropper.
As can be seen, the floating-point implementation achieved the worst performance.
-.. _pg_table_17:
-
-**Table 17. Relative Performance of Alternative Approaches**
-
-+------------------------------------------------------------------------------------+----------------------+
-| Method | Relative Performance |
-| | |
-+====================================================================================+======================+
-| Current dropper method (see :ref:`Section 23.3.2.1.3 <Dropper>`) | 100% |
-| | |
-+------------------------------------------------------------------------------------+----------------------+
-| Fixed-point method with small (512B) look-up table | 148% |
-| | |
-+------------------------------------------------------------------------------------+----------------------+
-| SSE method with small (512B) look-up table | 114% |
-| | |
-+------------------------------------------------------------------------------------+----------------------+
-| Large (76KB) look-up table | 118% |
-| | |
-+------------------------------------------------------------------------------------+----------------------+
-| Floating-point | 595% |
-| | |
-+------------------------------------------------------------------------------------+----------------------+
-| **Note**: In this case, since performance is expressed as time spent executing the operation in a |
-| specific condition, any relative performance value above 100% runs slower than the reference method. |
-| |
-+-----------------------------------------------------------------------------------------------------------+
+.. _table_qos_17:
+
+.. table:: Relative Performance of Alternative Approaches
+
+ +------------------------------------------------------------------------------------+----------------------+
+ | Method | Relative Performance |
+ | | |
+ +====================================================================================+======================+
+ | Current dropper method (see :ref:`Section 23.3.2.1.3 <Dropper>`) | 100% |
+ | | |
+ +------------------------------------------------------------------------------------+----------------------+
+ | Fixed-point method with small (512B) look-up table | 148% |
+ | | |
+ +------------------------------------------------------------------------------------+----------------------+
+ | SSE method with small (512B) look-up table | 114% |
+ | | |
+ +------------------------------------------------------------------------------------+----------------------+
+ | Large (76KB) look-up table | 118% |
+ | | |
+ +------------------------------------------------------------------------------------+----------------------+
+ | Floating-point | 595% |
+ | | |
+ +------------------------------------------------------------------------------------+----------------------+
+ | **Note**: In this case, since performance is expressed as time spent executing the operation in a |
+ | specific condition, any relative performance value above 100% runs slower than the reference method. |
+ | |
+ +-----------------------------------------------------------------------------------------------------------+
Drop Decision Block
^^^^^^^^^^^^^^^^^^^
tc 3 wred weight = 9 9 9
With this configuration file, the RED configuration that applies to green,
-yellow and red packets in traffic class 0 is shown in Table 18.
-
-.. _pg_table_18:
-
-**Table 18. RED Configuration Corresponding to RED Configuration File**
-
-+--------------------+--------------------+-------+--------+-----+
-| RED Parameter | Configuration Name | Green | Yellow | Red |
-| | | | | |
-+====================+====================+=======+========+=====+
-| Minimum Threshold | tc 0 wred min | 28 | 22 | 16 |
-| | | | | |
-+--------------------+--------------------+-------+--------+-----+
-| Maximum Threshold | tc 0 wred max | 32 | 32 | 32 |
-| | | | | |
-+--------------------+--------------------+-------+--------+-----+
-| Mark Probability | tc 0 wred inv prob | 10 | 10 | 10 |
-| | | | | |
-+--------------------+--------------------+-------+--------+-----+
-| EWMA Filter Weight | tc 0 wred weight | 9 | 9 | 9 |
-| | | | | |
-+--------------------+--------------------+-------+--------+-----+
+yellow and red packets in traffic class 0 is shown in :numref:`table_qos_18`.
+
+.. _table_qos_18:
+
+.. table:: RED Configuration Corresponding to RED Configuration File
+
+ +--------------------+--------------------+-------+--------+-----+
+ | RED Parameter | Configuration Name | Green | Yellow | Red |
+ | | | | | |
+ +====================+====================+=======+========+=====+
+ | Minimum Threshold | tc 0 wred min | 28 | 22 | 16 |
+ | | | | | |
+ +--------------------+--------------------+-------+--------+-----+
+ | Maximum Threshold | tc 0 wred max | 32 | 32 | 32 |
+ | | | | | |
+ +--------------------+--------------------+-------+--------+-----+
+ | Mark Probability | tc 0 wred inv prob | 10 | 10 | 10 |
+ | | | | | |
+ +--------------------+--------------------+-------+--------+-----+
+ | EWMA Filter Weight | tc 0 wred weight | 9 | 9 | 9 |
+ | | | | | |
+ +--------------------+--------------------+-------+--------+-----+
Application Programming Interface (API)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Tables**
-:ref:`Table 1.Output Traffic Marking <table_1>`
+:numref:`table_qos_metering_1` :ref:`table_qos_metering_1`
-:ref:`Table 2.Entity Types <table_2>`
+:numref:`table_qos_scheduler_1` :ref:`table_qos_scheduler_1`
-:ref:`Table 3.Table Types <table_3>`
+:numref:`table_test_pipeline_1` :ref:`table_test_pipeline_1`
Assuming the input traffic is generated at line rate and all packets are 64 bytes Ethernet frames (IPv4 packet size of 46 bytes)
and green, the expected output traffic should be marked as shown in the following table:
-.. _table_1:
-
-**Table 1. Output Traffic Marking**
-
-+-------------+------------------+-------------------+----------------+
-| **Mode** | **Green (Mpps)** | **Yellow (Mpps)** | **Red (Mpps)** |
-| | | | |
-+=============+==================+===================+================+
-| srTCM blind | 1 | 1 | 12.88 |
-| | | | |
-+-------------+------------------+-------------------+----------------+
-| srTCM color | 1 | 1 | 12.88 |
-| | | | |
-+-------------+------------------+-------------------+----------------+
-| trTCM blind | 1 | 0.5 | 13.38 |
-| | | | |
-+-------------+------------------+-------------------+----------------+
-| trTCM color | 1 | 0.5 | 13.38 |
-| | | | |
-+-------------+------------------+-------------------+----------------+
-| FWD | 14.88 | 0 | 0 |
-| | | | |
-+-------------+------------------+-------------------+----------------+
+.. _table_qos_metering_1:
+
+.. table:: Output Traffic Marking
+
+ +-------------+------------------+-------------------+----------------+
+ | **Mode** | **Green (Mpps)** | **Yellow (Mpps)** | **Red (Mpps)** |
+ | | | | |
+ +=============+==================+===================+================+
+ | srTCM blind | 1 | 1 | 12.88 |
+ | | | | |
+ +-------------+------------------+-------------------+----------------+
+ | srTCM color | 1 | 1 | 12.88 |
+ | | | | |
+ +-------------+------------------+-------------------+----------------+
+ | trTCM blind | 1 | 0.5 | 13.38 |
+ | | | | |
+ +-------------+------------------+-------------------+----------------+
+ | trTCM color | 1 | 0.5 | 13.38 |
+ | | | | |
+ +-------------+------------------+-------------------+----------------+
+ | FWD | 14.88 | 0 | 0 |
+ | | | | |
+ +-------------+------------------+-------------------+----------------+
To set up the policing scheme as desired, it is necessary to modify the main.h source file,
where this policy is implemented as a static structure, as follows:
The traffic flows that need to be configured are application dependent.
This application classifies based on the QinQ double VLAN tags and the IP destination address as indicated in the following table.
-.. _table_2:
-
-**Table 2. Entity Types**
-
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
-| **Level Name** | **Siblings per Parent** | **QoS Functional Description** | **Selected By** |
-| | | | |
-+================+=========================+==================================================+==================================+
-| Port | - | Ethernet port | Physical port |
-| | | | |
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
-| Subport | Config (8) | Traffic shaped (token bucket) | Outer VLAN tag |
-| | | | |
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
-| Pipe | Config (4k) | Traffic shaped (token bucket) | Inner VLAN tag |
-| | | | |
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
-| Traffic Class | 4 | TCs of the same pipe services in strict priority | Destination IP address (0.0.X.0) |
-| | | | |
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
-| Queue | 4 | Queue of the same TC serviced in WRR | Destination IP address (0.0.0.X) |
-| | | | |
-+----------------+-------------------------+--------------------------------------------------+----------------------------------+
+.. _table_qos_scheduler_1:
+
+.. table:: Entity Types
+
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
+ | **Level Name** | **Siblings per Parent** | **QoS Functional Description** | **Selected By** |
+ | | | | |
+ +================+=========================+==================================================+==================================+
+ | Port | - | Ethernet port | Physical port |
+ | | | | |
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
+ | Subport | Config (8) | Traffic shaped (token bucket) | Outer VLAN tag |
+ | | | | |
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
+ | Pipe | Config (4k) | Traffic shaped (token bucket) | Inner VLAN tag |
+ | | | | |
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
+ | Traffic Class | 4 | TCs of the same pipe services in strict priority | Destination IP address (0.0.X.0) |
+ | | | | |
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
+ | Queue | 4 | Queue of the same TC serviced in WRR | Destination IP address (0.0.0.X) |
+ | | | | |
+ +----------------+-------------------------+--------------------------------------------------+----------------------------------+
Please refer to the "QoS Scheduler" chapter in the *DPDK Programmer's Guide* for more information about these parameters.
Table Types and Behavior
~~~~~~~~~~~~~~~~~~~~~~~~
-Table 3 describes the table types used and how they are populated.
+:numref:`table_test_pipeline_1` describes the table types used and how they are populated.
The hash tables are pre-populated with 16 million keys.
For hash tables, the following parameters can be selected:
* **Table type (e.g. hash-spec-16-ext or hash-spec-16-lru).**
The available options are ext (extendable bucket) or lru (least recently used).
-.. _table_3:
-
-**Table 3. Table Types**
-
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| **#** | **TABLE_TYPE** | **Description of Core B Table** | **Pre-added Table Entries** |
-| | | | |
-+=======+========================+==========================================================+=======================================================+
-| 1 | none | Core B is not implementing a DPDK pipeline. | N/A |
-| | | Core B is implementing a pass-through from its input set | |
-| | | of software queues to its output set of software queues. | |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 2 | stub | Stub table. Core B is implementing the same pass-through | N/A |
-| | | functionality as described for the "none" option by | |
-| | | using the DPDK Packet Framework by using one | |
-| | | stub table for each input NIC port. | |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 3 | hash-[spec]-8-lru | LRU hash table with 8-byte key size and 16 million | 16 million entries are successfully added to the |
-| | | entries. | hash table with the following key format: |
-| | | | |
-| | | | [4-byte index, 4 bytes of 0] |
-| | | | |
-| | | | The action configured for all table entries is |
-| | | | "Sendto output port", with the output port index |
-| | | | uniformly distributed for the range of output ports. |
-| | | | |
-| | | | The default table rule (used in the case of a lookup |
-| | | | miss) is to drop the packet. |
-| | | | |
-| | | | At run time, core A is creating the following lookup |
-| | | | key and storing it into the packet meta data for |
-| | | | core B to use for table lookup: |
-| | | | |
-| | | | [destination IPv4 address, 4 bytes of 0] |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 4 | hash-[spec]-8-ext | Extendable bucket hash table with 8-byte key size | Same as hash-[spec]-8-lru table entries, above. |
-| | | and 16 million entries. | |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 5 | hash-[spec]-16-lru | LRU hash table with 16-byte key size and 16 million | 16 million entries are successfully added to the hash |
-| | | entries. | table with the following key format: |
-| | | | |
-| | | | [4-byte index, 12 bytes of 0] |
-| | | | |
-| | | | The action configured for all table entries is |
-| | | | "Send to output port", with the output port index |
-| | | | uniformly distributed for the range of output ports. |
-| | | | |
-| | | | The default table rule (used in the case of a lookup |
-| | | | miss) is to drop the packet. |
-| | | | |
-| | | | At run time, core A is creating the following lookup |
-| | | | key and storing it into the packet meta data for core |
-| | | | B to use for table lookup: |
-| | | | |
-| | | | [destination IPv4 address, 12 bytes of 0] |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 6 | hash-[spec]-16-ext | Extendable bucket hash table with 16-byte key size | Same as hash-[spec]-16-lru table entries, above. |
-| | | and 16 million entries. | |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 7 | hash-[spec]-32-lru | LRU hash table with 32-byte key size and 16 million | 16 million entries are successfully added to the hash |
-| | | entries. | table with the following key format: |
-| | | | |
-| | | | [4-byte index, 28 bytes of 0]. |
-| | | | |
-| | | | The action configured for all table entries is |
-| | | | "Send to output port", with the output port index |
-| | | | uniformly distributed for the range of output ports. |
-| | | | |
-| | | | The default table rule (used in the case of a lookup |
-| | | | miss) is to drop the packet. |
-| | | | |
-| | | | At run time, core A is creating the following lookup |
-| | | | key and storing it into the packet meta data for |
-| | | | Lpmcore B to use for table lookup: |
-| | | | |
-| | | | [destination IPv4 address, 28 bytes of 0] |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 8 | hash-[spec]-32-ext | Extendable bucket hash table with 32-byte key size | Same as hash-[spec]-32-lru table entries, above. |
-| | | and 16 million entries. | |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 9 | lpm | Longest Prefix Match (LPM) IPv4 table. | In the case of two ports, two routes |
-| | | | are added to the table: |
-| | | | |
-| | | | [0.0.0.0/9 => send to output port 0] |
-| | | | |
-| | | | [0.128.0.0/9 => send to output port 1] |
-| | | | |
-| | | | In case of four ports, four entries are added to the |
-| | | | table: |
-| | | | |
-| | | | [0.0.0.0/10 => send to output port 0] |
-| | | | |
-| | | | [0.64.0.0/10 => send to output port 1] |
-| | | | |
-| | | | [0.128.0.0/10 => send to output port 2] |
-| | | | |
-| | | | [0.192.0.0/10 => send to output port 3] |
-| | | | |
-| | | | The default table rule (used in the case of a lookup |
-| | | | miss) is to drop the packet. |
-| | | | |
-| | | | At run time, core A is storing the IPv4 destination |
-| | | | within the packet meta data to be later used by core |
-| | | | B as the lookup key. |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
-| 10 | acl | Access Control List (ACL) table | In the case of two ports, two ACL rules are added to |
-| | | | the table: |
-| | | | |
-| | | | [priority = 0 (highest), |
-| | | | |
-| | | | IPv4 source = ANY, |
-| | | | |
-| | | | IPv4 destination = 0.0.0.0/9, |
-| | | | |
-| | | | L4 protocol = ANY, |
-| | | | |
-| | | | TCP source port = ANY, |
-| | | | |
-| | | | TCP destination port = ANY |
-| | | | |
-| | | | => send to output port 0] |
-| | | | |
-| | | | |
-| | | | [priority = 0 (highest), |
-| | | | |
-| | | | IPv4 source = ANY, |
-| | | | |
-| | | | IPv4 destination = 0.128.0.0/9, |
-| | | | |
-| | | | L4 protocol = ANY, |
-| | | | |
-| | | | TCP source port = ANY, |
-| | | | |
-| | | | TCP destination port = ANY |
-| | | | |
-| | | | => send to output port 0]. |
-| | | | |
-| | | | |
-| | | | The default table rule (used in the case of a lookup |
-| | | | miss) is to drop the packet. |
-| | | | |
-+-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+.. _table_test_pipeline_1:
+
+.. table:: Table Types
+
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | **#** | **TABLE_TYPE** | **Description of Core B Table** | **Pre-added Table Entries** |
+ | | | | |
+ +=======+========================+==========================================================+=======================================================+
+ | 1 | none | Core B is not implementing a DPDK pipeline. | N/A |
+ | | | Core B is implementing a pass-through from its input set | |
+ | | | of software queues to its output set of software queues. | |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 2 | stub | Stub table. Core B is implementing the same pass-through | N/A |
+ | | | functionality as described for the "none" option by | |
+ | | | using the DPDK Packet Framework by using one | |
+ | | | stub table for each input NIC port. | |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 3 | hash-[spec]-8-lru | LRU hash table with 8-byte key size and 16 million | 16 million entries are successfully added to the |
+ | | | entries. | hash table with the following key format: |
+ | | | | |
+ | | | | [4-byte index, 4 bytes of 0] |
+ | | | | |
+ | | | | The action configured for all table entries is |
+ | | | | "Sendto output port", with the output port index |
+ | | | | uniformly distributed for the range of output ports. |
+ | | | | |
+ | | | | The default table rule (used in the case of a lookup |
+ | | | | miss) is to drop the packet. |
+ | | | | |
+ | | | | At run time, core A is creating the following lookup |
+ | | | | key and storing it into the packet meta data for |
+ | | | | core B to use for table lookup: |
+ | | | | |
+ | | | | [destination IPv4 address, 4 bytes of 0] |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 4 | hash-[spec]-8-ext | Extendible bucket hash table with 8-byte key size | Same as hash-[spec]-8-lru table entries, above. |
+ | | | and 16 million entries. | |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 5 | hash-[spec]-16-lru | LRU hash table with 16-byte key size and 16 million | 16 million entries are successfully added to the hash |
+ | | | entries. | table with the following key format: |
+ | | | | |
+ | | | | [4-byte index, 12 bytes of 0] |
+ | | | | |
+ | | | | The action configured for all table entries is |
+ | | | | "Send to output port", with the output port index |
+ | | | | uniformly distributed for the range of output ports. |
+ | | | | |
+ | | | | The default table rule (used in the case of a lookup |
+ | | | | miss) is to drop the packet. |
+ | | | | |
+ | | | | At run time, core A is creating the following lookup |
+ | | | | key and storing it into the packet meta data for core |
+ | | | | B to use for table lookup: |
+ | | | | |
+ | | | | [destination IPv4 address, 12 bytes of 0] |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 6 | hash-[spec]-16-ext | Extendible bucket hash table with 16-byte key size | Same as hash-[spec]-16-lru table entries, above. |
+ | | | and 16 million entries. | |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 7 | hash-[spec]-32-lru | LRU hash table with 32-byte key size and 16 million | 16 million entries are successfully added to the hash |
+ | | | entries. | table with the following key format: |
+ | | | | |
+ | | | | [4-byte index, 28 bytes of 0]. |
+ | | | | |
+ | | | | The action configured for all table entries is |
+ | | | | "Send to output port", with the output port index |
+ | | | | uniformly distributed for the range of output ports. |
+ | | | | |
+ | | | | The default table rule (used in the case of a lookup |
+ | | | | miss) is to drop the packet. |
+ | | | | |
+ | | | | At run time, core A is creating the following lookup |
+ | | | | key and storing it into the packet meta data for |
+ | | | | Lpmcore B to use for table lookup: |
+ | | | | |
+ | | | | [destination IPv4 address, 28 bytes of 0] |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 8 | hash-[spec]-32-ext | Extendible bucket hash table with 32-byte key size | Same as hash-[spec]-32-lru table entries, above. |
+ | | | and 16 million entries. | |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 9 | lpm | Longest Prefix Match (LPM) IPv4 table. | In the case of two ports, two routes |
+ | | | | are added to the table: |
+ | | | | |
+ | | | | [0.0.0.0/9 => send to output port 0] |
+ | | | | |
+ | | | | [0.128.0.0/9 => send to output port 1] |
+ | | | | |
+ | | | | In case of four ports, four entries are added to the |
+ | | | | table: |
+ | | | | |
+ | | | | [0.0.0.0/10 => send to output port 0] |
+ | | | | |
+ | | | | [0.64.0.0/10 => send to output port 1] |
+ | | | | |
+ | | | | [0.128.0.0/10 => send to output port 2] |
+ | | | | |
+ | | | | [0.192.0.0/10 => send to output port 3] |
+ | | | | |
+ | | | | The default table rule (used in the case of a lookup |
+ | | | | miss) is to drop the packet. |
+ | | | | |
+ | | | | At run time, core A is storing the IPv4 destination |
+ | | | | within the packet meta data to be later used by core |
+ | | | | B as the lookup key. |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
+ | 10 | acl | Access Control List (ACL) table | In the case of two ports, two ACL rules are added to |
+ | | | | the table: |
+ | | | | |
+ | | | | [priority = 0 (highest), |
+ | | | | |
+ | | | | IPv4 source = ANY, |
+ | | | | |
+ | | | | IPv4 destination = 0.0.0.0/9, |
+ | | | | |
+ | | | | L4 protocol = ANY, |
+ | | | | |
+ | | | | TCP source port = ANY, |
+ | | | | |
+ | | | | TCP destination port = ANY |
+ | | | | |
+ | | | | => send to output port 0] |
+ | | | | |
+ | | | | |
+ | | | | [priority = 0 (highest), |
+ | | | | |
+ | | | | IPv4 source = ANY, |
+ | | | | |
+ | | | | IPv4 destination = 0.128.0.0/9, |
+ | | | | |
+ | | | | L4 protocol = ANY, |
+ | | | | |
+ | | | | TCP source port = ANY, |
+ | | | | |
+ | | | | TCP destination port = ANY |
+ | | | | |
+ | | | | => send to output port 0]. |
+ | | | | |
+ | | | | |
+ | | | | The default table rule (used in the case of a lookup |
+ | | | | miss) is to drop the packet. |
+ | | | | |
+ +-------+------------------------+----------------------------------------------------------+-------------------------------------------------------+
Input Traffic
~~~~~~~~~~~~~