From: Jingjing Wu Date: Thu, 25 Feb 2016 07:33:35 +0000 (+0800) Subject: examples/vmdq_dcb: support X710 X-Git-Tag: spdx-start~7357 X-Git-Url: http://git.droids-corp.org/?a=commitdiff_plain;h=8cc72f2814dd;p=dpdk.git examples/vmdq_dcb: support X710 Currently, the example vmdq_dcb only works on Intel(R) 82599 NICs. This patch extends this sample to make it work both on Intel(R) 82599 and X710/XL710 NICs by making the following changes: 1. add VMDQ base queue checking to avoid forwarding on PF queues. 2. assign each VMDQ pool to a MAC address. 3. add more arguments (nb-tcs, enable-rss) to change the default setting 4. extend the max number of queues from 128 to 1024. This patch also reworks the user guide for the vmdq_dcb sample. Signed-off-by: Jingjing Wu Acked-by: Helin Zhang --- diff --git a/doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst b/doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst index e9ced9f872..bf55fdad5c 100644 --- a/doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst +++ b/doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst @@ -32,8 +32,8 @@ VMDQ and DCB Forwarding Sample Application ========================================== The VMDQ and DCB Forwarding sample application is a simple example of packet processing using the DPDK. -The application performs L2 forwarding using VMDQ and DCB to divide the incoming traffic into 128 queues. -The traffic splitting is performed in hardware by the VMDQ and DCB features of the Intel® 82599 10 Gigabit Ethernet Controller. +The application performs L2 forwarding using VMDQ and DCB to divide the incoming traffic into queues. +The traffic splitting is performed in hardware by the VMDQ and DCB features of the Intel® 82599 and X710/XL710 Ethernet Controllers. Overview -------- @@ -41,28 +41,27 @@ Overview This sample application can be used as a starting point for developing a new application that is based on the DPDK and uses VMDQ and DCB for traffic partitioning. -The VMDQ and DCB filters work on VLAN traffic to divide the traffic into 128 input queues on the basis of the VLAN ID field and -VLAN user priority field. -VMDQ filters split the traffic into 16 or 32 groups based on the VLAN ID. -Then, DCB places each packet into one of either 4 or 8 queues within that group, based upon the VLAN user priority field. - -In either case, 16 groups of 8 queues, or 32 groups of 4 queues, the traffic can be split into 128 hardware queues on the NIC, -each of which can be polled individually by a DPDK application. +The VMDQ and DCB filters work on MAC and VLAN traffic to divide the traffic into input queues on the basis of the Destination MAC +address, VLAN ID and VLAN user priority fields. +VMDQ filters split the traffic into 16 or 32 groups based on the Destination MAC and VLAN ID. +Then, DCB places each packet into one of queues within that group, based upon the VLAN user priority field. All traffic is read from a single incoming port (port 0) and output on port 1, without any processing being performed. -The traffic is split into 128 queues on input, where each thread of the application reads from multiple queues. -For example, when run with 8 threads, that is, with the -c FF option, each thread receives and forwards packets from 16 queues. +With Intel® 82599 NIC, for example, the traffic is split into 128 queues on input, where each thread of the application reads from +multiple queues. When run with 8 threads, that is, with the -c FF option, each thread receives and forwards packets from 16 queues. -As supplied, the sample application configures the VMDQ feature to have 16 pools with 8 queues each as indicated in :numref:`figure_vmdq_dcb_example`. -The Intel® 82599 10 Gigabit Ethernet Controller NIC also supports the splitting of traffic into 32 pools of 4 queues each and -this can be used by changing the NUM_POOLS parameter in the supplied code. -The NUM_POOLS parameter can be passed on the command line, after the EAL parameters: +As supplied, the sample application configures the VMDQ feature to have 32 pools with 4 queues each as indicated in :numref:`figure_vmdq_dcb_example`. +The Intel® 82599 10 Gigabit Ethernet Controller NIC also supports the splitting of traffic into 16 pools of 8 queues. While the +Intel® X710 or XL710 Ethernet Controller NICs support many configurations of VMDQ pools of 4 or 8 queues each. For simplicity, only 16 +or 32 pools is supported in this sample. And queues numbers for each VMDQ pool can be changed by setting CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM +in config/common_* file. +The nb-pools, nb-tcs and enable-rss parameters can be passed on the command line, after the EAL parameters: .. code-block:: console - ./build/vmdq_dcb [EAL options] -- -p PORTMASK --nb-pools NP + ./build/vmdq_dcb [EAL options] -- -p PORTMASK --nb-pools NP --nb-tcs TC --enable-rss -where, NP can be 16 or 32. +where, NP can be 16 or 32, TC can be 4 or 8, rss is disabled by default. .. _figure_vmdq_dcb_example: @@ -72,9 +71,7 @@ where, NP can be 16 or 32. In Linux* user space, the application can display statistics with the number of packets received on each queue. -To have the application display the statistics, send a SIGHUP signal to the running application process, as follows: - -where, is the process id of the application process. +To have the application display the statistics, send a SIGHUP signal to the running application process. The VMDQ and DCB Forwarding sample application is in many ways simpler than the L2 Forwarding application (see :doc:`l2_forward_real_virtual`) @@ -117,7 +114,7 @@ To run the example in a linuxapp environment: .. code-block:: console - user@target:~$ ./build/vmdq_dcb -c f -n 4 -- -p 0x3 --nb-pools 16 + user@target:~$ ./build/vmdq_dcb -c f -n 4 -- -p 0x3 --nb-pools 32 --nb-tcs 4 Refer to the *DPDK Getting Started Guide* for general information on running applications and the Environment Abstraction Layer (EAL) options. @@ -143,34 +140,48 @@ a default structure is provided for VMDQ and DCB configuration to be filled in l .. code-block:: c /* empty vmdq+dcb configuration structure. Filled in programmatically */ - static const struct rte_eth_conf vmdq_dcb_conf_default = { .rxmode = { - .mq_mode = ETH_VMDQ_DCB, + .mq_mode = ETH_MQ_RX_VMDQ_DCB, .split_hdr_size = 0, - .header_split = 0, /**< Header Split disabled */ + .header_split = 0, /**< Header Split disabled */ .hw_ip_checksum = 0, /**< IP checksum offload disabled */ .hw_vlan_filter = 0, /**< VLAN filtering disabled */ - .jumbo_frame = 0, /**< Jumbo Frame Support disabled */ + .jumbo_frame = 0, /**< Jumbo Frame Support disabled */ }, - .txmode = { - .mq_mode = ETH_DCB_NONE, + .mq_mode = ETH_MQ_TX_VMDQ_DCB, }, - + /* + * should be overridden separately in code with + * appropriate values + */ .rx_adv_conf = { - /* - * should be overridden separately in code with - * appropriate values - */ - .vmdq_dcb_conf = { - .nb_queue_pools = ETH_16_POOLS, + .nb_queue_pools = ETH_32_POOLS, + .enable_default_pool = 0, + .default_pool = 0, + .nb_pool_maps = 0, + .pool_map = {{0, 0},}, + .dcb_tc = {0}, + }, + .dcb_rx_conf = { + .nb_tcs = ETH_4_TCS, + /** Traffic class each UP mapped to. */ + .dcb_tc = {0}, + }, + .vmdq_rx_conf = { + .nb_queue_pools = ETH_32_POOLS, .enable_default_pool = 0, .default_pool = 0, .nb_pool_maps = 0, .pool_map = {{0, 0},}, - .dcb_queue = {0}, + }, + }, + .tx_adv_conf = { + .vmdq_dcb_tx_conf = { + .nb_queue_pools = ETH_32_POOLS, + .dcb_tc = {0}, }, }, }; @@ -178,11 +189,17 @@ a default structure is provided for VMDQ and DCB configuration to be filled in l The get_eth_conf() function fills in an rte_eth_conf structure with the appropriate values, based on the global vlan_tags array, and dividing up the possible user priority values equally among the individual queues -(also referred to as traffic classes) within each pool, that is, -if the number of pools is 32, then the user priority fields are allocated two to a queue. +(also referred to as traffic classes) within each pool. With Intel® 82599 NIC, +if the number of pools is 32, then the user priority fields are allocated 2 to a queue. If 16 pools are used, then each of the 8 user priority fields is allocated to its own queue within the pool. +With Intel® X710/XL710 NICs, if number of tcs is 4, and number of queues in pool is 8, +then the user priority fields are allocated 2 to one tc, and a tc has 2 queues mapping to it, then +RSS will determine the destination queue in 2. For the VLAN IDs, each one can be allocated to possibly multiple pools of queues, so the pools parameter in the rte_eth_vmdq_dcb_conf structure is specified as a bitmask value. +For destination MAC, each VMDQ pool will be assigned with a MAC address. In this sample, each VMDQ pool +is assigned to the MAC like 52:54:00:12::, that is, +the MAC of VMDQ pool 2 on port 1 is 52:54:00:12:01:02. .. code-block:: c @@ -193,38 +210,84 @@ so the pools parameter in the rte_eth_vmdq_dcb_conf structure is specified as a 24, 25, 26, 27, 28, 29, 30, 31 }; + /* pool mac addr template, pool mac addr is like: 52 54 00 12 port# pool# */ + static struct ether_addr pool_addr_template = { + .addr_bytes = {0x52, 0x54, 0x00, 0x12, 0x00, 0x00} + }; /* Builds up the correct configuration for vmdq+dcb based on the vlan tags array * given above, and the number of traffic classes available for use. */ - static inline int - get_eth_conf(struct rte_eth_conf *eth_conf, enum rte_eth_nb_pools num_pools) + get_eth_conf(struct rte_eth_conf *eth_conf) { struct rte_eth_vmdq_dcb_conf conf; - unsigned i; - - if (num_pools != ETH_16_POOLS && num_pools != ETH_32_POOLS ) return -1; - - conf.nb_queue_pools = num_pools; + struct rte_eth_vmdq_rx_conf vmdq_conf; + struct rte_eth_dcb_rx_conf dcb_conf; + struct rte_eth_vmdq_dcb_tx_conf tx_conf; + uint8_t i; + + conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + vmdq_conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + tx_conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + conf.nb_pool_maps = num_pools; + vmdq_conf.nb_pool_maps = num_pools; conf.enable_default_pool = 0; + vmdq_conf.enable_default_pool = 0; conf.default_pool = 0; /* set explicit value, even if not used */ - conf.nb_pool_maps = sizeof( vlan_tags )/sizeof( vlan_tags[ 0 ]); + vmdq_conf.default_pool = 0; - for (i = 0; i < conf.nb_pool_maps; i++){ - conf.pool_map[i].vlan_id = vlan_tags[ i ]; - conf.pool_map[i].pools = 1 << (i % num_pools); + for (i = 0; i < conf.nb_pool_maps; i++) { + conf.pool_map[i].vlan_id = vlan_tags[i]; + vmdq_conf.pool_map[i].vlan_id = vlan_tags[i]; + conf.pool_map[i].pools = 1UL << i ; + vmdq_conf.pool_map[i].pools = 1UL << i; } - for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++){ - conf.dcb_queue[i] = (uint8_t)(i % (NUM_QUEUES/num_pools)); + conf.dcb_tc[i] = i % num_tcs; + dcb_conf.dcb_tc[i] = i % num_tcs; + tx_conf.dcb_tc[i] = i % num_tcs; + } + dcb_conf.nb_tcs = (enum rte_eth_nb_tcs)num_tcs; + (void)(rte_memcpy(eth_conf, &vmdq_dcb_conf_default, sizeof(*eth_conf))); + (void)(rte_memcpy(ð_conf->rx_adv_conf.vmdq_dcb_conf, &conf, + sizeof(conf))); + (void)(rte_memcpy(ð_conf->rx_adv_conf.dcb_rx_conf, &dcb_conf, + sizeof(dcb_conf))); + (void)(rte_memcpy(ð_conf->rx_adv_conf.vmdq_rx_conf, &vmdq_conf, + sizeof(vmdq_conf))); + (void)(rte_memcpy(ð_conf->tx_adv_conf.vmdq_dcb_tx_conf, &tx_conf, + sizeof(tx_conf))); + if (rss_enable) { + eth_conf->rxmode.mq_mode= ETH_MQ_RX_VMDQ_DCB_RSS; + eth_conf->rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP | + ETH_RSS_UDP | + ETH_RSS_TCP | + ETH_RSS_SCTP; } - - (void) rte_memcpy(eth_conf, &vmdq_dcb_conf_default, sizeof(\*eth_conf)); - (void) rte_memcpy(ð_conf->rx_adv_conf.vmdq_dcb_conf, &conf, sizeof(eth_conf->rx_adv_conf.vmdq_dcb_conf)); - return 0; } + ...... + + /* Set mac for each pool.*/ + for (q = 0; q < num_pools; q++) { + struct ether_addr mac; + mac = pool_addr_template; + mac.addr_bytes[4] = port; + mac.addr_bytes[5] = q; + printf("Port %u vmdq pool %u set mac %02x:%02x:%02x:%02x:%02x:%02x\n", + port, q, + mac.addr_bytes[0], mac.addr_bytes[1], + mac.addr_bytes[2], mac.addr_bytes[3], + mac.addr_bytes[4], mac.addr_bytes[5]); + retval = rte_eth_dev_mac_addr_add(port, &mac, + q + vmdq_pool_base); + if (retval) { + printf("mac addr add failed at pool %d\n", q); + return retval; + } + } + Once the network port has been initialized using the correct VMDQ and DCB values, the initialization of the port's RX and TX hardware rings is performed similarly to that in the L2 Forwarding sample application. diff --git a/examples/vmdq_dcb/main.c b/examples/vmdq_dcb/main.c index b90ac2817c..62e1422a85 100644 --- a/examples/vmdq_dcb/main.c +++ b/examples/vmdq_dcb/main.c @@ -70,18 +70,36 @@ #include /* basic constants used in application */ -#define NUM_QUEUES 128 - -#define NUM_MBUFS 64*1024 +#define MAX_QUEUES 1024 +/* + * 1024 queues require to meet the needs of a large number of vmdq_pools. + * (RX/TX_queue_nb * RX/TX_ring_descriptors_nb) per port. + */ +#define NUM_MBUFS_PER_PORT (MAX_QUEUES * RTE_MAX(RTE_TEST_RX_DESC_DEFAULT, \ + RTE_TEST_TX_DESC_DEFAULT)) #define MBUF_CACHE_SIZE 64 +#define MAX_PKT_BURST 32 + +/* + * Configurable number of RX/TX ring descriptors + */ +#define RTE_TEST_RX_DESC_DEFAULT 128 +#define RTE_TEST_TX_DESC_DEFAULT 512 + #define INVALID_PORT_ID 0xFF /* mask of enabled ports */ -static uint32_t enabled_port_mask = 0; +static uint32_t enabled_port_mask; +static uint8_t ports[RTE_MAX_ETHPORTS]; +static unsigned num_ports; -/* number of pools (if user does not specify any, 16 by default */ -static enum rte_eth_nb_pools num_pools = ETH_16_POOLS; +/* number of pools (if user does not specify any, 32 by default */ +static enum rte_eth_nb_pools num_pools = ETH_32_POOLS; +static enum rte_eth_nb_tcs num_tcs = ETH_4_TCS; +static uint16_t num_queues, num_vmdq_queues; +static uint16_t vmdq_pool_base, vmdq_queue_base; +static uint8_t rss_enable; /* empty vmdq+dcb configuration structure. Filled in programatically */ static const struct rte_eth_conf vmdq_dcb_conf_default = { @@ -94,29 +112,44 @@ static const struct rte_eth_conf vmdq_dcb_conf_default = { .jumbo_frame = 0, /**< Jumbo Frame Support disabled */ }, .txmode = { - .mq_mode = ETH_MQ_TX_NONE, + .mq_mode = ETH_MQ_TX_VMDQ_DCB, }, + /* + * should be overridden separately in code with + * appropriate values + */ .rx_adv_conf = { - /* - * should be overridden separately in code with - * appropriate values - */ .vmdq_dcb_conf = { - .nb_queue_pools = ETH_16_POOLS, + .nb_queue_pools = ETH_32_POOLS, .enable_default_pool = 0, .default_pool = 0, .nb_pool_maps = 0, .pool_map = {{0, 0},}, .dcb_tc = {0}, }, + .dcb_rx_conf = { + .nb_tcs = ETH_4_TCS, + /** Traffic class each UP mapped to. */ + .dcb_tc = {0}, + }, + .vmdq_rx_conf = { + .nb_queue_pools = ETH_32_POOLS, + .enable_default_pool = 0, + .default_pool = 0, + .nb_pool_maps = 0, + .pool_map = {{0, 0},}, + }, + }, + .tx_adv_conf = { + .vmdq_dcb_tx_conf = { + .nb_queue_pools = ETH_32_POOLS, + .dcb_tc = {0}, + }, }, }; -static uint8_t ports[RTE_MAX_ETHPORTS]; -static unsigned num_ports = 0; - /* array used for printing out statistics */ -volatile unsigned long rxPackets[ NUM_QUEUES ] = {0}; +volatile unsigned long rxPackets[MAX_QUEUES] = {0}; const uint16_t vlan_tags[] = { 0, 1, 2, 3, 4, 5, 6, 7, @@ -125,30 +158,64 @@ const uint16_t vlan_tags[] = { 24, 25, 26, 27, 28, 29, 30, 31 }; +const uint16_t num_vlans = RTE_DIM(vlan_tags); +/* pool mac addr template, pool mac addr is like: 52 54 00 12 port# pool# */ +static struct ether_addr pool_addr_template = { + .addr_bytes = {0x52, 0x54, 0x00, 0x12, 0x00, 0x00} +}; + +/* ethernet addresses of ports */ +static struct ether_addr vmdq_ports_eth_addr[RTE_MAX_ETHPORTS]; + /* Builds up the correct configuration for vmdq+dcb based on the vlan tags array * given above, and the number of traffic classes available for use. */ static inline int -get_eth_conf(struct rte_eth_conf *eth_conf, enum rte_eth_nb_pools num_pools) +get_eth_conf(struct rte_eth_conf *eth_conf) { struct rte_eth_vmdq_dcb_conf conf; - unsigned i; - - if (num_pools != ETH_16_POOLS && num_pools != ETH_32_POOLS ) return -1; - - conf.nb_queue_pools = num_pools; + struct rte_eth_vmdq_rx_conf vmdq_conf; + struct rte_eth_dcb_rx_conf dcb_conf; + struct rte_eth_vmdq_dcb_tx_conf tx_conf; + uint8_t i; + + conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + vmdq_conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + tx_conf.nb_queue_pools = (enum rte_eth_nb_pools)num_pools; + conf.nb_pool_maps = num_pools; + vmdq_conf.nb_pool_maps = num_pools; conf.enable_default_pool = 0; + vmdq_conf.enable_default_pool = 0; conf.default_pool = 0; /* set explicit value, even if not used */ - conf.nb_pool_maps = sizeof( vlan_tags )/sizeof( vlan_tags[ 0 ]); - for (i = 0; i < conf.nb_pool_maps; i++){ - conf.pool_map[i].vlan_id = vlan_tags[ i ]; - conf.pool_map[i].pools = 1 << (i % num_pools); + vmdq_conf.default_pool = 0; + + for (i = 0; i < conf.nb_pool_maps; i++) { + conf.pool_map[i].vlan_id = vlan_tags[i]; + vmdq_conf.pool_map[i].vlan_id = vlan_tags[i]; + conf.pool_map[i].pools = 1UL << i; + vmdq_conf.pool_map[i].pools = 1UL << i; } for (i = 0; i < ETH_DCB_NUM_USER_PRIORITIES; i++){ - conf.dcb_tc[i] = (uint8_t)(i % (NUM_QUEUES/num_pools)); + conf.dcb_tc[i] = i % num_tcs; + dcb_conf.dcb_tc[i] = i % num_tcs; + tx_conf.dcb_tc[i] = i % num_tcs; } + dcb_conf.nb_tcs = (enum rte_eth_nb_tcs)num_tcs; (void)(rte_memcpy(eth_conf, &vmdq_dcb_conf_default, sizeof(*eth_conf))); (void)(rte_memcpy(ð_conf->rx_adv_conf.vmdq_dcb_conf, &conf, - sizeof(eth_conf->rx_adv_conf.vmdq_dcb_conf))); + sizeof(conf))); + (void)(rte_memcpy(ð_conf->rx_adv_conf.dcb_rx_conf, &dcb_conf, + sizeof(dcb_conf))); + (void)(rte_memcpy(ð_conf->rx_adv_conf.vmdq_rx_conf, &vmdq_conf, + sizeof(vmdq_conf))); + (void)(rte_memcpy(ð_conf->tx_adv_conf.vmdq_dcb_tx_conf, &tx_conf, + sizeof(tx_conf))); + if (rss_enable) { + eth_conf->rxmode.mq_mode = ETH_MQ_RX_VMDQ_DCB_RSS; + eth_conf->rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP | + ETH_RSS_UDP | + ETH_RSS_TCP | + ETH_RSS_SCTP; + } return 0; } @@ -159,51 +226,137 @@ get_eth_conf(struct rte_eth_conf *eth_conf, enum rte_eth_nb_pools num_pools) static inline int port_init(uint8_t port, struct rte_mempool *mbuf_pool) { - struct rte_eth_conf port_conf; - const uint16_t rxRings = ETH_VMDQ_DCB_NUM_QUEUES, - txRings = (uint16_t)rte_lcore_count(); - const uint16_t rxRingSize = 128, txRingSize = 512; + struct rte_eth_dev_info dev_info; + struct rte_eth_conf port_conf = {0}; + const uint16_t rxRingSize = RTE_TEST_RX_DESC_DEFAULT; + const uint16_t txRingSize = RTE_TEST_TX_DESC_DEFAULT; int retval; uint16_t q; + uint16_t queues_per_pool; + uint32_t max_nb_pools; + + /* + * The max pool number from dev_info will be used to validate the pool + * number specified in cmd line + */ + rte_eth_dev_info_get(port, &dev_info); + max_nb_pools = (uint32_t)dev_info.max_vmdq_pools; + /* + * We allow to process part of VMDQ pools specified by num_pools in + * command line. + */ + if (num_pools > max_nb_pools) { + printf("num_pools %d >max_nb_pools %d\n", + num_pools, max_nb_pools); + return -1; + } - retval = get_eth_conf(&port_conf, num_pools); + /* + * NIC queues are divided into pf queues and vmdq queues. + * There is assumption here all ports have the same configuration! + */ + vmdq_queue_base = dev_info.vmdq_queue_base; + vmdq_pool_base = dev_info.vmdq_pool_base; + printf("vmdq queue base: %d pool base %d\n", + vmdq_queue_base, vmdq_pool_base); + if (vmdq_pool_base == 0) { + num_vmdq_queues = dev_info.max_rx_queues; + num_queues = dev_info.max_rx_queues; + if (num_tcs != num_vmdq_queues / num_pools) { + printf("nb_tcs %d is invalid considering with" + " nb_pools %d, nb_tcs * nb_pools should = %d\n", + num_tcs, num_pools, num_vmdq_queues); + return -1; + } + } else { + queues_per_pool = dev_info.vmdq_queue_num / + dev_info.max_vmdq_pools; + if (num_tcs > queues_per_pool) { + printf("num_tcs %d > num of queues per pool %d\n", + num_tcs, queues_per_pool); + return -1; + } + num_vmdq_queues = num_pools * queues_per_pool; + num_queues = vmdq_queue_base + num_vmdq_queues; + printf("Configured vmdq pool num: %u," + " each vmdq pool has %u queues\n", + num_pools, queues_per_pool); + } + + if (port >= rte_eth_dev_count()) + return -1; + + retval = get_eth_conf(&port_conf); if (retval < 0) return retval; - if (port >= rte_eth_dev_count()) return -1; - - retval = rte_eth_dev_configure(port, rxRings, txRings, &port_conf); + /* + * Though in this example, all queues including pf queues are setup. + * This is because VMDQ queues doesn't always start from zero, and the + * PMD layer doesn't support selectively initialising part of rx/tx + * queues. + */ + retval = rte_eth_dev_configure(port, num_queues, num_queues, &port_conf); if (retval != 0) return retval; - for (q = 0; q < rxRings; q ++) { + for (q = 0; q < num_queues; q++) { retval = rte_eth_rx_queue_setup(port, q, rxRingSize, - rte_eth_dev_socket_id(port), - NULL, - mbuf_pool); - if (retval < 0) + rte_eth_dev_socket_id(port), + NULL, + mbuf_pool); + if (retval < 0) { + printf("initialize rx queue %d failed\n", q); return retval; + } } - for (q = 0; q < txRings; q ++) { + for (q = 0; q < num_queues; q++) { retval = rte_eth_tx_queue_setup(port, q, txRingSize, - rte_eth_dev_socket_id(port), - NULL); - if (retval < 0) + rte_eth_dev_socket_id(port), + NULL); + if (retval < 0) { + printf("initialize tx queue %d failed\n", q); return retval; + } } retval = rte_eth_dev_start(port); - if (retval < 0) + if (retval < 0) { + printf("port %d start failed\n", port); return retval; + } - struct ether_addr addr; - rte_eth_macaddr_get(port, &addr); + rte_eth_macaddr_get(port, &vmdq_ports_eth_addr[port]); printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8 " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n", (unsigned)port, - addr.addr_bytes[0], addr.addr_bytes[1], addr.addr_bytes[2], - addr.addr_bytes[3], addr.addr_bytes[4], addr.addr_bytes[5]); + vmdq_ports_eth_addr[port].addr_bytes[0], + vmdq_ports_eth_addr[port].addr_bytes[1], + vmdq_ports_eth_addr[port].addr_bytes[2], + vmdq_ports_eth_addr[port].addr_bytes[3], + vmdq_ports_eth_addr[port].addr_bytes[4], + vmdq_ports_eth_addr[port].addr_bytes[5]); + + /* Set mac for each pool.*/ + for (q = 0; q < num_pools; q++) { + struct ether_addr mac; + + mac = pool_addr_template; + mac.addr_bytes[4] = port; + mac.addr_bytes[5] = q; + printf("Port %u vmdq pool %u set mac %02x:%02x:%02x:%02x:%02x:%02x\n", + port, q, + mac.addr_bytes[0], mac.addr_bytes[1], + mac.addr_bytes[2], mac.addr_bytes[3], + mac.addr_bytes[4], mac.addr_bytes[5]); + retval = rte_eth_dev_mac_addr_add(port, &mac, + q + vmdq_pool_base); + if (retval) { + printf("mac addr add failed at pool %d\n", q); + return retval; + } + } return 0; } @@ -229,6 +382,28 @@ vmdq_parse_num_pools(const char *q_arg) return 0; } +/* Check num_tcs parameter and set it if OK*/ +static int +vmdq_parse_num_tcs(const char *q_arg) +{ + char *end = NULL; + int n; + + /* parse number string */ + n = strtol(q_arg, &end, 10); + if ((q_arg[0] == '\0') || (end == NULL) || (*end != '\0')) + return -1; + + if (n != 4 && n != 8) + return -1; + if (n == 4) + num_tcs = ETH_4_TCS; + else + num_tcs = ETH_8_TCS; + + return 0; +} + static int parse_portmask(const char *portmask) { @@ -251,7 +426,9 @@ static void vmdq_usage(const char *prgname) { printf("%s [EAL options] -- -p PORTMASK]\n" - " --nb-pools NP: number of pools (16 default, 32)\n", + " --nb-pools NP: number of pools (32 default, 16)\n" + " --nb-tcs NP: number of TCs (4 default, 8)\n" + " --enable-rss: enable RSS (disabled by default)\n", prgname); } @@ -265,11 +442,14 @@ vmdq_parse_args(int argc, char **argv) const char *prgname = argv[0]; static struct option long_option[] = { {"nb-pools", required_argument, NULL, 0}, + {"nb-tcs", required_argument, NULL, 0}, + {"enable-rss", 0, NULL, 0}, {NULL, 0, 0, 0} }; /* Parse command line */ - while ((opt = getopt_long(argc, argv, "p:",long_option,&option_index)) != EOF) { + while ((opt = getopt_long(argc, argv, "p:", long_option, + &option_index)) != EOF) { switch (opt) { /* portmask */ case 'p': @@ -281,43 +461,71 @@ vmdq_parse_args(int argc, char **argv) } break; case 0: - if (vmdq_parse_num_pools(optarg) == -1){ - printf("invalid number of pools\n"); - vmdq_usage(prgname); - return -1; + if (!strcmp(long_option[option_index].name, "nb-pools")) { + if (vmdq_parse_num_pools(optarg) == -1) { + printf("invalid number of pools\n"); + return -1; + } } + + if (!strcmp(long_option[option_index].name, "nb-tcs")) { + if (vmdq_parse_num_tcs(optarg) == -1) { + printf("invalid number of tcs\n"); + return -1; + } + } + + if (!strcmp(long_option[option_index].name, "enable-rss")) + rss_enable = 1; break; + default: vmdq_usage(prgname); return -1; } } - for(i = 0; i < RTE_MAX_ETHPORTS; i++) - { + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { if (enabled_port_mask & (1 << i)) ports[num_ports++] = (uint8_t)i; } if (num_ports < 2 || num_ports % 2) { printf("Current enabled port number is %u," - "but it should be even and at least 2\n",num_ports); + " but it should be even and at least 2\n", num_ports); return -1; } return 0; } +static void +update_mac_address(struct rte_mbuf *m, unsigned dst_port) +{ + struct ether_hdr *eth; + void *tmp; + + eth = rte_pktmbuf_mtod(m, struct ether_hdr *); + + /* 02:00:00:00:00:xx */ + tmp = ð->d_addr.addr_bytes[0]; + *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t)dst_port << 40); + + /* src addr */ + ether_addr_copy(&vmdq_ports_eth_addr[dst_port], ð->s_addr); +} /* When we receive a HUP signal, print out our stats */ static void sighup_handler(int signum) { - unsigned q; - for (q = 0; q < NUM_QUEUES; q ++) { - if (q % (NUM_QUEUES/num_pools) == 0) - printf("\nPool %u: ", q/(NUM_QUEUES/num_pools)); - printf("%lu ", rxPackets[ q ]); + unsigned q = vmdq_queue_base; + + for (; q < num_queues; q++) { + if (q % (num_vmdq_queues / num_pools) == 0) + printf("\nPool %u: ", (q - vmdq_queue_base) / + (num_vmdq_queues / num_pools)); + printf("%lu ", rxPackets[q]); } printf("\nFinished handling signal %d\n", signum); } @@ -326,20 +534,43 @@ sighup_handler(int signum) * Main thread that does the work, reading from INPUT_PORT * and writing to OUTPUT_PORT */ -static __attribute__((noreturn)) int +static int lcore_main(void *arg) { const uintptr_t core_num = (uintptr_t)arg; const unsigned num_cores = rte_lcore_count(); - uint16_t startQueue = (uint16_t)(core_num * (NUM_QUEUES/num_cores)); - uint16_t endQueue = (uint16_t)(startQueue + (NUM_QUEUES/num_cores)); + uint16_t startQueue, endQueue; uint16_t q, i, p; + const uint16_t quot = (uint16_t)(num_vmdq_queues / num_cores); + const uint16_t remainder = (uint16_t)(num_vmdq_queues % num_cores); + + + if (remainder) { + if (core_num < remainder) { + startQueue = (uint16_t)(core_num * (quot + 1)); + endQueue = (uint16_t)(startQueue + quot + 1); + } else { + startQueue = (uint16_t)(core_num * quot + remainder); + endQueue = (uint16_t)(startQueue + quot); + } + } else { + startQueue = (uint16_t)(core_num * quot); + endQueue = (uint16_t)(startQueue + quot); + } + /* vmdq queue idx doesn't always start from zero.*/ + startQueue += vmdq_queue_base; + endQueue += vmdq_queue_base; printf("Core %u(lcore %u) reading queues %i-%i\n", (unsigned)core_num, rte_lcore_id(), startQueue, endQueue - 1); + if (startQueue == endQueue) { + printf("lcore %u has nothing to do\n", (unsigned)core_num); + return 0; + } + for (;;) { - struct rte_mbuf *buf[32]; + struct rte_mbuf *buf[MAX_PKT_BURST]; const uint16_t buf_size = sizeof(buf) / sizeof(buf[0]); for (p = 0; p < num_ports; p++) { const uint8_t src = ports[p]; @@ -351,12 +582,17 @@ lcore_main(void *arg) for (q = startQueue; q < endQueue; q++) { const uint16_t rxCount = rte_eth_rx_burst(src, q, buf, buf_size); - if (rxCount == 0) + + if (unlikely(rxCount == 0)) continue; + rxPackets[q] += rxCount; + for (i = 0; i < rxCount; i++) + update_mac_address(buf[i], dst); + const uint16_t txCount = rte_eth_tx_burst(dst, - (uint16_t)core_num, buf, rxCount); + q, buf, rxCount); if (txCount != rxCount) { for (i = txCount; i < rxCount; i++) rte_pktmbuf_free(buf[i]); @@ -381,12 +617,12 @@ static unsigned check_ports_num(unsigned nb_ports) num_ports = nb_ports; } - for (portid = 0; portid < num_ports; portid ++) { + for (portid = 0; portid < num_ports; portid++) { if (ports[portid] >= nb_ports) { printf("\nSpecified port ID(%u) exceeds max system port ID(%u)\n", ports[portid], (nb_ports - 1)); ports[portid] = INVALID_PORT_ID; - valid_num_ports --; + valid_num_ports--; } } return valid_num_ports; @@ -420,16 +656,16 @@ main(int argc, char *argv[]) rte_exit(EXIT_FAILURE, "Invalid VMDQ argument\n"); cores = rte_lcore_count(); - if ((cores & (cores - 1)) != 0 || cores > 128) { + if ((cores & (cores - 1)) != 0 || cores > RTE_MAX_LCORE) { rte_exit(EXIT_FAILURE,"This program can only run on an even" - "number of cores(1-128)\n\n"); + " number of cores(1-%d)\n\n", RTE_MAX_LCORE); } nb_ports = rte_eth_dev_count(); if (nb_ports > RTE_MAX_ETHPORTS) nb_ports = RTE_MAX_ETHPORTS; - /* + /* * Update the global var NUM_PORTS and global array PORTS * and get value of var VALID_NUM_PORTS according to system ports number */ @@ -440,8 +676,9 @@ main(int argc, char *argv[]) rte_exit(EXIT_FAILURE, "Error with valid ports number is not even or less than 2\n"); } - mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports, - MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); + mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", + NUM_MBUFS_PER_PORT * nb_ports, MBUF_CACHE_SIZE, + 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); if (mbuf_pool == NULL) rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");