doc/guides/prog_guide/switch_representation.rst

   1 ..  SPDX-License-Identifier: BSD-3-Clause
   2     Copyright(c) 2018 6WIND S.A.
   3
   4 .. _switch_representation:
   5
   6 Switch Representation within DPDK Applications
   7 ==============================================
   8
   9 .. contents:: :local:
  10
  11 Introduction
  12 ------------
  13
  14 Network adapters with multiple physical ports and/or SR-IOV capabilities
  15 usually support the offload of traffic steering rules between their virtual
  16 functions (VFs), sub functions (SFs), physical functions (PFs) and ports.
  17
  18 Like for standard Ethernet switches, this involves a combination of
  19 automatic MAC learning and manual configuration. For most purposes it is
  20 managed by the host system and fully transparent to users and applications.
  21
  22 On the other hand, applications typically found on hypervisors that process
  23 layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
  24 according on their own criteria.
  25
  26 Without a standard software interface to manage traffic steering rules
  27 between VFs, SFs, PFs and the various physical ports of a given device,
  28 applications cannot take advantage of these offloads; software processing is
  29 mandatory even for traffic which ends up re-injected into the device it
  30 originates from.
  31
  32 This document describes how such steering rules can be configured through
  33 the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
  34 (PF/VF steering) using a single physical port for clarity, however the same
  35 logic applies to any number of ports without necessarily involving SR-IOV.
  36
  37 Sub Function
  38 ------------
  39 Besides SR-IOV, Sub function is a portion of the PCI device, a SF netdev
  40 has its own dedicated queues(txq, rxq). A SF netdev supports E-Switch
  41 representation offload similar to existing PF and VF representors.
  42 A SF shares PCI level resources with other SFs and/or with its parent PCI
  43 function.
  44
  45 Sub function is created on-demand, coexists with VFs. Number of SFs is
  46 limited by hardware resources.
  47
  48 Port Representors
  49 -----------------
  50
  51 In many cases, traffic steering rules cannot be determined in advance;
  52 applications usually have to process a bit of traffic in software before
  53 thinking about offloading specific flows to hardware.
  54
  55 Applications therefore need the ability to receive and inject traffic to
  56 various device endpoints (other VFs, SFs, PFs or physical ports) before
  57 connecting them together. Device drivers must provide means to hook the
  58 "other end" of these endpoints and to refer them when configuring flow
  59 rules.
  60
  61 This role is left to so-called "port representors" (also known as "VF
  62 representors" in the specific context of VFs, "SF representors" in the
  63 specific context of SFs), which are to DPDK what the Ethernet switch
  64 device driver model (**switchdev**) [1]_ is to Linux, and which can be
  65 thought as a software "patch panel" front-end for applications.
  66
  67 - DPDK port representors are implemented as additional virtual Ethernet
  68   device (**ethdev**) instances, spawned on an as needed basis through
  69   configuration parameters passed to the driver of the underlying
  70   device using devargs.
  71
  72 ::
  73
  74    -a pci:dbdf,representor=vf0
  75    -a pci:dbdf,representor=vf[0-3]
  76    -a pci:dbdf,representor=vf[0,5-11]
  77    -a pci:dbdf,representor=sf1
  78    -a pci:dbdf,representor=sf[0-1023]
  79    -a pci:dbdf,representor=sf[0,2-1023]
  80
  81 - As virtual devices, they may be more limited than their physical
  82   counterparts, for instance by exposing only a subset of device
  83   configuration callbacks and/or by not necessarily having Rx/Tx capability.
  84
  85 - Among other things, they can be used to assign MAC addresses to the
  86   resource they represent.
  87
  88 - Applications can tell port representors apart from other physical of virtual
  89   port by checking the dev_flags field within their device information
  90   structure for the RTE_ETH_DEV_REPRESENTOR bit-field.
  91
  92 .. code-block:: c
  93
  94   struct rte_eth_dev_info {
  95       ...
  96       uint32_t dev_flags; /**< Device flags */
  97       ...
  98   };
  99
 100 - The device or group relationship of ports can be discovered using the
 101   switch ``domain_id`` field within the devices switch information structure. By
 102   default the switch ``domain_id`` of a port will be
 103   ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't
 104   support the concept of a switch domain, but ports which do support the concept
 105   will be allocated a unique switch ``domain_id``, ports within the same switch
 106   domain will share the same ``domain_id``. The switch ``port_id`` is used to
 107   specify the port_id in terms of the switch, so in the case of SR-IOV devices
 108   the switch ``port_id`` would represent the virtual function identifier of the
 109   port.
 110
 111 .. code-block:: c
 112
 113    /**
 114     * Ethernet device associated switch information
 115     */
 116    struct rte_eth_switch_info {
 117        const char *name; /**< switch name */
 118        uint16_t domain_id; /**< switch domain id */
 119        uint16_t port_id; /**< switch port id */
 120    };
 121
 122
 123 .. [1] `Ethernet switch device driver model (switchdev)
 124        <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
 125
 126 - For some PMDs, memory usage of representors is huge when number of
 127   representor grows, mbufs are allocated for each descriptor of Rx queue.
 128   Polling large number of ports brings more CPU load, cache miss and
 129   latency. Shared Rx queue can be used to share Rx queue between PF and
 130   representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` in
 131   device info is used to indicate the capability. Setting non-zero share
 132   group in Rx queue configuration to enable share, share_qid is used to
 133   identify the shared Rx queue in group. Polling any member port can
 134   receive packets of all member ports in the group, port ID is saved in
 135   ``mbuf.port``.
 136
 137 Basic SR-IOV
 138 ------------
 139
 140 "Basic" in the sense that it is not managed by applications, which
 141 nonetheless expect traffic to flow between the various endpoints and the
 142 outside as if everything was linked by an Ethernet hub.
 143
 144 The following diagram pictures a setup involving a device with one PF, two
 145 VFs and one shared physical port
 146
 147 ::
 148
 149        .-------------.                 .-------------. .-------------.
 150        | hypervisor  |                 |    VM 1     | |    VM 2     |
 151        | application |                 | application | | application |
 152        `--+----------'                 `----------+--' `--+----------'
 153           |                                       |       |
 154     .-----+-----.                                 |       |
 155     | port_id 3 |                                 |       |
 156     `-----+-----'                                 |       |
 157           |                                       |       |
 158         .-+--.                                .---+--. .--+---.
 159         | PF |                                | VF 1 | | VF 2 |
 160         `-+--'                                `---+--' `--+---'
 161           |                                       |       |
 162           `---------.     .-----------------------'       |
 163                     |     |     .-------------------------'
 164                     |     |     |
 165                  .--+-----+-----+--.
 166                  | interconnection |
 167                  `--------+--------'
 168                           |
 169                      .----+-----.
 170                      | physical |
 171                      |  port 0  |
 172                      `----------'
 173
 174 - A DPDK application running on the hypervisor owns the PF device, which is
 175   arbitrarily assigned port index 3.
 176
 177 - Both VFs are assigned to VMs and used by unknown applications; they may be
 178   DPDK-based or anything else.
 179
 180 - Interconnection is not necessarily done through a true Ethernet switch and
 181   may not even exist as a separate entity. The role of this block is to show
 182   that something brings PF, VFs and physical ports together and enables
 183   communication between them, with a number of built-in restrictions.
 184
 185 Subsequent sections in this document describe means for DPDK applications
 186 running on the hypervisor to freely assign specific flows between PF, VFs
 187 and physical ports based on traffic properties, by managing this
 188 interconnection.
 189
 190 Controlled SR-IOV
 191 -----------------
 192
 193 Initialization
 194 ~~~~~~~~~~~~~~
 195
 196 When a DPDK application gets assigned a PF device and is deliberately not
 197 started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
 198 received by PF according to default rules, while VFs remain isolated.
 199
 200 ::
 201
 202        .-------------.                 .-------------. .-------------.
 203        | hypervisor  |                 |    VM 1     | |    VM 2     |
 204        | application |                 | application | | application |
 205        `--+----------'                 `----------+--' `--+----------'
 206           |                                       |       |
 207     .-----+-----.                                 |       |
 208     | port_id 3 |                                 |       |
 209     `-----+-----'                                 |       |
 210           |                                       |       |
 211         .-+--.                                .---+--. .--+---.
 212         | PF |                                | VF 1 | | VF 2 |
 213         `-+--'                                `------' `------'
 214           |
 215           `-----.
 216                 |
 217              .--+----------------------.
 218              | managed interconnection |
 219              `------------+------------'
 220                           |
 221                      .----+-----.
 222                      | physical |
 223                      |  port 0  |
 224                      `----------'
 225
 226 In this mode, interconnection must be configured by the application to
 227 enable VF communication, for instance by explicitly directing traffic with a
 228 given destination MAC address to VF 1 and allowing that with the same source
 229 MAC address to come out of it.
 230
 231 For this to work, hypervisor applications need a way to refer to either VF 1
 232 or VF 2 in addition to the PF. This is addressed by `VF representors`_.
 233
 234 VF Representors
 235 ~~~~~~~~~~~~~~~
 236
 237 VF representors are virtual but standard DPDK network devices (albeit with
 238 limited capabilities) created by PMDs when managing a PF device.
 239
 240 Since they represent VF instances used by other applications, configuring
 241 them (e.g. assigning a MAC address or setting up promiscuous mode) affects
 242 interconnection accordingly. If supported, they may also be used as two-way
 243 communication ports with VFs (assuming **switchdev** topology)
 244
 245
 246 ::
 247
 248        .-------------.                 .-------------. .-------------.
 249        | hypervisor  |                 |    VM 1     | |    VM 2     |
 250        | application |                 | application | | application |
 251        `--+---+---+--'                 `----------+--' `--+----------'
 252           |   |   |                               |       |
 253           |   |   `-------------------.           |       |
 254           |   `---------.             |           |       |
 255           |             |             |           |       |
 256     .-----+-----. .-----+-----. .-----+-----.     |       |
 257     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 258     `-----+-----' `-----+-----' `-----+-----'     |       |
 259           |             |             |           |       |
 260         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 261         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 262         `-+--'    `-----+-----' `-----+-----' `---+--' `--+---'
 263           |             |             |           |       |
 264           |             |   .---------'           |       |
 265           `-----.       |   |   .-----------------'       |
 266                 |       |   |   |   .---------------------'
 267                 |       |   |   |   |
 268              .--+-------+---+---+---+--.
 269              | managed interconnection |
 270              `------------+------------'
 271                           |
 272                      .----+-----.
 273                      | physical |
 274                      |  port 0  |
 275                      `----------'
 276
 277 - VF representors are assigned arbitrary port indices 4 and 5 in the
 278   hypervisor application and are respectively associated with VF 1 and VF 2.
 279
 280 - They can't be dissociated; even if VF 1 and VF 2 were not connected,
 281   representors could still be used for configuration.
 282
 283 - In this context, port index 3 can be thought as a representor for physical
 284   port 0.
 285
 286 As previously described, the "interconnection" block represents a logical
 287 concept. Interconnection occurs when hardware configuration enables traffic
 288 flows from one place to another (e.g. physical port 0 to VF 1) according to
 289 some criteria.
 290
 291 This is discussed in more detail in `traffic steering`_.
 292
 293 Traffic Steering
 294 ~~~~~~~~~~~~~~~~
 295
 296 In the following diagram, each meaningful traffic origin or endpoint as seen
 297 by the hypervisor application is tagged with a unique letter from A to F.
 298
 299 ::
 300
 301        .-------------.                 .-------------. .-------------.
 302        | hypervisor  |                 |    VM 1     | |    VM 2     |
 303        | application |                 | application | | application |
 304        `--+---+---+--'                 `----------+--' `--+----------'
 305           |   |   |                               |       |
 306           |   |   `-------------------.           |       |
 307           |   `---------.             |           |       |
 308           |             |             |           |       |
 309     .----(A)----. .----(B)----. .----(C)----.     |       |
 310     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 311     `-----+-----' `-----+-----' `-----+-----'     |       |
 312           |             |             |           |       |
 313         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 314         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 315         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 316           |             |             |           |       |
 317           |             |   .---------'           |       |
 318           `-----.       |   |   .-----------------'       |
 319                 |       |   |   |   .---------------------'
 320                 |       |   |   |   |
 321              .--+-------+---+---+---+--.
 322              | managed interconnection |
 323              `------------+------------'
 324                           |
 325                      .---(F)----.
 326                      | physical |
 327                      |  port 0  |
 328                      `----------'
 329
 330 - **A**: PF device.
 331 - **B**: port representor for VF 1.
 332 - **C**: port representor for VF 2.
 333 - **D**: VF 1 proper.
 334 - **E**: VF 2 proper.
 335 - **F**: physical port.
 336
 337 Although uncommon, some devices do not enforce a one to one mapping between
 338 PF and physical ports. For instance, by default all ports of **mlx4**
 339 adapters are available to all their PF/VF instances, in which case
 340 additional ports appear next to **F** in the above diagram.
 341
 342 Assuming no interconnection is provided by default in this mode, setting up
 343 a `basic SR-IOV`_ configuration involving physical port 0 could be broken
 344 down as:
 345
 346 PF:
 347
 348 - **A to F**: let everything through.
 349 - **F to A**: PF MAC as destination.
 350
 351 VF 1:
 352
 353 - **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
 354 - **D to A**: VF 1 MAC as source and PF MAC as destination.
 355 - **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
 356 - **D to F**: VF 1 MAC as source.
 357
 358 VF 2:
 359
 360 - **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
 361 - **E to A**: VF 2 MAC as source and PF MAC as destination.
 362 - **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
 363 - **E to F**: VF 2 MAC as source.
 364
 365 Devices may additionally support advanced matching criteria such as
 366 IPv4/IPv6 addresses or TCP/UDP ports.
 367
 368 The combination of matching criteria with target endpoints fits well with
 369 **rte_flow** [6]_, which expresses flow rules as combinations of patterns
 370 and actions.
 371
 372 Enhancing **rte_flow** with the ability to make flow rules match and target
 373 these endpoints provides a standard interface to manage their
 374 interconnection without introducing new concepts and whole new API to
 375 implement them. This is described in `flow API (rte_flow)`_.
 376
 377 .. [6] :doc:`Generic flow API (rte_flow) <rte_flow>`
 378
 379 Flow API (rte_flow)
 380 -------------------
 381
 382 Extensions
 383 ~~~~~~~~~~
 384
 385 Compared to creating a brand new dedicated interface, **rte_flow** was
 386 deemed flexible enough to manage representor traffic only with minor
 387 extensions:
 388
 389 - Using physical ports, PF, SF, VF or port representors as targets.
 390
 391 - Affecting traffic that is not necessarily addressed to the DPDK port ID a
 392   flow rule is associated with (e.g. forcing VF traffic redirection to PF).
 393
 394 For advanced uses:
 395
 396 - Rule-based packet counters.
 397
 398 - The ability to combine several identical actions for traffic duplication
 399   (e.g. VF representor in addition to a physical port).
 400
 401 - Dedicated actions for traffic encapsulation / decapsulation before
 402   reaching an endpoint.
 403
 404 Traffic Direction
 405 ~~~~~~~~~~~~~~~~~
 406
 407 From an application standpoint, "ingress" and "egress" flow rule attributes
 408 apply to the DPDK port ID they are associated with. They select a traffic
 409 direction for matching patterns, but have no impact on actions.
 410
 411 When matching traffic coming from or going to a different place than the
 412 immediate port ID a flow rule is associated with, these attributes keep
 413 their meaning while applying to the chosen origin, as highlighted by the
 414 following diagram
 415
 416 ::
 417
 418        .-------------.                 .-------------. .-------------.
 419        | hypervisor  |                 |    VM 1     | |    VM 2     |
 420        | application |                 | application | | application |
 421        `--+---+---+--'                 `----------+--' `--+----------'
 422           |   |   |                               |       |
 423           |   |   `-------------------.           |       |
 424           |   `---------.             |           |       |
 425           | ^           | ^           | ^         |       |
 426           | | ingress   | | ingress   | | ingress |       |
 427           | | egress    | | egress    | | egress  |       |
 428           | v           | v           | v         |       |
 429     .----(A)----. .----(B)----. .----(C)----.     |       |
 430     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 431     `-----+-----' `-----+-----' `-----+-----'     |       |
 432           |             |             |           |       |
 433         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 434         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 435         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 436           |             |             |         ^ |       | ^
 437           |             |             |  egress | |       | | egress
 438           |             |             | ingress | |       | | ingress
 439           |             |   .---------'         v |       | v
 440           `-----.       |   |   .-----------------'       |
 441                 |       |   |   |   .---------------------'
 442                 |       |   |   |   |
 443              .--+-------+---+---+---+--.
 444              | managed interconnection |
 445              `------------+------------'
 446                         ^ |
 447                 ingress | |
 448                  egress | |
 449                         v |
 450                      .---(F)----.
 451                      | physical |
 452                      |  port 0  |
 453                      `----------'
 454
 455 Ingress and egress are defined as relative to the application creating the
 456 flow rule.
 457
 458 For instance, matching traffic sent by VM 2 would be done through an ingress
 459 flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
 460 (**F**). This also applies to **C** and **A** respectively.
 461
 462 Transferring Traffic
 463 ~~~~~~~~~~~~~~~~~~~~
 464
 465 Without Port Representors
 466 ^^^^^^^^^^^^^^^^^^^^^^^^^
 467
 468 `Traffic direction`_ describes how an application could match traffic coming
 469 from or going to a specific place reachable from a DPDK port ID. This makes
 470 sense when the traffic in question is normally seen (i.e. sent or received)
 471 by the application creating the flow rule (e.g. as in "redirect all traffic
 472 coming from VF 1 to local queue 6").
 473
 474 However this does not force such traffic to take a specific route. Creating
 475 a flow rule on **A** matching traffic coming from **D** is only meaningful
 476 if it can be received by **A** in the first place, otherwise doing so simply
 477 has no effect.
 478
 479 A new flow rule attribute named "transfer" is necessary for that. Combining
 480 it with "ingress" or "egress" and a specific origin requests a flow rule to
 481 be applied at the lowest level
 482
 483 ::
 484
 485              ingress only           :       ingress + transfer
 486                                     :
 487     .-------------. .-------------. : .-------------. .-------------.
 488     | hypervisor  | |    VM 1     | : | hypervisor  | |    VM 1     |
 489     | application | | application | : | application | | application |
 490     `------+------' `--+----------' : `------+------' `--+----------'
 491            |           | | traffic  :        |           | | traffic
 492      .----(A)----.     | v          :  .----(A)----.     | v
 493      | port_id 3 |     |            :  | port_id 3 |     |
 494      `-----+-----'     |            :  `-----+-----'     |
 495            |           |            :        | ^         |
 496            |           |            :        | | traffic |
 497          .-+--.    .---+--.         :      .-+--.    .---+--.
 498          | PF |    | VF 1 |         :      | PF |    | VF 1 |
 499          `-+--'    `--(D)-'         :      `-+--'    `--(D)-'
 500            |           | | traffic  :        | ^         | | traffic
 501            |           | v          :        | | traffic | v
 502         .--+-----------+--.         :     .--+-----------+--.
 503         | interconnection |         :     | interconnection |
 504         `--------+--------'         :     `--------+--------'
 505                  | | traffic        :              |
 506                  | v                :              |
 507             .---(F)----.            :         .---(F)----.
 508             | physical |            :         | physical |
 509             |  port 0  |            :         |  port 0  |
 510             `----------'            :         `----------'
 511
 512 With "ingress" only, traffic is matched on **A** thus still goes to physical
 513 port **F** by default
 514
 515
 516 ::
 517
 518    testpmd> flow create 3 ingress pattern vf id is 1 / end
 519               actions queue index 6 / end
 520
 521 With "ingress + transfer", traffic is matched on **D** and is therefore
 522 successfully assigned to queue 6 on **A**
 523
 524
 525 ::
 526
 527     testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
 528               actions queue index 6 / end
 529
 530
 531 With Port Representors
 532 ^^^^^^^^^^^^^^^^^^^^^^
 533
 534 When port representors exist, implicit flow rules with the "transfer"
 535 attribute (described in `without port representors`_) are be assumed to
 536 exist between them and their represented resources. These may be immutable.
 537
 538 In this case, traffic is received by default through the representor and
 539 neither the "transfer" attribute nor traffic origin in flow rule patterns
 540 are necessary. They simply have to be created on the representor port
 541 directly and may target a different representor as described in `PORT_ID
 542 action`_.
 543
 544 Implicit traffic flow with port representor
 545
 546 ::
 547
 548        .-------------.   .-------------.
 549        | hypervisor  |   |    VM 1     |
 550        | application |   | application |
 551        `--+-------+--'   `----------+--'
 552           |       | ^               | | traffic
 553           |       | | traffic       | v
 554           |       `-----.           |
 555           |             |           |
 556     .----(A)----. .----(B)----.     |
 557     | port_id 3 | | port_id 4 |     |
 558     `-----+-----' `-----+-----'     |
 559           |             |           |
 560         .-+--.    .-----+-----. .---+--.
 561         | PF |    | VF 1 rep. | | VF 1 |
 562         `-+--'    `-----+-----' `--(D)-'
 563           |             |           |
 564        .--|-------------|-----------|--.
 565        |  |             |           |  |
 566        |  |             `-----------'  |
 567        |  |              <-- traffic   |
 568        `--|----------------------------'
 569           |
 570      .---(F)----.
 571      | physical |
 572      |  port 0  |
 573      `----------'
 574
 575 Pattern Items And Actions
 576 ~~~~~~~~~~~~~~~~~~~~~~~~~
 577
 578 PORT Pattern Item
 579 ^^^^^^^^^^^^^^^^^
 580
 581 Matches traffic originating from (ingress) or going to (egress) a physical
 582 port of the underlying device.
 583
 584 Using this pattern item without specifying a port index matches the physical
 585 port associated with the current DPDK port ID by default. As described in
 586 `traffic steering`_, specifying it should be rarely needed.
 587
 588 - Matches **F** in `traffic steering`_.
 589
 590 PORT Action
 591 ^^^^^^^^^^^
 592
 593 Directs matching traffic to a given physical port index.
 594
 595 - Targets **F** in `traffic steering`_.
 596
 597 PORT_ID Pattern Item
 598 ^^^^^^^^^^^^^^^^^^^^
 599
 600 Matches traffic originating from (ingress) or going to (egress) a given DPDK
 601 port ID.
 602
 603 Normally only supported if the port ID in question is known by the
 604 underlying PMD and related to the device the flow rule is created against.
 605
 606 This must not be confused with the `PORT pattern item`_ which refers to the
 607 physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
 608 object on the application side (also known as "port representor" depending
 609 on the kind of underlying device).
 610
 611 - Matches **A**, **B** or **C** in `traffic steering`_.
 612
 613 PORT_ID Action
 614 ^^^^^^^^^^^^^^
 615
 616 Directs matching traffic to a given DPDK port ID.
 617
 618 Same restrictions as `PORT_ID pattern item`_.
 619
 620 - Targets **A**, **B** or **C** in `traffic steering`_.
 621
 622 PF Pattern Item
 623 ^^^^^^^^^^^^^^^
 624
 625 Matches traffic originating from (ingress) or going to (egress) the physical
 626 function of the current device.
 627
 628 If supported, should work even if the physical function is not managed by
 629 the application and thus not associated with a DPDK port ID. Its behavior is
 630 otherwise similar to `PORT_ID pattern item`_ using PF port ID.
 631
 632 - Matches **A** in `traffic steering`_.
 633
 634 PF Action
 635 ^^^^^^^^^
 636
 637 Directs matching traffic to the physical function of the current device.
 638
 639 Same restrictions as `PF pattern item`_.
 640
 641 - Targets **A** in `traffic steering`_.
 642
 643 VF Pattern Item
 644 ^^^^^^^^^^^^^^^
 645
 646 Matches traffic originating from (ingress) or going to (egress) a given
 647 virtual function of the current device.
 648
 649 If supported, should work even if the virtual function is not managed by
 650 the application and thus not associated with a DPDK port ID. Its behavior is
 651 otherwise similar to `PORT_ID pattern item`_ using VF port ID.
 652
 653 Note this pattern item does not match VF representors traffic which, as
 654 separate entities, should be addressed through their own port IDs.
 655
 656 - Matches **D** or **E** in `traffic steering`_.
 657
 658 VF Action
 659 ^^^^^^^^^
 660
 661 Directs matching traffic to a given virtual function of the current device.
 662
 663 Same restrictions as `VF pattern item`_.
 664
 665 - Targets **D** or **E** in `traffic steering`_.
 666
 667 \*_ENCAP actions
 668 ^^^^^^^^^^^^^^^^
 669
 670 These actions are named according to the protocol they encapsulate traffic
 671 with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
 672 VXLAN).
 673
 674 While they modify traffic and can be used multiple times (order matters),
 675 unlike `PORT_ID action`_ and friends, they have no impact on steering.
 676
 677 As described in `actions order and repetition`_ this means they are useless
 678 if used alone in an action list, the resulting traffic gets dropped unless
 679 combined with either ``PASSTHRU`` or other endpoint-targeting actions.
 680
 681 \*_DECAP actions
 682 ^^^^^^^^^^^^^^^^
 683
 684 They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers
 685 from traffic instead of pushing them. They can be used multiple times as
 686 well.
 687
 688 Note that using these actions on non-matching traffic results in undefined
 689 behavior. It is recommended to match the protocol headers to decapsulate on
 690 the pattern side of a flow rule in order to use these actions or otherwise
 691 make sure only matching traffic goes through.
 692
 693 Actions Order and Repetition
 694 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 695
 696 Flow rules are currently restricted to at most a single action of each
 697 supported type, performed in an unpredictable order (or all at once). To
 698 repeat actions in a predictable fashion, applications have to make rules
 699 pass-through and use priority levels.
 700
 701 It's now clear that PMD support for chaining multiple non-terminating flow
 702 rules of varying priority levels is prohibitively difficult to implement
 703 compared to simply allowing multiple identical actions performed in a
 704 defined order by a single flow rule.
 705
 706 - This change is required to support protocol encapsulation offloads and the
 707   ability to perform them multiple times (e.g. VLAN then VXLAN).
 708
 709 - It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
 710   be combined for duplication.
 711
 712 - The (non-)terminating property of actions must be discarded. Instead, flow
 713   rules themselves must be considered terminating by default (i.e. dropping
 714   traffic if there is no specific target) unless a ``PASSTHRU`` action is
 715   also specified.
 716
 717 Switching Examples
 718 ------------------
 719
 720 This section provides practical examples based on the established testpmd
 721 flow command syntax [2]_, in the context described in `traffic steering`_
 722
 723 ::
 724
 725       .-------------.                 .-------------. .-------------.
 726       | hypervisor  |                 |    VM 1     | |    VM 2     |
 727       | application |                 | application | | application |
 728       `--+---+---+--'                 `----------+--' `--+----------'
 729          |   |   |                               |       |
 730          |   |   `-------------------.           |       |
 731          |   `---------.             |           |       |
 732          |             |             |           |       |
 733    .----(A)----. .----(B)----. .----(C)----.     |       |
 734    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 735    `-----+-----' `-----+-----' `-----+-----'     |       |
 736          |             |             |           |       |
 737        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 738        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 739        `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 740          |             |             |           |       |
 741          |             |   .---------'           |       |
 742          `-----.       |   |   .-----------------'       |
 743                |       |   |   |   .---------------------'
 744                |       |   |   |   |
 745             .--|-------|---|---|---|--.
 746             |  |       |   `---|---'  |
 747             |  |       `-------'      |
 748             |  `---------.            |
 749             `------------|------------'
 750                          |
 751                     .---(F)----.
 752                     | physical |
 753                     |  port 0  |
 754                     `----------'
 755
 756 By default, PF (**A**) can communicate with the physical port it is
 757 associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
 758 and restricted to communicate with the hypervisor application through their
 759 respective representors (**B** and **C**) if supported.
 760
 761 Examples in subsequent sections apply to hypervisor applications only and
 762 are based on port representors **A**, **B** and **C**.
 763
 764 .. [2] :ref:`Flow syntax <testpmd_rte_flow>`
 765
 766 Associating VF 1 with Physical Port 0
 767 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 768
 769 Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
 770 their representors
 771
 772 ::
 773
 774    flow create 3 ingress pattern / end actions port_id id 4 / end
 775    flow create 4 ingress pattern / end actions port_id id 3 / end
 776
 777 More practical example with MAC address restrictions
 778
 779 ::
 780
 781    flow create 3 ingress
 782        pattern eth dst is {VF 1 MAC} / end
 783        actions port_id id 4 / end
 784
 785 ::
 786
 787    flow create 4 ingress
 788        pattern eth src is {VF 1 MAC} / end
 789        actions port_id id 3 / end
 790
 791
 792 Sharing Broadcasts
 793 ~~~~~~~~~~~~~~~~~~
 794
 795 From outside to PF and VFs
 796
 797 ::
 798
 799    flow create 3 ingress
 800       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 801       actions port_id id 3 / port_id id 4 / port_id id 5 / end
 802
 803 Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
 804 traffic.
 805
 806 From PF to outside and VFs
 807
 808 ::
 809
 810    flow create 3 egress
 811       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 812       actions port / port_id id 4 / port_id id 5 / end
 813
 814 From VFs to outside and PF
 815
 816 ::
 817
 818    flow create 4 ingress
 819       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
 820       actions port_id id 3 / port_id id 5 / end
 821
 822    flow create 5 ingress
 823       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
 824       actions port_id id 4 / port_id id 4 / end
 825
 826 Similar ``33:33:*`` rules based on known MAC addresses should be added for
 827 IPv6 traffic.
 828
 829 Encapsulating VF 2 Traffic in VXLAN
 830 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 831
 832 Assuming pass-through flow rules are supported
 833
 834 ::
 835
 836    flow create 5 ingress
 837       pattern eth / end
 838       actions vxlan_encap vni 42 / passthru / end
 839
 840 ::
 841
 842    flow create 5 egress
 843       pattern vxlan vni is 42 / end
 844       actions vxlan_decap / passthru / end
 845
 846 Here ``passthru`` is needed since as described in `actions order and
 847 repetition`_, flow rules are otherwise terminating; if supported, a rule
 848 without a target endpoint will drop traffic.
 849
 850 Without pass-through support, ingress encapsulation on the destination
 851 endpoint might not be supported and action list must provide one
 852
 853 ::
 854
 855    flow create 5 ingress
 856       pattern eth src is {VF 2 MAC} / end
 857       actions vxlan_encap vni 42 / port_id id 3 / end
 858
 859    flow create 3 ingress
 860       pattern vxlan vni is 42 / end
 861       actions vxlan_decap / port_id id 5 / end