• John Fastabend's avatar
    net: implement mechanism for HW based QOS · 4f57c087
    John Fastabend authored
    This patch provides a mechanism for lower layer devices to
    steer traffic using skb->priority to tx queues. This allows
    for hardware based QOS schemes to use the default qdisc without
    incurring the penalties related to global state and the qdisc
    lock. While reliably receiving skbs on the correct tx ring
    to avoid head of line blocking resulting from shuffling in
    the LLD. Finally, all the goodness from txq caching and xps/rps
    can still be leveraged.
    
    Many drivers and hardware exist with the ability to implement
    QOS schemes in the hardware but currently these drivers tend
    to rely on firmware to reroute specific traffic, a driver
    specific select_queue or the queue_mapping action in the
    qdisc.
    
    By using select_queue for this drivers need to be updated for
    each and every traffic type and we lose the goodness of much
    of the upstream work. Firmware solutions are inherently
    inflexible. And finally if admins are expected to build a
    qdisc and filter rules to steer traffic this requires knowledge
    of how the hardware is currently configured. The number of tx
    queues and the queue offsets may change depending on resources.
    Also this approach incurs all the overhead of a qdisc with filters.
    
    With the mechanism in this patch users can set skb priority using
    expected methods ie setsockopt() or the stack can set the priority
    directly. Then the skb will be steered to the correct tx queues
    aligned with hardware QOS traffic classes. In the normal case with
    single traffic class and all queues in this class everything
    works as is until the LLD enables multiple tcs.
    
    To steer the skb we mask out the lower 4 bits of the priority
    and allow the hardware to configure upto 15 distinct classes
    of traffic. This is expected to be sufficient for most applications
    at any rate it is more then the 8021Q spec designates and is
    equal to the number of prio bands currently implemented in
    the default qdisc.
    
    This in conjunction with a userspace application such as
    lldpad can be used to implement 8021Q transmission selection
    algorithms one of these algorithms being the extended transmission
    selection algorithm currently being used for DCB.
    Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    4f57c087
dev.c 154 KB