• Kevin 'ldir' Darbyshire-Bryant's avatar
    net: sched: Introduce act_ctinfo action · 24ec483c
    Kevin 'ldir' Darbyshire-Bryant authored
    ctinfo is a new tc filter action module.  It is designed to restore
    information contained in firewall conntrack marks to other packet fields
    and is typically used on packet ingress paths.  At present it has two
    independent sub-functions or operating modes, DSCP restoration mode &
    skb mark restoration mode.
    
    The DSCP restore mode:
    
    This mode copies DSCP values that have been placed in the firewall
    conntrack mark back into the IPv4/v6 diffserv fields of relevant
    packets.
    
    The DSCP restoration is intended for use and has been found useful for
    restoring ingress classifications based on egress classifications across
    links that bleach or otherwise change DSCP, typically home ISP Internet
    links.  Restoring DSCP on ingress on the WAN link allows qdiscs such as
    but by no means limited to CAKE to shape inbound packets according to
    policies that are easier to set & mark on egress.
    
    Ingress classification is traditionally a challenging task since
    iptables rules haven't yet run and tc filter/eBPF programs are pre-NAT
    lookups, hence are unable to see internal IPv4 addresses as used on the
    typical home masquerading gateway.  Thus marking the connection in some
    manner on egress for later restoration of classification on ingress is
    easier to implement.
    
    Parameters related to DSCP restore mode:
    
    dscpmask - a 32 bit mask of 6 contiguous bits and indicate bits of the
    conntrack mark field contain the DSCP value to be restored.
    
    statemask - a 32 bit mask of (usually) 1 bit length, outside the area
    specified by dscpmask.  This represents a conditional operation flag
    whereby the DSCP is only restored if the flag is set.  This is useful to
    implement a 'one shot' iptables based classification where the
    'complicated' iptables rules are only run once to classify the
    connection on initial (egress) packet and subsequent packets are all
    marked/restored with the same DSCP.  A mask of zero disables the
    conditional behaviour ie. the conntrack mark DSCP bits are always
    restored to the ip diffserv field (assuming the conntrack entry is found
    & the skb is an ipv4/ipv6 type)
    
    e.g. dscpmask 0xfc000000 statemask 0x01000000
    
    |----0xFC----conntrack mark----000000---|
    | Bits 31-26 | bit 25 | bit24 |~~~ Bit 0|
    | DSCP       | unused | flag  |unused   |
    |-----------------------0x01---000000---|
          |                   |
          |                   |
          ---|             Conditional flag
             v             only restore if set
    |-ip diffserv-|
    | 6 bits      |
    |-------------|
    
    The skb mark restore mode (cpmark):
    
    This mode copies the firewall conntrack mark to the skb's mark field.
    It is completely the functional equivalent of the existing act_connmark
    action with the additional feature of being able to apply a mask to the
    restored value.
    
    Parameters related to skb mark restore mode:
    
    mask - a 32 bit mask applied to the firewall conntrack mark to mask out
    bits unwanted for restoration.  This can be useful where the conntrack
    mark is being used for different purposes by different applications.  If
    not specified and by default the whole mark field is copied (i.e.
    default mask of 0xffffffff)
    
    e.g. mask 0x00ffffff to mask out the top 8 bits being used by the
    aforementioned DSCP restore mode.
    
    |----0x00----conntrack mark----ffffff---|
    | Bits 31-24 |                          |
    | DSCP & flag|      some value here     |
    |---------------------------------------|
    			|
    			|
    			v
    |------------skb mark-------------------|
    |            |                          |
    |  zeroed    |                          |
    |---------------------------------------|
    
    Overall parameters:
    
    zone - conntrack zone
    
    control - action related control (reclassify | pipe | drop | continue |
    ok | goto chain <CHAIN_INDEX>)
    Signed-off-by: default avatarKevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
    Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
    Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    24ec483c
tc_ctinfo.h 517 Bytes