• Maciej Fijalkowski's avatar
    ice: Add support for XDP multi-buffer on Rx side · 2fba7dc5
    Maciej Fijalkowski authored
    Ice driver needs to be a bit reworked on Rx data path in order to
    support multi-buffer XDP. For skb path, it currently works in a way that
    Rx ring carries pointer to skb so if driver didn't manage to combine
    fragmented frame at current NAPI instance, it can restore the state on
    next instance and keep looking for last fragment (so descriptor with EOP
    bit set). What needs to be achieved is that xdp_buff needs to be
    combined in such way (linear + frags part) in the first place. Then skb
    will be ready to go in case of XDP_PASS or BPF program being not present
    on interface. If BPF program is there, it would work on multi-buffer
    XDP. At this point xdp_buff resides directly on Rx ring, so given the
    fact that skb will be built straight from xdp_buff, there will be no
    further need to carry skb on Rx ring.
    
    Besides removing skb pointer from Rx ring, lots of members have been
    moved around within ice_rx_ring. First and foremost reason was to place
    rx_buf with xdp_buff on the same cacheline. This means that once we
    touch rx_buf (which is a preceding step before touching xdp_buff),
    xdp_buff will already be hot in cache. Second thing was that xdp_rxq is
    used rather rarely and it occupies a separate cacheline, so maybe it is
    better to have it at the end of ice_rx_ring.
    
    Other change that affects ice_rx_ring is the introduction of
    ice_rx_ring::first_desc. Its purpose is twofold - first is to propagate
    rx_buf->act to all the parts of current xdp_buff after running XDP
    program, so that ice_put_rx_buf() that got moved out of the main Rx
    processing loop will be able to tak an appriopriate action on each
    buffer. Second is for ice_construct_skb().
    
    ice_construct_skb() has a copybreak mechanism which had an explicit
    impact on xdp_buff->skb conversion in the new approach when legacy Rx
    flag is toggled. It works in a way that linear part is 256 bytes long,
    if frame is bigger than that, remaining bytes are going as a frag to
    skb_shared_info.
    
    This means while memcpying frags from xdp_buff to newly allocated skb,
    care needs to be taken when picking the destination frag array entry.
    Upon the time ice_construct_skb() is called, when dealing with
    fragmented frame, current rx_buf points to the *last* fragment, but
    copybreak needs to be done against the first one.  That's where
    ice_rx_ring::first_desc helps.
    
    When frame building spans across NAPI polls (DD bit is not set on
    current descriptor and xdp->data is not NULL) with current Rx buffer
    handling state there might be some problems.
    Since calls to ice_put_rx_buf() were pulled out of the main Rx
    processing loop and were scoped from cached_ntc to current ntc, remember
    that now mentioned function relies on rx_buf->act, which is set within
    ice_run_xdp(). ice_run_xdp() is called when EOP bit was found, so
    currently we could put Rx buffer with rx_buf->act being *uninitialized*.
    To address this, change scoping to rely on first_desc on both boundaries
    instead.
    
    This also implies that cleaned_count which is used as an input to
    ice_alloc_rx_buffers() and tells how many new buffers should be refilled
    has to be adjusted. If it stayed as is, what could happen is a case
    where ntc would go over ntu.
    
    Therefore, remove cleaned_count altogether and use against allocing
    routine newly introduced ICE_RX_DESC_UNUSED() macro which is an
    equivalent of ICE_DESC_UNUSED() dedicated for Rx side and based on
    struct ice_rx_ring::first_desc instead of next_to_clean.
    Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
    Link: https://lore.kernel.org/bpf/20230131204506.219292-11-maciej.fijalkowski@intel.com
    2fba7dc5
ice_txrx.c 69 KB