1. 02 Jun, 2020 1 commit
    • Daniel Borkmann's avatar
      bpf: Fix up bpf_skb_adjust_room helper's skb csum setting · 836e66c2
      Daniel Borkmann authored
      Lorenz recently reported:
      
        In our TC classifier cls_redirect [0], we use the following sequence of
        helper calls to decapsulate a GUE (basically IP + UDP + custom header)
        encapsulated packet:
      
          bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO)
          bpf_redirect(skb->ifindex, BPF_F_INGRESS)
      
        It seems like some checksums of the inner headers are not validated in
        this case. For example, a TCP SYN packet with invalid TCP checksum is
        still accepted by the network stack and elicits a SYN ACK. [...]
      
        That is, we receive the following packet from the driver:
      
          | ETH | IP | UDP | GUE | IP | TCP |
          skb->ip_summed == CHECKSUM_UNNECESSARY
      
        ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading.
        On this packet we run skb_adjust_room_mac(-encap_len), and get the following:
      
          | ETH | IP | TCP |
          skb->ip_summed == CHECKSUM_UNNECESSARY
      
        Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing
        into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is
        turned into a no-op due to CHECKSUM_UNNECESSARY.
      
      The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally,
      it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does
      not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original
      skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now
      there is no way to adjust the skb->csum_level. NICs that have checksum offload
      disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected.
      
      Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and
      add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users
      from opting out. Opting out is useful for the case where we don't remove/add
      full protocol headers, or for the case where a user wants to adjust the csum
      level manually e.g. through bpf_csum_level() helper that is added in subsequent
      patch.
      
      The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF
      bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally
      as well but doesn't change layers, only transitions between v4 to v6 and vice
      versa, therefore no adoption is required there.
      
        [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/
      
      Fixes: 2be7e212 ("bpf: add bpf_skb_adjust_room helper")
      Reported-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Reported-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/
      Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net
      836e66c2
  2. 01 Jun, 2020 39 commits