1. 16 Mar, 2021 3 commits
  2. 15 Mar, 2021 1 commit
  3. 10 Mar, 2021 22 commits
    • Andrii Nakryiko's avatar
      Merge branch 'libbpf/xsk cleanups' · 1211f4e9
      Andrii Nakryiko authored
      Björn Töpel says:
      
      ====================
      
      This series removes a header dependency from xsk.h, and moves
      libbpf_util.h into xsk.h.
      
      More details in each commit!
      
      Thank you,
      Björn
      ====================
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      1211f4e9
    • Björn Töpel's avatar
      libbpf: xsk: Move barriers from libbpf_util.h to xsk.h · 7e8bbe24
      Björn Töpel authored
      The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h,
      and remove libbpf_util.h. The barriers are used as an implementation
      detail, and should not be considered part of the stable API.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com
      7e8bbe24
    • Björn Töpel's avatar
      libbpf: xsk: Remove linux/compiler.h header · 2882c48b
      Björn Töpel authored
      In commit 291471dd ("libbpf, xsk: Add libbpf_smp_store_release
      libbpf_smp_load_acquire") linux/compiler.h was added as a dependency
      to xsk.h, which is the user-facing API. This makes it harder for
      userspace application to consume the library. Here the header
      inclusion is removed, and instead {READ,WRITE}_ONCE() is added
      explicitly.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com
      2882c48b
    • Jiapeng Chong's avatar
      bpf: Fix warning comparing pointer to 0 · a9c80b03
      Jiapeng Chong authored
      Fix the following coccicheck warning:
      
      ./tools/testing/selftests/bpf/progs/fentry_test.c:67:12-13: WARNING
      comparing pointer to 0.
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/1615360714-30381-1-git-send-email-jiapeng.chong@linux.alibaba.com
      a9c80b03
    • Jiapeng Chong's avatar
      selftests/bpf: Fix warning comparing pointer to 0 · 04ea63e3
      Jiapeng Chong authored
      Fix the following coccicheck warning:
      
      ./tools/testing/selftests/bpf/progs/test_global_func10.c:17:12-13:
      WARNING comparing pointer to 0.
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/1615357366-97612-1-git-send-email-jiapeng.chong@linux.alibaba.com
      04ea63e3
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · c1acda98
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2021-03-09
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 90 non-merge commits during the last 17 day(s) which contain
      a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).
      
      The main changes are:
      
      1) Faster bpf_redirect_map(), from Björn.
      
      2) skmsg cleanup, from Cong.
      
      3) Support for floating point types in BTF, from Ilya.
      
      4) Documentation for sys_bpf commands, from Joe.
      
      5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.
      
      6) Enable task local storage for tracing programs, from Song.
      
      7) bpf_for_each_map_elem() helper, from Yonghong.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1acda98
    • Linus Torvalds's avatar
      Merge git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net · 05a59d79
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.
      
       2) TX skb error handling fix in mt76 driver, also from Felix.
      
       3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.
      
       4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
          From Cong Wang.
      
       5) Use correct printf format for dma_addr_t in ath11k, from Geert
          Uytterhoeven.
      
       6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
          Hsieh.
      
       7) Don't report truncated frames to mac80211 in mt76 driver, from
          Lorenzop Bianconi.
      
       8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.
      
       9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.
      
      10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
          Arjun Roy.
      
      11) Ignore routes with deleted nexthop object in mlxsw, from Ido
          Schimmel.
      
      12) Need to undo tcp early demux lookup sometimes in nf_nat, from
          Florian Westphal.
      
      13) Fix gro aggregation for udp encaps with zero csum, from Daniel
          Borkmann.
      
      14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
          Donenfeld.
      
      15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.
      
      16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.
      
      17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
          Oltean.
      
      18) Make sure textsearch copntrol block is large enough, from Wilem de
          Bruijn.
      
      19) Revert MAC changes to r8152 leading to instability, from Hates Wang.
      
      20) Advance iov in 9p even for empty reads, from Jissheng Zhang.
      
      21) Double hook unregister in nftables, from PabloNeira Ayuso.
      
      22) Fix memleak in ixgbe, fropm Dinghao Liu.
      
      23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.
      
      24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
          Tang.
      
      25) Fix DOI refcount bugs in cipso, from Paul Moore.
      
      26) One too many irqsave in ibmvnic, from Junlin Yang.
      
      27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
          Balazs Nemeth.
      
      * git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
        s390/qeth: fix notification for pending buffers during teardown
        s390/qeth: schedule TX NAPI on QAOB completion
        s390/qeth: improve completion of pending TX buffers
        s390/qeth: fix memory leak after failed TX Buffer allocation
        net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
        net: check if protocol extracted by virtio_net_hdr_set_proto is correct
        net: dsa: xrs700x: check if partner is same as port in hsr join
        net: lapbether: Remove netif_start_queue / netif_stop_queue
        atm: idt77252: fix null-ptr-dereference
        atm: uPD98402: fix incorrect allocation
        atm: fix a typo in the struct description
        net: qrtr: fix error return code of qrtr_sendmsg()
        mptcp: fix length of ADD_ADDR with port sub-option
        net: bonding: fix error return code of bond_neigh_init()
        net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
        net: enetc: set MAC RX FIFO to recommended value
        net: davicom: Use platform_get_irq_optional()
        net: davicom: Fix regulator not turned off on driver removal
        net: davicom: Fix regulator not turned off on failed probe
        net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
        ...
      05a59d79
    • Linus Torvalds's avatar
      Merge git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc · 6a30bedf
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Fix opcode filtering for exceptions, and clean up defconfig"
      
      * git://git.kernel.org:/pub/scm/linux/kernel/git/davem/sparc:
        sparc: sparc64_defconfig: remove duplicate CONFIGs
        sparc64: Fix opcode filtering in handling of no fault loads
      6a30bedf
    • Corentin Labbe's avatar
      sparc: sparc64_defconfig: remove duplicate CONFIGs · 69264b4a
      Corentin Labbe authored
      After my patch there is CONFIG_ATA defined twice.
      Remove the duplicate one.
      Same problem for CONFIG_HAPPYMEAL, except I added as builtin for boot
      test with NFS.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Fixes: a57cdeb3 ("sparc: sparc64_defconfig: add necessary configs for qemu")
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69264b4a
    • Rob Gardner's avatar
      sparc64: Fix opcode filtering in handling of no fault loads · e5e8b80d
      Rob Gardner authored
      is_no_fault_exception() has two bugs which were discovered via random
      opcode testing with stress-ng. Both are caused by improper filtering
      of opcodes.
      
      The first bug can be triggered by a floating point store with a no-fault
      ASI, for instance "sta %f0, [%g0] #ASI_PNF", opcode C1A01040.
      
      The code first tests op3[5] (0x1000000), which denotes a floating
      point instruction, and then tests op3[2] (0x200000), which denotes a
      store instruction. But these bits are not mutually exclusive, and the
      above mentioned opcode has both bits set. The intent is to filter out
      stores, so the test for stores must be done first in order to have
      any effect.
      
      The second bug can be triggered by a floating point load with one of
      the invalid ASI values 0x8e or 0x8f, which pass this check in
      is_no_fault_exception():
           if ((asi & 0xf2) == ASI_PNF)
      
      An example instruction is "ldqa [%l7 + %o7] #ASI 0x8f, %f38",
      opcode CF95D1EF. Asi values greater than 0x8b (ASI_SNFL) are fatal
      in handle_ldf_stq(), and is_no_fault_exception() must not allow these
      invalid asi values to make it that far.
      
      In both of these cases, handle_ldf_stq() reacts by calling
      sun4v_data_access_exception() or spitfire_data_access_exception(),
      which call is_no_fault_exception() and results in an infinite
      recursion.
      Signed-off-by: default avatarRob Gardner <rob.gardner@oracle.com>
      Tested-by: default avatarAnatoly Pugachev <matorola@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5e8b80d
    • David S. Miller's avatar
      Merge branch 's390-qeth-fixes' · 85154557
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2021-03-09
      
      please apply the following patch series to netdev's net tree.
      
      This brings one fix for a memleak in an error path of the setup code.
      Also several fixes for dealing with pending TX buffers - two for old
      bugs in their completion handling, and one recent regression in a
      teardown path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85154557
    • Julian Wiedmann's avatar
      s390/qeth: fix notification for pending buffers during teardown · 7eefda7f
      Julian Wiedmann authored
      The cited commit reworked the state machine for pending TX buffers.
      In qeth_iqd_tx_complete() it turned PENDING into a transient state, and
      uses NEED_QAOB for buffers that get parked while waiting for their QAOB
      completion.
      
      But it missed to adjust the check in qeth_tx_complete_buf(). So if
      qeth_tx_complete_pending_bufs() is called during teardown to drain
      the parked TX buffers, we no longer raise a notification for af_iucv.
      
      Instead of updating the checked state, just move this code into
      qeth_tx_complete_pending_bufs() itself. This also gets rid of the
      special-case in the common TX completion path.
      
      Fixes: 8908f36d ("s390/qeth: fix af_iucv notification race")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7eefda7f
    • Julian Wiedmann's avatar
      s390/qeth: schedule TX NAPI on QAOB completion · 3e83d467
      Julian Wiedmann authored
      When a QAOB notifies us that a pending TX buffer has been delivered, the
      actual TX completion processing by qeth_tx_complete_pending_bufs()
      is done within the context of a TX NAPI instance. We shouldn't rely on
      this instance being scheduled by some other TX event, but just do it
      ourselves.
      
      qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI
      instance. To avoid touching the TX queue's NAPI instance
      before/after it is (un-)registered, reorder the code in qeth_open()
      and qeth_stop() accordingly.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e83d467
    • Julian Wiedmann's avatar
      s390/qeth: improve completion of pending TX buffers · c20383ad
      Julian Wiedmann authored
      The current design attaches a pending TX buffer to a custom
      single-linked list, which is anchored at the buffer's slot on the
      TX ring. The buffer is then checked for final completion whenever
      this slot is processed during a subsequent TX NAPI poll cycle.
      
      But if there's insufficient traffic on the ring, we might never make
      enough progress to get back to this ring slot and discover the pending
      buffer's final TX completion. In particular if this missing TX
      completion blocks the application from sending further traffic.
      
      So convert the custom single-linked list code to a per-queue list_head,
      and scan this list on every TX NAPI cycle.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c20383ad
    • Julian Wiedmann's avatar
      s390/qeth: fix memory leak after failed TX Buffer allocation · e7a36d27
      Julian Wiedmann authored
      When qeth_alloc_qdio_queues() fails to allocate one of the buffers that
      back an Output Queue, the 'out_freeoutqbufs' path will free all
      previously allocated buffers for this queue. But it misses to free the
      half-finished queue struct itself.
      
      Move the buffer allocation into qeth_alloc_output_queue(), and deal with
      such errors internally.
      
      Fixes: 0da9581d ("qeth: exploit asynchronous delivery of storage blocks")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarAlexandra Winter <wintera@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7a36d27
    • David S. Miller's avatar
      Merge branch 'virtio_net-infinite-loop' · b005c9ef
      David S. Miller authored
      Balazs Nemeth says:
      
      ====================
      net: prevent infinite loop caused by incorrect proto from virtio_net_hdr_set_proto
      
      These patches prevent an infinite loop for gso packets with a protocol
      from virtio net hdr that doesn't match the protocol in the packet.
      Note that packets coming from a device without
      header_ops->parse_protocol being implemented will not be caught by
      the check in virtio_net_hdr_to_skb, but the infinite loop will still
      be prevented by the check in the gso layer.
      
      Changes from v2 to v3:
        - Remove unused *eth.
        - Use MPLS_HLEN to also check if the MPLS header length is a multiple
          of four.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b005c9ef
    • Balazs Nemeth's avatar
      net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 · d348ede3
      Balazs Nemeth authored
      A packet with skb_inner_network_header(skb) == skb_network_header(skb)
      and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
      from the packet. Subsequently, the call to skb_mac_gso_segment will
      again call mpls_gso_segment with the same packet leading to an infinite
      loop. In addition, ensure that the header length is a multiple of four,
      which should hold irrespective of the number of stacked labels.
      Signed-off-by: default avatarBalazs Nemeth <bnemeth@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d348ede3
    • Balazs Nemeth's avatar
      net: check if protocol extracted by virtio_net_hdr_set_proto is correct · 924a9bc3
      Balazs Nemeth authored
      For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
      set) based on the type in the virtio net hdr, but the skb could contain
      anything since it could come from packet_snd through a raw socket. If
      there is a mismatch between what virtio_net_hdr_set_proto sets and
      the actual protocol, then the skb could be handled incorrectly later
      on.
      
      An example where this poses an issue is with the subsequent call to
      skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
      correctly. A specially crafted packet could fool
      skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
      
      Avoid blindly trusting the information provided by the virtio net header
      by checking that the protocol in the packet actually matches the
      protocol set by virtio_net_hdr_set_proto. Note that since the protocol
      is only checked if skb->dev implements header_ops->parse_protocol,
      packets from devices without the implementation are not checked at this
      stage.
      
      Fixes: 9274124f ("net: stricter validation of untrusted gso packets")
      Signed-off-by: default avatarBalazs Nemeth <bnemeth@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      924a9bc3
    • Daniel Borkmann's avatar
      Merge branch 'bpf-xdp-redirect' · 32f91529
      Daniel Borkmann authored
      Björn Töpel says:
      
      ====================
      This two patch series contain two optimizations for the
      bpf_redirect_map() helper and the xdp_do_redirect() function.
      
      The bpf_redirect_map() optimization is about avoiding the map lookup
      dispatching. Instead of having a switch-statement and selecting the
      correct lookup function, we let bpf_redirect_map() be a map operation,
      where each map has its own bpf_redirect_map() implementation. This way
      the run-time lookup is avoided.
      
      The xdp_do_redirect() patch restructures the code, so that the map
      pointer indirection can be avoided.
      
      Performance-wise I got 4% improvement for XSKMAP
      (sample:xdpsock/rx-drop), and 8% (sample:xdp_redirect_map) on my
      machine.
      
      v5->v6:  Removed REDIR enum, and instead use map_id and map_type. (Daniel)
               Applied Daniel's fixups on patch 1. (Daniel)
      v4->v5:  Renamed map operation to map_redirect. (Daniel)
      v3->v4:  Made bpf_redirect_map() a map operation. (Daniel)
      v2->v3:  Fix build when CONFIG_NET is not set. (lkp)
      v1->v2:  Removed warning when CONFIG_BPF_SYSCALL was not set. (lkp)
               Cleaned up case-clause in xdp_do_generic_redirect_map(). (Toke)
               Re-added comment. (Toke)
      rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke)
               Get rid of the macro and use __always_inline. (Jesper)
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      32f91529
    • Björn Töpel's avatar
      bpf, xdp: Restructure redirect actions · ee75aef2
      Björn Töpel authored
      The XDP_REDIRECT implementations for maps and non-maps are fairly
      similar, but obviously need to take different code paths depending on
      if the target is using a map or not. Today, the redirect targets for
      XDP either uses a map, or is based on ifindex.
      
      Here, the map type and id are added to bpf_redirect_info, instead of
      the actual map. Map type, map item/ifindex, and the map_id (if any) is
      passed to xdp_do_redirect().
      
      For ifindex-based redirect, used by the bpf_redirect() XDP BFP helper,
      a special map type/id are used. Map type of UNSPEC together with map id
      equal to INT_MAX has the special meaning of an ifindex based
      redirect. Note that valid map ids are 1 inclusive, INT_MAX exclusive
      ([1,INT_MAX[).
      
      In addition to making the code easier to follow, using explicit type
      and id in bpf_redirect_info has a slight positive performance impact
      by avoiding a pointer indirection for the map type lookup, and instead
      use the cacheline for bpf_redirect_info.
      
      Since the actual map is not passed via bpf_redirect_info anymore, the
      map lookup is only done in the BPF helper. This means that the
      bpf_clear_redirect_map() function can be removed. The actual map item
      is RCU protected.
      
      The bpf_redirect_info flags member is not used by XDP, and not
      read/written any more. The map member is only written to when
      required/used, and not unconditionally.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210308112907.559576-3-bjorn.topel@gmail.com
      ee75aef2
    • Björn Töpel's avatar
      bpf, xdp: Make bpf_redirect_map() a map operation · e6a4750f
      Björn Töpel authored
      Currently the bpf_redirect_map() implementation dispatches to the
      correct map-lookup function via a switch-statement. To avoid the
      dispatching, this change adds bpf_redirect_map() as a map
      operation. Each map provides its bpf_redirect_map() version, and
      correct function is automatically selected by the BPF verifier.
      
      A nice side-effect of the code movement is that the map lookup
      functions are now local to the map implementation files, which removes
      one additional function call.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/bpf/20210308112907.559576-2-bjorn.topel@gmail.com
      e6a4750f
    • George McCollister's avatar
      net: dsa: xrs700x: check if partner is same as port in hsr join · 286a8624
      George McCollister authored
      Don't assign dp to partner if it's the same port that xrs700x_hsr_join
      was called with. The partner port is supposed to be the other port in
      the HSR/PRP redundant pair not the same port. This fixes an issue
      observed in testing where forwarding between redundant HSR ports on this
      switch didn't work depending on the order the ports were added to the
      hsr device.
      
      Fixes: bd62e6f5 ("net: dsa: xrs700x: add HSR offloading support")
      Signed-off-by: default avatarGeorge McCollister <george.mccollister@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      286a8624
  4. 09 Mar, 2021 9 commits
  5. 08 Mar, 2021 5 commits