1. 06 Apr, 2021 8 commits
  2. 02 Mar, 2021 1 commit
  3. 24 Feb, 2021 5 commits
  4. 23 Feb, 2021 26 commits
    • Jakub Kicinski's avatar
      Merge branch 'wireguard-fixes-for-5-12-rc1' · fcb30073
      Jakub Kicinski authored
      Jason Donenfeld says:
      
      ====================
      wireguard fixes for 5.12-rc1
      
      This series has a collection of fixes that have piled up for a little
      while now, that I unfortunately didn't get a chance to send out earlier.
      
      1) Removes unlikely() from IS_ERR(), since it's already implied.
      
      2) Remove a bogus sparse annotation that hasn't been needed for years.
      
      3) Addition test in the test suite for stressing parallel ndo_start_xmit.
      
      4) Slight struct reordering in preparation for subsequent fix.
      
      5) If skb->protocol is bogus, we no longer attempt to send icmp messages.
      
      6) Massive memory usage fix, hit by larger deployments.
      
      7) Fix typo in kconfig dependency logic.
      
      (1) and (2) are tiny cleanups, and (3) is just a test, so if you're
      trying to reduce churn, you could not backport these. But (4), (5), (6),
      and (7) fix problems and should be applied to stable. IMO, it's probably
      easiest to just apply them all to stable.
      ====================
      
      Link: https://lore.kernel.org/r/20210222162549.3252778-1-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fcb30073
    • Jason A. Donenfeld's avatar
      wireguard: kconfig: use arm chacha even with no neon · bce24739
      Jason A. Donenfeld authored
      The condition here was incorrect: a non-neon fallback implementation is
      available on arm32 when NEON is not supported.
      Reported-by: default avatarIlya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bce24739
    • Jason A. Donenfeld's avatar
      wireguard: queueing: get rid of per-peer ring buffers · 8b5553ac
      Jason A. Donenfeld authored
      Having two ring buffers per-peer means that every peer results in two
      massive ring allocations. On an 8-core x86_64 machine, this commit
      reduces the per-peer allocation from 18,688 bytes to 1,856 bytes, which
      is an 90% reduction. Ninety percent! With some single-machine
      deployments approaching 500,000 peers, we're talking about a reduction
      from 7 gigs of memory down to 700 megs of memory.
      
      In order to get rid of these per-peer allocations, this commit switches
      to using a list-based queueing approach. Currently GSO fragments are
      chained together using the skb->next pointer (the skb_list_* singly
      linked list approach), so we form the per-peer queue around the unused
      skb->prev pointer (which sort of makes sense because the links are
      pointing backwards). Use of skb_queue_* is not possible here, because
      that is based on doubly linked lists and spinlocks. Multiple cores can
      write into the queue at any given time, because its writes occur in the
      start_xmit path or in the udp_recv path. But reads happen in a single
      workqueue item per-peer, amounting to a multi-producer, single-consumer
      paradigm.
      
      The MPSC queue is implemented locklessly and never blocks. However, it
      is not linearizable (though it is serializable), with a very tight and
      unlikely race on writes, which, when hit (some tiny fraction of the
      0.15% of partial adds on a fully loaded 16-core x86_64 system), causes
      the queue reader to terminate early. However, because every packet sent
      queues up the same workqueue item after it is fully added, the worker
      resumes again, and stopping early isn't actually a problem, since at
      that point the packet wouldn't have yet been added to the encryption
      queue. These properties allow us to avoid disabling interrupts or
      spinning. The design is based on Dmitry Vyukov's algorithm [1].
      
      Performance-wise, ordinarily list-based queues aren't preferable to
      ringbuffers, because of cache misses when following pointers around.
      However, we *already* have to follow the adjacent pointers when working
      through fragments, so there shouldn't actually be any change there. A
      potential downside is that dequeueing is a bit more complicated, but the
      ptr_ring structure used prior had a spinlock when dequeueing, so all and
      all the difference appears to be a wash.
      
      Actually, from profiling, the biggest performance hit, by far, of this
      commit winds up being atomic_add_unless(count, 1, max) and atomic_
      dec(count), which account for the majority of CPU time, according to
      perf. In that sense, the previous ring buffer was superior in that it
      could check if it was full by head==tail, which the list-based approach
      cannot do.
      
      But all and all, this enables us to get massive memory savings, allowing
      WireGuard to scale for real world deployments, without taking much of a
      performance hit.
      
      [1] http://www.1024cores.net/home/lock-free-algorithms/queues/intrusive-mpsc-node-based-queueReviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8b5553ac
    • Jason A. Donenfeld's avatar
      wireguard: device: do not generate ICMP for non-IP packets · 99fff526
      Jason A. Donenfeld authored
      If skb->protocol doesn't match the actual skb->data header, it's
      probably not a good idea to pass it off to icmp{,v6}_ndo_send, which is
      expecting to reply to a valid IP packet. So this commit has that early
      mismatch case jump to a later error label.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      99fff526
    • Jason A. Donenfeld's avatar
      wireguard: peer: put frequently used members above cache lines · 5a059869
      Jason A. Donenfeld authored
      The is_dead boolean is checked for every single packet, while the
      internal_id member is used basically only for pr_debug messages. So it
      makes sense to hoist up is_dead into some space formerly unused by a
      struct hole, while demoting internal_api to below the lowest struct
      cache line.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5a059869
    • Jason A. Donenfeld's avatar
      wireguard: selftests: test multiple parallel streams · d5a49aa6
      Jason A. Donenfeld authored
      In order to test ndo_start_xmit being called in parallel, explicitly add
      separate tests, which should all run on different cores. This should
      help tease out bugs associated with queueing up packets from different
      cores in parallel. Currently, it hasn't found those types of bugs, but
      given future planned work, this is a useful regression to avoid.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d5a49aa6
    • Jann Horn's avatar
      wireguard: socket: remove bogus __be32 annotation · 7f57bd8d
      Jann Horn authored
      The endpoint->src_if4 has nothing to do with fixed-endian numbers; remove
      the bogus annotation.
      
      This was introduced in
      https://git.zx2c4.com/wireguard-monolithic-historical/commit?id=14e7d0a499a676ec55176c0de2f9fcbd34074a82
      in the historical WireGuard repo because the old code used to
      zero-initialize multiple members as follows:
      
          endpoint->src4.s_addr = endpoint->src_if4 = fl.saddr = 0;
      
      Because fl.saddr is fixed-endian and an assignment returns a value with the
      type of its left operand, this meant that sparse detected an assignment
      between values of different endianness.
      
      Since then, this assignment was already split up into separate statements;
      just the cast survived.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7f57bd8d
    • Antonio Quartulli's avatar
      wireguard: avoid double unlikely() notation when using IS_ERR() · 30ac4e2f
      Antonio Quartulli authored
      The definition of IS_ERR() already applies the unlikely() notation
      when checking the error status of the passed pointer. For this
      reason there is no need to have the same notation outside of
      IS_ERR() itself.
      
      Clean up code by removing redundant notation.
      Signed-off-by: default avatarAntonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30ac4e2f
    • Takeshi Misawa's avatar
      net: qrtr: Fix memory leak in qrtr_tun_open · fc0494ea
      Takeshi Misawa authored
      If qrtr_endpoint_register() failed, tun is leaked.
      Fix this, by freeing tun in error path.
      
      syzbot report:
      BUG: memory leak
      unreferenced object 0xffff88811848d680 (size 64):
        comm "syz-executor684", pid 10171, jiffies 4294951561 (age 26.070s)
        hex dump (first 32 bytes):
          80 dd 0a 84 ff ff ff ff 00 00 00 00 00 00 00 00  ................
          90 d6 48 18 81 88 ff ff 90 d6 48 18 81 88 ff ff  ..H.......H.....
        backtrace:
          [<0000000018992a50>] kmalloc include/linux/slab.h:552 [inline]
          [<0000000018992a50>] kzalloc include/linux/slab.h:682 [inline]
          [<0000000018992a50>] qrtr_tun_open+0x22/0x90 net/qrtr/tun.c:35
          [<0000000003a453ef>] misc_open+0x19c/0x1e0 drivers/char/misc.c:141
          [<00000000dec38ac8>] chrdev_open+0x10d/0x340 fs/char_dev.c:414
          [<0000000079094996>] do_dentry_open+0x1e6/0x620 fs/open.c:817
          [<000000004096d290>] do_open fs/namei.c:3252 [inline]
          [<000000004096d290>] path_openat+0x74a/0x1b00 fs/namei.c:3369
          [<00000000b8e64241>] do_filp_open+0xa0/0x190 fs/namei.c:3396
          [<00000000a3299422>] do_sys_openat2+0xed/0x230 fs/open.c:1172
          [<000000002c1bdcef>] do_sys_open fs/open.c:1188 [inline]
          [<000000002c1bdcef>] __do_sys_openat fs/open.c:1204 [inline]
          [<000000002c1bdcef>] __se_sys_openat fs/open.c:1199 [inline]
          [<000000002c1bdcef>] __x64_sys_openat+0x7f/0xe0 fs/open.c:1199
          [<00000000f3a5728f>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
          [<000000004b38b7ec>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 28fb4e59 ("net: qrtr: Expose tunneling endpoint to user space")
      Reported-by: syzbot+5d6e4af21385f5cfc56a@syzkaller.appspotmail.com
      Signed-off-by: default avatarTakeshi Misawa <jeliantsurux@gmail.com>
      Link: https://lore.kernel.org/r/20210221234427.GA2140@DESKTOPSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fc0494ea
    • Taehee Yoo's avatar
      vxlan: move debug check after netdev unregister · 92584ddf
      Taehee Yoo authored
      The debug check must be done after unregister_netdevice_many() call --
      the hlist_del_rcu() for this is done inside .ndo_stop.
      
      This is the same with commit 0fda7600 ("geneve: move debug check after
      netdev unregister")
      
      Test commands:
          ip netns del A
          ip netns add A
          ip netns add B
      
          ip netns exec B ip link add vxlan0 type vxlan vni 100 local 10.0.0.1 \
      	    remote 10.0.0.2 dstport 4789 srcport 4789 4789
          ip netns exec B ip link set vxlan0 netns A
          ip netns exec A ip link set vxlan0 up
          ip netns del B
      
      Splat looks like:
      [   73.176249][    T7] ------------[ cut here ]------------
      [   73.178662][    T7] WARNING: CPU: 4 PID: 7 at drivers/net/vxlan.c:4743 vxlan_exit_batch_net+0x52e/0x720 [vxlan]
      [   73.182597][    T7] Modules linked in: vxlan openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_core nfp mlxfw ixgbevf tls sch_fq_codel nf_tables nfnetlink ip_tables x_tables unix
      [   73.190113][    T7] CPU: 4 PID: 7 Comm: kworker/u16:0 Not tainted 5.11.0-rc7+ #838
      [   73.193037][    T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [   73.196986][    T7] Workqueue: netns cleanup_net
      [   73.198946][    T7] RIP: 0010:vxlan_exit_batch_net+0x52e/0x720 [vxlan]
      [   73.201509][    T7] Code: 00 01 00 00 0f 84 39 fd ff ff 48 89 ca 48 c1 ea 03 80 3c 1a 00 0f 85 a6 00 00 00 89 c2 48 83 c2 02 49 8b 14 d4 48 85 d2 74 ce <0f> 0b eb ca e8 b9 51 db dd 84 c0 0f 85 4a fe ff ff 48 c7 c2 80 bc
      [   73.208813][    T7] RSP: 0018:ffff888100907c10 EFLAGS: 00010286
      [   73.211027][    T7] RAX: 000000000000003c RBX: dffffc0000000000 RCX: ffff88800ec411f0
      [   73.213702][    T7] RDX: ffff88800a278000 RSI: ffff88800fc78c70 RDI: ffff88800fc78070
      [   73.216169][    T7] RBP: ffff88800b5cbdc0 R08: fffffbfff424de61 R09: fffffbfff424de61
      [   73.218463][    T7] R10: ffffffffa126f307 R11: fffffbfff424de60 R12: ffff88800ec41000
      [   73.220794][    T7] R13: ffff888100907d08 R14: ffff888100907c50 R15: ffff88800fc78c40
      [   73.223337][    T7] FS:  0000000000000000(0000) GS:ffff888114800000(0000) knlGS:0000000000000000
      [   73.225814][    T7] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   73.227616][    T7] CR2: 0000562b5cb4f4d0 CR3: 0000000105fbe001 CR4: 00000000003706e0
      [   73.229700][    T7] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   73.231820][    T7] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   73.233844][    T7] Call Trace:
      [   73.234698][    T7]  ? vxlan_err_lookup+0x3c0/0x3c0 [vxlan]
      [   73.235962][    T7]  ? ops_exit_list.isra.11+0x93/0x140
      [   73.237134][    T7]  cleanup_net+0x45e/0x8a0
      [ ... ]
      
      Fixes: 57b61127 ("vxlan: speedup vxlan tunnels dismantle")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Link: https://lore.kernel.org/r/20210221154552.11749-1-ap420073@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      92584ddf
    • Jakub Kicinski's avatar
      Merge branch 'r8152-minor-adjustments' · 2c8396de
      Jakub Kicinski authored
      Hayes Wang says:
      
      ====================
      r8152: minor adjustments
      
      These patches are used to adjust the code.
      ====================
      
      Link: https://lore.kernel.org/r/1394712342-15778-341-Taiwan-albertk@realtek.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2c8396de
    • Hayes Wang's avatar
      r8152: spilt rtl_set_eee_plus and r8153b_green_en · 40fa7568
      Hayes Wang authored
      Add rtl_eee_plus_en() and rtl_green_en().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      40fa7568
    • Hayes Wang's avatar
      r8152: replace netif_err with dev_err · 156c3207
      Hayes Wang authored
      Some messages are before calling register_netdev(), so replace
      netif_err() with dev_err().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      156c3207
    • Hayes Wang's avatar
      r8152: check if the pointer of the function exists · c79515e4
      Hayes Wang authored
      Return error code if autosuspend_en, eee_get, or eee_set don't exist.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c79515e4
    • Hayes Wang's avatar
      r8152: enable U1/U2 for USB_SPEED_SUPER · 7a0ae61a
      Hayes Wang authored
      U1/U2 shoued be enabled for USB 3.0 or later. The USB 2.0 doesn't
      support it.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7a0ae61a
    • wenxu's avatar
      net/sched: cls_flower: validate ct_state for invalid and reply flags · 3aed8b63
      wenxu authored
      Add invalid and reply flags validate in the fl_validate_ct_state.
      This makes the checking complete if compared to ovs'
      validate_ct_state().
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Link: https://lore.kernel.org/r/1614064315-364-1-git-send-email-wenxu@ucloud.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3aed8b63
    • Jakub Kicinski's avatar
      Merge branch 'net-dsa-learning-fixes-for-b53-bcm_sf2' · f3f9be9c
      Jakub Kicinski authored
      Florian Fainelli says:
      
      ====================
      net: dsa: Learning fixes for b53/bcm_sf2
      
      This patch series contains a couple of fixes for the b53/bcm_sf2 drivers
      with respect to configuring learning.
      
      The first patch is wiring-up the necessary dsa_switch_ops operations in
      order to support the offloading of bridge flags.
      
      The second patch corrects the switch driver's default learning behavior
      which was unfortunately wrong from day one.
      
      This is submitted against "net" because this is technically a bug fix
      since ports should not have had learning enabled by default but given
      this is dependent upon Vladimir's recent br_flags series, there is no
      Fixes tag provided.
      
      I will be providing targeted stable backports that look a bit
      different.
      
      Changes in v2:
      
      - added first patch
      - updated second patch to include BR_LEARNING check in br_flags_pre as
        a support bridge flag to offload
      ====================
      
      Link: https://lore.kernel.org/r/20210222223010.2907234-1-f.fainelli@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3f9be9c
    • Florian Fainelli's avatar
      net: dsa: b53: Support setting learning on port · f9b3827e
      Florian Fainelli authored
      Add support for being able to set the learning attribute on port, and
      make sure that the standalone ports start up with learning disabled.
      
      We can remove the code in bcm_sf2 that configured the ports learning
      attribute because we want the standalone ports to have learning disabled
      by default and port 7 cannot be bridged, so its learning attribute will
      not change past its initial configuration.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f9b3827e
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Wire-up br_flags_pre, br_flags and set_mrouter · e6dd86ed
      Florian Fainelli authored
      Because bcm_sf2 implements its own dsa_switch_ops we need to export the
      b53_br_flags_pre(), b53_br_flags() and b53_set_mrouter so we can wire-up
      them up like they used to be with the former b53_br_egress_floods().
      
      Fixes: a8b659e7 ("net: dsa: act as passthrough for bridge port flags")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e6dd86ed
    • Krzysztof Halasa's avatar
      Marvell Sky2 Ethernet adapter: fix warning messages. · 18755e27
      Krzysztof Halasa authored
      sky2.c driver uses netdev_warn() before the net device is initialized.
      Fix it by using dev_warn() instead.
      Signed-off-by: default avatarKrzysztof Halasa <khalasa@piap.pl>
      
      Link: https://lore.kernel.org/r/m3a6s1r1ul.fsf@t19.piap.plSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      18755e27
    • Sieng Piaw Liew's avatar
      bcm63xx_enet: fix sporadic kernel panic · 9bc1ef64
      Sieng Piaw Liew authored
      In ndo_stop functions, netdev_completed_queue() is called during forced
      tx reclaim, after netdev_reset_queue(). This may trigger kernel panic if
      there is any tx skb left.
      
      This patch moves netdev_reset_queue() to after tx reclaim, so BQL can
      complete successfully then reset.
      Signed-off-by: default avatarSieng Piaw Liew <liew.s.piaw@gmail.com>
      Fixes: 4c59b0f5 ("bcm63xx_enet: add BQL support")
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/20210222013530.1356-1-liew.s.piaw@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9bc1ef64
    • Jason A. Donenfeld's avatar
      net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending · ee576c47
      Jason A. Donenfeld authored
      The icmp{,v6}_send functions make all sorts of use of skb->cb, casting
      it with IPCB or IP6CB, assuming the skb to have come directly from the
      inet layer. But when the packet comes from the ndo layer, especially
      when forwarded, there's no telling what might be in skb->cb at that
      point. As a result, the icmp sending code risks reading bogus memory
      contents, which can result in nasty stack overflows such as this one
      reported by a user:
      
          panic+0x108/0x2ea
          __stack_chk_fail+0x14/0x20
          __icmp_send+0x5bd/0x5c0
          icmp_ndo_send+0x148/0x160
      
      In icmp_send, skb->cb is cast with IPCB and an ip_options struct is read
      from it. The optlen parameter there is of particular note, as it can
      induce writes beyond bounds. There are quite a few ways that can happen
      in __ip_options_echo. For example:
      
          // sptr/skb are attacker-controlled skb bytes
          sptr = skb_network_header(skb);
          // dptr/dopt points to stack memory allocated by __icmp_send
          dptr = dopt->__data;
          // sopt is the corrupt skb->cb in question
          if (sopt->rr) {
              optlen  = sptr[sopt->rr+1]; // corrupt skb->cb + skb->data
              soffset = sptr[sopt->rr+2]; // corrupt skb->cb + skb->data
      	// this now writes potentially attacker-controlled data, over
      	// flowing the stack:
              memcpy(dptr, sptr+sopt->rr, optlen);
          }
      
      In the icmpv6_send case, the story is similar, but not as dire, as only
      IP6CB(skb)->iif and IP6CB(skb)->dsthao are used. The dsthao case is
      worse than the iif case, but it is passed to ipv6_find_tlv, which does
      a bit of bounds checking on the value.
      
      This is easy to simulate by doing a `memset(skb->cb, 0x41,
      sizeof(skb->cb));` before calling icmp{,v6}_ndo_send, and it's only by
      good fortune and the rarity of icmp sending from that context that we've
      avoided reports like this until now. For example, in KASAN:
      
          BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0xa0e/0x12b0
          Write of size 38 at addr ffff888006f1f80e by task ping/89
          CPU: 2 PID: 89 Comm: ping Not tainted 5.10.0-rc7-debug+ #5
          Call Trace:
           dump_stack+0x9a/0xcc
           print_address_description.constprop.0+0x1a/0x160
           __kasan_report.cold+0x20/0x38
           kasan_report+0x32/0x40
           check_memory_region+0x145/0x1a0
           memcpy+0x39/0x60
           __ip_options_echo+0xa0e/0x12b0
           __icmp_send+0x744/0x1700
      
      Actually, out of the 4 drivers that do this, only gtp zeroed the cb for
      the v4 case, while the rest did not. So this commit actually removes the
      gtp-specific zeroing, while putting the code where it belongs in the
      shared infrastructure of icmp{,v6}_ndo_send.
      
      This commit fixes the issue by passing an empty IPCB or IP6CB along to
      the functions that actually do the work. For the icmp_send, this was
      already trivial, thanks to __icmp_send providing the plumbing function.
      For icmpv6_send, this required a tiny bit of refactoring to make it
      behave like the v4 case, after which it was straight forward.
      
      Fixes: a2b78e9b ("sunvnet: generate ICMP PTMUD messages for smaller port MTUs")
      Reported-by: default avatarSinYu <liuxyon@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/netdev/CAF=yD-LOF116aHub6RMe8vB8ZpnrrnoTdqhobEx+bvoA8AsP0w@mail.gmail.com/T/Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://lore.kernel.org/r/20210223131858.72082-1-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ee576c47
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 42870a1a
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-02-19
      
      This series contains updates to i40e driver only.
      
      Slawomir resolves an issue with the IPv6 extension headers being
      processed incorrectly.
      
      Keita Suzuki fixes a memory leak on probe failure.
      
      Mateusz initializes AQ command structures to zero to comply with
      spec, fixes FW flow control settings being overwritten and resolves an
      issue with adding VLAN filters after enabling FW LLDP. He also adds
      an additional check when adding TC filter as the current check doesn't
      properly distinguish between IPv4 and IPv6.
      
      Sylwester removes setting disabled bit when syncing filters as this
      prevents VFs from completing setup.
      
      Norbert cleans up sparse warnings.
      
      v2:
      - Fix fixes tag on patch 7
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        i40e: Fix endianness conversions
        i40e: Fix add TC filter for IPv6
        i40e: Fix VFs not created
        i40e: Fix addition of RX filters after enabling FW LLDP agent
        i40e: Fix overwriting flow control settings during driver loading
        i40e: Add zero-initialization of AQ command structures
        i40e: Fix memory leak in i40e_probe
        i40e: Fix flow for IPv6 next header (extension header)
      ====================
      
      Link: https://lore.kernel.org/r/20210219213606.2567536-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      42870a1a
    • Chuhong Yuan's avatar
      net/mlx4_core: Add missed mlx4_free_cmd_mailbox() · 8eb65fda
      Chuhong Yuan authored
      mlx4_do_mirror_rule() forgets to call mlx4_free_cmd_mailbox() to
      free the memory region allocated by mlx4_alloc_cmd_mailbox() before
      an exit.
      Add the missed call to fix it.
      
      Fixes: 78efed27 ("net/mlx4_core: Support mirroring VF DMFS rules on both ports")
      Signed-off-by: default avatarChuhong Yuan <hslester96@gmail.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20210221143559.390277-1-hslester96@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8eb65fda
    • Song, Yoong Siang's avatar
      net: stmmac: fix CBS idleslope and sendslope calculation · 24877687
      Song, Yoong Siang authored
      When link speed is not 100 Mbps, port transmit rate and speed divider
      are set to 8 and 1000000 respectively. These values are incorrect for
      CBS idleslope and sendslope HW values calculation if the link speed is
      not 1 Gbps.
      
      This patch adds switch statement to set the values of port transmit rate
      and speed divider for 10 Gbps, 5 Gbps, 2.5 Gbps, 1 Gbps, and 100 Mbps.
      Note that CBS is not supported at 10 Mbps.
      
      Fixes: bc41a668 ("net: stmmac: tc: Remove the speed dependency")
      Fixes: 1f705bc6 ("net: stmmac: Add support for CBS QDISC")
      Signed-off-by: default avatarSong, Yoong Siang <yoong.siang.song@intel.com>
      Link: https://lore.kernel.org/r/1613655653-11755-1-git-send-email-yoong.siang.song@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      24877687
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-a-bunch-of-fixes' · e5bcf0e8
      Jakub Kicinski authored
      Paolo Abeni says:
      
      ====================
      mptcp: a bunch of fixes
      
      This series bundle a few MPTCP fixes for the current net tree.
      They have been detected via syzkaller and packetdrill
      
      Patch 1 fixes a slow close for orphaned sockets
      
      Patch 2 fixes another hangup at close time, when no
      data was actually transmitted before close
      
      Patch 3 fixes a memory leak with unusual sockopts
      
      Patch 4 fixes stray wake-ups on listener sockets
      ====================
      
      Link: https://lore.kernel.org/r/cover.1613755058.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e5bcf0e8