1. 27 Nov, 2020 2 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nftables_offload: set address type in control dissector · 3c78e9e0
      Pablo Neira Ayuso authored
      This patch adds nft_flow_rule_set_addr_type() to set the address type
      from the nft_payload expression accordingly.
      
      If the address type is not set in the control dissector then a rule that
      matches either on source or destination IP address does not work.
      
      After this patch, nft hardware offload generates the flow dissector
      configuration as tc-flower does to match on an IP address.
      
      This patch has been also tested functionally to make sure packets are
      filtered out by the NIC.
      
      This is also getting the code aligned with the existing netfilter flow
      offload infrastructure which is also setting the control dissector.
      
      Fixes: c9626a2c ("netfilter: nf_tables: add hardware offload support")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3c78e9e0
    • Wang Hai's avatar
      ipvs: fix possible memory leak in ip_vs_control_net_init · 4bc3c8dc
      Wang Hai authored
      kmemleak report a memory leak as follows:
      
      BUG: memory leak
      unreferenced object 0xffff8880759ea000 (size 256):
      backtrace:
      [<00000000c0bf2deb>] kmem_cache_zalloc include/linux/slab.h:656 [inline]
      [<00000000c0bf2deb>] __proc_create+0x23d/0x7d0 fs/proc/generic.c:421
      [<000000009d718d02>] proc_create_reg+0x8e/0x140 fs/proc/generic.c:535
      [<0000000097bbfc4f>] proc_create_net_data+0x8c/0x1b0 fs/proc/proc_net.c:126
      [<00000000652480fc>] ip_vs_control_net_init+0x308/0x13a0 net/netfilter/ipvs/ip_vs_ctl.c:4169
      [<000000004c927ebe>] __ip_vs_init+0x211/0x400 net/netfilter/ipvs/ip_vs_core.c:2429
      [<00000000aa6b72d9>] ops_init+0xa8/0x3c0 net/core/net_namespace.c:151
      [<00000000153fd114>] setup_net+0x2de/0x7e0 net/core/net_namespace.c:341
      [<00000000be4e4f07>] copy_net_ns+0x27d/0x530 net/core/net_namespace.c:482
      [<00000000f1c23ec9>] create_new_namespaces+0x382/0xa30 kernel/nsproxy.c:110
      [<00000000098a5757>] copy_namespaces+0x2e6/0x3b0 kernel/nsproxy.c:179
      [<0000000026ce39e9>] copy_process+0x220a/0x5f00 kernel/fork.c:2072
      [<00000000b71f4efe>] _do_fork+0xc7/0xda0 kernel/fork.c:2428
      [<000000002974ee96>] __do_sys_clone3+0x18a/0x280 kernel/fork.c:2703
      [<0000000062ac0a4d>] do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
      [<0000000093f1ce2c>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      In the error path of ip_vs_control_net_init(), remove_proc_entry() needs
      to be called to remove the added proc entry, otherwise a memory leak
      will occur.
      
      Also, add some '#ifdef CONFIG_PROC_FS' because proc_create_net* return NULL
      when PROC is not used.
      
      Fixes: b17fc996 ("IPVS: netns, ip_vs_stats and its procfs")
      Fixes: 61b1ab45 ("IPVS: netns, add basic init per netns.")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4bc3c8dc
  2. 25 Nov, 2020 16 commits
    • Florian Westphal's avatar
      netfilter: nf_tables: avoid false-postive lockdep splat · c0700dfa
      Florian Westphal authored
      There are reports wrt lockdep splat in nftables, e.g.:
      ------------[ cut here ]------------
      WARNING: CPU: 2 PID: 31416 at net/netfilter/nf_tables_api.c:622
      lockdep_nfnl_nft_mutex_not_held+0x28/0x38 [nf_tables]
      ...
      
      These are caused by an earlier, unrelated bug such as a n ABBA deadlock
      in a different subsystem.
      In such an event, lockdep is disabled and lockdep_is_held returns true
      unconditionally.  This then causes the WARN() in nf_tables.
      
      Make the WARN conditional on lockdep still active to avoid this.
      
      Fixes: f102d66b ("netfilter: nf_tables: use dedicated mutex to guard transactions")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Link: https://lore.kernel.org/linux-kselftest/CA+G9fYvFUpODs+NkSYcnwKnXm62tmP=ksLeBPmB+KFrB2rvCtQ@mail.gmail.com/Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c0700dfa
    • Eric Dumazet's avatar
      netfilter: ipset: prevent uninit-value in hash_ip6_add · 68ad89de
      Eric Dumazet authored
      syzbot found that we are not validating user input properly
      before copying 16 bytes [1].
      
      Using NLA_BINARY in ipaddr_policy[] for IPv6 address is not correct,
      since it ensures at most 16 bytes were provided.
      
      We should instead make sure user provided exactly 16 bytes.
      
      In old kernels (before v4.20), fix would be to remove the NLA_BINARY,
      since NLA_POLICY_EXACT_LEN() was not yet available.
      
      [1]
      BUG: KMSAN: uninit-value in hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892
      CPU: 1 PID: 11611 Comm: syz-executor.0 Not tainted 5.10.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x21c/0x280 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
       hash_ip6_add+0x1cba/0x3a50 net/netfilter/ipset/ip_set_hash_gen.h:892
       hash_ip6_uadt+0x976/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:267
       call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
       ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
       ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
       nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
       netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
       nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
       ___sys_sendmsg net/socket.c:2407 [inline]
       __sys_sendmsg+0x6d5/0x830 net/socket.c:2440
       __do_sys_sendmsg net/socket.c:2449 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
       do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45deb9
      Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fe2e503fc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000029ec0 RCX: 000000000045deb9
      RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
      RBP: 000000000118bf60 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c
      R13: 000000000169fb7f R14: 00007fe2e50409c0 R15: 000000000118bf2c
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
       kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289
       __msan_chain_origin+0x57/0xa0 mm/kmsan/kmsan_instr.c:147
       ip6_netmask include/linux/netfilter/ipset/pfxlen.h:49 [inline]
       hash_ip6_netmask net/netfilter/ipset/ip_set_hash_ip.c:185 [inline]
       hash_ip6_uadt+0xb1c/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:263
       call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
       ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
       ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
       nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
       netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
       nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
       ___sys_sendmsg net/socket.c:2407 [inline]
       __sys_sendmsg+0x6d5/0x830 net/socket.c:2440
       __do_sys_sendmsg net/socket.c:2449 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
       do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
       kmsan_internal_chain_origin+0xad/0x130 mm/kmsan/kmsan.c:289
       kmsan_memcpy_memmove_metadata+0x25e/0x2d0 mm/kmsan/kmsan.c:226
       kmsan_memcpy_metadata+0xb/0x10 mm/kmsan/kmsan.c:246
       __msan_memcpy+0x46/0x60 mm/kmsan/kmsan_instr.c:110
       ip_set_get_ipaddr6+0x2cb/0x370 net/netfilter/ipset/ip_set_core.c:310
       hash_ip6_uadt+0x439/0xbd0 net/netfilter/ipset/ip_set_hash_ip.c:255
       call_ad+0x329/0xd00 net/netfilter/ipset/ip_set_core.c:1720
       ip_set_ad+0x111f/0x1440 net/netfilter/ipset/ip_set_core.c:1808
       ip_set_uadd+0xf6/0x110 net/netfilter/ipset/ip_set_core.c:1833
       nfnetlink_rcv_msg+0xc7d/0xdf0 net/netfilter/nfnetlink.c:252
       netlink_rcv_skb+0x70a/0x820 net/netlink/af_netlink.c:2494
       nfnetlink_rcv+0x4f0/0x4380 net/netfilter/nfnetlink.c:600
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x11da/0x14b0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x173c/0x1840 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
       ___sys_sendmsg net/socket.c:2407 [inline]
       __sys_sendmsg+0x6d5/0x830 net/socket.c:2440
       __do_sys_sendmsg net/socket.c:2449 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
       do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
       kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104
       kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76
       slab_alloc_node mm/slub.c:2906 [inline]
       __kmalloc_node_track_caller+0xc61/0x15f0 mm/slub.c:4512
       __kmalloc_reserve net/core/skbuff.c:142 [inline]
       __alloc_skb+0x309/0xae0 net/core/skbuff.c:210
       alloc_skb include/linux/skbuff.h:1094 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
       netlink_sendmsg+0xdb8/0x1840 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg net/socket.c:671 [inline]
       ____sys_sendmsg+0xc7a/0x1240 net/socket.c:2353
       ___sys_sendmsg net/socket.c:2407 [inline]
       __sys_sendmsg+0x6d5/0x830 net/socket.c:2440
       __do_sys_sendmsg net/socket.c:2449 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
       do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a7b4f989 ("netfilter: ipset: IP set core support")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      68ad89de
    • Vladimir Oltean's avatar
      enetc: Let the hardware auto-advance the taprio base-time of 0 · 90cf87d1
      Vladimir Oltean authored
      The tc-taprio base time indicates the beginning of the tc-taprio
      schedule, which is cyclic by definition (where the length of the cycle
      in nanoseconds is called the cycle time). The base time is a 64-bit PTP
      time in the TAI domain.
      
      Logically, the base-time should be a future time. But that imposes some
      restrictions to user space, which has to retrieve the current PTP time
      from the NIC first, then calculate a base time that will still be larger
      than the base time by the time the kernel driver programs this value
      into the hardware. Actually ensuring that the programmed base time is in
      the future is still a problem even if the kernel alone deals with this.
      
      Luckily, the enetc hardware already advances a base-time that is in the
      past into a congruent time in the immediate future, according to the
      same formula that can be found in the software implementation of taprio
      (in taprio_get_start_time):
      
      	/* Schedule the start time for the beginning of the next
      	 * cycle.
      	 */
      	n = div64_s64(ktime_sub_ns(now, base), cycle);
      	*start = ktime_add_ns(base, (n + 1) * cycle);
      
      There's only one problem: the driver doesn't let the hardware do that.
      It interferes with the base-time passed from user space, by special-casing
      the situation when the base-time is zero, and replaces that with the
      current PTP time. This changes the intended effective base-time of the
      schedule, which will in the end have a different phase offset than if
      the base-time of 0.000000000 was to be advanced by an integer multiple
      of the cycle-time.
      
      Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20201124220259.3027991-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      90cf87d1
    • Eric Dumazet's avatar
      gro_cells: reduce number of synchronize_net() calls · 2543a600
      Eric Dumazet authored
      After cited commit, gro_cells_destroy() became damn slow
      on hosts with a lot of cores.
      
      This is because we have one additional synchronize_net() per cpu as
      stated in the changelog.
      
      gro_cells_init() is setting NAPI_STATE_NO_BUSY_POLL, and this was enough
      to not have one synchronize_net() call per netif_napi_del()
      
      We can factorize all the synchronize_net() to a single one,
      right before freeing per-cpu memory.
      
      Fixes: 5198d545 ("net: remove napi_hash_del() from driver-facing API")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20201124203822.1360107-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2543a600
    • Antonio Borneo's avatar
      net: stmmac: fix incorrect merge of patch upstream · 12a8fe56
      Antonio Borneo authored
      Commit 75792624 ("net: stmmac: add flexible PPS to dwmac
      4.10a") was intended to modify the struct dwmac410_ops, but it got
      somehow badly merged and modified the struct dwmac4_ops.
      
      Revert the modification in struct dwmac4_ops and re-apply it
      properly in struct dwmac410_ops.
      
      Fixes: 75792624 ("net: stmmac: add flexible PPS to dwmac 4.10a")
      Signed-off-by: default avatarAntonio Borneo <antonio.borneo@st.com>
      Reported-by: default avatarAhmad Fatoum <a.fatoum@pengutronix.de>
      Link: https://lore.kernel.org/r/20201124223729.886992-1-antonio.borneo@st.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      12a8fe56
    • Wang Hai's avatar
      ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init · e255e11e
      Wang Hai authored
      kmemleak report a memory leak as follows:
      
      unreferenced object 0xffff8880059c6a00 (size 64):
        comm "ip", pid 23696, jiffies 4296590183 (age 1755.384s)
        hex dump (first 32 bytes):
          20 01 00 10 00 00 00 00 00 00 00 00 00 00 00 00   ...............
          1c 00 00 00 00 00 00 00 00 00 00 00 07 00 00 00  ................
        backtrace:
          [<00000000aa4e7a87>] ip6addrlbl_add+0x90/0xbb0
          [<0000000070b8d7f1>] ip6addrlbl_net_init+0x109/0x170
          [<000000006a9ca9d4>] ops_init+0xa8/0x3c0
          [<000000002da57bf2>] setup_net+0x2de/0x7e0
          [<000000004e52d573>] copy_net_ns+0x27d/0x530
          [<00000000b07ae2b4>] create_new_namespaces+0x382/0xa30
          [<000000003b76d36f>] unshare_nsproxy_namespaces+0xa1/0x1d0
          [<0000000030653721>] ksys_unshare+0x3a4/0x780
          [<0000000007e82e40>] __x64_sys_unshare+0x2d/0x40
          [<0000000031a10c08>] do_syscall_64+0x33/0x40
          [<0000000099df30e7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      We should free all rules when we catch an error in ip6addrlbl_net_init().
      otherwise a memory leak will occur.
      
      Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Link: https://lore.kernel.org/r/20201124071728.8385-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e255e11e
    • Jakub Kicinski's avatar
      Documentation: netdev-FAQ: suggest how to post co-dependent series · 6f7a1f9c
      Jakub Kicinski authored
      Make an explicit suggestion how to post user space side of kernel
      patches to avoid reposts when patchwork groups the wrong patches.
      
      v2: mention the cases unlike iproute2 explicitly
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f7a1f9c
    • Jakub Kicinski's avatar
      Merge tag 'batadv-net-pullrequest-20201124' of git://git.open-mesh.org/linux-merge · 26c89965
      Jakub Kicinski authored
      Simon Wunderlich says:
      
      ====================
      Here is a batman-adv bugfix:
      
       - set module owner to THIS_MODULE, by Taehee Yoo
      
      * tag 'batadv-net-pullrequest-20201124' of git://git.open-mesh.org/linux-merge:
        batman-adv: set .owner to THIS_MODULE
      ====================
      
      Link: https://lore.kernel.org/r/20201124134417.17269-1-sw@simonwunderlich.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      26c89965
    • Jakub Kicinski's avatar
      Merge branch 'ibmvnic-null-pointer-dereference' · 49d66ed8
      Jakub Kicinski authored
      Lijun Pan says:
      
      ====================
      ibmvnic: null pointer dereference
      
      Fix two NULL pointer dereference crash issues.
      Improve module removal procedure.
      ====================
      
      Link: https://lore.kernel.org/r/20201123193547.57225-1-ljp@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49d66ed8
    • Lijun Pan's avatar
      ibmvnic: enhance resetting status check during module exit · 3ada2881
      Lijun Pan authored
      Based on the discussion with Sukadev Bhattiprolu and Dany Madden,
      we believe that checking adapter->resetting bit is preferred
      since RESETTING state flag is not as strict as resetting bit.
      RESETTING state flag is removed since it is verbose now.
      
      Fixes: 7d7195a0 ("ibmvnic: Do not process device remove during device reset")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3ada2881
    • Lijun Pan's avatar
      ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq · 0e435bef
      Lijun Pan authored
      crq->msgs could be NULL if the previous reset did not complete after
      freeing crq->msgs. Check for NULL before dereferencing them.
      
      Snippet of call trace:
      ...
      ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ
      ibmvnic 30000003 env3 (unregistering): Releasing CRQ
      BUG: Kernel NULL pointer dereference on read at 0x00000000
      Faulting instruction address: 0xc0000000000c1a30
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic]
      CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G            E     5.10.0-rc1+ #12
      Workqueue: events __ibmvnic_reset [ibmvnic]
      NIP:  c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400
      REGS: c00000000d05b7a0 TRAP: 0380   Tainted: G            E      (5.10.0-rc1+)
      MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002480  XER: 20040000
      CFAR: c0000000000c19ec IRQMASK: 0
      GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000
      GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2
      GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280
      GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002
      GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550
      GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800
      NIP [c0000000000c1a30] memset+0x68/0x104
      LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic]
      Call Trace:
      [c00000000d05ba30] [0000000000000800] 0x800 (unreliable)
      [c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic]
      [c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic]
      [c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800
      [c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520
      [c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0
      [c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c
      
      Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0e435bef
    • Lijun Pan's avatar
      ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues · a0faaa27
      Lijun Pan authored
      adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
      did not complete after freeing sub crqs. Check for NULL before
      dereferencing them.
      
      Snippet of call trace:
      ibmvnic 30000006 env6: Releasing sub-CRQ
      ibmvnic 30000006 env6: Releasing CRQ
      ...
      ibmvnic 30000006 env6: Got Control IP offload Response
      ibmvnic 30000006 env6: Re-setting tx_scrq[0]
      BUG: Kernel NULL pointer dereference on read at 0x00000000
      Faulting instruction address: 0xc008000003dea7cc
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
      Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
      CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G        W         5.8.0+ #4
      Workqueue: events __ibmvnic_reset [ibmvnic]
      NIP:  c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000
      REGS: c0000007ef7db860 TRAP: 0380   Tainted: G        W          (5.8.0+)
      MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28002422  XER: 0000000d
      CFAR: c000000000bd9520 IRQMASK: 0
      GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00
      GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010
      GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000
      GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7
      GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048
      GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00
      NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic]
      LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic]
      Call Trace:
      [c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable)
      [c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic]
      [c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800
      [c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520
      [c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0
      [c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74
      
      Fixes: 57a49436 ("ibmvnic: Reset sub-crqs during driver reset")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a0faaa27
    • Jakub Kicinski's avatar
      Merge branch 'fixes-for-ena-driver' · 5fc145f1
      Jakub Kicinski authored
      Shay Agroskin says:
      
      ====================
      Fixes for ENA driver
      
      - fix wrong data offset on machines that support rx offset
      - work-around Intel iommu issue
      - fix out of bound access when request id is wrong
      ====================
      
      Link: https://lore.kernel.org/r/20201123190859.21298-1-shayagr@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5fc145f1
    • Shay Agroskin's avatar
      net: ena: fix packet's addresses for rx_offset feature · 1396d314
      Shay Agroskin authored
      This patch fixes two lines in which the rx_offset received by the device
      wasn't taken into account:
      
      - prefetch function:
      	In our driver the copied data would reside in
      	rx_info->page + rx_headroom + rx_offset
      
      	so the prefetch function is changed accordingly.
      
      - setting page_offset to zero for descriptors > 1:
      	for every descriptor but the first, the rx_offset is zero. Hence
      	the page_offset value should be set to rx_headroom.
      
      	The previous implementation changed the value of rx_info after
      	the descriptor was added to the SKB (essentially providing wrong
      	page offset).
      
      Fixes: 68f236df ("net: ena: add support for the rx offset feature")
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1396d314
    • Shay Agroskin's avatar
      net: ena: set initial DMA width to avoid intel iommu issue · 09323b3b
      Shay Agroskin authored
      The ENA driver uses the readless mechanism, which uses DMA, to find
      out what the DMA mask is supposed to be.
      
      If DMA is used without setting the dma_mask first, it causes the
      Intel IOMMU driver to think that ENA is a 32-bit device and therefore
      disables IOMMU passthrough permanently.
      
      This patch sets the dma_mask to be ENA_MAX_PHYS_ADDR_SIZE_BITS=48
      before readless initialization in
      ena_device_init()->ena_com_mmio_reg_read_request_init(),
      which is large enough to workaround the intel_iommu issue.
      
      DMA mask is set again to the correct value after it's received from the
      device after readless is initialized.
      
      The patch also changes the driver to use dma_set_mask_and_coherent()
      function instead of the two pci_set_dma_mask() and
      pci_set_consistent_dma_mask() ones. Both methods achieve the same
      effect.
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarMike Cui <mikecui@amazon.com>
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      09323b3b
    • Shay Agroskin's avatar
      net: ena: handle bad request id in ena_netdev · 5b7022cf
      Shay Agroskin authored
      After request id is checked in validate_rx_req_id() its value is still
      used in the line
      	rx_ring->free_ids[next_to_clean] =
      					rx_ring->ena_bufs[i].req_id;
      even if it was found to be out-of-bound for the array free_ids.
      
      The patch moves the request id to an earlier stage in the napi routine and
      makes sure its value isn't used if it's found out-of-bounds.
      
      Fixes: 30623e1e ("net: ena: avoid memory access violation by validating req_id properly")
      Signed-off-by: default avatarIdo Segev <idose@amazon.com>
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5b7022cf
  3. 24 Nov, 2020 10 commits
  4. 23 Nov, 2020 1 commit
  5. 21 Nov, 2020 11 commits
    • Jakub Kicinski's avatar
      Merge branch 'ibmvnic-fixes-in-reset-path' · f9b03653
      Jakub Kicinski authored
      Lijun Pan says:
      
      ====================
      ibmvnic: fixes in reset path
      
      Patch 1/3 and 2/3 notify peers in failover and migration reset.
      Patch 3/3 skips timeout reset if it is already resetting.
      ====================
      
      Link: https://lore.kernel.org/r/20201120224013.46891-1-ljp@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f9b03653
    • Lijun Pan's avatar
      ibmvnic: skip tx timeout reset while in resetting · 855a631a
      Lijun Pan authored
      Sometimes it takes longer than 5 seconds (watchdog timeout) to complete
      failover, migration, and other resets. In stead of scheduling another
      timeout reset, we wait for the current one to complete.
      Suggested-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Reviewed-by: default avatarDany Madden <drt@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      855a631a
    • Lijun Pan's avatar
      ibmvnic: notify peers when failover and migration happen · 98025bce
      Lijun Pan authored
      Commit 61d3e1d9 ("ibmvnic: Remove netdev notify for failover resets")
      excluded the failover case for notify call because it said
      netdev_notify_peers() can cause network traffic to stall or halt.
      Current testing does not show network traffic stall
      or halt because of the notify call for failover event.
      netdev_notify_peers may be used when a device wants to inform the
      rest of the network about some sort of a reconfiguration
      such as failover or migration.
      
      It is unnecessary to call that in other events like
      FATAL, NON_FATAL, CHANGE_PARAM, and TIMEOUT resets
      since in those scenarios the hardware does not change.
      If the driver must do a hard reset, it is necessary to notify peers.
      
      Fixes: 61d3e1d9 ("ibmvnic: Remove netdev notify for failover resets")
      Suggested-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Suggested-by: default avatarPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
      Signed-off-by: default avatarDany Madden <drt@linux.ibm.com>
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      98025bce
    • Lijun Pan's avatar
      ibmvnic: fix call_netdevice_notifiers in do_reset · 83935975
      Lijun Pan authored
      When netdev_notify_peers was substituted in
      commit 986103e7 ("net/ibmvnic: Fix RTNL deadlock during device reset"),
      call_netdevice_notifiers(NETDEV_RESEND_IGMP, dev) was missed.
      Fix it now.
      
      Fixes: 986103e7 ("net/ibmvnic: Fix RTNL deadlock during device reset")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Reviewed-by: default avatarDany Madden <drt@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      83935975
    • Jens Axboe's avatar
      tun: honor IOCB_NOWAIT flag · 5aac0390
      Jens Axboe authored
      tun only checks the file O_NONBLOCK flag, but it should also be checking
      the iocb IOCB_NOWAIT flag. Any fops using ->read/write_iter() should check
      both, otherwise it breaks users that correctly expect O_NONBLOCK semantics
      if IOCB_NOWAIT is set.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/e9451860-96cc-c7c7-47b8-fe42cadd5f4c@kernel.dkSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5aac0390
    • Julian Wiedmann's avatar
      net/af_iucv: set correct sk_protocol for child sockets · c5dab094
      Julian Wiedmann authored
      Child sockets erroneously inherit their parent's sk_type (ie. SOCK_*),
      instead of the PF_IUCV protocol that the parent was created with in
      iucv_sock_create().
      
      We're currently not using sk->sk_protocol ourselves, so this shouldn't
      have much impact (except eg. getting the output in skb_dump() right).
      
      Fixes: eac3731b ("[S390]: Add AF_IUCV socket support")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Link: https://lore.kernel.org/r/20201120100657.34407-1-jwi@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c5dab094
    • Yves-Alexis Perez's avatar
      usbnet: ipheth: fix connectivity with iOS 14 · f33d9e2b
      Yves-Alexis Perez authored
      Starting with iOS 14 released in September 2020, connectivity using the
      personal hotspot USB tethering function of iOS devices is broken.
      
      Communication between the host and the device (for example ICMP traffic
      or DNS resolution using the DNS service running in the device itself)
      works fine, but communication to endpoints further away doesn't work.
      
      Investigation on the matter shows that no UDP and ICMP traffic from the
      tethered host is reaching the Internet at all. For TCP traffic there are
      exchanges between tethered host and server but packets are modified in
      transit leading to impossible communication.
      
      After some trials Matti Vuorela discovered that reducing the URB buffer
      size by two bytes restored the previous behavior. While a better
      solution might exist to fix the issue, since the protocol is not
      publicly documented and considering the small size of the fix, let's do
      that.
      Tested-by: default avatarMatti Vuorela <matti.vuorela@bitfactor.fi>
      Signed-off-by: default avatarYves-Alexis Perez <corsac@corsac.net>
      Link: https://lore.kernel.org/linux-usb/CAAn0qaXmysJ9vx3ZEMkViv_B19ju-_ExN8Yn_uSefxpjS6g4Lw@mail.gmail.com/
      Link: https://github.com/libimobiledevice/libimobiledevice/issues/1038
      Link: https://lore.kernel.org/r/20201119172439.94988-1-corsac@corsac.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f33d9e2b
    • Tom Seewald's avatar
      cxgb4: Fix build failure when CONFIG_TLS=m · 659fbdcf
      Tom Seewald authored
      After commit 9d2e5e9e ("cxgb4/ch_ktls: decrypted bit is not enough")
      whenever CONFIG_TLS=m and CONFIG_CHELSIO_T4=y, the following build
      failure occurs:
      
      ld: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.o: in function
      `cxgb_select_queue':
      cxgb4_main.c:(.text+0x2dac): undefined reference to `tls_validate_xmit_skb'
      
      Fix this by ensuring that if TLS is set to be a module, CHELSIO_T4 will
      also be compiled as a module. As otherwise the cxgb4 driver will not be
      able to access TLS' symbols.
      
      Fixes: 9d2e5e9e ("cxgb4/ch_ktls: decrypted bit is not enough")
      Signed-off-by: default avatarTom Seewald <tseewald@gmail.com>
      Link: https://lore.kernel.org/r/20201120192528.615-1-tseewald@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      659fbdcf
    • Jamie Iles's avatar
      bonding: wait for sysfs kobject destruction before freeing struct slave · b9ad3e9f
      Jamie Iles authored
      syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a
      struct slave device could result in the following splat:
      
        kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000)
        bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
        ------------[ cut here ]------------
        ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline]
        ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600
        WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S                5.9.0-rc8+ #96
        Hardware name: linux,dummy-virt (DT)
        Workqueue: netns cleanup_net
        Call trace:
         dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239
         show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x174/0x1f8 lib/dump_stack.c:118
         panic+0x360/0x7a0 kernel/panic.c:231
         __warn+0x244/0x2ec kernel/panic.c:600
         report_bug+0x240/0x398 lib/bug.c:198
         bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974
         call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322
         brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329
         do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864
         el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65
         el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93
         el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594
         debug_print_object+0x180/0x240 lib/debugobjects.c:485
         __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
         debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998
         slab_free_hook mm/slub.c:1536 [inline]
         slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577
         slab_free mm/slub.c:3138 [inline]
         kfree+0x13c/0x460 mm/slub.c:4119
         bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492
         __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190
         bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline]
         bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420
         notifier_call_chain+0xf0/0x200 kernel/notifier.c:83
         __raw_notifier_call_chain kernel/notifier.c:361 [inline]
         raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368
         call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033
         call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
         call_netdevice_notifiers net/core/dev.c:2059 [inline]
         rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347
         unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509
         unregister_netdevice_many net/core/dev.c:10508 [inline]
         default_device_exit_batch+0x294/0x338 net/core/dev.c:10992
         ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189
         cleanup_net+0x44c/0x888 net/core/net_namespace.c:603
         process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269
         worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415
         kthread+0x390/0x498 kernel/kthread.c:292
         ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925
      
      This is a potential use-after-free if the sysfs nodes are being accessed
      whilst removing the struct slave, so wait for the object destruction to
      complete before freeing the struct slave itself.
      
      Fixes: 07699f9a ("bonding: add sysfs /slave dir for bond slave devices.")
      Fixes: a068aab4 ("bonding: Fix reference count leak in bond_sysfs_slave_add.")
      Cc: Qiushi Wu <wu000273@umn.edu>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarJamie Iles <jamie@nuviainc.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b9ad3e9f
    • Jakub Kicinski's avatar
      Merge branch 's390-qeth-fixes-2020-11-20' · 207d0bfc
      Jakub Kicinski authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: fixes 2020-11-20
      
      This brings several fixes for qeth's af_iucv-specific code paths.
      
      Also one fix by Alexandra for the recently added BR_LEARNING_SYNC
      support. We want to trust the feature indication bit, so that HW can
      mask it out if there's any issues on their end.
      ====================
      
      Link: https://lore.kernel.org/r/20201120090939.101406-1-jwi@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      207d0bfc
    • Julian Wiedmann's avatar
      s390/qeth: fix tear down of async TX buffers · 7ed10e16
      Julian Wiedmann authored
      When qeth_iqd_tx_complete() detects that a TX buffer requires additional
      async completion via QAOB, it might fail to replace the queue entry's
      metadata (and ends up triggering recovery).
      
      Assume now that the device gets torn down, overruling the recovery.
      If the QAOB notification then arrives before the tear down has
      sufficiently progressed, the buffer state is changed to
      QETH_QDIO_BUF_HANDLED_DELAYED by qeth_qdio_handle_aob().
      
      The tear down code calls qeth_drain_output_queue(), where
      qeth_cleanup_handled_pending() will then attempt to replace such a
      buffer _again_. If it succeeds this time, the buffer ends up dangling in
      its replacement's ->next_pending list ... where it will never be freed,
      since there's no further call to qeth_cleanup_handled_pending().
      
      But the second attempt isn't actually needed, we can simply leave the
      buffer on the queue and re-use it after a potential recovery has
      completed. The qeth_clear_output_buffer() in qeth_drain_output_queue()
      will ensure that it's in a clean state again.
      
      Fixes: 72861ae7 ("qeth: recovery through asynchronous delivery")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ed10e16