1. 03 Jun, 2020 40 commits
    • Qiushi Wu's avatar
      bonding: Fix reference count leak in bond_sysfs_slave_add. · 8a37da13
      Qiushi Wu authored
      commit a068aab4 upstream.
      
      kobject_init_and_add() takes reference even when it fails.
      If this function returns an error, kobject_put() must be called to
      properly clean up the memory associated with the object. Previous
      commit "b8eb7183" fixed a similar problem.
      
      Fixes: 07699f9a ("bonding: add sysfs /slave dir for bond slave devices.")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a37da13
    • Eric Dumazet's avatar
      crypto: chelsio/chtls: properly set tp->lsndtime · 3219344f
      Eric Dumazet authored
      commit a4976a3e upstream.
      
      TCP tp->lsndtime unit/base is tcp_jiffies32, not tcp_time_stamp()
      
      Fixes: 36bedb3f ("crypto: chtls - Inline TLS record Tx")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ayush Sawal <ayush.sawal@chelsio.com>
      Cc: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3219344f
    • Qiushi Wu's avatar
      qlcnic: fix missing release in qlcnic_83xx_interrupt_test. · 79ed4c83
      Qiushi Wu authored
      commit 15c97385 upstream.
      
      In function qlcnic_83xx_interrupt_test(), function
      qlcnic_83xx_diag_alloc_res() is not handled by function
      qlcnic_83xx_diag_free_res() after a call of the function
      qlcnic_alloc_mbx_args() failed. Fix this issue by adding
      a jump target "fail_mbx_args", and jump to this new target
      when qlcnic_alloc_mbx_args() failed.
      
      Fixes: b6b4316c ("qlcnic: Handle qlcnic_alloc_mbx_args() failure")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79ed4c83
    • Björn Töpel's avatar
      xsk: Add overflow check for u64 division, stored into u32 · 03cfd4e0
      Björn Töpel authored
      commit b16a87d0 upstream.
      
      The npgs member of struct xdp_umem is an u32 entity, and stores the
      number of pages the UMEM consumes. The calculation of npgs
      
        npgs = size / PAGE_SIZE
      
      can overflow.
      
      To avoid overflow scenarios, the division is now first stored in a
      u64, and the result is verified to fit into 32b.
      
      An alternative would be storing the npgs as a u64, however, this
      wastes memory and is an unrealisticly large packet area.
      
      Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
      Reported-by: default avatar"Minh Bùi Quang" <minhquangbui99@gmail.com>
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Link: https://lore.kernel.org/bpf/CACtPs=GGvV-_Yj6rbpzTVnopgi5nhMoCcTkSkYrJHGQHJWFZMQ@mail.gmail.com/
      Link: https://lore.kernel.org/bpf/20200525080400.13195-1-bjorn.topel@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03cfd4e0
    • Michael Chan's avatar
      bnxt_en: Fix accumulation of bp->net_stats_prev. · a4c9756a
      Michael Chan authored
      commit b8056e84 upstream.
      
      We have logic to maintain network counters across resets by storing
      the counters in bp->net_stats_prev before reset.  But not all resets
      will clear the counters.  Certain resets that don't need to change
      the number of rings do not clear the counters.  The current logic
      accumulates the counters before all resets, causing big jumps in
      the counters after some resets, such as ethtool -G.
      
      Fix it by only accumulating the counters during reset if the irq_re_init
      parameter is set.  The parameter signifies that all rings and interrupts
      will be reset and that means that the counters will also be reset.
      Reported-by: default avatarVijayendra Suman <vijayendra.suman@oracle.com>
      Fixes: b8875ca3 ("bnxt_en: Save ring statistics before reset.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4c9756a
    • Xin Long's avatar
      esp6: get the right proto for transport mode in esp6_gso_encap · e8f7bd7b
      Xin Long authored
      commit 3c96ec56 upstream.
      
      For transport mode, when ipv6 nexthdr is set, the packet format might
      be like:
      
          ----------------------------------------------------
          |        | dest |     |     |      |  ESP    | ESP |
          | IP6 hdr| opts.| ESP | TCP | Data | Trailer | ICV |
          ----------------------------------------------------
      
      What it wants to get for x-proto in esp6_gso_encap() is the proto that
      will be set in ESP nexthdr. So it should skip all ipv6 nexthdrs and
      get the real transport protocol. Othersize, the wrong proto number
      will be set into ESP nexthdr.
      
      This patch is to skip all ipv6 nexthdrs by calling ipv6_skip_exthdr()
      in esp6_gso_encap().
      
      Fixes: 7862b405 ("esp: Add gso handlers for esp4 and esp6")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8f7bd7b
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code · 9fb6b81e
      Pablo Neira Ayuso authored
      commit 4c559f15 upstream.
      
      Dan Carpenter says: "Smatch complains that the value for "cmd" comes
      from the network and can't be trusted."
      
      Add pptp_msg_name() helper function that checks for the array boundary.
      
      Fixes: f09943fe ("[NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fb6b81e
    • Pablo Neira Ayuso's avatar
      netfilter: nfnetlink_cthelper: unbreak userspace helper support · e70fb3ef
      Pablo Neira Ayuso authored
      commit 703acd70 upstream.
      
      Restore helper data size initialization and fix memcopy of the helper
      data size.
      
      Fixes: 157ffffe ("netfilter: nfnetlink_cthelper: reject too large userspace allocation requests")
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e70fb3ef
    • Phil Sutter's avatar
      netfilter: ipset: Fix subcounter update skip · 37bc21bb
      Phil Sutter authored
      commit a164b95a upstream.
      
      If IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE is set, user requested to not
      update counters in sub sets. Therefore IPSET_FLAG_SKIP_COUNTER_UPDATE
      must be set, not unset.
      
      Fixes: 6e01781d ("netfilter: ipset: set match: add support to match the counters")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37bc21bb
    • Michael Braun's avatar
      netfilter: nft_reject_bridge: enable reject with bridge vlan · f7d80955
      Michael Braun authored
      commit e9c284ec upstream.
      
      Currently, using the bridge reject target with tagged packets
      results in untagged packets being sent back.
      
      Fix this by mirroring the vlan id as well.
      
      Fixes: 85f5b308 ("netfilter: bridge: add reject support")
      Signed-off-by: default avatarMichael Braun <michael-dev@fami-braun.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7d80955
    • Xin Long's avatar
      ip_vti: receive ipip packet by calling ip_tunnel_rcv · 60efd2f8
      Xin Long authored
      commit 976eba8a upstream.
      
      In Commit dd9ee344 ("vti4: Fix a ipip packet processing bug in
      'IPCOMP' virtual tunnel"), it tries to receive IPIP packets in vti
      by calling xfrm_input(). This case happens when a small packet or
      frag sent by peer is too small to get compressed.
      
      However, xfrm_input() will still get to the IPCOMP path where skb
      sec_path is set, but never dropped while it should have been done
      in vti_ipcomp4_protocol.cb_handler(vti_rcv_cb), as it's not an
      ipcomp4 packet. This will cause that the packet can never pass
      xfrm4_policy_check() in the upper protocol rcv functions.
      
      So this patch is to call ip_tunnel_rcv() to process IPIP packets
      instead.
      
      Fixes: dd9ee344 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60efd2f8
    • Jeremy Sowden's avatar
      vti4: eliminated some duplicate code. · 0b7d0ff2
      Jeremy Sowden authored
      commit f981c57f upstream.
      
      The ipip tunnel introduced in commit dd9ee344 ("vti4: Fix a ipip
      packet processing bug in 'IPCOMP' virtual tunnel") largely duplicated
      the existing vti_input and vti_recv functions.  Refactored to
      deduplicate the common code.
      Signed-off-by: default avatarJeremy Sowden <jeremy@azazel.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b7d0ff2
    • Antony Antony's avatar
      xfrm: fix error in comment · e6194d4a
      Antony Antony authored
      commit 29e42766 upstream.
      
      s/xfrm_state_offload/xfrm_user_offload/
      
      Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
      Signed-off-by: default avatarAntony Antony <antony@phenome.org>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6194d4a
    • Xin Long's avatar
      xfrm: fix a NULL-ptr deref in xfrm_local_error · ef22ddba
      Xin Long authored
      commit f6a23d85 upstream.
      
      This patch is to fix a crash:
      
        [ ] kasan: GPF could be caused by NULL-ptr deref or user memory access
        [ ] general protection fault: 0000 [#1] SMP KASAN PTI
        [ ] RIP: 0010:ipv6_local_error+0xac/0x7a0
        [ ] Call Trace:
        [ ]  xfrm6_local_error+0x1eb/0x300
        [ ]  xfrm_local_error+0x95/0x130
        [ ]  __xfrm6_output+0x65f/0xb50
        [ ]  xfrm6_output+0x106/0x46f
        [ ]  udp_tunnel6_xmit_skb+0x618/0xbf0 [ip6_udp_tunnel]
        [ ]  vxlan_xmit_one+0xbc6/0x2c60 [vxlan]
        [ ]  vxlan_xmit+0x6a0/0x4276 [vxlan]
        [ ]  dev_hard_start_xmit+0x165/0x820
        [ ]  __dev_queue_xmit+0x1ff0/0x2b90
        [ ]  ip_finish_output2+0xd3e/0x1480
        [ ]  ip_do_fragment+0x182d/0x2210
        [ ]  ip_output+0x1d0/0x510
        [ ]  ip_send_skb+0x37/0xa0
        [ ]  raw_sendmsg+0x1b4c/0x2b80
        [ ]  sock_sendmsg+0xc0/0x110
      
      This occurred when sending a v4 skb over vxlan6 over ipsec, in which case
      skb->protocol == htons(ETH_P_IPV6) while skb->sk->sk_family == AF_INET in
      xfrm_local_error(). Then it will go to xfrm6_local_error() where it tries
      to get ipv6 info from a ipv4 sk.
      
      This issue was actually fixed by Commit 628e341f ("xfrm: make local
      error reporting more robust"), but brought back by Commit 844d4874
      ("xfrm: choose protocol family by skb protocol").
      
      So to fix it, we should call xfrm6_local_error() only when skb->protocol
      is htons(ETH_P_IPV6) and skb->sk->sk_family is AF_INET6.
      
      Fixes: 844d4874 ("xfrm: choose protocol family by skb protocol")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef22ddba
    • Xin Long's avatar
      xfrm: fix a warning in xfrm_policy_insert_list · 3aa98483
      Xin Long authored
      commit ed17b8d3 upstream.
      
      This waring can be triggered simply by:
      
        # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
          priority 1 mark 0 mask 0x10  #[1]
        # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
          priority 2 mark 0 mask 0x1   #[2]
        # ip xfrm policy update src 192.168.1.1/24 dst 192.168.1.2/24 dir in \
          priority 2 mark 0 mask 0x10  #[3]
      
      Then dmesg shows:
      
        [ ] WARNING: CPU: 1 PID: 7265 at net/xfrm/xfrm_policy.c:1548
        [ ] RIP: 0010:xfrm_policy_insert_list+0x2f2/0x1030
        [ ] Call Trace:
        [ ]  xfrm_policy_inexact_insert+0x85/0xe50
        [ ]  xfrm_policy_insert+0x4ba/0x680
        [ ]  xfrm_add_policy+0x246/0x4d0
        [ ]  xfrm_user_rcv_msg+0x331/0x5c0
        [ ]  netlink_rcv_skb+0x121/0x350
        [ ]  xfrm_netlink_rcv+0x66/0x80
        [ ]  netlink_unicast+0x439/0x630
        [ ]  netlink_sendmsg+0x714/0xbf0
        [ ]  sock_sendmsg+0xe2/0x110
      
      The issue was introduced by Commit 7cb8a939 ("xfrm: Allow inserting
      policies with matching mark and different priorities"). After that, the
      policies [1] and [2] would be able to be added with different priorities.
      
      However, policy [3] will actually match both [1] and [2]. Policy [1]
      was matched due to the 1st 'return true' in xfrm_policy_mark_match(),
      and policy [2] was matched due to the 2nd 'return true' in there. It
      caused WARN_ON() in xfrm_policy_insert_list().
      
      This patch is to fix it by only (the same value and priority) as the
      same policy in xfrm_policy_mark_match().
      
      Thanks to Yuehaibing, we could make this fix better.
      
      v1->v2:
        - check policy->mark.v == pol->mark.v only without mask.
      
      Fixes: 7cb8a939 ("xfrm: Allow inserting policies with matching mark and different priorities")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3aa98483
    • Nicolas Dichtel's avatar
      xfrm interface: fix oops when deleting a x-netns interface · a1b98e3b
      Nicolas Dichtel authored
      commit c95c5f58 upstream.
      
      Here is the steps to reproduce the problem:
      ip netns add foo
      ip netns add bar
      ip -n foo link add xfrmi0 type xfrm dev lo if_id 42
      ip -n foo link set xfrmi0 netns bar
      ip netns del foo
      ip netns del bar
      
      Which results to:
      [  186.686395] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bd3: 0000 [#1] SMP PTI
      [  186.687665] CPU: 7 PID: 232 Comm: kworker/u16:2 Not tainted 5.6.0+ #1
      [  186.688430] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      [  186.689420] Workqueue: netns cleanup_net
      [  186.689903] RIP: 0010:xfrmi_dev_uninit+0x1b/0x4b [xfrm_interface]
      [  186.690657] Code: 44 f6 ff ff 31 c0 5b 5d 41 5c 41 5d 41 5e c3 48 8d 8f c0 08 00 00 8b 05 ce 14 00 00 48 8b 97 d0 08 00 00 48 8b 92 c0 0e 00 00 <48> 8b 14 c2 48 8b 02 48 85 c0 74 19 48 39 c1 75 0c 48 8b 87 c0 08
      [  186.692838] RSP: 0018:ffffc900003b7d68 EFLAGS: 00010286
      [  186.693435] RAX: 000000000000000d RBX: ffff8881b0f31000 RCX: ffff8881b0f318c0
      [  186.694334] RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000246 RDI: ffff8881b0f31000
      [  186.695190] RBP: ffffc900003b7df0 R08: ffff888236c07740 R09: 0000000000000040
      [  186.696024] R10: ffffffff81fce1b8 R11: 0000000000000002 R12: ffffc900003b7d80
      [  186.696859] R13: ffff8881edcc6a40 R14: ffff8881a1b6e780 R15: ffffffff81ed47c8
      [  186.697738] FS:  0000000000000000(0000) GS:ffff888237dc0000(0000) knlGS:0000000000000000
      [  186.698705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  186.699408] CR2: 00007f2129e93148 CR3: 0000000001e0a000 CR4: 00000000000006e0
      [  186.700221] Call Trace:
      [  186.700508]  rollback_registered_many+0x32b/0x3fd
      [  186.701058]  ? __rtnl_unlock+0x20/0x3d
      [  186.701494]  ? arch_local_irq_save+0x11/0x17
      [  186.702012]  unregister_netdevice_many+0x12/0x55
      [  186.702594]  default_device_exit_batch+0x12b/0x150
      [  186.703160]  ? prepare_to_wait_exclusive+0x60/0x60
      [  186.703719]  cleanup_net+0x17d/0x234
      [  186.704138]  process_one_work+0x196/0x2e8
      [  186.704652]  worker_thread+0x1a4/0x249
      [  186.705087]  ? cancel_delayed_work+0x92/0x92
      [  186.705620]  kthread+0x105/0x10f
      [  186.706000]  ? __kthread_bind_mask+0x57/0x57
      [  186.706501]  ret_from_fork+0x35/0x40
      [  186.706978] Modules linked in: xfrm_interface nfsv3 nfs_acl auth_rpcgss nfsv4 nfs lockd grace fscache sunrpc button parport_pc parport serio_raw evdev pcspkr loop ext4 crc16 mbcache jbd2 crc32c_generic 8139too ide_cd_mod cdrom ide_gd_mod ata_generic ata_piix libata scsi_mod piix psmouse i2c_piix4 ide_core 8139cp i2c_core mii floppy
      [  186.710423] ---[ end trace 463bba18105537e5 ]---
      
      The problem is that x-netns xfrm interface are not removed when the link
      netns is removed. This causes later this oops when thoses interfaces are
      removed.
      
      Let's add a handler to remove all interfaces related to a netns when this
      netns is removed.
      
      Fixes: f203b76d ("xfrm: Add virtual xfrm interfaces")
      Reported-by: default avatarChristophe Gouault <christophe.gouault@6wind.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1b98e3b
    • Xin Long's avatar
      xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output · e41e9c53
      Xin Long authored
      commit a204aef9 upstream.
      
      An use-after-free crash can be triggered when sending big packets over
      vxlan over esp with esp offload enabled:
      
        [] BUG: KASAN: use-after-free in ipv6_gso_pull_exthdrs.part.8+0x32c/0x4e0
        [] Call Trace:
        []  dump_stack+0x75/0xa0
        []  kasan_report+0x37/0x50
        []  ipv6_gso_pull_exthdrs.part.8+0x32c/0x4e0
        []  ipv6_gso_segment+0x2c8/0x13c0
        []  skb_mac_gso_segment+0x1cb/0x420
        []  skb_udp_tunnel_segment+0x6b5/0x1c90
        []  inet_gso_segment+0x440/0x1380
        []  skb_mac_gso_segment+0x1cb/0x420
        []  esp4_gso_segment+0xae8/0x1709 [esp4_offload]
        []  inet_gso_segment+0x440/0x1380
        []  skb_mac_gso_segment+0x1cb/0x420
        []  __skb_gso_segment+0x2d7/0x5f0
        []  validate_xmit_skb+0x527/0xb10
        []  __dev_queue_xmit+0x10f8/0x2320 <---
        []  ip_finish_output2+0xa2e/0x1b50
        []  ip_output+0x1a8/0x2f0
        []  xfrm_output_resume+0x110e/0x15f0
        []  __xfrm4_output+0xe1/0x1b0
        []  xfrm4_output+0xa0/0x200
        []  iptunnel_xmit+0x5a7/0x920
        []  vxlan_xmit_one+0x1658/0x37a0 [vxlan]
        []  vxlan_xmit+0x5e4/0x3ec8 [vxlan]
        []  dev_hard_start_xmit+0x125/0x540
        []  __dev_queue_xmit+0x17bd/0x2320  <---
        []  ip6_finish_output2+0xb20/0x1b80
        []  ip6_output+0x1b3/0x390
        []  ip6_xmit+0xb82/0x17e0
        []  inet6_csk_xmit+0x225/0x3d0
        []  __tcp_transmit_skb+0x1763/0x3520
        []  tcp_write_xmit+0xd64/0x5fe0
        []  __tcp_push_pending_frames+0x8c/0x320
        []  tcp_sendmsg_locked+0x2245/0x3500
        []  tcp_sendmsg+0x27/0x40
      
      As on the tx path of vxlan over esp, skb->inner_network_header would be
      set on vxlan_xmit() and xfrm4_tunnel_encap_add(), and the later one can
      overwrite the former one. It causes skb_udp_tunnel_segment() to use a
      wrong skb->inner_network_header, then the issue occurs.
      
      This patch is to fix it by calling xfrm_output_gso() instead when the
      inner_protocol is set, in which gso_segment of inner_protocol will be
      done first.
      
      While at it, also improve some code around.
      
      Fixes: 7862b405 ("esp: Add gso handlers for esp4 and esp6")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e41e9c53
    • Xin Long's avatar
      xfrm: allow to accept packets with ipv6 NEXTHDR_HOP in xfrm_input · 477ae702
      Xin Long authored
      commit afcaf61b upstream.
      
      For beet mode, when it's ipv6 inner address with nexthdrs set,
      the packet format might be:
      
          ----------------------------------------------------
          | outer  |     | dest |     |      |  ESP    | ESP |
          | IP hdr | ESP | opts.| TCP | Data | Trailer | ICV |
          ----------------------------------------------------
      
      The nexthdr from ESP could be NEXTHDR_HOP(0), so it should
      continue processing the packet when nexthdr returns 0 in
      xfrm_input(). Otherwise, when ipv6 nexthdr is set, the
      packet will be dropped.
      
      I don't see any error cases that nexthdr may return 0. So
      fix it by removing the check for nexthdr == 0.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      477ae702
    • Al Viro's avatar
      copy_xstate_to_kernel(): don't leave parts of destination uninitialized · 51c01770
      Al Viro authored
      commit 9e463654 upstream.
      
      copy the corresponding pieces of init_fpstate into the gaps instead.
      
      Cc: stable@kernel.org
      Tested-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51c01770
    • Alexander Dahl's avatar
      x86/dma: Fix max PFN arithmetic overflow on 32 bit systems · cfe8d761
      Alexander Dahl authored
      commit 88743470 upstream.
      
      The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
      4 294 967 296 or 0x100000000 which is no problem on 64 bit systems.
      The patch does not change the later overall result of 0x100000 for
      MAX_DMA32_PFN (after it has been shifted by PAGE_SHIFT). The new
      calculation yields the same result, but does not require 64 bit
      arithmetic.
      
      On 32 bit systems the old calculation suffers from an arithmetic
      overflow in that intermediate term in braces: 4UL aka unsigned long int
      is 4 byte wide and an arithmetic overflow happens (the 0x100000000 does
      not fit in 4 bytes), the in braces result is truncated to zero, the
      following right shift does not alter that, so MAX_DMA32_PFN evaluates to
      0 on 32 bit systems.
      
      That wrong value is a problem in a comparision against MAX_DMA32_PFN in
      the init code for swiotlb in pci_swiotlb_detect_4gb() to decide if
      swiotlb should be active.  That comparison yields the opposite result,
      when compiling on 32 bit systems.
      
      This was not possible before
      
        1b7e03ef ("x86, NUMA: Enable emulation on 32bit too")
      
      when that MAX_DMA32_PFN was first made visible to x86_32 (and which
      landed in v3.0).
      
      In practice this wasn't a problem, unless CONFIG_SWIOTLB is active on
      x86-32.
      
      However if one has set CONFIG_IOMMU_INTEL, since
      
        c5a5dc4c ("iommu/vt-d: Don't switch off swiotlb if bounce page is used")
      
      there's a dependency on CONFIG_SWIOTLB, which was not necessarily
      active before. That landed in v5.4, where we noticed it in the fli4l
      Linux distribution. We have CONFIG_IOMMU_INTEL active on both 32 and 64
      bit kernel configs there (I could not find out why, so let's just say
      historical reasons).
      
      The effect is at boot time 64 MiB (default size) were allocated for
      bounce buffers now, which is a noticeable amount of memory on small
      systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
      frequently used as home routers.
      
      We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
      (LTS) in fli4l and got that kernel messages for example:
      
        Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
        …
        Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
        …
        PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
        software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)
      
      The initial analysis and the suggested fix was done by user 'sourcejedi'
      at stackoverflow and explicitly marked as GPLv2 for inclusion in the
      Linux kernel:
      
        https://unix.stackexchange.com/a/520525/50007
      
      The new calculation, which does not suffer from that overflow, is the
      same as for arch/mips now as suggested by Robin Murphy.
      
      The fix was tested by fli4l users on round about two dozen different
      systems, including both 32 and 64 bit archs, bare metal and virtualized
      machines.
      
       [ bp: Massage commit message. ]
      
      Fixes: 1b7e03ef ("x86, NUMA: Enable emulation on 32bit too")
      Reported-by: default avatarAlan Jenkins <alan.christopher.jenkins@gmail.com>
      Suggested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarAlexander Dahl <post@lespocky.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: stable@vger.kernel.org
      Link: https://unix.stackexchange.com/q/520065/50007
      Link: https://web.nettworks.org/bugs/browse/FFL-2560
      Link: https://lkml.kernel.org/r/20200526175749.20742-1-post@lespocky.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cfe8d761
    • Linus Lüssing's avatar
      mac80211: mesh: fix discovery timer re-arming issue / crash · e57ed07d
      Linus Lüssing authored
      commit e2d4a80f upstream.
      
      On a non-forwarding 802.11s link between two fairly busy
      neighboring nodes (iperf with -P 16 at ~850MBit/s TCP;
      1733.3 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 4), so with
      frequent PREQ retries, usually after around 30-40 seconds the
      following crash would occur:
      
      [ 1110.822428] Unable to handle kernel read from unreadable memory at virtual address 00000000
      [ 1110.830786] Mem abort info:
      [ 1110.833573]   Exception class = IABT (current EL), IL = 32 bits
      [ 1110.839494]   SET = 0, FnV = 0
      [ 1110.842546]   EA = 0, S1PTW = 0
      [ 1110.845678] user pgtable: 4k pages, 48-bit VAs, pgd = ffff800076386000
      [ 1110.852204] [0000000000000000] *pgd=00000000f6322003, *pud=00000000f62de003, *pmd=0000000000000000
      [ 1110.861167] Internal error: Oops: 86000004 [#1] PREEMPT SMP
      [ 1110.866730] Modules linked in: pppoe ppp_async batman_adv ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 usb_storage xhci_plat_hcd xhci_pci xhci_hcd dwc3 usbcore usb_common
      [ 1110.932190] Process swapper/3 (pid: 0, stack limit = 0xffff0000090c8000)
      [ 1110.938884] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.162 #0
      [ 1110.944965] Hardware name: LS1043A RGW Board (DT)
      [ 1110.949658] task: ffff8000787a81c0 task.stack: ffff0000090c8000
      [ 1110.955568] PC is at 0x0
      [ 1110.958097] LR is at call_timer_fn.isra.27+0x24/0x78
      [ 1110.963055] pc : [<0000000000000000>] lr : [<ffff0000080ff29c>] pstate: 00400145
      [ 1110.970440] sp : ffff00000801be10
      [ 1110.973744] x29: ffff00000801be10 x28: ffff000008bf7018
      [ 1110.979047] x27: ffff000008bf87c8 x26: ffff000008c160c0
      [ 1110.984352] x25: 0000000000000000 x24: 0000000000000000
      [ 1110.989657] x23: dead000000000200 x22: 0000000000000000
      [ 1110.994959] x21: 0000000000000000 x20: 0000000000000101
      [ 1111.000262] x19: ffff8000787a81c0 x18: 0000000000000000
      [ 1111.005565] x17: ffff0000089167b0 x16: 0000000000000058
      [ 1111.010868] x15: ffff0000089167b0 x14: 0000000000000000
      [ 1111.016172] x13: ffff000008916788 x12: 0000000000000040
      [ 1111.021475] x11: ffff80007fda9af0 x10: 0000000000000001
      [ 1111.026777] x9 : ffff00000801bea0 x8 : 0000000000000004
      [ 1111.032080] x7 : 0000000000000000 x6 : ffff80007fda9aa8
      [ 1111.037383] x5 : ffff00000801bea0 x4 : 0000000000000010
      [ 1111.042685] x3 : ffff00000801be98 x2 : 0000000000000614
      [ 1111.047988] x1 : 0000000000000000 x0 : 0000000000000000
      [ 1111.053290] Call trace:
      [ 1111.055728] Exception stack(0xffff00000801bcd0 to 0xffff00000801be10)
      [ 1111.062158] bcc0:                                   0000000000000000 0000000000000000
      [ 1111.069978] bce0: 0000000000000614 ffff00000801be98 0000000000000010 ffff00000801bea0
      [ 1111.077798] bd00: ffff80007fda9aa8 0000000000000000 0000000000000004 ffff00000801bea0
      [ 1111.085618] bd20: 0000000000000001 ffff80007fda9af0 0000000000000040 ffff000008916788
      [ 1111.093437] bd40: 0000000000000000 ffff0000089167b0 0000000000000058 ffff0000089167b0
      [ 1111.101256] bd60: 0000000000000000 ffff8000787a81c0 0000000000000101 0000000000000000
      [ 1111.109075] bd80: 0000000000000000 dead000000000200 0000000000000000 0000000000000000
      [ 1111.116895] bda0: ffff000008c160c0 ffff000008bf87c8 ffff000008bf7018 ffff00000801be10
      [ 1111.124715] bdc0: ffff0000080ff29c ffff00000801be10 0000000000000000 0000000000400145
      [ 1111.132534] bde0: ffff8000787a81c0 ffff00000801bde8 0000ffffffffffff 000001029eb19be8
      [ 1111.140353] be00: ffff00000801be10 0000000000000000
      [ 1111.145220] [<          (null)>]           (null)
      [ 1111.149917] [<ffff0000080ff77c>] run_timer_softirq+0x184/0x398
      [ 1111.155741] [<ffff000008081938>] __do_softirq+0x100/0x1fc
      [ 1111.161130] [<ffff0000080a2e28>] irq_exit+0x80/0xd8
      [ 1111.166002] [<ffff0000080ea708>] __handle_domain_irq+0x88/0xb0
      [ 1111.171825] [<ffff000008081678>] gic_handle_irq+0x68/0xb0
      [ 1111.177213] Exception stack(0xffff0000090cbe30 to 0xffff0000090cbf70)
      [ 1111.183642] be20:                                   0000000000000020 0000000000000000
      [ 1111.191461] be40: 0000000000000001 0000000000000000 00008000771af000 0000000000000000
      [ 1111.199281] be60: ffff000008c95180 0000000000000000 ffff000008c19360 ffff0000090cbef0
      [ 1111.207101] be80: 0000000000000810 0000000000000400 0000000000000098 ffff000000000000
      [ 1111.214920] bea0: 0000000000000001 ffff0000089167b0 0000000000000000 ffff0000089167b0
      [ 1111.222740] bec0: 0000000000000000 ffff000008c198e8 ffff000008bf7018 ffff000008c19000
      [ 1111.230559] bee0: 0000000000000000 0000000000000000 ffff8000787a81c0 ffff000008018000
      [ 1111.238380] bf00: ffff00000801c000 ffff00000913ba34 ffff8000787a81c0 ffff0000090cbf70
      [ 1111.246199] bf20: ffff0000080857cc ffff0000090cbf70 ffff0000080857d0 0000000000400145
      [ 1111.254020] bf40: ffff000008018000 ffff00000801c000 ffffffffffffffff ffff0000080fa574
      [ 1111.261838] bf60: ffff0000090cbf70 ffff0000080857d0
      [ 1111.266706] [<ffff0000080832e8>] el1_irq+0xe8/0x18c
      [ 1111.271576] [<ffff0000080857d0>] arch_cpu_idle+0x10/0x18
      [ 1111.276880] [<ffff0000080d7de4>] do_idle+0xec/0x1b8
      [ 1111.281748] [<ffff0000080d8020>] cpu_startup_entry+0x20/0x28
      [ 1111.287399] [<ffff00000808f81c>] secondary_start_kernel+0x104/0x110
      [ 1111.293662] Code: bad PC value
      [ 1111.296710] ---[ end trace 555b6ca4363c3edd ]---
      [ 1111.301318] Kernel panic - not syncing: Fatal exception in interrupt
      [ 1111.307661] SMP: stopping secondary CPUs
      [ 1111.311574] Kernel Offset: disabled
      [ 1111.315053] CPU features: 0x0002000
      [ 1111.318530] Memory Limit: none
      [ 1111.321575] Rebooting in 3 seconds..
      
      With some added debug output / delays we were able to push the crash from
      the timer callback runner into the callback function and by that shedding
      some light on which object holding the timer gets corrupted:
      
      [  401.720899] Unable to handle kernel read from unreadable memory at virtual address 00000868
      [...]
      [  402.335836] [<ffff0000088fafa4>] _raw_spin_lock_bh+0x14/0x48
      [  402.341548] [<ffff000000dbe684>] mesh_path_timer+0x10c/0x248 [mac80211]
      [  402.348154] [<ffff0000080ff29c>] call_timer_fn.isra.27+0x24/0x78
      [  402.354150] [<ffff0000080ff77c>] run_timer_softirq+0x184/0x398
      [  402.359974] [<ffff000008081938>] __do_softirq+0x100/0x1fc
      [  402.365362] [<ffff0000080a2e28>] irq_exit+0x80/0xd8
      [  402.370231] [<ffff0000080ea708>] __handle_domain_irq+0x88/0xb0
      [  402.376053] [<ffff000008081678>] gic_handle_irq+0x68/0xb0
      
      The issue happens due to the following sequence of events:
      
      1) mesh_path_start_discovery():
      -> spin_unlock_bh(&mpath->state_lock) before mesh_path_sel_frame_tx()
      
      2) mesh_path_free_rcu()
      -> del_timer_sync(&mpath->timer)
         [...]
      -> kfree_rcu(mpath)
      
      3) mesh_path_start_discovery():
      -> mod_timer(&mpath->timer, ...)
         [...]
      -> rcu_read_unlock()
      
      4) mesh_path_free_rcu()'s kfree_rcu():
      -> kfree(mpath)
      
      5) mesh_path_timer() starts after timeout, using freed mpath object
      
      So a use-after-free issue due to a timer re-arming bug caused by an
      early spin-unlocking.
      
      This patch fixes this issue by re-checking if mpath is about to be
      free'd and if so bails out of re-arming the timer.
      
      Cc: stable@vger.kernel.org
      Fixes: 050ac52c ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
      Cc: Simon Wunderlich <sw@simonwunderlich.de>
      Signed-off-by: default avatarLinus Lüssing <ll@simonwunderlich.de>
      Link: https://lore.kernel.org/r/20200522170413.14973-1-linus.luessing@c0d3.blueSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e57ed07d
    • Jason Gunthorpe's avatar
      RDMA/core: Fix double destruction of uobject · cde9a4f6
      Jason Gunthorpe authored
      commit c85f4abe upstream.
      
      Fix use after free when user user space request uobject concurrently for
      the same object, within the RCU grace period.
      
      In that case, remove_handle_idr_uobject() is called twice and we will have
      an extra put on the uobject which cause use after free.  Fix it by leaving
      the uobject write locked after it was removed from the idr.
      
      Call to rdma_lookup_put_uobject with UVERBS_LOOKUP_DESTROY instead of
      UVERBS_LOOKUP_WRITE will do the work.
      
        refcount_t: underflow; use-after-free.
        WARNING: CPU: 0 PID: 1381 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0
        Kernel panic - not syncing: panic_on_warn set ...
        CPU: 0 PID: 1381 Comm: syz-executor.0 Not tainted 5.5.0-rc3 #8
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
        Call Trace:
         dump_stack+0x94/0xce
         panic+0x234/0x56f
         __warn+0x1cc/0x1e1
         report_bug+0x200/0x310
         fixup_bug.part.11+0x32/0x80
         do_error_trap+0xd3/0x100
         do_invalid_op+0x31/0x40
         invalid_op+0x1e/0x30
        RIP: 0010:refcount_warn_saturate+0xfe/0x1a0
        Code: 0f 0b eb 9b e8 23 f6 6d ff 80 3d 6c d4 19 03 00 75 8d e8 15 f6 6d ff 48 c7 c7 c0 02 55 bd c6 05 57 d4 19 03 01 e8 a2 58 49 ff <0f> 0b e9 6e ff ff ff e8 f6 f5 6d ff 80 3d 42 d4 19 03 00 0f 85 5c
        RSP: 0018:ffffc90002df7b98 EFLAGS: 00010282
        RAX: 0000000000000000 RBX: ffff88810f6a193c RCX: ffffffffba649009
        RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88811b0283cc
        RBP: 0000000000000003 R08: ffffed10236060e3 R09: ffffed10236060e3
        R10: 0000000000000001 R11: ffffed10236060e2 R12: ffff88810f6a193c
        R13: ffffc90002df7d60 R14: 0000000000000000 R15: ffff888116ae6a08
         uverbs_uobject_put+0xfd/0x140
         __uobj_perform_destroy+0x3d/0x60
         ib_uverbs_close_xrcd+0x148/0x170
         ib_uverbs_write+0xaa5/0xdf0
         __vfs_write+0x7c/0x100
         vfs_write+0x168/0x4a0
         ksys_write+0xc8/0x200
         do_syscall_64+0x9c/0x390
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x465b49
        Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
        RSP: 002b:00007f759d122c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
        RAX: ffffffffffffffda RBX: 000000000073bfa8 RCX: 0000000000465b49
        RDX: 000000000000000c RSI: 0000000020000080 RDI: 0000000000000003
        RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000246 R12: 00007f759d1236bc
        R13: 00000000004ca27c R14: 000000000070de40 R15: 00000000ffffffff
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Kernel Offset: 0x39400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      
      Fixes: 7452a3c7 ("IB/uverbs: Allow RDMA_REMOVE_DESTROY to work concurrently with disassociate")
      Link: https://lore.kernel.org/r/20200527135534.482279-1-leon@kernel.orgSigned-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cde9a4f6
    • Sarthak Garg's avatar
      mmc: core: Fix recursive locking issue in CQE recovery path · 34141cb8
      Sarthak Garg authored
      commit 39a22f73 upstream.
      
      Consider the following stack trace
      
      -001|raw_spin_lock_irqsave
      -002|mmc_blk_cqe_complete_rq
      -003|__blk_mq_complete_request(inline)
      -003|blk_mq_complete_request(rq)
      -004|mmc_cqe_timed_out(inline)
      -004|mmc_mq_timed_out
      
      mmc_mq_timed_out acquires the queue_lock for the first
      time. The mmc_blk_cqe_complete_rq function also tries to acquire
      the same queue lock resulting in recursive locking where the task
      is spinning for the same lock which it has already acquired leading
      to watchdog bark.
      
      Fix this issue with the lock only for the required critical section.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 1e8e55b6 ("mmc: block: Add CQE support")
      Suggested-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: default avatarSarthak Garg <sartgarg@codeaurora.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Link: https://lore.kernel.org/r/1588868135-31783-1-git-send-email-vbadigan@codeaurora.orgSigned-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34141cb8
    • Helge Deller's avatar
      parisc: Fix kernel panic in mem_init() · 52234e55
      Helge Deller authored
      commit bf71bc16 upstream.
      
      The Debian kernel v5.6 triggers this kernel panic:
      
       Kernel panic - not syncing: Bad Address (null pointer deref?)
       Bad Address (null pointer deref?): Code=26 (Data memory access rights trap) at addr 0000000000000000
       CPU: 0 PID: 0 Comm: swapper Not tainted 5.6.0-2-parisc64 #1 Debian 5.6.14-1
        IAOQ[0]: mem_init+0xb0/0x150
        IAOQ[1]: mem_init+0xb4/0x150
        RP(r2): start_kernel+0x6c8/0x1190
       Backtrace:
        [<0000000040101ab4>] start_kernel+0x6c8/0x1190
        [<0000000040108574>] start_parisc+0x158/0x1b8
      
      on a HP-PARISC rp3440 machine with this memory layout:
       Memory Ranges:
        0) Start 0x0000000000000000 End 0x000000003fffffff Size   1024 MB
        1) Start 0x0000004040000000 End 0x00000040ffdfffff Size   3070 MB
      
      Fix the crash by avoiding virt_to_page() and similar functions in
      mem_init() until the memory zones have been fully set up.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # v5.0+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      52234e55
    • Qiushi Wu's avatar
      iommu: Fix reference count leak in iommu_group_alloc. · 0dc3cd09
      Qiushi Wu authored
      [ Upstream commit 7cc31613 ]
      
      kobject_init_and_add() takes reference even when it fails.
      Thus, when kobject_init_and_add() returns an error,
      kobject_put() must be called to properly clean up the kobject.
      
      Fixes: d72e31c9 ("iommu: IOMMU Groups")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Link: https://lore.kernel.org/r/20200527210020.6522-1-wu000273@umn.eduSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0dc3cd09
    • Arnd Bergmann's avatar
      include/asm-generic/topology.h: guard cpumask_of_node() macro argument · 51b77959
      Arnd Bergmann authored
      [ Upstream commit 4377748c ]
      
      drivers/hwmon/amd_energy.c:195:15: error: invalid operands to binary expression ('void' and 'int')
                                              (channel - data->nr_cpus));
                                              ~~~~~~~~~^~~~~~~~~~~~~~~~~
      include/asm-generic/topology.h:51:42: note: expanded from macro 'cpumask_of_node'
          #define cpumask_of_node(node)       ((void)node, cpu_online_mask)
                                                     ^~~~
      include/linux/cpumask.h:618:72: note: expanded from macro 'cpumask_first_and'
       #define cpumask_first_and(src1p, src2p) cpumask_next_and(-1, (src1p), (src2p))
                                                                             ^~~~~
      
      Fixes: f0b848ce ("cpumask: Introduce cpumask_of_{node,pcibus} to replace {node,pcibus}_to_cpumask")
      Fixes: 8abee956 ("hwmon: Add amd_energy driver to report energy counters")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: http://lkml.kernel.org/r/20200527134623.930247-1-arnd@arndb.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      51b77959
    • Alexander Potapenko's avatar
      fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() · d16b0abe
      Alexander Potapenko authored
      [ Upstream commit 1d605416 ]
      
      KMSAN reported uninitialized data being written to disk when dumping
      core.  As a result, several kilobytes of kmalloc memory may be written
      to the core file and then read by a non-privileged user.
      Reported-by: default avatarsam <sunhaoyl@outlook.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200419100848.63472-1-glider@google.com
      Link: https://github.com/google/kmsan/issues/76Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d16b0abe
    • Konstantin Khlebnikov's avatar
      mm: remove VM_BUG_ON(PageSlab()) from page_mapcount() · 0985f471
      Konstantin Khlebnikov authored
      [ Upstream commit 6988f31d ]
      
      Replace superfluous VM_BUG_ON() with comment about correct usage.
      
      Technically reverts commit 1d148e21 ("mm: add VM_BUG_ON_PAGE() to
      page_mapcount()"), but context lines have changed.
      
      Function isolate_migratepages_block() runs some checks out of lru_lock
      when choose pages for migration.  After checking PageLRU() it checks
      extra page references by comparing page_count() and page_mapcount().
      Between these two checks page could be removed from lru, freed and taken
      by slab.
      
      As a result this race triggers VM_BUG_ON(PageSlab()) in page_mapcount().
      Race window is tiny.  For certain workload this happens around once a
      year.
      
          page:ffffea0105ca9380 count:1 mapcount:0 mapping:ffff88ff7712c180 index:0x0 compound_mapcount: 0
          flags: 0x500000000008100(slab|head)
          raw: 0500000000008100 dead000000000100 dead000000000200 ffff88ff7712c180
          raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
          page dumped because: VM_BUG_ON_PAGE(PageSlab(page))
          ------------[ cut here ]------------
          kernel BUG at ./include/linux/mm.h:628!
          invalid opcode: 0000 [#1] SMP NOPTI
          CPU: 77 PID: 504 Comm: kcompactd1 Tainted: G        W         4.19.109-27 #1
          Hardware name: Yandex T175-N41-Y3N/MY81-EX0-Y3N, BIOS R05 06/20/2019
          RIP: 0010:isolate_migratepages_block+0x986/0x9b0
      
      The code in isolate_migratepages_block() was added in commit
      119d6d59 ("mm, compaction: avoid isolating pinned pages") before
      adding VM_BUG_ON into page_mapcount().
      
      This race has been predicted in 2015 by Vlastimil Babka (see link
      below).
      
      [akpm@linux-foundation.org: comment tweaks, per Hugh]
      Fixes: 1d148e21 ("mm: add VM_BUG_ON_PAGE() to page_mapcount()")
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/159032779896.957378.7852761411265662220.stgit@buzz
      Link: https://lore.kernel.org/lkml/557710E1.6060103@suse.cz/
      Link: https://lore.kernel.org/linux-mm/158937872515.474360.5066096871639561424.stgit@buzz/T/ (v1)
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0985f471
    • Valentine Fatiev's avatar
      IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode · 977436cf
      Valentine Fatiev authored
      [ Upstream commit 1acba6a8 ]
      
      When connected mode is set, and we have connected and datagram traffic in
      parallel, ipoib might crash with double free of datagram skb.
      
      The current mechanism assumes that the order in the completion queue is
      the same as the order of sent packets for all QPs. Order is kept only for
      specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and
      few CM's) in parallel.
      
      The problem:
      ----------------------------------------------------------
      
      Transmit queue:
      -----------------
      UD skb pointer kept in queue itself, CM skb kept in spearate queue and
      uses transmit queue as a placeholder to count the number of total
      transmitted packets.
      
      0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
      ------------------------------------------------------------
      NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
      ------------------------------------------------------------
          ^                                  ^
         tail                               head
      
      Completion queue (problematic scenario) - the order not the same as in
      the transmit queue:
      
        1  2  3  4  5  6  7  8  9
      ------------------------------------
       ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5
      ------------------------------------
      
      1. CM1 'wc' processing
         - skb freed in cm separate ring.
         - tx_tail of transmit queue increased although UD2 is not freed.
           Now driver assumes UD2 index is already freed and it could be used for
           new transmitted skb.
      
      0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
      ------------------------------------------------------------
      NL NL  UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
      ------------------------------------------------------------
              ^   ^                       ^
            (Bad)tail                    head
      (Bad - Could be used for new SKB)
      
      In this case (due to heavy load) UD2 skb pointer could be replaced by new
      transmitted packet UD_NEW, as the driver assumes its free.  At this point
      we will have to process two 'wc' with same index but we have only one
      pointer to free.
      
      During second attempt to free the same skb we will have NULL pointer
      exception.
      
      2. UD2 'wc' processing
         - skb freed according the index we got from 'wc', but it was already
           overwritten by mistake. So actually the skb that was released is the
           skb of the new transmitted packet and not the original one.
      
      3. UD_NEW 'wc' processing
         - attempt to free already freed skb. NUll pointer exception.
      
      The fix:
      -----------------------------------------------------------------------
      
      The fix is to stop using the UD ring as a placeholder for CM packets, the
      cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a
      new cyclic variables global_tx_head and global_tx_tail are introduced for
      managing and counting the overall outstanding sent packets, then the send
      queue will be stopped and waken based on these variables only.
      
      Note that no locking is needed since global_tx_head is updated in the xmit
      flow and global_tx_tail is updated in the NAPI flow only.  A previous
      attempt tried to use one variable to count the outstanding sent packets,
      but it did not work since xmit and NAPI flows can run at the same time and
      the counter will be updated wrongly. Thus, we use the same simple cyclic
      head and tail scheme that we have today for the UD tx_ring.
      
      Fixes: 2c104ea6 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes")
      Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.orgSigned-off-by: default avatarValentine Fatiev <valentinef@mellanox.com>
      Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      977436cf
    • Jerry Lee's avatar
      libceph: ignore pool overlay and cache logic on redirects · 49998bbe
      Jerry Lee authored
      [ Upstream commit 890bd0f8 ]
      
      OSD client should ignore cache/overlay flag if got redirect reply.
      Otherwise, the client hangs when the cache tier is in forward mode.
      
      [ idryomov: Redirects are effectively deprecated and no longer
        used or tested.  The original tiering modes based on redirects
        are inherently flawed because redirects can race and reorder,
        potentially resulting in data corruption.  The new proxy and
        readproxy tiering modes should be used instead of forward and
        readforward.  Still marking for stable as obviously correct,
        though. ]
      
      Cc: stable@vger.kernel.org
      URL: https://tracker.ceph.com/issues/23296
      URL: https://tracker.ceph.com/issues/36406Signed-off-by: default avatarJerry Lee <leisurelysw24@gmail.com>
      Reviewed-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      49998bbe
    • Kailang Yang's avatar
      ALSA: hda/realtek - Add new codec supported for ALC287 · ccc9da36
      Kailang Yang authored
      [ Upstream commit 630e3612 ]
      
      Enable new codec supported for ALC287.
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/dcf5ce5507104d0589a917cbb71dc3c6@realtek.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ccc9da36
    • Takashi Iwai's avatar
      ALSA: usb-audio: Quirks for Gigabyte TRX40 Aorus Master onboard audio · 59edcbe0
      Takashi Iwai authored
      [ Upstream commit 7f5ad9c9 ]
      
      Gigabyte TRX40 Aorus Master is equipped with two USB-audio devices,
      a Realtek ALC1220-VB codec (USB ID 0414:a001) and an ESS SABRE9218 DAC
      (USB ID 0414:a000).  The latter serves solely for the headphone output
      on the front panel while the former serves for the rest I/Os (mostly
      for the I/Os in the rear panel but also including the front mic).
      
      Both chips do work more or less with the unmodified USB-audio driver,
      but there are a few glitches.  The ALC1220-VB returns an error for an
      inquiry to some jacks, as already seen on other TRX40-based mobos.
      However this machine has a slightly incompatible configuration, hence
      the existing mapping cannot be used as is.
      
      Meanwhile the ESS chip seems working without any quirk.  But since
      both audio devices don't provide any specific names, both cards appear
      as "USB-Audio", and it's quite confusing for users.
      
      This patch is an attempt to overcome those issues:
      
      - The specific mapping table for ALC1220-VB is provided, reducing the
        non-working nodes and renaming the badly chosen controls.
        The connector map isn't needed here unlike other TRX40 quirks.
      
      - For both USB IDs (0414:a000 and 0414:a001), provide specific card
        name strings, so that user-space can identify more easily; and more
        importantly, UCM profile can be applied to each.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200526082810.29506-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      59edcbe0
    • Eric W. Biederman's avatar
      exec: Always set cap_ambient in cap_bprm_set_creds · 6c45ea17
      Eric W. Biederman authored
      [ Upstream commit a4ae32c7 ]
      
      An invariant of cap_bprm_set_creds is that every field in the new cred
      structure that cap_bprm_set_creds might set, needs to be set every
      time to ensure the fields does not get a stale value.
      
      The field cap_ambient is not set every time cap_bprm_set_creds is
      called, which means that if there is a suid or sgid script with an
      interpreter that has neither the suid nor the sgid bits set the
      interpreter should be able to accept ambient credentials.
      Unfortuantely because cap_ambient is not reset to it's original value
      the interpreter can not accept ambient credentials.
      
      Given that the ambient capability set is expected to be controlled by
      the caller, I don't think this is particularly serious.  But it is
      definitely worth fixing so the code works correctly.
      
      I have tested to verify my reading of the code is correct and the
      interpreter of a sgid can receive ambient capabilities with this
      change and cannot receive ambient capabilities without this change.
      
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Fixes: 58319057 ("capabilities: ambient capabilities")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6c45ea17
    • Chris Chiu's avatar
      ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC · 5870873c
      Chris Chiu authored
      [ Upstream commit 4020d1cc ]
      
      The Asus USB DAC is a USB type-C audio dongle for connecting to
      the headset and headphone. The volume minimum value -23040 which
      is 0xa600 in hexadecimal with the resolution value 1 indicates
      this should be endianness issue caused by the firmware bug. Add
      a volume quirk to fix the volume control problem.
      
      Also fixes this warning:
        Warning! Unlikely big volume range (=23040), cval->res is probably wrong.
        [5] FU [Headset Capture Volume] ch = 1, val = -23040/0/1
        Warning! Unlikely big volume range (=23040), cval->res is probably wrong.
        [7] FU [Headset Playback Volume] ch = 1, val = -23040/0/1
      Signed-off-by: default avatarChris Chiu <chiu@endlessm.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200526062613.55401-1-chiu@endlessm.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5870873c
    • Takashi Iwai's avatar
      ALSA: hda/realtek - Add a model for Thinkpad T570 without DAC workaround · 5151c8e3
      Takashi Iwai authored
      [ Upstream commit 399c01aa ]
      
      We fixed the regression of the speaker volume for some Thinkpad models
      (e.g. T570) by the commit 54947cd6 ("ALSA: hda/realtek - Fix
      speaker output regression on Thinkpad T570").  Essentially it fixes
      the DAC / pin pairing by a static table.  It was confirmed and merged
      to stable kernel later.
      
      Now, interestingly, we got another regression report for the very same
      model (T570) about the similar problem, and the commit above was the
      culprit.  That is, by some reason, there are devices that prefer the
      DAC1, and another device DAC2!
      
      Unfortunately those have the same ID and we have no idea what can
      differentiate, in this patch, a new fixup model "tpt470-dock-fix" is
      provided, so that users with such a machine can apply it manually.
      When model=tpt470-dock-fix option is passed to snd-hda-intel module,
      it avoids the fixed DAC pairing and the DAC1 is assigned to the
      speaker like the earlier versions.
      
      Fixes: 54947cd6 ("ALSA: hda/realtek - Fix speaker output regression on Thinkpad T570")
      BugLink: https://apibugzilla.suse.com/show_bug.cgi?id=1172017
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200526062406.9799-1-tiwai@suse.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5151c8e3
    • Changming Liu's avatar
      ALSA: hwdep: fix a left shifting 1 by 31 UB bug · f9ee8f97
      Changming Liu authored
      [ Upstream commit fb8cd648 ]
      
      The "info.index" variable can be 31 in "1 << info.index".
      This might trigger an undefined behavior since 1 is signed.
      
      Fix this by casting 1 to 1u just to be sure "1u << 31" is defined.
      Signed-off-by: default avatarChangming Liu <liu.changm@northeastern.edu>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/BL0PR06MB4548170B842CB055C9AF695DE5B00@BL0PR06MB4548.namprd06.prod.outlook.comSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f9ee8f97
    • Qiushi Wu's avatar
      RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe() · e8ed2ff7
      Qiushi Wu authored
      [ Upstream commit db857e6a ]
      
      In function pvrdma_pci_probe(), pdev was not disabled in one error
      path. Thus replace the jump target “err_free_device” by
      "err_disable_pdev".
      
      Fixes: 29c8d9eb ("IB: Add vmw_pvrdma driver")
      Link: https://lore.kernel.org/r/20200523030457.16160-1-wu000273@umn.eduSigned-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e8ed2ff7
    • Peng Hao's avatar
      mmc: block: Fix use-after-free issue for rpmb · 9f5562d7
      Peng Hao authored
      [ Upstream commit 202500d2 ]
      
      The data structure member “rpmb->md” was passed to a call of the function
      “mmc_blk_put” after a call of the function “put_device”. Reorder these
      function calls to keep the data accesses consistent.
      
      Fixes: 1c87f735 ("mmc: block: Fix bug when removing RPMB chardev ")
      Signed-off-by: default avatarPeng Hao <richard.peng@oppo.com>
      Cc: stable@vger.kernel.org
      [Uffe: Fixed up mangled patch and updated commit message]
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9f5562d7
    • Hamish Martin's avatar
      ARM: dts: bcm: HR2: Fix PPI interrupt types · 78b83e79
      Hamish Martin authored
      [ Upstream commit be0ec060 ]
      
      These error messages are output when booting on a BCM HR2 system:
          GIC: PPI11 is secure or misconfigured
          GIC: PPI13 is secure or misconfigured
      
      Per ARM documentation these interrupts are triggered on a rising edge.
      See ARM Cortex A-9 MPCore Technical Reference Manual, Revision r4p1,
      Section 3.3.8 Interrupt Configuration Registers.
      
      The same issue was resolved for NSP systems in commit 5f1aa51c
      ("ARM: dts: NSP: Fix PPI interrupt types").
      
      Fixes: b9099ec7 ("ARM: dts: Add Broadcom Hurricane 2 DTS include file")
      Signed-off-by: default avatarHamish Martin <hamish.martin@alliedtelesis.co.nz>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      78b83e79
    • Vincent Stehlé's avatar
      ARM: dts: bcm2835-rpi-zero-w: Fix led polarity · 3d657b5c
      Vincent Stehlé authored
      [ Upstream commit 58bb90ab ]
      
      The status "ACT" led on the Raspberry Pi Zero W is on when GPIO 47 is low.
      
      This has been verified on a board and somewhat confirmed by both the GPIO
      name ("STATUS_LED_N") and the reduced schematics [1].
      
      [1]: https://www.raspberrypi.org/documentation/hardware/raspberrypi/schematics/rpi_SCH_ZeroW_1p1_reduced.pdf
      
      Fixes: 2c7c040c ("ARM: dts: bcm2835: Add Raspberry Pi Zero W")
      Signed-off-by: default avatarVincent Stehlé <vincent.stehle@laposte.net>
      Cc: Stefan Wahren <stefan.wahren@i2se.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d657b5c