1. 22 Jul, 2020 27 commits
    • Andy Shevchenko's avatar
      i2c: eg20t: Load module automatically if ID matches · 206c9c97
      Andy Shevchenko authored
      [ Upstream commit 5f90786b ]
      
      The driver can't be loaded automatically because it misses
      module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
      call to the driver.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      206c9c97
    • Bob Peterson's avatar
      gfs2: read-only mounts should grab the sd_freeze_gl glock · f8bc4ff1
      Bob Peterson authored
      [ Upstream commit b780cc61 ]
      
      Before this patch, only read-write mounts would grab the freeze
      glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
      glock was never initialized. That meant requests to freeze, which
      request the glock in EX, were granted without any state transition.
      That meant you could mount a gfs2 file system, which is currently
      frozen on a different cluster node, in read-only mode.
      
      This patch makes read-only mounts lock the freeze glock in SH mode,
      which will block for file systems that are frozen on another node.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f8bc4ff1
    • Vasily Averin's avatar
      tpm_tis: extra chip->ops check on error path in tpm_tis_core_init · 01729f8d
      Vasily Averin authored
      [ Upstream commit ccf6fb85 ]
      
      Found by smatch:
      drivers/char/tpm/tpm_tis_core.c:1088 tpm_tis_core_init() warn:
       variable dereferenced before check 'chip->ops' (see line 979)
      
      'chip->ops' is assigned in the beginning of function
      in tpmm_chip_alloc->tpm_chip_alloc
      and is used before first possible goto to error path.
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      01729f8d
    • Ard Biesheuvel's avatar
      arm64/alternatives: use subsections for replacement sequences · d6d91458
      Ard Biesheuvel authored
      [ Upstream commit f7b93d42 ]
      
      When building very large kernels, the logic that emits replacement
      sequences for alternatives fails when relative branches are present
      in the code that is emitted into the .altinstr_replacement section
      and patched in at the original site and fixed up. The reason is that
      the linker will insert veneers if relative branches go out of range,
      and due to the relative distance of the .altinstr_replacement from
      the .text section where its branch targets usually live, veneers
      may be emitted at the end of the .altinstr_replacement section, with
      the relative branches in the sequence pointed at the veneers instead
      of the actual target.
      
      The alternatives patching logic will attempt to fix up the branch to
      point to its original target, which will be the veneer in this case,
      but given that the patch site is likely to be far away as well, it
      will be out of range and so patching will fail. There are other cases
      where these veneers are problematic, e.g., when the target of the
      branch is in .text while the patch site is in .init.text, in which
      case putting the replacement sequence inside .text may not help either.
      
      So let's use subsections to emit the replacement code as closely as
      possible to the patch site, to ensure that veneers are only likely to
      be emitted if they are required at the patch site as well, in which
      case they will be in range for the replacement sequence both before
      and after it is transported to the patch site.
      
      This will prevent alternative sequences in non-init code from being
      released from memory after boot, but this is tolerable given that the
      entire section is only 512 KB on an allyesconfig build (which weighs in
      at 500+ MB for the entire Image). Also, note that modules today carry
      the replacement sequences in non-init sections as well, and any of
      those that target init code will be emitted into init sections after
      this change.
      
      This fixes an early crash when booting an allyesconfig kernel on a
      system where any of the alternatives sequences containing relative
      branches are activated at boot (e.g., ARM64_HAS_PAN on TX2)
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Dave P Martin <dave.martin@arm.com>
      Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.orgSigned-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d6d91458
    • Angelo Dureghello's avatar
      m68k: mm: fix node memblock init · c24fb4f5
      Angelo Dureghello authored
      [ Upstream commit c43e5579 ]
      
      After pulling 5.7.0 (linux-next merge), mcf5441x mmu boot was
      hanging silently.
      
      memblock_add() seems not appropriate, since using MAX_NUMNODES
      as node id, while memblock_add_node() sets up memory for node id 0.
      Signed-off-by: default avatarAngelo Dureghello <angelo.dureghello@timesys.com>
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarGreg Ungerer <gerg@linux-m68k.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c24fb4f5
    • Mike Rapoport's avatar
      m68k: nommu: register start of the memory with memblock · 0ea1f064
      Mike Rapoport authored
      [ Upstream commit d63bd8c8 ]
      
      The m68k nommu setup code didn't register the beginning of the physical
      memory with memblock because it was anyway occupied by the kernel. However,
      commit fa3354e4 ("mm: free_area_init: use maximal zone PFNs rather than
      zone sizes") changed zones initialization to use memblock.memory to detect
      the zone extents and this caused inconsistency between zone PFNs and the
      actual PFNs:
      
      BUG: Bad page state in process swapper  pfn:20165
      page:41fe0ca0 refcount:0 mapcount:1 mapping:00000000 index:0x0 flags: 0x0()
      raw: 00000000 00000100 00000122 00000000 00000000 00000000 00000000 00000000
      page dumped because: nonzero mapcount
      CPU: 0 PID: 1 Comm: swapper Not tainted 5.8.0-rc1-00001-g3a38f8a60c65-dirty #1
      Stack from 404c9ebc:
              404c9ebc 4029ab28 4029ab28 40088470 41fe0ca0 40299e21 40299df1 404ba2a4
              00020165 00000000 41fd2c10 402c7ba0 41fd2c04 40088504 41fe0ca0 40299e21
              00000000 40088a12 41fe0ca0 41fe0ca4 0000020a 00000000 00000001 402ca000
              00000000 41fe0ca0 41fd2c10 41fd2c10 00000000 00000000 402b2388 00000001
              400a0934 40091056 404c9f44 404c9f44 40088db4 402c7ba0 00000001 41fd2c04
              41fe0ca0 41fd2000 41fe0ca0 40089e02 4026ecf4 40089e4e 41fe0ca0 ffffffff
      Call Trace:
              [<40088470>] 0x40088470
       [<40088504>] 0x40088504
       [<40088a12>] 0x40088a12
       [<402ca000>] 0x402ca000
       [<400a0934>] 0x400a0934
      
      Adjust the memory registration with memblock to include the beginning of
      the physical memory and make sure that the area occupied by the kernel is
      marked as reserved.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarGreg Ungerer <gerg@linux-m68k.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0ea1f064
    • Navid Emamdoost's avatar
      drm/exynos: fix ref count leak in mic_pre_enable · b2cde935
      Navid Emamdoost authored
      [ Upstream commit d4f5a095 ]
      
      in mic_pre_enable, pm_runtime_get_sync is called which
      increments the counter even in case of failure, leading to incorrect
      ref count. In case of failure, decrement the ref count before returning.
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarInki Dae <inki.dae@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b2cde935
    • Bernard Zhao's avatar
      drm/msm: fix potential memleak in error branch · bf59aa00
      Bernard Zhao authored
      [ Upstream commit 177d3819 ]
      
      In function msm_submitqueue_create, the queue is a local
      variable, in return -EINVAL branch, queue didn`t add to ctx`s
      list yet, and also didn`t kfree, this maybe bring in potential
      memleak.
      Signed-off-by: default avatarBernard Zhao <bernard@vivo.com>
      [trivial commit msg fixup]
      Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bf59aa00
    • Toke Høiland-Jørgensen's avatar
      vlan: consolidate VLAN parsing code and limit max parsing depth · d4d0e6c0
      Toke Høiland-Jørgensen authored
      [ Upstream commit 469acedd ]
      
      Toshiaki pointed out that we now have two very similar functions to extract
      the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
      that the unbounded parsing loop makes it possible for maliciously crafted
      packets to loop through potentially hundreds of tags.
      
      Fix both of these issues by consolidating the two parsing functions and
      limiting the VLAN tag parsing to a max depth of 8 tags. As part of this,
      switch over __vlan_get_protocol() to use skb_header_pointer() instead of
      pskb_may_pull(), to avoid the possible side effects of the latter and keep
      the skb pointer 'const' through all the parsing functions.
      
      v2:
      - Use limit of 8 tags instead of 32 (matching XMIT_RECURSION_LIMIT)
      Reported-by: default avatarToshiaki Makita <toshiaki.makita1@gmail.com>
      Reported-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Fixes: d7bf2ebe ("sched: consistently handle layer3 header accesses in the presence of VLANs")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4d0e6c0
    • Toke Høiland-Jørgensen's avatar
      sched: consistently handle layer3 header accesses in the presence of VLANs · 9fd235ff
      Toke Høiland-Jørgensen authored
      [ Upstream commit d7bf2ebe ]
      
      There are a couple of places in net/sched/ that check skb->protocol and act
      on the value there. However, in the presence of VLAN tags, the value stored
      in skb->protocol can be inconsistent based on whether VLAN acceleration is
      enabled. The commit quoted in the Fixes tag below fixed the users of
      skb->protocol to use a helper that will always see the VLAN ethertype.
      
      However, most of the callers don't actually handle the VLAN ethertype, but
      expect to find the IP header type in the protocol field. This means that
      things like changing the ECN field, or parsing diffserv values, stops
      working if there's a VLAN tag, or if there are multiple nested VLAN
      tags (QinQ).
      
      To fix this, change the helper to take an argument that indicates whether
      the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
      make sure to skip all of them, so behaviour is consistent even in QinQ
      mode.
      
      To make the helper usable from the ECN code, move it to if_vlan.h instead
      of pkt_sched.h.
      
      v3:
      - Remove empty lines
      - Move vlan variable definitions inside loop in skb_protocol()
      - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
        bpf_skb_ecn_set_ce()
      
      v2:
      - Use eth_type_vlan() helper in skb_protocol()
      - Also fix code that reads skb->protocol directly
      - Change a couple of 'if/else if' statements to switch constructs to avoid
        calling the helper twice
      Reported-by: default avatarIlya Ponetayev <i.ponetaev@ndmsystems.com>
      Fixes: d8b9605d ("net: sched: fix skb->protocol use in case of accelerated vlan path")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fd235ff
    • Cong Wang's avatar
      cgroup: Fix sock_cgroup_data on big-endian. · 6d584c6e
      Cong Wang authored
      [ Upstream commit 14b032b8 ]
      
      In order for no_refcnt and is_data to be the lowest order two
      bits in the 'val' we have to pad out the bitfield of the u8.
      
      Fixes: ad0f75e5 ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d584c6e
    • Cong Wang's avatar
      cgroup: fix cgroup_sk_alloc() for sk_clone_lock() · 0505cc4c
      Cong Wang authored
      [ Upstream commit ad0f75e5 ]
      
      When we clone a socket in sk_clone_lock(), its sk_cgrp_data is
      copied, so the cgroup refcnt must be taken too. And, unlike the
      sk_alloc() path, sock_update_netprioidx() is not called here.
      Therefore, it is safe and necessary to grab the cgroup refcnt
      even when cgroup_sk_alloc is disabled.
      
      sk_clone_lock() is in BH context anyway, the in_interrupt()
      would terminate this function if called there. And for sk_alloc()
      skcd->val is always zero. So it's safe to factor out the code
      to make it more readable.
      
      The global variable 'cgroup_sk_alloc_disabled' is used to determine
      whether to take these reference counts. It is impossible to make
      the reference counting correct unless we save this bit of information
      in skcd->val. So, add a new bit there to record whether the socket
      has already taken the reference counts. This obviously relies on
      kmalloc() to align cgroup pointers to at least 4 bytes,
      ARCH_KMALLOC_MINALIGN is certainly larger than that.
      
      This bug seems to be introduced since the beginning, commit
      d979a39d ("cgroup: duplicate cgroup reference when cloning sockets")
      tried to fix it but not compeletely. It seems not easy to trigger until
      the recent commit 090e28b2
      ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged.
      
      Fixes: bd1060a1 ("sock, cgroup: add sock->sk_cgroup")
      Reported-by: default avatarCameron Berkenpas <cam@neo-zeon.de>
      Reported-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Reported-by: default avatarLu Fengqi <lufq.fnst@cn.fujitsu.com>
      Reported-by: default avatarDaniël Sonck <dsonck92@gmail.com>
      Reported-by: default avatarZhang Qiang <qiang.zhang@windriver.com>
      Tested-by: default avatarCameron Berkenpas <cam@neo-zeon.de>
      Tested-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Tested-by: default avatarThomas Lamprecht <t.lamprecht@proxmox.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Roman Gushchin <guro@fb.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0505cc4c
    • Eric Dumazet's avatar
      tcp: md5: allow changing MD5 keys in all socket states · 81b65501
      Eric Dumazet authored
      [ Upstream commit 1ca0fafd ]
      
      This essentially reverts commit 72123032 ("tcp: md5: reject TCP_MD5SIG
      or TCP_MD5SIG_EXT on established sockets")
      
      Mathieu reported that many vendors BGP implementations can
      actually switch TCP MD5 on established flows.
      
      Quoting Mathieu :
         Here is a list of a few network vendors along with their behavior
         with respect to TCP MD5:
      
         - Cisco: Allows for password to be changed, but within the hold-down
           timer (~180 seconds).
         - Juniper: When password is initially set on active connection it will
           reset, but after that any subsequent password changes no network
           resets.
         - Nokia: No notes on if they flap the tcp connection or not.
         - Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until
           both sides are ok with new passwords.
         - Meta-Switch: Expects the password to be set before a connection is
           attempted, but no further info on whether they reset the TCP
           connection on a change.
         - Avaya: Disable the neighbor, then set password, then re-enable.
         - Zebos: Would normally allow the change when socket connected.
      
      We can revert my prior change because commit 9424e2e7 ("tcp: md5: fix potential
      overestimation of TCP option space") removed the leak of 4 kernel bytes to
      the wire that was the main reason for my patch.
      
      While doing my investigations, I found a bug when a MD5 key is changed, leading
      to these commits that stable teams want to consider before backporting this revert :
      
       Commit 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
       Commit e6ced831 ("tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers")
      
      Fixes: 72123032 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets"
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81b65501
    • Eric Dumazet's avatar
      tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers · 797a053e
      Eric Dumazet authored
      [ Upstream commit e6ced831 ]
      
      My prior fix went a bit too far, according to Herbert and Mathieu.
      
      Since we accept that concurrent TCP MD5 lookups might see inconsistent
      keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
      
      Clearing all key->key[] is needed to avoid possible KMSAN reports,
      if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
      using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
      
      data_race() was added in linux-5.8 and will prevent KCSAN reports,
      this can safely be removed in stable backports, if data_race() is
      not yet backported.
      
      v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add()
      
      Fixes: 6a2febec ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Marco Elver <elver@google.com>
      Reviewed-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      797a053e
    • Eric Dumazet's avatar
      tcp: md5: do not send silly options in SYNCOOKIES · c83943a7
      Eric Dumazet authored
      [ Upstream commit e114e1e8 ]
      
      Whenever cookie_init_timestamp() has been used to encode
      ECN,SACK,WSCALE options, we can not remove the TS option in the SYNACK.
      
      Otherwise, tcp_synack_options() will still advertize options like WSCALE
      that we can not deduce later when receiving the packet from the client
      to complete 3WHS.
      
      Note that modern linux TCP stacks wont use MD5+TS+SACK in a SYN packet,
      but we can not know for sure that all TCP stacks have the same logic.
      
      Before the fix a tcpdump would exhibit this wrong exchange :
      
      10:12:15.464591 IP C > S: Flags [S], seq 4202415601, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 456965269 ecr 0,nop,wscale 8], length 0
      10:12:15.464602 IP S > C: Flags [S.], seq 253516766, ack 4202415602, win 65535, options [nop,nop,md5 valid,mss 1400,nop,nop,sackOK,nop,wscale 8], length 0
      10:12:15.464611 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid], length 0
      10:12:15.464678 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid], length 12
      10:12:15.464685 IP S > C: Flags [.], ack 13, win 65535, options [nop,nop,md5 valid], length 0
      
      After this patch the exchange looks saner :
      
      11:59:59.882990 IP C > S: Flags [S], seq 517075944, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508483 ecr 0,nop,wscale 8], length 0
      11:59:59.883002 IP S > C: Flags [S.], seq 1902939253, ack 517075945, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508479 ecr 1751508483,nop,wscale 8], length 0
      11:59:59.883012 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 0
      11:59:59.883114 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 12
      11:59:59.883122 IP S > C: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508483], length 0
      11:59:59.883152 IP S > C: Flags [P.], seq 1:13, ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508483], length 12
      11:59:59.883170 IP C > S: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508484], length 0
      
      Of course, no SACK block will ever be added later, but nothing should break.
      Technically, we could remove the 4 nops included in MD5+TS options,
      but again some stacks could break seeing not conventional alignment.
      
      Fixes: 4957faad ("TCPCT part 1g: Responder Cookie => Initiator")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c83943a7
    • Eric Dumazet's avatar
      tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key() · ee66c2d1
      Eric Dumazet authored
      [ Upstream commit 6a2febec ]
      
      MD5 keys are read with RCU protection, and tcp_md5_do_add()
      might update in-place a prior key.
      
      Normally, typical RCU updates would allocate a new piece
      of memory. In this case only key->key and key->keylen might
      be updated, and we do not care if an incoming packet could
      see the old key, the new one, or some intermediate value,
      since changing the key on a live flow is known to be problematic
      anyway.
      
      We only want to make sure that in the case key->keylen
      is changed, cpus in tcp_md5_hash_key() wont try to use
      uninitialized data, or crash because key->keylen was
      read twice to feed sg_init_one() and ahash_request_set_crypt()
      
      Fixes: 9ea88a15 ("tcp: md5: check md5 signature without socket lock")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee66c2d1
    • Christoph Paasch's avatar
      tcp: make sure listeners don't initialize congestion-control state · 030a998a
      Christoph Paasch authored
      [ Upstream commit ce69e563 ]
      
      syzkaller found its way into setsockopt with TCP_CONGESTION "cdg".
      tcp_cdg_init() does a kcalloc to store the gradients. As sk_clone_lock
      just copies all the memory, the allocated pointer will be copied as
      well, if the app called setsockopt(..., TCP_CONGESTION) on the listener.
      If now the socket will be destroyed before the congestion-control
      has properly been initialized (through a call to tcp_init_transfer), we
      will end up freeing memory that does not belong to that particular
      socket, opening the door to a double-free:
      
      [   11.413102] ==================================================================
      [   11.414181] BUG: KASAN: double-free or invalid-free in tcp_cleanup_congestion_control+0x58/0xd0
      [   11.415329]
      [   11.415560] CPU: 3 PID: 4884 Comm: syz-executor.5 Not tainted 5.8.0-rc2 #80
      [   11.416544] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [   11.418148] Call Trace:
      [   11.418534]  <IRQ>
      [   11.418834]  dump_stack+0x7d/0xb0
      [   11.419297]  print_address_description.constprop.0+0x1a/0x210
      [   11.422079]  kasan_report_invalid_free+0x51/0x80
      [   11.423433]  __kasan_slab_free+0x15e/0x170
      [   11.424761]  kfree+0x8c/0x230
      [   11.425157]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.425872]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.426493]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.427093]  tcp_v4_syn_recv_sock+0xb29/0x1100
      [   11.427731]  tcp_get_cookie_sock+0xc3/0x4a0
      [   11.429457]  cookie_v4_check+0x13d0/0x2500
      [   11.433189]  tcp_v4_do_rcv+0x60e/0x780
      [   11.433727]  tcp_v4_rcv+0x2869/0x2e10
      [   11.437143]  ip_protocol_deliver_rcu+0x23/0x190
      [   11.437810]  ip_local_deliver+0x294/0x350
      [   11.439566]  __netif_receive_skb_one_core+0x15d/0x1a0
      [   11.441995]  process_backlog+0x1b1/0x6b0
      [   11.443148]  net_rx_action+0x37e/0xc40
      [   11.445361]  __do_softirq+0x18c/0x61a
      [   11.445881]  asm_call_on_stack+0x12/0x20
      [   11.446409]  </IRQ>
      [   11.446716]  do_softirq_own_stack+0x34/0x40
      [   11.447259]  do_softirq.part.0+0x26/0x30
      [   11.447827]  __local_bh_enable_ip+0x46/0x50
      [   11.448406]  ip_finish_output2+0x60f/0x1bc0
      [   11.450109]  __ip_queue_xmit+0x71c/0x1b60
      [   11.451861]  __tcp_transmit_skb+0x1727/0x3bb0
      [   11.453789]  tcp_rcv_state_process+0x3070/0x4d3a
      [   11.456810]  tcp_v4_do_rcv+0x2ad/0x780
      [   11.457995]  __release_sock+0x14b/0x2c0
      [   11.458529]  release_sock+0x4a/0x170
      [   11.459005]  __inet_stream_connect+0x467/0xc80
      [   11.461435]  inet_stream_connect+0x4e/0xa0
      [   11.462043]  __sys_connect+0x204/0x270
      [   11.465515]  __x64_sys_connect+0x6a/0xb0
      [   11.466088]  do_syscall_64+0x3e/0x70
      [   11.466617]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.467341] RIP: 0033:0x7f56046dc469
      [   11.467844] Code: Bad RIP value.
      [   11.468282] RSP: 002b:00007f5604dccdd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      [   11.469326] RAX: ffffffffffffffda RBX: 000000000068bf00 RCX: 00007f56046dc469
      [   11.470379] RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000004
      [   11.471311] RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
      [   11.472286] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      [   11.473341] R13: 000000000041427c R14: 00007f5604dcd5c0 R15: 0000000000000003
      [   11.474321]
      [   11.474527] Allocated by task 4884:
      [   11.475031]  save_stack+0x1b/0x40
      [   11.475548]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [   11.476182]  tcp_cdg_init+0xf0/0x150
      [   11.476744]  tcp_init_congestion_control+0x9b/0x3a0
      [   11.477435]  tcp_set_congestion_control+0x270/0x32f
      [   11.478088]  do_tcp_setsockopt.isra.0+0x521/0x1a00
      [   11.478744]  __sys_setsockopt+0xff/0x1e0
      [   11.479259]  __x64_sys_setsockopt+0xb5/0x150
      [   11.479895]  do_syscall_64+0x3e/0x70
      [   11.480395]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [   11.481097]
      [   11.481321] Freed by task 4872:
      [   11.481783]  save_stack+0x1b/0x40
      [   11.482230]  __kasan_slab_free+0x12c/0x170
      [   11.482839]  kfree+0x8c/0x230
      [   11.483240]  tcp_cleanup_congestion_control+0x58/0xd0
      [   11.483948]  tcp_v4_destroy_sock+0x57/0x5a0
      [   11.484502]  inet_csk_destroy_sock+0x153/0x2c0
      [   11.485144]  tcp_close+0x932/0xfe0
      [   11.485642]  inet_release+0xc1/0x1c0
      [   11.486131]  __sock_release+0xc0/0x270
      [   11.486697]  sock_close+0xc/0x10
      [   11.487145]  __fput+0x277/0x780
      [   11.487632]  task_work_run+0xeb/0x180
      [   11.488118]  __prepare_exit_to_usermode+0x15a/0x160
      [   11.488834]  do_syscall_64+0x4a/0x70
      [   11.489326]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Wei Wang fixed a part of these CDG-malloc issues with commit c1201444
      ("tcp: memset ca_priv data to 0 properly").
      
      This patch here fixes the listener-scenario: We make sure that listeners
      setting the congestion-control through setsockopt won't initialize it
      (thus CDG never allocates on listeners). For those who use AF_UNSPEC to
      reuse a socket, tcp_disconnect() is changed to cleanup afterwards.
      
      (The issue can be reproduced at least down to v4.4.x.)
      
      Cc: Wei Wang <weiwan@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Fixes: 2b0a8c9e ("tcp: add CDG congestion control")
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      030a998a
    • Eric Dumazet's avatar
      tcp: fix SO_RCVLOWAT possible hangs under high mem pressure · 721e5f54
      Eric Dumazet authored
      [ Upstream commit ba3bb0e7 ]
      
      Whenever tcp_try_rmem_schedule() returns an error, we are under
      trouble and should make sure to wakeup readers so that they
      can drain socket queues and eventually make room.
      
      Fixes: 03f45c88 ("tcp: avoid extra wakeups for SO_RCVLOWAT users")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      721e5f54
    • AceLan Kao's avatar
      net: usb: qmi_wwan: add support for Quectel EG95 LTE modem · aedd1023
      AceLan Kao authored
      [ Upstream commit f815dd5c ]
      
      Add support for Quectel Wireless Solutions Co., Ltd. EG95 LTE modem
      
      T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  5 Spd=480 MxCh= 0
      D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=2c7c ProdID=0195 Rev=03.18
      S:  Manufacturer=Android
      S:  Product=Android
      C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
      I:  If#=0x0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      Signed-off-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aedd1023
    • Cong Wang's avatar
      net_sched: fix a memory leak in atm_tc_init() · 1343c539
      Cong Wang authored
      [ Upstream commit 306381ae ]
      
      When tcf_block_get() fails inside atm_tc_init(),
      atm_tc_put() is called to release the qdisc p->link.q.
      But the flow->ref prevents it to do so, as the flow->ref
      is still zero.
      
      Fix this by moving the p->link.ref initialization before
      tcf_block_get().
      
      Fixes: 6529eaba ("net: sched: introduce tcf block infractructure")
      Reported-and-tested-by: syzbot+d411cff6ab29cc2c311b@syzkaller.appspotmail.com
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1343c539
    • Martin Varghese's avatar
      net: Added pointer check for dst->ops->neigh_lookup in dst_neigh_lookup_skb · 5c6e5496
      Martin Varghese authored
      [ Upstream commit 394de110 ]
      
      The packets from tunnel devices (eg bareudp) may have only
      metadata in the dst pointer of skb. Hence a pointer check of
      neigh_lookup is needed in dst_neigh_lookup_skb
      
      Kernel crashes when packets from bareudp device is processed in
      the kernel neighbour subsytem.
      
      [  133.384484] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [  133.385240] #PF: supervisor instruction fetch in kernel mode
      [  133.385828] #PF: error_code(0x0010) - not-present page
      [  133.386603] PGD 0 P4D 0
      [  133.386875] Oops: 0010 [#1] SMP PTI
      [  133.387275] CPU: 0 PID: 5045 Comm: ping Tainted: G        W         5.8.0-rc2+ #15
      [  133.388052] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  133.391076] RIP: 0010:0x0
      [  133.392401] Code: Bad RIP value.
      [  133.394029] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
      [  133.396656] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
      [  133.399018] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
      [  133.399685] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
      [  133.400350] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
      [  133.401010] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
      [  133.401667] FS:  00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
      [  133.402412] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  133.402948] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
      [  133.403611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  133.404270] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  133.404933] Call Trace:
      [  133.405169]  <IRQ>
      [  133.405367]  __neigh_update+0x5a4/0x8f0
      [  133.405734]  arp_process+0x294/0x820
      [  133.406076]  ? __netif_receive_skb_core+0x866/0xe70
      [  133.406557]  arp_rcv+0x129/0x1c0
      [  133.406882]  __netif_receive_skb_one_core+0x95/0xb0
      [  133.407340]  process_backlog+0xa7/0x150
      [  133.407705]  net_rx_action+0x2af/0x420
      [  133.408457]  __do_softirq+0xda/0x2a8
      [  133.408813]  asm_call_on_stack+0x12/0x20
      [  133.409290]  </IRQ>
      [  133.409519]  do_softirq_own_stack+0x39/0x50
      [  133.410036]  do_softirq+0x50/0x60
      [  133.410401]  __local_bh_enable_ip+0x50/0x60
      [  133.410871]  ip_finish_output2+0x195/0x530
      [  133.411288]  ip_output+0x72/0xf0
      [  133.411673]  ? __ip_finish_output+0x1f0/0x1f0
      [  133.412122]  ip_send_skb+0x15/0x40
      [  133.412471]  raw_sendmsg+0x853/0xab0
      [  133.412855]  ? insert_pfn+0xfe/0x270
      [  133.413827]  ? vvar_fault+0xec/0x190
      [  133.414772]  sock_sendmsg+0x57/0x80
      [  133.415685]  __sys_sendto+0xdc/0x160
      [  133.416605]  ? syscall_trace_enter+0x1d4/0x2b0
      [  133.417679]  ? __audit_syscall_exit+0x1d9/0x280
      [  133.418753]  ? __prepare_exit_to_usermode+0x5d/0x1a0
      [  133.419819]  __x64_sys_sendto+0x24/0x30
      [  133.420848]  do_syscall_64+0x4d/0x90
      [  133.421768]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  133.422833] RIP: 0033:0x7fe013689c03
      [  133.423749] Code: Bad RIP value.
      [  133.424624] RSP: 002b:00007ffc7288f418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [  133.425940] RAX: ffffffffffffffda RBX: 000056151fc63720 RCX: 00007fe013689c03
      [  133.427225] RDX: 0000000000000040 RSI: 000056151fc63720 RDI: 0000000000000003
      [  133.428481] RBP: 00007ffc72890b30 R08: 000056151fc60500 R09: 0000000000000010
      [  133.429757] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
      [  133.431041] R13: 000056151fc636e0 R14: 000056151fc616bc R15: 0000000000000080
      [  133.432481] Modules linked in: mpls_iptunnel act_mirred act_tunnel_key cls_flower sch_ingress veth mpls_router ip_tunnel bareudp ip6_udp_tunnel udp_tunnel macsec udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xt_MASQUERADE iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc ebtable_filter ebtables overlay ip6table_filter ip6_tables iptable_filter sunrpc ext4 mbcache jbd2 pcspkr i2c_piix4 virtio_balloon joydev ip_tables xfs libcrc32c ata_generic qxl pata_acpi drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ata_piix libata virtio_net net_failover virtio_console failover virtio_blk i2c_core virtio_pci virtio_ring serio_raw floppy virtio dm_mirror dm_region_hash dm_log dm_mod
      [  133.444045] CR2: 0000000000000000
      [  133.445082] ---[ end trace f4aeee1958fd1638 ]---
      [  133.446236] RIP: 0010:0x0
      [  133.447180] Code: Bad RIP value.
      [  133.448152] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
      [  133.449363] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
      [  133.450835] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
      [  133.452237] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
      [  133.453722] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
      [  133.455149] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
      [  133.456520] FS:  00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
      [  133.458046] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  133.459342] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
      [  133.460782] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  133.462240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  133.463697] Kernel panic - not syncing: Fatal exception in interrupt
      [  133.465226] Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [  133.467025] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      Fixes: aaa0c23c ("Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug")
      Signed-off-by: default avatarMartin Varghese <martin.varghese@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c6e5496
    • Eric Dumazet's avatar
      llc: make sure applications use ARPHRD_ETHER · f58328d7
      Eric Dumazet authored
      [ Upstream commit a9b11101 ]
      
      syzbot was to trigger a bug by tricking AF_LLC with
      non sensible addr->sllc_arphrd
      
      It seems clear LLC requires an Ethernet device.
      
      Back in commit abf9d537 ("llc: add support for SO_BINDTODEVICE")
      Octavian Purdila added possibility for application to use a zero
      value for sllc_arphrd, convert it to ARPHRD_ETHER to not cause
      regressions on existing applications.
      
      BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:199 [inline]
      BUG: KASAN: use-after-free in list_empty include/linux/list.h:268 [inline]
      BUG: KASAN: use-after-free in waitqueue_active include/linux/wait.h:126 [inline]
      BUG: KASAN: use-after-free in wq_has_sleeper include/linux/wait.h:160 [inline]
      BUG: KASAN: use-after-free in skwq_has_sleeper include/net/sock.h:2092 [inline]
      BUG: KASAN: use-after-free in sock_def_write_space+0x642/0x670 net/core/sock.c:2813
      Read of size 8 at addr ffff88801e0b4078 by task ksoftirqd/3/27
      
      CPU: 3 PID: 27 Comm: ksoftirqd/3 Not tainted 5.5.0-rc1-syzkaller #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
       __read_once_size include/linux/compiler.h:199 [inline]
       list_empty include/linux/list.h:268 [inline]
       waitqueue_active include/linux/wait.h:126 [inline]
       wq_has_sleeper include/linux/wait.h:160 [inline]
       skwq_has_sleeper include/net/sock.h:2092 [inline]
       sock_def_write_space+0x642/0x670 net/core/sock.c:2813
       sock_wfree+0x1e1/0x260 net/core/sock.c:1958
       skb_release_head_state+0xeb/0x260 net/core/skbuff.c:652
       skb_release_all+0x16/0x60 net/core/skbuff.c:663
       __kfree_skb net/core/skbuff.c:679 [inline]
       consume_skb net/core/skbuff.c:838 [inline]
       consume_skb+0xfb/0x410 net/core/skbuff.c:832
       __dev_kfree_skb_any+0xa4/0xd0 net/core/dev.c:2967
       dev_kfree_skb_any include/linux/netdevice.h:3650 [inline]
       e1000_unmap_and_free_tx_resource.isra.0+0x21b/0x3a0 drivers/net/ethernet/intel/e1000/e1000_main.c:1963
       e1000_clean_tx_irq drivers/net/ethernet/intel/e1000/e1000_main.c:3854 [inline]
       e1000_clean+0x4cc/0x1d10 drivers/net/ethernet/intel/e1000/e1000_main.c:3796
       napi_poll net/core/dev.c:6532 [inline]
       net_rx_action+0x508/0x1120 net/core/dev.c:6600
       __do_softirq+0x262/0x98c kernel/softirq.c:292
       run_ksoftirqd kernel/softirq.c:603 [inline]
       run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595
       smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165
       kthread+0x361/0x430 kernel/kthread.c:255
       ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
      
      Allocated by task 8247:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:513 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
       kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:521
       slab_post_alloc_hook mm/slab.h:584 [inline]
       slab_alloc mm/slab.c:3320 [inline]
       kmem_cache_alloc+0x121/0x710 mm/slab.c:3484
       sock_alloc_inode+0x1c/0x1d0 net/socket.c:240
       alloc_inode+0x68/0x1e0 fs/inode.c:230
       new_inode_pseudo+0x19/0xf0 fs/inode.c:919
       sock_alloc+0x41/0x270 net/socket.c:560
       __sock_create+0xc2/0x730 net/socket.c:1384
       sock_create net/socket.c:1471 [inline]
       __sys_socket+0x103/0x220 net/socket.c:1513
       __do_sys_socket net/socket.c:1522 [inline]
       __se_sys_socket net/socket.c:1520 [inline]
       __ia32_sys_socket+0x73/0xb0 net/socket.c:1520
       do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
       do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
       entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
      
      Freed by task 17:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:335 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
       __cache_free mm/slab.c:3426 [inline]
       kmem_cache_free+0x86/0x320 mm/slab.c:3694
       sock_free_inode+0x20/0x30 net/socket.c:261
       i_callback+0x44/0x80 fs/inode.c:219
       __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
       rcu_do_batch kernel/rcu/tree.c:2183 [inline]
       rcu_core+0x570/0x1540 kernel/rcu/tree.c:2408
       rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2417
       __do_softirq+0x262/0x98c kernel/softirq.c:292
      
      The buggy address belongs to the object at ffff88801e0b4000
       which belongs to the cache sock_inode_cache of size 1152
      The buggy address is located 120 bytes inside of
       1152-byte region [ffff88801e0b4000, ffff88801e0b4480)
      The buggy address belongs to the page:
      page:ffffea0000782d00 refcount:1 mapcount:0 mapping:ffff88807aa59c40 index:0xffff88801e0b4ffd
      raw: 00fffe0000000200 ffffea00008e6c88 ffffea0000782d48 ffff88807aa59c40
      raw: ffff88801e0b4ffd ffff88801e0b4000 0000000100000003 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88801e0b3f00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff88801e0b3f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88801e0b4000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                      ^
       ffff88801e0b4080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff88801e0b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: abf9d537 ("llc: add support for SO_BINDTODEVICE")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f58328d7
    • Xin Long's avatar
      l2tp: remove skb_dst_set() from l2tp_xmit_skb() · a0400bbe
      Xin Long authored
      [ Upstream commit 27d53323 ]
      
      In the tx path of l2tp, l2tp_xmit_skb() calls skb_dst_set() to set
      skb's dst. However, it will eventually call inet6_csk_xmit() or
      ip_queue_xmit() where skb's dst will be overwritten by:
      
         skb_dst_set_noref(skb, dst);
      
      without releasing the old dst in skb. Then it causes dst/dev refcnt leak:
      
        unregister_netdevice: waiting for eth0 to become free. Usage count = 1
      
      This can be reproduced by simply running:
      
        # modprobe l2tp_eth && modprobe l2tp_ip
        # sh ./tools/testing/selftests/net/l2tp.sh
      
      So before going to inet6_csk_xmit() or ip_queue_xmit(), skb's dst
      should be dropped. This patch is to fix it by removing skb_dst_set()
      from l2tp_xmit_skb() and moving skb_dst_drop() into l2tp_xmit_core().
      
      Fixes: 3557baab ("[L2TP]: PPP over L2TP driver core")
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJames Chapman <jchapman@katalix.com>
      Tested-by: default avatarJames Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0400bbe
    • Sabrina Dubroca's avatar
      ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg · 5e8f49ed
      Sabrina Dubroca authored
      [ Upstream commit 5eff0690 ]
      
      IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to
      incomplete IPsec ACQUIRE messages being sent to userspace. Currently,
      both raw sockets and IPv6 ping sockets set those fields.
      
      Expected output of "ip xfrm monitor":
          acquire proto esp
            sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4
            policy src 10.0.2.15/32 dst 8.8.8.8/32
              <snip>
      
      Currently with ping sockets:
          acquire proto esp
            sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4
            policy src 10.0.2.15/32 dst 8.8.8.8/32
              <snip>
      
      The Libreswan test suite found this problem after Fedora changed the
      value for the sysctl net.ipv4.ping_group_range.
      
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Reported-by: default avatarPaul Wouters <pwouters@redhat.com>
      Tested-by: default avatarPaul Wouters <pwouters@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e8f49ed
    • Sean Tranchetti's avatar
      genetlink: remove genl_bind · 0845447d
      Sean Tranchetti authored
      [ Upstream commit 1e82a62f ]
      
      A potential deadlock can occur during registering or unregistering a
      new generic netlink family between the main nl_table_lock and the
      cb_lock where each thread wants the lock held by the other, as
      demonstrated below.
      
      1) Thread 1 is performing a netlink_bind() operation on a socket. As part
         of this call, it will call netlink_lock_table(), incrementing the
         nl_table_users count to 1.
      2) Thread 2 is registering (or unregistering) a genl_family via the
         genl_(un)register_family() API. The cb_lock semaphore will be taken for
         writing.
      3) Thread 1 will call genl_bind() as part of the bind operation to handle
         subscribing to GENL multicast groups at the request of the user. It will
         attempt to take the cb_lock semaphore for reading, but it will fail and
         be scheduled away, waiting for Thread 2 to finish the write.
      4) Thread 2 will call netlink_table_grab() during the (un)registration
         call. However, as Thread 1 has incremented nl_table_users, it will not
         be able to proceed, and both threads will be stuck waiting for the
         other.
      
      genl_bind() is a noop, unless a genl_family implements the mcast_bind()
      function to handle setting up family-specific multicast operations. Since
      no one in-tree uses this functionality as Cong pointed out, simply removing
      the genl_bind() function will remove the possibility for deadlock, as there
      is no attempt by Thread 1 above to take the cb_lock semaphore.
      
      Fixes: c380d9a7 ("genetlink: pass multicast bind/unbind to families")
      Suggested-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0845447d
    • Taehee Yoo's avatar
      net: rmnet: fix lower interface leak · cf7318b7
      Taehee Yoo authored
      commit 2a762e9e upstream.
      
      There are two types of the lower interface of rmnet that are VND
      and BRIDGE.
      Each lower interface can have only one type either VND or BRIDGE.
      But, there is a case, which uses both lower interface types.
      Due to this unexpected behavior, lower interface leak occurs.
      
      Test commands:
          ip link add dummy0 type dummy
          ip link add dummy1 type dummy
          ip link add rmnet0 link dummy0 type rmnet mux_id 1
          ip link set dummy1 master rmnet0
          ip link add rmnet1 link dummy1 type rmnet mux_id 2
          ip link del rmnet0
      
      The dummy1 was attached as BRIDGE interface of rmnet0.
      Then, it also was attached as VND interface of rmnet1.
      This is unexpected behavior and there is no code for handling this case.
      So that below splat occurs when the rmnet0 interface is deleted.
      
      Splat looks like:
      [   53.254112][    C1] WARNING: CPU: 1 PID: 1192 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
      [   53.254117][    C1] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nfx
      [   53.254182][    C1] CPU: 1 PID: 1192 Comm: ip Not tainted 5.8.0-rc1+ #620
      [   53.254188][    C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   53.254192][    C1] RIP: 0010:rollback_registered_many+0x986/0xcf0
      [   53.254200][    C1] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 0f 0b e5
      [   53.254205][    C1] RSP: 0018:ffff888050a5f2e0 EFLAGS: 00010287
      [   53.254214][    C1] RAX: ffff88805756d658 RBX: ffff88804d99c000 RCX: ffffffff8329d323
      [   53.254219][    C1] RDX: 1ffffffff0be6410 RSI: 0000000000000008 RDI: ffffffff85f32080
      [   53.254223][    C1] RBP: dffffc0000000000 R08: fffffbfff0be6411 R09: fffffbfff0be6411
      [   53.254228][    C1] R10: ffffffff85f32087 R11: 0000000000000001 R12: ffff888050a5f480
      [   53.254233][    C1] R13: ffff88804d99c0b8 R14: ffff888050a5f400 R15: ffff8880548ebe40
      [   53.254238][    C1] FS:  00007f6b86b370c0(0000) GS:ffff88806c200000(0000) knlGS:0000000000000000
      [   53.254243][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   53.254248][    C1] CR2: 0000562c62438758 CR3: 000000003f600005 CR4: 00000000000606e0
      [   53.254253][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   53.254257][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   53.254261][    C1] Call Trace:
      [   53.254266][    C1]  ? lockdep_hardirqs_on_prepare+0x379/0x540
      [   53.254270][    C1]  ? netif_set_real_num_tx_queues+0x780/0x780
      [   53.254275][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [   53.254279][    C1]  ? __kasan_slab_free+0x126/0x150
      [   53.254283][    C1]  ? kfree+0xdc/0x320
      [   53.254288][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
      [   53.254293][    C1]  unregister_netdevice_many.part.135+0x13/0x1b0
      [   53.254297][    C1]  rtnl_delete_link+0xbc/0x100
      [   53.254301][    C1]  ? rtnl_af_register+0xc0/0xc0
      [   53.254305][    C1]  rtnl_dellink+0x2dc/0x840
      [   53.254309][    C1]  ? find_held_lock+0x39/0x1d0
      [   53.254314][    C1]  ? valid_fdb_dump_strict+0x620/0x620
      [   53.254318][    C1]  ? rtnetlink_rcv_msg+0x457/0x890
      [   53.254322][    C1]  ? lock_contended+0xd20/0xd20
      [   53.254326][    C1]  rtnetlink_rcv_msg+0x4a8/0x890
      [ ... ]
      [   73.813696][ T1192] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1
      
      Fixes: 037f9cdf ("net: rmnet: use upper/lower device infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf7318b7
    • Changbin Du's avatar
      perf: Make perf able to build with latest libbfd · 0753b05c
      Changbin Du authored
      commit 0ada120c upstream.
      
      libbfd has changed the bfd_section_* macros to inline functions
      bfd_section_<field> since 2019-09-18. See below two commits:
        o http://www.sourceware.org/ml/gdb-cvs/2019-09/msg00064.html
        o https://www.sourceware.org/ml/gdb-cvs/2019-09/msg00072.html
      
      This fix make perf able to build with both old and new libbfd.
      Signed-off-by: default avatarChangbin Du <changbin.du@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200128152938.31413-1-changbin.du@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jianmin Wang <jianmin@iscas.ac.cn>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0753b05c
  2. 16 Jul, 2020 13 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.133 · 17a87580
      Greg Kroah-Hartman authored
      17a87580
    • Janosch Frank's avatar
      s390/mm: fix huge pte soft dirty copying · 1a575164
      Janosch Frank authored
      commit 528a9539 upstream.
      
      If the pmd is soft dirty we must mark the pte as soft dirty (and not dirty).
      This fixes some cases for guest migration with huge page backings.
      
      Cc: <stable@vger.kernel.org> # 4.8
      Fixes: bc29b7ac ("s390/mm: clean up pte/pmd encoding")
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarJanosch Frank <frankja@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a575164
    • Vineet Gupta's avatar
      ARC: elf: use right ELF_ARCH · 391a502e
      Vineet Gupta authored
      commit b7faf971 upstream.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      391a502e
    • Vineet Gupta's avatar
      ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE · 01b636cb
      Vineet Gupta authored
      commit 00fdec98 upstream.
      
      Trap handler for syscall tracing reads EFA (Exception Fault Address),
      in case strace wants PC of trap instruction (EFA is not part of pt_regs
      as of current code).
      
      However this EFA read is racy as it happens after dropping to pure
      kernel mode (re-enabling interrupts). A taken interrupt could
      context-switch, trigger a different task's trap, clobbering EFA for this
      execution context.
      
      Fix this by reading EFA early, before re-enabling interrupts. A slight
      side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
      The trap handler is common to both ARCompact and ARCv2 builds too.
      
      This just came out of code rework/review and no real problem was reported
      but is clearly a potential problem specially for strace.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01b636cb
    • Mikulas Patocka's avatar
      dm: use noio when sending kobject event · 35a9af8d
      Mikulas Patocka authored
      commit 6958c1c6 upstream.
      
      kobject_uevent may allocate memory and it may be called while there are dm
      devices suspended. The allocation may recurse into a suspended device,
      causing a deadlock. We must set the noio flag when sending a uevent.
      
      The observed deadlock was reported here:
      https://www.redhat.com/archives/dm-devel/2020-March/msg00025.htmlReported-by: default avatarKhazhismel Kumykov <khazhy@google.com>
      Reported-by: default avatarTahsin Erdogan <tahsin@google.com>
      Reported-by: default avatarGabriel Krisman Bertazi <krisman@collabora.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35a9af8d
    • Tom Rix's avatar
      drm/radeon: fix double free · 8e5d3786
      Tom Rix authored
      commit 41855a89 upstream.
      
      clang static analysis flags this error
      
      drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
                      kfree(rdev->pm.dpm.ps[i].ps_priv);
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
              kfree(rdev->pm.dpm.ps);
              ^~~~~~~~~~~~~~~~~~~~~~
      
      problem is reported in ci_dpm_fini, with these code blocks.
      
      	for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
      		kfree(rdev->pm.dpm.ps[i].ps_priv);
      	}
      	kfree(rdev->pm.dpm.ps);
      
      The first free happens in ci_parse_power_table where it cleans up locally
      on a failure.  ci_dpm_fini also does a cleanup.
      
      	ret = ci_parse_power_table(rdev);
      	if (ret) {
      		ci_dpm_fini(rdev);
      		return ret;
      	}
      
      So remove the cleanup in ci_parse_power_table and
      move the num_ps calculation to inside the loop so ci_dpm_fini
      will know how many array elements to free.
      
      Fixes: cc8dbbb4 ("drm/radeon: add dpm support for CI dGPUs (v2)")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e5d3786
    • Boris Burkov's avatar
      btrfs: fix fatal extent_buffer readahead vs releasepage race · 8ee03580
      Boris Burkov authored
      commit 6bf9cd2e upstream.
      
      Under somewhat convoluted conditions, it is possible to attempt to
      release an extent_buffer that is under io, which triggers a BUG_ON in
      btrfs_release_extent_buffer_pages.
      
      This relies on a few different factors. First, extent_buffer reads done
      as readahead for searching use WAIT_NONE, so they free the local extent
      buffer reference while the io is outstanding. However, they should still
      be protected by TREE_REF. However, if the system is doing signficant
      reclaim, and simultaneously heavily accessing the extent_buffers, it is
      possible for releasepage to race with two concurrent readahead attempts
      in a way that leaves TREE_REF unset when the readahead extent buffer is
      released.
      
      Essentially, if two tasks race to allocate a new extent_buffer, but the
      winner who attempts the first io is rebuffed by a page being locked
      (likely by the reclaim itself) then the loser will still go ahead with
      issuing the readahead. The loser's call to find_extent_buffer must also
      race with the reclaim task reading the extent_buffer's refcount as 1 in
      a way that allows the reclaim to re-clear the TREE_REF checked by
      find_extent_buffer.
      
      The following represents an example execution demonstrating the race:
      
                  CPU0                                                         CPU1                                           CPU2
      reada_for_search                                            reada_for_search
        readahead_tree_block                                        readahead_tree_block
          find_create_tree_block                                      find_create_tree_block
            alloc_extent_buffer                                         alloc_extent_buffer
                                                                        find_extent_buffer // not found
                                                                        allocates eb
                                                                        lock pages
                                                                        associate pages to eb
                                                                        insert eb into radix tree
                                                                        set TREE_REF, refs == 2
                                                                        unlock pages
                                                                    read_extent_buffer_pages // WAIT_NONE
                                                                      not uptodate (brand new eb)
                                                                                                                  lock_page
                                                                      if !trylock_page
                                                                        goto unlock_exit // not an error
                                                                    free_extent_buffer
                                                                      release_extent_buffer
                                                                        atomic_dec_and_test refs to 1
              find_extent_buffer // found
                                                                                                                  try_release_extent_buffer
                                                                                                                    take refs_lock
                                                                                                                    reads refs == 1; no io
                atomic_inc_not_zero refs to 2
                mark_buffer_accessed
                  check_buffer_tree_ref
                    // not STALE, won't take refs_lock
                    refs == 2; TREE_REF set // no action
          read_extent_buffer_pages // WAIT_NONE
                                                                                                                    clear TREE_REF
                                                                                                                    release_extent_buffer
                                                                                                                      atomic_dec_and_test refs to 1
                                                                                                                      unlock_page
            still not uptodate (CPU1 read failed on trylock_page)
            locks pages
            set io_pages > 0
            submit io
            return
          free_extent_buffer
            release_extent_buffer
              dec refs to 0
              delete from radix tree
              btrfs_release_extent_buffer_pages
                BUG_ON(io_pages > 0)!!!
      
      We observe this at a very low rate in production and were also able to
      reproduce it in a test environment by introducing some spurious delays
      and by introducing probabilistic trylock_page failures.
      
      To fix it, we apply check_tree_ref at a point where it could not
      possibly be unset by a competing task: after io_pages has been
      incremented. All the codepaths that clear TREE_REF check for io, so they
      would not be able to clear it after this point until the io is done.
      
      Stack trace, for reference:
      [1417839.424739] ------------[ cut here ]------------
      [1417839.435328] kernel BUG at fs/btrfs/extent_io.c:4841!
      [1417839.447024] invalid opcode: 0000 [#1] SMP
      [1417839.502972] RIP: 0010:btrfs_release_extent_buffer_pages+0x20/0x1f0
      [1417839.517008] Code: ed e9 ...
      [1417839.558895] RSP: 0018:ffffc90020bcf798 EFLAGS: 00010202
      [1417839.570816] RAX: 0000000000000002 RBX: ffff888102d6def0 RCX: 0000000000000028
      [1417839.586962] RDX: 0000000000000002 RSI: ffff8887f0296482 RDI: ffff888102d6def0
      [1417839.603108] RBP: ffff88885664a000 R08: 0000000000000046 R09: 0000000000000238
      [1417839.619255] R10: 0000000000000028 R11: ffff88885664af68 R12: 0000000000000000
      [1417839.635402] R13: 0000000000000000 R14: ffff88875f573ad0 R15: ffff888797aafd90
      [1417839.651549] FS:  00007f5a844fa700(0000) GS:ffff88885f680000(0000) knlGS:0000000000000000
      [1417839.669810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [1417839.682887] CR2: 00007f7884541fe0 CR3: 000000049f609002 CR4: 00000000003606e0
      [1417839.699037] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [1417839.715187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [1417839.731320] Call Trace:
      [1417839.737103]  release_extent_buffer+0x39/0x90
      [1417839.746913]  read_block_for_search.isra.38+0x2a3/0x370
      [1417839.758645]  btrfs_search_slot+0x260/0x9b0
      [1417839.768054]  btrfs_lookup_file_extent+0x4a/0x70
      [1417839.778427]  btrfs_get_extent+0x15f/0x830
      [1417839.787665]  ? submit_extent_page+0xc4/0x1c0
      [1417839.797474]  ? __do_readpage+0x299/0x7a0
      [1417839.806515]  __do_readpage+0x33b/0x7a0
      [1417839.815171]  ? btrfs_releasepage+0x70/0x70
      [1417839.824597]  extent_readpages+0x28f/0x400
      [1417839.833836]  read_pages+0x6a/0x1c0
      [1417839.841729]  ? startup_64+0x2/0x30
      [1417839.849624]  __do_page_cache_readahead+0x13c/0x1a0
      [1417839.860590]  filemap_fault+0x6c7/0x990
      [1417839.869252]  ? xas_load+0x8/0x80
      [1417839.876756]  ? xas_find+0x150/0x190
      [1417839.884839]  ? filemap_map_pages+0x295/0x3b0
      [1417839.894652]  __do_fault+0x32/0x110
      [1417839.902540]  __handle_mm_fault+0xacd/0x1000
      [1417839.912156]  handle_mm_fault+0xaa/0x1c0
      [1417839.921004]  __do_page_fault+0x242/0x4b0
      [1417839.930044]  ? page_fault+0x8/0x30
      [1417839.937933]  page_fault+0x1e/0x30
      [1417839.945631] RIP: 0033:0x33c4bae
      [1417839.952927] Code: Bad RIP value.
      [1417839.960411] RSP: 002b:00007f5a844f7350 EFLAGS: 00010206
      [1417839.972331] RAX: 000000000000006e RBX: 1614b3ff6a50398a RCX: 0000000000000000
      [1417839.988477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
      [1417840.004626] RBP: 00007f5a844f7420 R08: 000000000000006e R09: 00007f5a94aeccb8
      [1417840.020784] R10: 00007f5a844f7350 R11: 0000000000000000 R12: 00007f5a94aecc79
      [1417840.036932] R13: 00007f5a94aecc78 R14: 00007f5a94aecc90 R15: 00007f5a94aecc40
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarBoris Burkov <boris@bur.io>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ee03580
    • Greg Kroah-Hartman's avatar
      Revert "ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb" · 54d2495b
      Greg Kroah-Hartman authored
      This reverts commit bdf4d37b which is
      commit 2bbcaaee upstream.
      
      It is being reverted upstream, just hasn't made it there yet and is
      causing lots of problems.
      Reported-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: Qiujun Huang <hqjagain@gmail.com>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54d2495b
    • Kees Cook's avatar
      bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() · b80e052c
      Kees Cook authored
      commit 63960260 upstream.
      
      When evaluating access control over kallsyms visibility, credentials at
      open() time need to be used, not the "current" creds (though in BPF's
      case, this has likely always been the same). Plumb access to associated
      file->f_cred down through bpf_dump_raw_ok() and its callers now that
      kallsysm_show_value() has been refactored to take struct cred.
      
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: bpf@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: 7105e828 ("bpf: allow for correlation of maps and helpers in dump")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b80e052c
    • Kees Cook's avatar
      kprobes: Do not expose probe addresses to non-CAP_SYSLOG · bff4fad8
      Kees Cook authored
      commit 60f7bb66 upstream.
      
      The kprobe show() functions were using "current"'s creds instead
      of the file opener's creds for kallsyms visibility. Fix to use
      seq_file->file->f_cred.
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: 81365a94 ("kprobes: Show address of kprobes if kallsyms does")
      Fixes: ffb9bd68 ("kprobes: Show blacklist addresses as same as kallsyms does")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bff4fad8
    • Kees Cook's avatar
      module: Do not expose section addresses to non-CAP_SYSLOG · c744568a
      Kees Cook authored
      commit b25a7c5a upstream.
      
      The printing of section addresses in /sys/module/*/sections/* was not
      using the correct credentials to evaluate visibility.
      
      Before:
      
       # cat /sys/module/*/sections/.*text
       0xffffffffc0458000
       ...
       # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
       0xffffffffc0458000
       ...
      
      After:
      
       # cat /sys/module/*/sections/*.text
       0xffffffffc0458000
       ...
       # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
       0x0000000000000000
       ...
      
      Additionally replaces the existing (safe) /proc/modules check with
      file->f_cred for consistency.
      Reported-by: default avatarDominik Czarnota <dominik.czarnota@trailofbits.com>
      Fixes: be71eda5 ("module: Fix display of wrong module .text address")
      Cc: stable@vger.kernel.org
      Tested-by: default avatarJessica Yu <jeyu@kernel.org>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c744568a
    • Kees Cook's avatar
      module: Refactor section attr into bin attribute · bbd10824
      Kees Cook authored
      commit ed66f991 upstream.
      
      In order to gain access to the open file's f_cred for kallsym visibility
      permission checks, refactor the module section attributes to use the
      bin_attribute instead of attribute interface. Additionally removes the
      redundant "name" struct member.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Tested-by: default avatarJessica Yu <jeyu@kernel.org>
      Acked-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbd10824
    • Gustavo A. R. Silva's avatar
      kernel: module: Use struct_size() helper · 2bcda838
      Gustavo A. R. Silva authored
      commit 8d1b73dd upstream.
      
      One of the more common cases of allocation size calculations is finding
      the size of a structure that has a zero-sized array at the end, along
      with memory for some number of elements for that array. For example:
      
      struct module_sect_attrs {
      	...
              struct module_sect_attr attrs[0];
      };
      
      Make use of the struct_size() helper instead of an open-coded version
      in order to avoid any potential type mistakes.
      
      So, replace the following form:
      
      sizeof(*sect_attrs) + nloaded * sizeof(sect_attrs->attrs[0]
      
      with:
      
      struct_size(sect_attrs, attrs, nloaded)
      
      This code was detected with the help of Coccinelle.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bcda838