1. 17 Apr, 2020 40 commits
    • Paul Cercueil's avatar
      clk: ingenic/jz4770: Exit with error if CGU init failed · ae6baa8c
      Paul Cercueil authored
      commit c067b46d upstream.
      
      Exit jz4770_cgu_init() if the 'cgu' pointer we get is NULL, since the
      pointer is passed as argument to functions later on.
      
      Fixes: 7a01c190 ("clk: Add Ingenic jz4770 CGU driver")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaul Cercueil <paul@crapouillou.net>
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lkml.kernel.org/r/20200213161952.37460-1-paul@crapouillou.netSigned-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae6baa8c
    • Hans de Goede's avatar
      Input: i8042 - add Acer Aspire 5738z to nomux list · cdfa83e1
      Hans de Goede authored
      commit ebc68ced upstream.
      
      The Acer Aspire 5738z has a button to disable (and re-enable) the
      touchpad next to the touchpad.
      
      When this button is pressed a LED underneath indicates that the touchpad
      is disabled (and an event is send to userspace and GNOME shows its
      touchpad enabled / disable OSD thingie).
      
      So far so good, but after re-enabling the touchpad it no longer works.
      
      The laptop does not have an external ps2 port, so mux mode is not needed
      and disabling mux mode fixes the touchpad no longer working after toggling
      it off and back on again, so lets add this laptop model to the nomux list.
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://lore.kernel.org/r/20200331123947.318908-1-hdegoede@redhat.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cdfa83e1
    • Michael Mueller's avatar
      s390/diag: fix display of diagnose call statistics · e68129e6
      Michael Mueller authored
      commit 6c7c851f upstream.
      
      Show the full diag statistic table and not just parts of it.
      
      The issue surfaced in a KVM guest with a number of vcpus
      defined smaller than NR_DIAG_STAT.
      
      Fixes: 1ec2772e ("s390/diag: add a statistic for diagnose calls")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael Mueller <mimu@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e68129e6
    • Sam Lunt's avatar
      perf tools: Support Python 3.8+ in Makefile · d1b6feb4
      Sam Lunt authored
      commit b9c9ce4e upstream.
      
      Python 3.8 changed the output of 'python-config --ldflags' to no longer
      include the '-lpythonX.Y' flag (this apparently fixed an issue loading
      modules with a statically linked Python executable).  The libpython
      feature check in linux/build/feature fails if the Python library is not
      included in FEATURE_CHECK_LDFLAGS-libpython variable.
      
      This adds a check in the Makefile to determine if PYTHON_CONFIG accepts
      the '--embed' flag and passes that flag alongside '--ldflags' if so.
      
      tools/perf is the only place the libpython feature check is used.
      Signed-off-by: default avatarSam Lunt <samuel.j.lunt@gmail.com>
      Tested-by: default avatarHe Zhe <zhe.he@windriver.com>
      Link: http://lore.kernel.org/lkml/c56be2e1-8111-9dfe-8298-f7d0f9ab7431@windriver.comAcked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: trivial@kernel.org
      Cc: stable@kernel.org
      Link: http://lore.kernel.org/lkml/20200131181123.tmamivhq4b7uqasr@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1b6feb4
    • Changwei Ge's avatar
      ocfs2: no need try to truncate file beyond i_size · 13380f2b
      Changwei Ge authored
      commit 783fda85 upstream.
      
      Linux fallocate(2) with FALLOC_FL_PUNCH_HOLE mode set, its offset can
      exceed the inode size.  Ocfs2 now doesn't allow that offset beyond inode
      size.  This restriction is not necessary and violates fallocate(2)
      semantics.
      
      If fallocate(2) offset is beyond inode size, just return success and do
      nothing further.
      
      Otherwise, ocfs2 will crash the kernel.
      
        kernel BUG at fs/ocfs2//alloc.c:7264!
         ocfs2_truncate_inline+0x20f/0x360 [ocfs2]
         ocfs2_remove_inode_range+0x23c/0xcb0 [ocfs2]
         __ocfs2_change_file_space+0x4a5/0x650 [ocfs2]
         ocfs2_fallocate+0x83/0xa0 [ocfs2]
         vfs_fallocate+0x148/0x230
         SyS_fallocate+0x48/0x80
         do_syscall_64+0x79/0x170
      Signed-off-by: default avatarChangwei Ge <chge@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200407082754.17565-1-chge@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13380f2b
    • Eric Biggers's avatar
      fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() · d4b3709c
      Eric Biggers authored
      commit 26c5d78c upstream.
      
      After request_module(), nothing is stopping the module from being
      unloaded until someone takes a reference to it via try_get_module().
      
      The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
      running 'rmmod' concurrently.
      
      Since WARN_ONCE() is for kernel bugs only, not for user-reachable
      situations, downgrade this warning to pr_warn_once().
      
      Keep it printed once only, since the intent of this warning is to detect
      a bug in modprobe at boot time.  Printing the warning more than once
      wouldn't really provide any useful extra information.
      
      Fixes: 41124db8 ("fs: warn in case userspace lied about modprobe return")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJessica Yu <jeyu@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: NeilBrown <neilb@suse.com>
      Cc: <stable@vger.kernel.org>		[4.13+]
      Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4b3709c
    • Qian Cai's avatar
      ext4: fix a data race at inode->i_blocks · 803ef6fa
      Qian Cai authored
      commit 28936b62 upstream.
      
      inode->i_blocks could be accessed concurrently as noticed by KCSAN,
      
       BUG: KCSAN: data-race in ext4_do_update_inode [ext4] / inode_add_bytes
      
       write to 0xffff9a00d4b982d0 of 8 bytes by task 22100 on cpu 118:
        inode_add_bytes+0x65/0xf0
        __inode_add_bytes at fs/stat.c:689
        (inlined by) inode_add_bytes at fs/stat.c:702
        ext4_mb_new_blocks+0x418/0xca0 [ext4]
        ext4_ext_map_blocks+0x1a6b/0x27b0 [ext4]
        ext4_map_blocks+0x1a9/0x950 [ext4]
        _ext4_get_block+0xfc/0x270 [ext4]
        ext4_get_block_unwritten+0x33/0x50 [ext4]
        __block_write_begin_int+0x22e/0xae0
        __block_write_begin+0x39/0x50
        ext4_write_begin+0x388/0xb50 [ext4]
        ext4_da_write_begin+0x35f/0x8f0 [ext4]
        generic_perform_write+0x15d/0x290
        ext4_buffered_write_iter+0x11f/0x210 [ext4]
        ext4_file_write_iter+0xce/0x9e0 [ext4]
        new_sync_write+0x29c/0x3b0
        __vfs_write+0x92/0xa0
        vfs_write+0x103/0x260
        ksys_write+0x9d/0x130
        __x64_sys_write+0x4c/0x60
        do_syscall_64+0x91/0xb05
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       read to 0xffff9a00d4b982d0 of 8 bytes by task 8 on cpu 65:
        ext4_do_update_inode+0x4a0/0xf60 [ext4]
        ext4_inode_blocks_set at fs/ext4/inode.c:4815
        ext4_mark_iloc_dirty+0xaf/0x160 [ext4]
        ext4_mark_inode_dirty+0x129/0x3e0 [ext4]
        ext4_convert_unwritten_extents+0x253/0x2d0 [ext4]
        ext4_convert_unwritten_io_end_vec+0xc5/0x150 [ext4]
        ext4_end_io_rsv_work+0x22c/0x350 [ext4]
        process_one_work+0x54f/0xb90
        worker_thread+0x80/0x5f0
        kthread+0x1cd/0x1f0
        ret_from_fork+0x27/0x50
      
       4 locks held by kworker/u256:0/8:
        #0: ffff9a025abc4328 ((wq_completion)ext4-rsv-conversion){+.+.}, at: process_one_work+0x443/0xb90
        #1: ffffab5a862dbe20 ((work_completion)(&ei->i_rsv_conversion_work)){+.+.}, at: process_one_work+0x443/0xb90
        #2: ffff9a025a9d0f58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
        #3: ffff9a00d4b985d8 (&(&ei->i_raw_lock)->rlock){+.+.}, at: ext4_do_update_inode+0xaa/0xf60 [ext4]
       irq event stamp: 3009267
       hardirqs last  enabled at (3009267): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
       hardirqs last disabled at (3009266): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
       softirqs last  enabled at (3009230): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
       softirqs last disabled at (3009223): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0
      
       Reported by Kernel Concurrency Sanitizer on:
       CPU: 65 PID: 8 Comm: kworker/u256:0 Tainted: G L 5.6.0-rc2-next-20200221+ #7
       Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
       Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ext4]
      
      The plain read is outside of inode->i_lock critical section which
      results in a data race. Fix it by adding READ_ONCE() there.
      
      Link: https://lore.kernel.org/r/20200222043258.2279-1-cai@lca.pwSigned-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      803ef6fa
    • Trond Myklebust's avatar
      NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() · 194a805d
      Trond Myklebust authored
      commit add42de3 upstream.
      
      When we detach a subrequest from the list, we must also release the
      reference it holds to the parent.
      
      Fixes: 5b2b5187 ("NFS: Fix nfs_page_group_destroy() and nfs_lock_and_join_requests() race cases")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      194a805d
    • Libor Pechacek's avatar
      powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable · 83dc8f0a
      Libor Pechacek authored
      commit a83836db upstream.
      
      In guests without hotplugagble memory drmem structure is only zero
      initialized. Trying to manipulate DLPAR parameters results in a crash.
      
        $ echo "memory add count 1" > /sys/kernel/dlpar
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
        ...
        NIP:  c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000
        REGS: c0000000fb9d3880 TRAP: 0300   Tainted: G            E      (5.5.0-rc6-2-default)
        MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28242428  XER: 20000000
        CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
        ...
        NIP dlpar_memory+0x6e4/0xd00
        LR  dlpar_memory+0x698/0xd00
        Call Trace:
          dlpar_memory+0x698/0xd00 (unreliable)
          handle_dlpar_errorlog+0xc0/0x190
          dlpar_store+0x198/0x4a0
          kobj_attr_store+0x30/0x50
          sysfs_kf_write+0x64/0x90
          kernfs_fop_write+0x1b0/0x290
          __vfs_write+0x3c/0x70
          vfs_write+0xd0/0x260
          ksys_write+0xdc/0x130
          system_call+0x5c/0x68
      
      Taking closer look at the code, I can see that for_each_drmem_lmb is a
      macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <=
      &drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs
      is NULL, the loop would iterate through the whole address range if it
      weren't stopped by the NULL pointer dereference on the next line.
      
      This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range
      macro behavior with the common C semantics, where the end marker does
      not belong to the scanned range, and alters get_lmb_range() semantics.
      As a side effect, the wraparound observed in the crash is prevented.
      
      Fixes: 6c6ea537 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: default avatarLibor Pechacek <lpechacek@suse.cz>
      Signed-off-by: default avatarMichal Suchanek <msuchanek@suse.de>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      83dc8f0a
    • Christian Gmeiner's avatar
      drm/etnaviv: rework perfmon query infrastructure · 9c6c4593
      Christian Gmeiner authored
      commit ed1dd899 upstream.
      
      Report the correct perfmon domains and signals depending
      on the supported feature flags.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 9e2c2e27 ("drm/etnaviv: add infrastructure to query perf counter")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristian Gmeiner <christian.gmeiner@gmail.com>
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c6c4593
    • Nathan Chancellor's avatar
      rtc: omap: Use define directive for PIN_CONFIG_ACTIVE_HIGH · 23599f81
      Nathan Chancellor authored
      commit c5015652 upstream.
      
      Clang warns when one enumerated type is implicitly converted to another:
      
      drivers/rtc/rtc-omap.c:574:21: warning: implicit conversion from
      enumeration type 'enum rtc_pin_config_param' to different enumeration
      type 'enum pin_config_param' [-Wenum-conversion]
              {"ti,active-high", PIN_CONFIG_ACTIVE_HIGH, 0},
              ~                  ^~~~~~~~~~~~~~~~~~~~~~
      drivers/rtc/rtc-omap.c:579:12: warning: implicit conversion from
      enumeration type 'enum rtc_pin_config_param' to different enumeration
      type 'enum pin_config_param' [-Wenum-conversion]
              PCONFDUMP(PIN_CONFIG_ACTIVE_HIGH, "input active high", NULL, false),
              ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/pinctrl/pinconf-generic.h:163:11: note: expanded from
      macro 'PCONFDUMP'
              .param = a, .display = b, .format = c, .has_arg = d     \
                       ^
      2 warnings generated.
      
      It is expected that pinctrl drivers can extend pin_config_param because
      of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
      isn't an issue. Most drivers that take advantage of this define the
      PIN_CONFIG variables as constants, rather than enumerated values. Do the
      same thing here so that Clang no longer warns.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/144Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23599f81
    • Michal Hocko's avatar
      selftests: vm: drop dependencies on page flags from mlock2 tests · 01522e4d
      Michal Hocko authored
      commit eea274d6 upstream.
      
      It was noticed that mlock2 tests are failing after 9c4e6b1a ("mm,
      mlock, vmscan: no more skipping pagevecs") because the patch has changed
      the timing on when the page is added to the unevictable LRU list and thus
      gains the unevictable page flag.
      
      The test was just too dependent on the implementation details which were
      true at the time when it was introduced.  Page flags and the timing when
      they are set is something no userspace should ever depend on.  The test
      should be testing only for the user observable contract of the tested
      syscalls.  Those are defined pretty well for the mlock and there are other
      means for testing them.  In fact this is already done and testing for page
      flags can be safely dropped to achieve the aimed purpose.  Present bits
      can be checked by /proc/<pid>/smaps RSS field and the locking state by
      VmFlags although I would argue that Locked: field would be more
      appropriate.
      
      Drop all the page flag machinery and considerably simplify the test.  This
      should be more robust for future kernel changes while checking the
      promised contract is still valid.
      
      Fixes: 9c4e6b1a ("mm, mlock, vmscan: no more skipping pagevecs")
      Reported-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Eric B Munson <emunson@akamai.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200324154218.GS19542@dhcp22.suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01522e4d
    • Fredrik Strupe's avatar
      arm64: armv8_deprecated: Fix undef_hook mask for thumb setend · fb3e9f47
      Fredrik Strupe authored
      commit fc226601 upstream.
      
      For thumb instructions, call_undef_hook() in traps.c first reads a u16,
      and if the u16 indicates a T32 instruction (u16 >= 0xe800), a second
      u16 is read, which then makes up the the lower half-word of a T32
      instruction. For T16 instructions, the second u16 is not read,
      which makes the resulting u32 opcode always have the upper half set to
      0.
      
      However, having the upper half of instr_mask in the undef_hook set to 0
      masks out the upper half of all thumb instructions - both T16 and T32.
      This results in trapped T32 instructions with the lower half-word equal
      to the T16 encoding of setend (b650) being matched, even though the upper
      half-word is not 0000 and thus indicates a T32 opcode.
      
      An example of such a T32 instruction is eaa0b650, which should raise a
      SIGILL since T32 instructions with an eaa prefix are unallocated as per
      Arm ARM, but instead works as a SETEND because the second half-word is set
      to b650.
      
      This patch fixes the issue by extending instr_mask to include the
      upper u32 half, which will still match T16 instructions where the upper
      half is 0, but not T32 instructions.
      
      Fixes: 2d888f48 ("arm64: Emulate SETEND for AArch32 tasks")
      Cc: <stable@vger.kernel.org> # 4.0.x-
      Reviewed-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarFredrik Strupe <fredrik@strupe.net>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb3e9f47
    • Steffen Maier's avatar
      scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point · af77e3e4
      Steffen Maier authored
      commit 819732be upstream.
      
      v2.6.27 commit cc8c2829 ("[SCSI] zfcp: Automatically attach remote
      ports") introduced zfcp automatic port scan.
      
      Before that, the user had to use the sysfs attribute "port_add" of an FCP
      device (adapter) to add and open remote (target) ports, even for the remote
      peer port in point-to-point topology. That code path did a proper port open
      recovery trigger taking the erp_lock.
      
      Since above commit, a new helper function zfcp_erp_open_ptp_port()
      performed an UNlocked port open recovery trigger. This can race with other
      parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt
      e.g. adapter->erp_total_count or adapter->erp_ready_head.
      
      As already found for fabric topology in v4.17 commit fa89adba ("scsi:
      zfcp: fix infinite iteration on ERP ready list"), there was an endless loop
      during tracing of rport (un)block.  A subsequent v4.18 commit 9e156c54
      ("scsi: zfcp: assert that the ERP lock is held when tracing a recovery
      trigger") introduced a lockdep assertion for that case.
      
      As a side effect, that lockdep assertion now uncovered the unlocked code
      path for PtP. It is from within an adapter ERP action:
      
      zfcp_erp_strategy[1479]  intentionally DROPs erp lock around
                               zfcp_erp_strategy_do_action()
      zfcp_erp_strategy_do_action[1441]      NO erp lock
      zfcp_erp_adapter_strategy[876]         NO erp lock
      zfcp_erp_adapter_strategy_open[855]    NO erp lock
      zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock
      zfcp_erp_adapter_strat_fsf_xconf[772]  erp lock only around
                                             zfcp_erp_action_to_running(),
                                             BUT *_not_* around
                                             zfcp_erp_enqueue_ptp_port()
      zfcp_erp_enqueue_ptp_port[728]         BUG: *_not_* taking erp lock
      _zfcp_erp_port_reopen[432]             assumes to be called with erp lock
      zfcp_erp_action_enqueue[314]           assumes to be called with erp lock
      zfcp_dbf_rec_trig[288]                 _checks_ to be called with erp lock:
      	lockdep_assert_held(&adapter->erp_lock);
      
      It causes the following lockdep warning:
      
      WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288
                                  zfcp_dbf_rec_trig+0x16a/0x188
      no locks held by zfcperp0.0.17c0/775.
      
      Fix this by using the proper locked recovery trigger helper function.
      
      Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com
      Fixes: cc8c2829 ("[SCSI] zfcp: Automatically attach remote ports")
      Cc: <stable@vger.kernel.org> #v2.6.27+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af77e3e4
    • Shetty, Harshini X (EXT-Sony Mobile)'s avatar
      dm verity fec: fix memory leak in verity_fec_dtr · 92760667
      Shetty, Harshini X (EXT-Sony Mobile) authored
      commit 75fa6019 upstream.
      
      Fix below kmemleak detected in verity_fec_ctr. output_pool is
      allocated for each dm-verity-fec device. But it is not freed when
      dm-table for the verity target is removed. Hence free the output
      mempool in destructor function verity_fec_dtr.
      
      unreferenced object 0xffffffffa574d000 (size 4096):
        comm "init", pid 1667, jiffies 4294894890 (age 307.168s)
        hex dump (first 32 bytes):
          8e 36 00 98 66 a8 0b 9b 00 00 00 00 00 00 00 00  .6..f...........
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<0000000060e82407>] __kmalloc+0x2b4/0x340
          [<00000000dd99488f>] mempool_kmalloc+0x18/0x20
          [<000000002560172b>] mempool_init_node+0x98/0x118
          [<000000006c3574d2>] mempool_init+0x14/0x20
          [<0000000008cb266e>] verity_fec_ctr+0x388/0x3b0
          [<000000000887261b>] verity_ctr+0x87c/0x8d0
          [<000000002b1e1c62>] dm_table_add_target+0x174/0x348
          [<000000002ad89eda>] table_load+0xe4/0x328
          [<000000001f06f5e9>] dm_ctl_ioctl+0x3b4/0x5a0
          [<00000000bee5fbb7>] do_vfs_ioctl+0x5dc/0x928
          [<00000000b475b8f5>] __arm64_sys_ioctl+0x70/0x98
          [<000000005361e2e8>] el0_svc_common+0xa0/0x158
          [<000000001374818f>] el0_svc_handler+0x6c/0x88
          [<000000003364e9f4>] el0_svc+0x8/0xc
          [<000000009d84cec9>] 0xffffffffffffffff
      
      Fixes: a739ff3f ("dm verity: add support for forward error correction")
      Depends-on: 6f1c819c ("dm: convert to bioset_init()/mempool_init()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHarshini Shetty <harshini.x.shetty@sony.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92760667
    • Mikulas Patocka's avatar
      dm writecache: add cond_resched to avoid CPU hangs · 6f3a303a
      Mikulas Patocka authored
      commit 1edaa447 upstream.
      
      Initializing a dm-writecache device can take a long time when the
      persistent memory device is large.  Add cond_resched() to a few loops
      to avoid warnings that the CPU is stuck.
      
      Cc: stable@vger.kernel.org # v4.18+
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f3a303a
    • Maxime Ripard's avatar
      arm64: dts: allwinner: h6: Fix PMU compatible · a6d77a5c
      Maxime Ripard authored
      commit 4c7eeb9a upstream.
      
      The commit 7aa9b9eb ("arm64: dts: allwinner: H6: Add PMU mode")
      introduced support for the PMU found on the Allwinner H6. However, the
      binding only allows for a single compatible, while the patch was adding
      two.
      
      Make sure we follow the binding.
      
      Fixes: 7aa9b9eb ("arm64: dts: allwinner: H6: Add PMU mode")
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6d77a5c
    • Subash Abhinov Kasiviswanathan's avatar
      net: qualcomm: rmnet: Allow configuration updates to existing devices · 0389387e
      Subash Abhinov Kasiviswanathan authored
      commit 2abb5792 upstream.
      
      This allows the changelink operation to succeed if the mux_id was
      specified as an argument. Note that the mux_id must match the
      existing mux_id of the rmnet device or should be an unused mux_id.
      
      Fixes: 1dc49e9d ("net: rmnet: do not allow to change mux id if mux id is duplicated")
      Reported-and-tested-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarSean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0389387e
    • Alexander Duyck's avatar
      mm: Use fixed constant in page_frag_alloc instead of size + 1 · 69598616
      Alexander Duyck authored
      commit 86447726 upstream.
      
      This patch replaces the size + 1 value introduced with the recent fix for 1
      byte allocs with a constant value.
      
      The idea here is to reduce code overhead as the previous logic would have
      to read size into a register, then increment it, and write it back to
      whatever field was being used. By using a constant we can avoid those
      memory reads and arithmetic operations in favor of just encoding the
      maximum value into the operation itself.
      
      Fixes: 2c2ade81 ("mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs")
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69598616
    • Anssi Hannula's avatar
      tools: gpio: Fix out-of-tree build regression · 2e22edcd
      Anssi Hannula authored
      commit 82f04bfe upstream.
      
      Commit 0161a94e ("tools: gpio: Correctly add make dependencies for
      gpio_utils") added a make rule for gpio-utils-in.o but used $(output)
      instead of the correct $(OUTPUT) for the output directory, breaking
      out-of-tree build (O=xx) with the following error:
      
        No rule to make target 'out/tools/gpio/gpio-utils-in.o', needed by 'out/tools/gpio/lsgpio-in.o'.  Stop.
      
      Fix that.
      
      Fixes: 0161a94e ("tools: gpio: Correctly add make dependencies for gpio_utils")
      Cc: <stable@vger.kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Link: https://lore.kernel.org/r/20200325103154.32235-1-anssi.hannula@bitwise.fiReviewed-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e22edcd
    • Zhenzhong Duan's avatar
      x86/speculation: Remove redundant arch_smt_update() invocation · 6209e098
      Zhenzhong Duan authored
      commit 34d66caf upstream.
      
      With commit a74cfffb ("x86/speculation: Rework SMT state change"),
      arch_smt_update() is invoked from each individual CPU hotplug function.
      
      Therefore the extra arch_smt_update() call in the sysfs SMT control is
      redundant.
      
      Fixes: a74cfffb ("x86/speculation: Rework SMT state change")
      Signed-off-by: default avatarZhenzhong Duan <zhenzhong.duan@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <konrad.wilk@oracle.com>
      Cc: <dwmw@amazon.co.uk>
      Cc: <bp@suse.de>
      Cc: <srinivas.eeda@oracle.com>
      Cc: <peterz@infradead.org>
      Cc: <hpa@zytor.com>
      Link: https://lkml.kernel.org/r/e2e064f2-e8ef-42ca-bf4f-76b612964752@default
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6209e098
    • YueHaibing's avatar
      powerpc/pseries: Drop pointless static qualifier in vpa_debugfs_init() · f5e2eef0
      YueHaibing authored
      commit 11dd34f3 upstream.
      
      There is no need to have the 'struct dentry *vpa_dir' variable static
      since new value always be assigned before use it.
      
      Fixes: c6c26fb5 ("powerpc/pseries: Export raw per-CPU VPA data via debugfs")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190218125644.87448-1-yuehaibing@huawei.com
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5e2eef0
    • Gao Xiang's avatar
      erofs: correct the remaining shrink objects · d8bd8bca
      Gao Xiang authored
      commit 9d5a09c6 upstream.
      
      The remaining count should not include successful
      shrink attempts.
      
      Fixes: e7e9a307 ("staging: erofs: introduce workstation for decompression")
      Cc: <stable@vger.kernel.org> # 4.19+
      Link: https://lore.kernel.org/r/20200226081008.86348-1-gaoxiang25@huawei.comReviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      d8bd8bca
    • Rosioru Dragos's avatar
      crypto: mxs-dcp - fix scatterlist linearization for hash · c127f180
      Rosioru Dragos authored
      commit fa03481b upstream.
      
      The incorrect traversal of the scatterlist, during the linearization phase
      lead to computing the hash value of the wrong input buffer.
      New implementation uses scatterwalk_map_and_copy()
      to address this issue.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 15b59e7c ("crypto: mxs - Add Freescale MXS DCP driver")
      Signed-off-by: default avatarRosioru Dragos <dragos.rosioru@nxp.com>
      Reviewed-by: default avatarHoria Geantă <horia.geanta@nxp.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c127f180
    • Robbie Ko's avatar
      btrfs: fix missing semaphore unlock in btrfs_sync_file · ed870340
      Robbie Ko authored
      commit 6ff06729 upstream.
      
      Ordered ops are started twice in sync file, once outside of inode mutex
      and once inside, taking the dio semaphore. There was one error path
      missing the semaphore unlock.
      
      Fixes: aab15e8e ("Btrfs: fix rare chances for data loss when doing a fast fsync")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      [ add changelog ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed870340
    • Filipe Manana's avatar
      btrfs: fix missing file extent item for hole after ranged fsync · 867ae5eb
      Filipe Manana authored
      commit 95418ed1 upstream.
      
      When doing a fast fsync for a range that starts at an offset greater than
      zero, we can end up with a log that when replayed causes the respective
      inode miss a file extent item representing a hole if we are not using the
      NO_HOLES feature. This is because for fast fsyncs we don't log any extents
      that cover a range different from the one requested in the fsync.
      
      Example scenario to trigger it:
      
        $ mkfs.btrfs -O ^no-holes -f /dev/sdd
        $ mount /dev/sdd /mnt
      
        # Create a file with a single 256K and fsync it to clear to full sync
        # bit in the inode - we want the msync below to trigger a fast fsync.
        $ xfs_io -f -c "pwrite -S 0xab 0 256K" -c "fsync" /mnt/foo
      
        # Force a transaction commit and wipe out the log tree.
        $ sync
      
        # Dirty 768K of data, increasing the file size to 1Mb, and flush only
        # the range from 256K to 512K without updating the log tree
        # (sync_file_range() does not trigger fsync, it only starts writeback
        # and waits for it to finish).
      
        $ xfs_io -c "pwrite -S 0xcd 256K 768K" /mnt/foo
        $ xfs_io -c "sync_range -abw 256K 256K" /mnt/foo
      
        # Now dirty the range from 768K to 1M again and sync that range.
        $ xfs_io -c "mmap -w 768K 256K"        \
                 -c "mwrite -S 0xef 768K 256K" \
                 -c "msync -s 768K 256K"       \
                 -c "munmap"                   \
                 /mnt/foo
      
        <power fail>
      
        # Mount to replay the log.
        $ mount /dev/sdd /mnt
        $ umount /mnt
      
        $ btrfs check /dev/sdd
        Opening filesystem to check...
        Checking filesystem on /dev/sdd
        UUID: 482fb574-b288-478e-a190-a9c44a78fca6
        [1/7] checking root items
        [2/7] checking extents
        [3/7] checking free space cache
        [4/7] checking fs roots
        root 5 inode 257 errors 100, file extent discount
        Found file extent holes:
             start: 262144, len: 524288
        ERROR: errors found in fs roots
        found 720896 bytes used, error(s) found
        total csum bytes: 512
        total tree bytes: 131072
        total fs tree bytes: 32768
        total extent tree bytes: 16384
        btree space waste bytes: 123514
        file data blocks allocated: 589824
          referenced 589824
      
      Fix this issue by setting the range to full (0 to LLONG_MAX) when the
      NO_HOLES feature is not enabled. This results in extra work being done
      but it gives the guarantee we don't end up with missing holes after
      replaying the log.
      
      CC: stable@vger.kernel.org # 4.19+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      867ae5eb
    • Josef Bacik's avatar
      btrfs: drop block from cache on error in relocation · d8ecdce1
      Josef Bacik authored
      commit 8e19c973 upstream.
      
      If we have an error while building the backref tree in relocation we'll
      process all the pending edges and then free the node.  However if we
      integrated some edges into the cache we'll lose our link to those edges
      by simply freeing this node, which means we'll leak memory and
      references to any roots that we've found.
      
      Instead we need to use remove_backref_node(), which walks through all of
      the edges that are still linked to this node and free's them up and
      drops any root references we may be holding.
      
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d8ecdce1
    • Josef Bacik's avatar
      btrfs: set update the uuid generation as soon as possible · d3a7c4b8
      Josef Bacik authored
      commit 75ec1db8 upstream.
      
      In my EIO stress testing I noticed I was getting forced to rescan the
      uuid tree pretty often, which was weird.  This is because my error
      injection stuff would sometimes inject an error after log replay but
      before we loaded the UUID tree.  If log replay committed the transaction
      it wouldn't have updated the uuid tree generation, but the tree was
      valid and didn't change, so there's no reason to not update the
      generation here.
      
      Fix this by setting the BTRFS_FS_UPDATE_UUID_TREE_GEN bit immediately
      after reading all the fs roots if the uuid tree generation matches the
      fs generation.  Then any transaction commits that happen during mount
      won't screw up our uuid tree state, forcing us to do needless uuid
      rescans.
      
      Fixes: 70f80175 ("Btrfs: check UUID tree during mount if required")
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3a7c4b8
    • Filipe Manana's avatar
      Btrfs: fix crash during unmount due to race with delayed inode workers · 7ed0c4db
      Filipe Manana authored
      commit f0cc2cd7 upstream.
      
      During unmount we can have a job from the delayed inode items work queue
      still running, that can lead to at least two bad things:
      
      1) A crash, because the worker can try to create a transaction just
         after the fs roots were freed;
      
      2) A transaction leak, because the worker can create a transaction
         before the fs roots are freed and just after we committed the last
         transaction and after we stopped the transaction kthread.
      
      A stack trace example of the crash:
      
       [79011.691214] kernel BUG at lib/radix-tree.c:982!
       [79011.692056] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
       [79011.693180] CPU: 3 PID: 1394 Comm: kworker/u8:2 Tainted: G        W         5.6.0-rc2-btrfs-next-54 #2
       (...)
       [79011.696789] Workqueue: btrfs-delayed-meta btrfs_work_helper [btrfs]
       [79011.697904] RIP: 0010:radix_tree_tag_set+0xe7/0x170
       (...)
       [79011.702014] RSP: 0018:ffffb3c84a317ca0 EFLAGS: 00010293
       [79011.702949] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
       [79011.704202] RDX: ffffb3c84a317cb0 RSI: ffffb3c84a317ca8 RDI: ffff8db3931340a0
       [79011.705463] RBP: 0000000000000005 R08: 0000000000000005 R09: ffffffff974629d0
       [79011.706756] R10: ffffb3c84a317bc0 R11: 0000000000000001 R12: ffff8db393134000
       [79011.708010] R13: ffff8db3931340a0 R14: ffff8db393134068 R15: 0000000000000001
       [79011.709270] FS:  0000000000000000(0000) GS:ffff8db3b6a00000(0000) knlGS:0000000000000000
       [79011.710699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [79011.711710] CR2: 00007f22c2a0a000 CR3: 0000000232ad4005 CR4: 00000000003606e0
       [79011.712958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       [79011.714205] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       [79011.715448] Call Trace:
       [79011.715925]  record_root_in_trans+0x72/0xf0 [btrfs]
       [79011.716819]  btrfs_record_root_in_trans+0x4b/0x70 [btrfs]
       [79011.717925]  start_transaction+0xdd/0x5c0 [btrfs]
       [79011.718829]  btrfs_async_run_delayed_root+0x17e/0x2b0 [btrfs]
       [79011.719915]  btrfs_work_helper+0xaa/0x720 [btrfs]
       [79011.720773]  process_one_work+0x26d/0x6a0
       [79011.721497]  worker_thread+0x4f/0x3e0
       [79011.722153]  ? process_one_work+0x6a0/0x6a0
       [79011.722901]  kthread+0x103/0x140
       [79011.723481]  ? kthread_create_worker_on_cpu+0x70/0x70
       [79011.724379]  ret_from_fork+0x3a/0x50
       (...)
      
      The following diagram shows a sequence of steps that lead to the crash
      during ummount of the filesystem:
      
              CPU 1                                             CPU 2                                CPU 3
      
       btrfs_punch_hole()
         btrfs_btree_balance_dirty()
           btrfs_balance_delayed_items()
             --> sees
                 fs_info->delayed_root->items
                 with value 200, which is greater
                 than
                 BTRFS_DELAYED_BACKGROUND (128)
                 and smaller than
                 BTRFS_DELAYED_WRITEBACK (512)
             btrfs_wq_run_delayed_node()
               --> queues a job for
                   fs_info->delayed_workers to run
                   btrfs_async_run_delayed_root()
      
                                                                                                  btrfs_async_run_delayed_root()
                                                                                                    --> job queued by CPU 1
      
                                                                                                    --> starts picking and running
                                                                                                        delayed nodes from the
                                                                                                        prepare_list list
      
                                                       close_ctree()
      
                                                         btrfs_delete_unused_bgs()
      
                                                         btrfs_commit_super()
      
                                                           btrfs_join_transaction()
                                                             --> gets transaction N
      
                                                           btrfs_commit_transaction(N)
                                                             --> set transaction state
                                                              to TRANTS_STATE_COMMIT_START
      
                                                                                                   btrfs_first_prepared_delayed_node()
                                                                                                     --> picks delayed node X through
                                                                                                         the prepared_list list
      
                                                             btrfs_run_delayed_items()
      
                                                               btrfs_first_delayed_node()
                                                                 --> also picks delayed node X
                                                                     but through the node_list
                                                                     list
      
                                                               __btrfs_commit_inode_delayed_items()
                                                                  --> runs all delayed items from
                                                                      this node and drops the
                                                                      node's item count to 0
                                                                      through call to
                                                                      btrfs_release_delayed_inode()
      
                                                               --> finishes running any remaining
                                                                   delayed nodes
      
                                                             --> finishes transaction commit
      
                                                         --> stops cleaner and transaction threads
      
                                                         btrfs_free_fs_roots()
                                                           --> frees all roots and removes them
                                                               from the radix tree
                                                               fs_info->fs_roots_radix
      
                                                                                                   btrfs_join_transaction()
                                                                                                     start_transaction()
                                                                                                       btrfs_record_root_in_trans()
                                                                                                         record_root_in_trans()
                                                                                                           radix_tree_tag_set()
                                                                                                             --> crashes because
                                                                                                                 the root is not in
                                                                                                                 the radix tree
                                                                                                                 anymore
      
      If the worker is able to call btrfs_join_transaction() before the unmount
      task frees the fs roots, we end up leaking a transaction and all its
      resources, since after the call to btrfs_commit_super() and stopping the
      transaction kthread, we don't expect to have any transaction open anymore.
      
      When this situation happens the worker has a delayed node that has no
      more items to run, since the task calling btrfs_run_delayed_items(),
      which is doing a transaction commit, picks the same node and runs all
      its items first.
      
      We can not wait for the worker to complete when running delayed items
      through btrfs_run_delayed_items(), because we call that function in
      several phases of a transaction commit, and that could cause a deadlock
      because the worker calls btrfs_join_transaction() and the task doing the
      transaction commit may have already set the transaction state to
      TRANS_STATE_COMMIT_DOING.
      
      Also it's not possible to get into a situation where only some of the
      items of a delayed node are added to the fs/subvolume tree in the current
      transaction and the remaining ones in the next transaction, because when
      running the items of a delayed inode we lock its mutex, effectively
      waiting for the worker if the worker is running the items of the delayed
      node already.
      
      Since this can only cause issues when unmounting a filesystem, fix it in
      a simple way by waiting for any jobs on the delayed workers queue before
      calling btrfs_commit_supper() at close_ctree(). This works because at this
      point no one can call btrfs_btree_balance_dirty() or
      btrfs_balance_delayed_items(), and if we end up waiting for any worker to
      complete, btrfs_commit_super() will commit the transaction created by the
      worker.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ed0c4db
    • Frieder Schrempf's avatar
      mtd: spinand: Do not erase the block before writing a bad block marker · d389050b
      Frieder Schrempf authored
      commit b645ad39 upstream.
      
      Currently when marking a block, we use spinand_erase_op() to erase
      the block before writing the marker to the OOB area. Doing so without
      waiting for the operation to finish can lead to the marking failing
      silently and no bad block marker being written to the flash.
      
      In fact we don't need to do an erase at all before writing the BBM.
      The ECC is disabled for raw accesses to the OOB data and we don't
      need to work around any issues with chips reporting ECC errors as it
      is known to be the case for raw NAND.
      
      Fixes: 7529df46 ("mtd: nand: Add core infrastructure to support SPI NANDs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFrieder Schrempf <frieder.schrempf@kontron.de>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20200218100432.32433-4-frieder.schrempf@kontron.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d389050b
    • Frieder Schrempf's avatar
      mtd: spinand: Stop using spinand->oobbuf for buffering bad block markers · a8899631
      Frieder Schrempf authored
      commit 21489375 upstream.
      
      For reading and writing the bad block markers, spinand->oobbuf is
      currently used as a buffer for the marker bytes. During the
      underlying read and write operations to actually get/set the content
      of the OOB area, the content of spinand->oobbuf is reused and changed
      by accessing it through spinand->oobbuf and/or spinand->databuf.
      
      This is a flaw in the original design of the SPI NAND core and at the
      latest from 13c15e07 ("mtd: spinand: Handle the case where
      PROGRAM LOAD does not reset the cache") on, it results in not having
      the bad block marker written at all, as the spinand->oobbuf is
      cleared to 0xff after setting the marker bytes to zero.
      
      To fix it, we now just store the two bytes for the marker on the
      stack and let the read/write operations copy it from/to the page
      buffer later.
      
      Fixes: 7529df46 ("mtd: nand: Add core infrastructure to support SPI NANDs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFrieder Schrempf <frieder.schrempf@kontron.de>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Link: https://lore.kernel.org/linux-mtd/20200218100432.32433-2-frieder.schrempf@kontron.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a8899631
    • Yilu Lin's avatar
      CIFS: Fix bug which the return value by asynchronous read is error · 9bc02258
      Yilu Lin authored
      commit 97adda8b upstream.
      
      This patch is used to fix the bug in collect_uncached_read_data()
      that rc is automatically converted from a signed number to an
      unsigned number when the CIFS asynchronous read fails.
      It will cause ctx->rc is error.
      
      Example:
      Share a directory and create a file on the Windows OS.
      Mount the directory to the Linux OS using CIFS.
      On the CIFS client of the Linux OS, invoke the pread interface to
      deliver the read request.
      
      The size of the read length plus offset of the read request is greater
      than the maximum file size.
      
      In this case, the CIFS server on the Windows OS returns a failure
      message (for example, the return value of
      smb2.nt_status is STATUS_INVALID_PARAMETER).
      
      After receiving the response message, the CIFS client parses
      smb2.nt_status to STATUS_INVALID_PARAMETER
      and converts it to the Linux error code (rdata->result=-22).
      
      Then the CIFS client invokes the collect_uncached_read_data function to
      assign the value of rdata->result to rc, that is, rc=rdata->result=-22.
      
      The type of the ctx->total_len variable is unsigned integer,
      the type of the rc variable is integer, and the type of
      the ctx->rc variable is ssize_t.
      
      Therefore, during the ternary operation, the value of rc is
      automatically converted to an unsigned number. The final result is
      ctx->rc=4294967274. However, the expected result is ctx->rc=-22.
      Signed-off-by: default avatarYilu Lin <linyilu@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Acked-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9bc02258
    • Vitaly Kuznetsov's avatar
      KVM: VMX: fix crash cleanup when KVM wasn't used · f9971a89
      Vitaly Kuznetsov authored
      commit dbef2808 upstream.
      
      If KVM wasn't used at all before we crash the cleanup procedure fails with
       BUG: unable to handle page fault for address: ffffffffffffffc8
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 23215067 P4D 23215067 PUD 23217067 PMD 0
       Oops: 0000 [#8] SMP PTI
       CPU: 0 PID: 3542 Comm: bash Kdump: loaded Tainted: G      D           5.6.0-rc2+ #823
       RIP: 0010:crash_vmclear_local_loaded_vmcss.cold+0x19/0x51 [kvm_intel]
      
      The root cause is that loaded_vmcss_on_cpu list is not yet initialized,
      we initialize it in hardware_enable() but this only happens when we start
      a VM.
      
      Previously, we used to have a bitmap with enabled CPUs and that was
      preventing [masking] the issue.
      
      Initialized loaded_vmcss_on_cpu list earlier, right before we assign
      crash_vmclear_loaded_vmcss pointer. blocked_vcpu_on_cpu list and
      blocked_vcpu_on_cpu_lock are moved altogether for consistency.
      
      Fixes: 31603d4f ("KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support")
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20200401081348.1345307-1-vkuznets@redhat.com>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9971a89
    • Sean Christopherson's avatar
      KVM: x86: Gracefully handle __vmalloc() failure during VM allocation · 4538f42a
      Sean Christopherson authored
      commit d18b2f43 upstream.
      
      Check the result of __vmalloc() to avoid dereferencing a NULL pointer in
      the event that allocation failres.
      
      Fixes: d1e5b0e9 ("kvm: Make VM ioctl do valloc for some archs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4538f42a
    • Sean Christopherson's avatar
      KVM: VMX: Always VMCLEAR in-use VMCSes during crash with kexec support · a9f890aa
      Sean Christopherson authored
      commit 31603d4f upstream.
      
      VMCLEAR all in-use VMCSes during a crash, even if kdump's NMI shootdown
      interrupted a KVM update of the percpu in-use VMCS list.
      
      Because NMIs are not blocked by disabling IRQs, it's possible that
      crash_vmclear_local_loaded_vmcss() could be called while the percpu list
      of VMCSes is being modified, e.g. in the middle of list_add() in
      vmx_vcpu_load_vmcs().  This potential corner case was called out in the
      original commit[*], but the analysis of its impact was wrong.
      
      Skipping the VMCLEARs is wrong because it all but guarantees that a
      loaded, and therefore cached, VMCS will live across kexec and corrupt
      memory in the new kernel.  Corruption will occur because the CPU's VMCS
      cache is non-coherent, i.e. not snooped, and so the writeback of VMCS
      memory on its eviction will overwrite random memory in the new kernel.
      The VMCS will live because the NMI shootdown also disables VMX, i.e. the
      in-progress VMCLEAR will #UD, and existing Intel CPUs do not flush the
      VMCS cache on VMXOFF.
      
      Furthermore, interrupting list_add() and list_del() is safe due to
      crash_vmclear_local_loaded_vmcss() using forward iteration.  list_add()
      ensures the new entry is not visible to forward iteration unless the
      entire add completes, via WRITE_ONCE(prev->next, new).  A bad "prev"
      pointer could be observed if the NMI shootdown interrupted list_del() or
      list_add(), but list_for_each_entry() does not consume ->prev.
      
      In addition to removing the temporary disabling of VMCLEAR, open code
      loaded_vmcs_init() in __loaded_vmcs_clear() and reorder VMCLEAR so that
      the VMCS is deleted from the list only after it's been VMCLEAR'd.
      Deleting the VMCS before VMCLEAR would allow a race where the NMI
      shootdown could arrive between list_del() and vmcs_clear() and thus
      neither flow would execute a successful VMCLEAR.  Alternatively, more
      code could be moved into loaded_vmcs_init(), but that gets rather silly
      as the only other user, alloc_loaded_vmcs(), doesn't need the smp_wmb()
      and would need to work around the list_del().
      
      Update the smp_*() comments related to the list manipulation, and
      opportunistically reword them to improve clarity.
      
      [*] https://patchwork.kernel.org/patch/1675731/#3720461
      
      Fixes: 8f536b76 ("KVM: VMX: provide the vmclear function and a bitmap to support VMCLEAR in kdump")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200321193751.24985-2-sean.j.christopherson@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9f890aa
    • Sean Christopherson's avatar
      KVM: x86: Allocate new rmap and large page tracking when moving memslot · 4a0efabb
      Sean Christopherson authored
      commit edd4fa37 upstream.
      
      Reallocate a rmap array and recalcuate large page compatibility when
      moving an existing memslot to correctly handle the alignment properties
      of the new memslot.  The number of rmap entries required at each level
      is dependent on the alignment of the memslot's base gfn with respect to
      that level, e.g. moving a large-page aligned memslot so that it becomes
      unaligned will increase the number of rmap entries needed at the now
      unaligned level.
      
      Not updating the rmap array is the most obvious bug, as KVM accesses
      garbage data beyond the end of the rmap.  KVM interprets the bad data as
      pointers, leading to non-canonical #GPs, unexpected #PFs, etc...
      
        general protection fault: 0000 [#1] SMP
        CPU: 0 PID: 1909 Comm: move_memory_reg Not tainted 5.4.0-rc7+ #139
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        RIP: 0010:rmap_get_first+0x37/0x50 [kvm]
        Code: <48> 8b 3b 48 85 ff 74 ec e8 6c f4 ff ff 85 c0 74 e3 48 89 d8 5b c3
        RSP: 0018:ffffc9000021bbc8 EFLAGS: 00010246
        RAX: ffff00617461642e RBX: ffff00617461642e RCX: 0000000000000012
        RDX: ffff88827400f568 RSI: ffffc9000021bbe0 RDI: ffff88827400f570
        RBP: 0010000000000000 R08: ffffc9000021bd00 R09: ffffc9000021bda8
        R10: ffffc9000021bc48 R11: 0000000000000000 R12: 0030000000000000
        R13: 0000000000000000 R14: ffff88827427d700 R15: ffffc9000021bce8
        FS:  00007f7eda014700(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f7ed9216ff8 CR3: 0000000274391003 CR4: 0000000000162eb0
        Call Trace:
         kvm_mmu_slot_set_dirty+0xa1/0x150 [kvm]
         __kvm_set_memory_region.part.64+0x559/0x960 [kvm]
         kvm_set_memory_region+0x45/0x60 [kvm]
         kvm_vm_ioctl+0x30f/0x920 [kvm]
         do_vfs_ioctl+0xa1/0x620
         ksys_ioctl+0x66/0x70
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x4c/0x170
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f7ed9911f47
        Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 21 6f 2c 00 f7 d8 64 89 01 48
        RSP: 002b:00007ffc00937498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        RAX: ffffffffffffffda RBX: 0000000001ab0010 RCX: 00007f7ed9911f47
        RDX: 0000000001ab1350 RSI: 000000004020ae46 RDI: 0000000000000004
        RBP: 000000000000000a R08: 0000000000000000 R09: 00007f7ed9214700
        R10: 00007f7ed92149d0 R11: 0000000000000246 R12: 00000000bffff000
        R13: 0000000000000003 R14: 00007f7ed9215000 R15: 0000000000000000
        Modules linked in: kvm_intel kvm irqbypass
        ---[ end trace 0c5f570b3358ca89 ]---
      
      The disallow_lpage tracking is more subtle.  Failure to update results
      in KVM creating large pages when it shouldn't, either due to stale data
      or again due to indexing beyond the end of the metadata arrays, which
      can lead to memory corruption and/or leaking data to guest/userspace.
      
      Note, the arrays for the old memslot are freed by the unconditional call
      to kvm_free_memslot() in __kvm_set_memory_region().
      
      Fixes: 05da4558 ("KVM: MMU: large page support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a0efabb
    • David Hildenbrand's avatar
      KVM: s390: vsie: Fix delivery of addressing exceptions · de2ac8a7
      David Hildenbrand authored
      commit 4d4cee96 upstream.
      
      Whenever we get an -EFAULT, we failed to read in guest 2 physical
      address space. Such addressing exceptions are reported via a program
      intercept to the nested hypervisor.
      
      We faked the intercept, we have to return to guest 2. Instead, right
      now we would be returning -EFAULT from the intercept handler, eventually
      crashing the VM.
      the correct thing to do is to return 1 as rc == 1 is the internal
      representation of "we have to go back into g2".
      
      Addressing exceptions can only happen if the g2->g3 page tables
      reference invalid g2 addresses (say, either a table or the final page is
      not accessible - so something that basically never happens in sane
      environments.
      
      Identified by manual code inspection.
      
      Fixes: a3508fbe ("KVM: s390: vsie: initial support for nested virtualization")
      Cc: <stable@vger.kernel.org> # v4.8+
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Link: https://lore.kernel.org/r/20200403153050.20569-3-david@redhat.comReviewed-by: default avatarClaudio Imbrenda <imbrenda@linux.ibm.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      [borntraeger@de.ibm.com: fix patch description]
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de2ac8a7
    • David Hildenbrand's avatar
      KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks · 50a59d2d
      David Hildenbrand authored
      commit a1d032a4 upstream.
      
      In case we have a region 1 the following calculation
      (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11)
      results in 64. As shifts beyond the size are undefined the compiler is
      free to use instructions like sllg. sllg will only use 6 bits of the
      shift value (here 64) resulting in no shift at all. That means that ALL
      addresses will be rejected.
      
      The can result in endless loops, e.g. when prefix cannot get mapped.
      
      Fixes: 4be130a0 ("s390/mm: add shadow gmap support")
      Tested-by: default avatarJanosch Frank <frankja@linux.ibm.com>
      Reported-by: default avatarJanosch Frank <frankja@linux.ibm.com>
      Cc: <stable@vger.kernel.org> # v4.8+
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Link: https://lore.kernel.org/r/20200403153050.20569-2-david@redhat.comReviewed-by: default avatarClaudio Imbrenda <imbrenda@linux.ibm.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      [borntraeger@de.ibm.com: fix patch description, remove WARN_ON_ONCE]
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50a59d2d
    • Sean Christopherson's avatar
      KVM: nVMX: Properly handle userspace interrupt window request · deecbb36
      Sean Christopherson authored
      commit a1c77abb upstream.
      
      Return true for vmx_interrupt_allowed() if the vCPU is in L2 and L1 has
      external interrupt exiting enabled.  IRQs are never blocked in hardware
      if the CPU is in the guest (L2 from L1's perspective) when IRQs trigger
      VM-Exit.
      
      The new check percolates up to kvm_vcpu_ready_for_interrupt_injection()
      and thus vcpu_run(), and so KVM will exit to userspace if userspace has
      requested an interrupt window (to inject an IRQ into L1).
      
      Remove the @external_intr param from vmx_check_nested_events(), which is
      actually an indicator that userspace wants an interrupt window, e.g.
      it's named @req_int_win further up the stack.  Injecting a VM-Exit into
      L1 to try and bounce out to L0 userspace is all kinds of broken and is
      no longer necessary.
      
      Remove the hack in nested_vmx_vmexit() that attempted to workaround the
      breakage in vmx_check_nested_events() by only filling interrupt info if
      there's an actual interrupt pending.  The hack actually made things
      worse because it caused KVM to _never_ fill interrupt info when the
      LAPIC resides in userspace (kvm_cpu_has_interrupt() queries
      interrupt.injected, which is always cleared by prepare_vmcs12() before
      reaching the hack in nested_vmx_vmexit()).
      
      Fixes: 6550c4df ("KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"")
      Cc: stable@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      deecbb36
    • Thomas Gleixner's avatar
      x86/entry/32: Add missing ASM_CLAC to general_protection entry · 7460d17c
      Thomas Gleixner authored
      commit 3d51507f upstream.
      
      All exception entry points must have ASM_CLAC right at the
      beginning. The general_protection entry is missing one.
      
      Fixes: e59d1b0a ("x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarAlexandre Chartre <alexandre.chartre@oracle.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20200225220216.219537887@linutronix.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7460d17c