1. 23 Mar, 2019 40 commits
    • Stephen Boyd's avatar
      soc: qcom: rpmh: Avoid accessing freed memory from batch API · 02c55be5
      Stephen Boyd authored
      commit baef1c90 upstream.
      
      Using the batch API from the interconnect driver sometimes leads to a
      KASAN error due to an access to freed memory. This is easier to trigger
      with threadirqs on the kernel commandline.
      
       BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c
       Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57
      
       CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G        W         4.19.10 #72
       Call trace:
        dump_backtrace+0x0/0x2f8
        show_stack+0x20/0x2c
        __dump_stack+0x20/0x28
        dump_stack+0xcc/0x10c
        print_address_description+0x74/0x240
        kasan_report+0x250/0x26c
        __asan_report_load1_noabort+0x20/0x2c
        rpmh_tx_done+0x114/0x12c
        tcs_tx_done+0x450/0x768
        irq_forced_thread_fn+0x58/0x9c
        irq_thread+0x120/0x1dc
        kthread+0x248/0x260
        ret_from_fork+0x10/0x18
      
       Allocated by task 385:
        kasan_kmalloc+0xac/0x148
        __kmalloc+0x170/0x1e4
        rpmh_write_batch+0x174/0x540
        qcom_icc_set+0x8dc/0x9ac
        icc_set+0x288/0x2e8
        a6xx_gmu_stop+0x320/0x3c0
        a6xx_pm_suspend+0x108/0x124
        adreno_suspend+0x50/0x60
        pm_generic_runtime_suspend+0x60/0x78
        __rpm_callback+0x214/0x32c
        rpm_callback+0x54/0x184
        rpm_suspend+0x3f8/0xa90
        pm_runtime_work+0xb4/0x178
        process_one_work+0x544/0xbc0
        worker_thread+0x514/0x7d0
        kthread+0x248/0x260
        ret_from_fork+0x10/0x18
      
       Freed by task 385:
        __kasan_slab_free+0x12c/0x1e0
        kasan_slab_free+0x10/0x1c
        kfree+0x134/0x588
        rpmh_write_batch+0x49c/0x540
        qcom_icc_set+0x8dc/0x9ac
        icc_set+0x288/0x2e8
        a6xx_gmu_stop+0x320/0x3c0
        a6xx_pm_suspend+0x108/0x124
        adreno_suspend+0x50/0x60
       cr50_spi spi5.0: SPI transfer timed out
        pm_generic_runtime_suspend+0x60/0x78
        __rpm_callback+0x214/0x32c
        rpm_callback+0x54/0x184
        rpm_suspend+0x3f8/0xa90
        pm_runtime_work+0xb4/0x178
        process_one_work+0x544/0xbc0
        worker_thread+0x514/0x7d0
        kthread+0x248/0x260
        ret_from_fork+0x10/0x18
      
       The buggy address belongs to the object at fffffff51414ac80
        which belongs to the cache kmalloc-512 of size 512
       The buggy address is located 260 bytes inside of
        512-byte region [fffffff51414ac80, fffffff51414ae80)
       The buggy address belongs to the page:
       page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0
       flags: 0x4000000000008100(slab|head)
       raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680
       raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       >fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
        fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      The batch API sets the same completion for each rpmh message that's sent
      and then loops through all the messages and waits for that single
      completion declared on the stack to be completed before returning from
      the function and freeing the message structures. Unfortunately, some
      messages may still be in process and 'stuck' in the TCS. At some later
      point, the tcs_tx_done() interrupt will run and try to process messages
      that have already been freed at the end of rpmh_write_batch(). This will
      in turn access the 'needs_free' member of the rpmh_request structure and
      cause KASAN to complain. Furthermore, if there's a message that's
      completed in rpmh_tx_done() and freed immediately after the complete()
      call is made we'll be racing with potentially freed memory when
      accessing the 'needs_free' member:
      
      	CPU0                         CPU1
      	----                         ----
      	rpmh_tx_done()
      	 complete(&compl)
      	                             wait_for_completion(&compl)
      	                             kfree(rpm_msg)
      	 if (rpm_msg->needs_free)
      	 <KASAN warning splat>
      
      Let's fix this by allocating a chunk of completions for each message and
      waiting for all of them to be completed before returning from the batch
      API. Alternatively, we could wait for the last message in the batch, but
      that may be a more complicated change because it looks like
      tcs_tx_done() just iterates through the indices of the queue and
      completes each message instead of tracking the last inserted message and
      completing that first.
      
      Fixes: c8790cb6 ("drivers: qcom: rpmh: add support for batch RPMH request")
      Cc: Lina Iyer <ilina@codeaurora.org>
      Cc: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarLina Iyer <ilina@codeaurora.org>
      Reviewed-by: default avatarEvan Green <evgreen@chromium.org>
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarAndy Gross <andy.gross@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02c55be5
    • Filipe Manana's avatar
      Btrfs: fix corruption reading shared and compressed extents after hole punching · 898488e2
      Filipe Manana authored
      commit 8e928218 upstream.
      
      In the past we had data corruption when reading compressed extents that
      are shared within the same file and they are consecutive, this got fixed
      by commit 005efedf ("Btrfs: fix read corruption of compressed and
      shared extents") and by commit 808f80b4 ("Btrfs: update fix for read
      corruption of compressed and shared extents"). However there was a case
      that was missing in those fixes, which is when the shared and compressed
      extents are referenced with a non-zero offset. The following shell script
      creates a reproducer for this issue:
      
        #!/bin/bash
      
        mkfs.btrfs -f /dev/sdc &> /dev/null
        mount -o compress /dev/sdc /mnt/sdc
      
        # Create a file with 3 consecutive compressed extents, each has an
        # uncompressed size of 128Kb and a compressed size of 4Kb.
        for ((i = 1; i <= 3; i++)); do
            head -c 4096 /dev/zero
            for ((j = 1; j <= 31; j++)); do
                head -c 4096 /dev/zero | tr '\0' "\377"
            done
        done > /mnt/sdc/foobar
        sync
      
        echo "Digest after file creation:   $(md5sum /mnt/sdc/foobar)"
      
        # Clone the first extent into offsets 128K and 256K.
        xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
        xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
        sync
      
        echo "Digest after cloning:         $(md5sum /mnt/sdc/foobar)"
      
        # Punch holes into the regions that are already full of zeroes.
        xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
        xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
        xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
        sync
      
        echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
      
        echo "Dropping page cache..."
        sysctl -q vm.drop_caches=1
        echo "Digest after hole punching:   $(md5sum /mnt/sdc/foobar)"
      
        umount /dev/sdc
      
      When running the script we get the following output:
      
        Digest after file creation:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
        linked 131072/131072 bytes at offset 131072
        128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
        linked 131072/131072 bytes at offset 262144
        128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
        Digest after cloning:         5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
        Digest after hole punching:   5a0888d80d7ab1fd31c229f83a3bbcc8  /mnt/sdc/foobar
        Dropping page cache...
        Digest after hole punching:   fba694ae8664ed0c2e9ff8937e7f1484  /mnt/sdc/foobar
      
      This happens because after reading all the pages of the extent in the
      range from 128K to 256K for example, we read the hole at offset 256K
      and then when reading the page at offset 260K we don't submit the
      existing bio, which is responsible for filling all the page in the
      range 128K to 256K only, therefore adding the pages from range 260K
      to 384K to the existing bio and submitting it after iterating over the
      entire range. Once the bio completes, the uncompressed data fills only
      the pages in the range 128K to 256K because there's no more data read
      from disk, leaving the pages in the range 260K to 384K unfilled. It is
      just a slightly different variant of what was solved by commit
      005efedf ("Btrfs: fix read corruption of compressed and shared
      extents").
      
      Fix this by forcing a bio submit, during readpages(), whenever we find a
      compressed extent map for a page that is different from the extent map
      for the previous page or has a different starting offset (in case it's
      the same compressed extent), instead of the extent map's original start
      offset.
      
      A test case for fstests follows soon.
      Reported-by: default avatarZygo Blaxell <ce3g8jdj@umail.furryterror.org>
      Fixes: 808f80b4 ("Btrfs: update fix for read corruption of compressed and shared extents")
      Fixes: 005efedf ("Btrfs: fix read corruption of compressed and shared extents")
      Cc: stable@vger.kernel.org # 4.3+
      Tested-by: default avatarZygo Blaxell <ce3g8jdj@umail.furryterror.org>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      898488e2
    • Johannes Thumshirn's avatar
      btrfs: ensure that a DUP or RAID1 block group has exactly two stripes · 1a00f7fd
      Johannes Thumshirn authored
      commit 349ae63f upstream.
      
      We recently had a customer issue with a corrupted filesystem. When
      trying to mount this image btrfs panicked with a division by zero in
      calc_stripe_length().
      
      The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
      takes this value and divides it by the number of copies the RAID profile
      is expected to have to calculate the amount of data stripes. As a DUP
      profile is expected to have 2 copies this division resulted in 1/2 = 0.
      Later then the 'data_stripes' variable is used as a divisor in the
      stripe length calculation which results in a division by 0 and thus a
      kernel panic.
      
      When encountering a filesystem with a DUP block group and a
      'num_stripes' value unequal to 2, refuse mounting as the image is
      corrupted and will lead to unexpected behaviour.
      
      Code inspection showed a RAID1 block group has the same issues.
      
      Fixes: e06cd3dd ("Btrfs: add validadtion checks for chunk loading")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a00f7fd
    • Filipe Manana's avatar
      Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl · 6e24f5a1
      Filipe Manana authored
      commit a0873490 upstream.
      
      We are holding a transaction handle when setting an acl, therefore we can
      not allocate the xattr value buffer using GFP_KERNEL, as we could deadlock
      if reclaim is triggered by the allocation, therefore setup a nofs context.
      
      Fixes: 39a27ec1 ("btrfs: use GFP_KERNEL for xattr and acl allocations")
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e24f5a1
    • Filipe Manana's avatar
      Btrfs: setup a nofs context for memory allocation at btrfs_create_tree() · 61f92096
      Filipe Manana authored
      commit b89f6d1f upstream.
      
      We are holding a transaction handle when creating a tree, therefore we can
      not allocate the root using GFP_KERNEL, as we could deadlock if reclaim is
      triggered by the allocation, therefore setup a nofs context.
      
      Fixes: 74e4d827 ("btrfs: let callers of btrfs_alloc_root pass gfp flags")
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61f92096
    • Finn Thain's avatar
      m68k: Add -ffreestanding to CFLAGS · fcbf12e2
      Finn Thain authored
      commit 28713169 upstream.
      
      This patch fixes a build failure when using GCC 8.1:
      
      /usr/bin/ld: block/partitions/ldm.o: in function `ldm_parse_tocblock':
      block/partitions/ldm.c:153: undefined reference to `strcmp'
      
      This is caused by a new optimization which effectively replaces a
      strncmp() call with a strcmp() call. This affects a number of strncmp()
      call sites in the kernel.
      
      The entire class of optimizations is avoided with -fno-builtin, which
      gets enabled by -ffreestanding. This may avoid possible future build
      failures in case new optimizations appear in future compilers.
      
      I haven't done any performance measurements with this patch but I did
      count the function calls in a defconfig build. For example, there are now
      23 more sprintf() calls and 39 fewer strcpy() calls. The effect on the
      other libc functions is smaller.
      
      If this harms performance we can tackle that regression by optimizing
      the call sites, ideally using semantic patches. That way, clang and ICC
      builds might benfit too.
      
      Cc: stable@vger.kernel.org
      Reference: https://marc.info/?l=linux-m68k&m=154514816222244&w=2Signed-off-by: default avatarFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcbf12e2
    • Vivek Goyal's avatar
      ovl: Do not lose security.capability xattr over metadata file copy-up · 205f149f
      Vivek Goyal authored
      commit 993a0b2a upstream.
      
      If a file has been copied up metadata only, and later data is copied up,
      upper loses any security.capability xattr it has (underlying filesystem
      clears it as upon file write).
      
      From a user's point of view, this is just a file copy-up and that should
      not result in losing security.capability xattr.  Hence, before data copy
      up, save security.capability xattr (if any) and restore it on upper after
      data copy up is complete.
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Fixes: 0c288874 ("ovl: A new xattr OVL_XATTR_METACOPY for file on upper")
      Cc: <stable@vger.kernel.org> # v4.19+
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      205f149f
    • Vivek Goyal's avatar
      ovl: During copy up, first copy up data and then xattrs · 6f048ae2
      Vivek Goyal authored
      commit 5f32879e upstream.
      
      If a file with capability set (and hence security.capability xattr) is
      written kernel clears security.capability xattr. For overlay, during file
      copy up if xattrs are copied up first and then data is, copied up. This
      means data copy up will result in clearing of security.capability xattr
      file on lower has. And this can result into surprises. If a lower file has
      CAP_SETUID, then it should not be cleared over copy up (if nothing was
      actually written to file).
      
      This also creates problems with chown logic where it first copies up file
      and then tries to clear setuid bit. But by that time security.capability
      xattr is already gone (due to data copy up), and caller gets -ENODATA.
      This has been reported by Giuseppe here.
      
      https://github.com/containers/libpod/issues/2015#issuecomment-447824842
      
      Fix this by copying up data first and then metadta. This is a regression
      which has been introduced by my commit as part of metadata only copy up
      patches.
      
      TODO: There will be some corner cases where a file is copied up metadata
      only and later data copy up happens and that will clear security.capability
      xattr. Something needs to be done about that too.
      
      Fixes: bd64e575 ("ovl: During copy up, first copy up metadata and then data")
      Cc: <stable@vger.kernel.org> # v4.19+
      Reported-by: default avatarGiuseppe Scrivano <gscrivan@redhat.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f048ae2
    • Jann Horn's avatar
      splice: don't merge into linked buffers · 2af926fd
      Jann Horn authored
      commit a0ce2f0a upstream.
      
      Before this patch, it was possible for two pipes to affect each other after
      data had been transferred between them with tee():
      
      ============
      $ cat tee_test.c
      
      int main(void) {
        int pipe_a[2];
        if (pipe(pipe_a)) err(1, "pipe");
        int pipe_b[2];
        if (pipe(pipe_b)) err(1, "pipe");
        if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write");
        if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee");
        if (write(pipe_b[1], "xx", 2) != 2) err(1, "write");
      
        char buf[5];
        if (read(pipe_a[0], buf, 4) != 4) err(1, "read");
        buf[4] = 0;
        printf("got back: '%s'\n", buf);
      }
      $ gcc -o tee_test tee_test.c
      $ ./tee_test
      got back: 'abxx'
      $
      ============
      
      As suggested by Al Viro, fix it by creating a separate type for
      non-mergeable pipe buffers, then changing the types of buffers in
      splice_pipe_to_pipe() and link_pipe().
      
      Cc: <stable@vger.kernel.org>
      Fixes: 7c77f0b3 ("splice: implement pipe to pipe splicing")
      Fixes: 70524490 ("[PATCH] splice: add support for sys_tee()")
      Suggested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2af926fd
    • Varad Gautam's avatar
      fs/devpts: always delete dcache dentry-s in dput() · 1c2123ff
      Varad Gautam authored
      commit 73052b0d upstream.
      
      d_delete only unhashes an entry if it is reached with
      dentry->d_lockref.count != 1. Prior to commit 8ead9dd5 ("devpts:
      more pty driver interface cleanups"), d_delete was called on a dentry
      from devpts_pty_kill with two references held, which would trigger the
      unhashing, and the subsequent dputs would release it.
      
      Commit 8ead9dd5 reworked devpts_pty_kill to stop acquiring the second
      reference from d_find_alias, and the d_delete call left the dentries
      still on the hashed list without actually ever being dropped from dcache
      before explicit cleanup. This causes the number of negative dentries for
      devpts to pile up, and an `ls /dev/pts` invocation can take seconds to
      return.
      
      Provide always_delete_dentry() from simple_dentry_operations
      as .d_delete for devpts, to make the dentry be dropped from dcache.
      
      Without this cleanup, the number of dentries in /dev/pts/ can be grown
      arbitrarily as:
      
      `python -c 'import pty; pty.spawn(["ls", "/dev/pts"])'`
      
      A systemtap probe on dcache_readdir to count d_subdirs shows this count
      to increase with each pty spawn invocation above:
      
      probe kernel.function("dcache_readdir") {
          subdirs = &@cast($file->f_path->dentry, "dentry")->d_subdirs;
          p = subdirs;
          p = @cast(p, "list_head")->next;
          i = 0
          while (p != subdirs) {
            p = @cast(p, "list_head")->next;
            i = i+1;
          }
          printf("number of dentries: %d\n", i);
      }
      
      Fixes: 8ead9dd5 ("devpts: more pty driver interface cleanups")
      Signed-off-by: default avatarVarad Gautam <vrd@amazon.de>
      Reported-by: default avatarZheng Wang <wanz@amazon.de>
      Reported-by: default avatarBrandon Schwartz <bsschwar@amazon.de>
      Root-caused-by: default avatarMaximilian Heyne <mheyne@amazon.de>
      Root-caused-by: default avatarNicolas Pernas Maradei <npernas@amazon.de>
      CC: David Woodhouse <dwmw@amazon.co.uk>
      CC: Maximilian Heyne <mheyne@amazon.de>
      CC: Stefan Nuernberger <snu@amazon.de>
      CC: Amit Shah <aams@amazon.de>
      CC: Linus Torvalds <torvalds@linux-foundation.org>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: Al Viro <viro@ZenIV.linux.org.uk>
      CC: Christian Brauner <christian.brauner@ubuntu.com>
      CC: Eric W. Biederman <ebiederm@xmission.com>
      CC: Matthew Wilcox <willy@infradead.org>
      CC: Eric Biggers <ebiggers@google.com>
      CC: <stable@vger.kernel.org> # 4.9+
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1c2123ff
    • Himanshu Madhani's avatar
      scsi: qla2xxx: Fix LUN discovery if loop id is not assigned yet by firmware · d8ae662b
      Himanshu Madhani authored
      commit ec322937 upstream.
      
      This patch fixes LUN discovery when loop ID is not yet assigned by the
      firmware during driver load/sg_reset operations. Driver will now search for
      new loop id before retrying login.
      
      Fixes: 48acad09 ("scsi: qla2xxx: Fix N2N link re-connect")
      Cc: stable@vger.kernel.org #4.19
      Signed-off-by: default avatarHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d8ae662b
    • Bart Van Assche's avatar
      scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock · f4a9fd56
      Bart Van Assche authored
      commit 32e36bfb upstream.
      
      When using SCSI passthrough in combination with the iSCSI target driver
      then cmd->t_state_lock may be obtained from interrupt context. Hence, all
      code that obtains cmd->t_state_lock from thread context must disable
      interrupts first. This patch avoids that lockdep reports the following:
      
      WARNING: inconsistent lock state
      4.18.0-dbg+ #1 Not tainted
      --------------------------------
      inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      iscsi_ttx/1800 [HC1[1]:SC0[2]:HE0:SE0] takes:
      000000006e7b0ceb (&(&cmd->t_state_lock)->rlock){?...}, at: target_complete_cmd+0x47/0x2c0 [target_core_mod]
      {HARDIRQ-ON-W} state was registered at:
       lock_acquire+0xd2/0x260
       _raw_spin_lock+0x32/0x50
       iscsit_close_connection+0x97e/0x1020 [iscsi_target_mod]
       iscsit_take_action_for_connection_exit+0x108/0x200 [iscsi_target_mod]
       iscsi_target_rx_thread+0x180/0x190 [iscsi_target_mod]
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30
      irq event stamp: 1281
      hardirqs last  enabled at (1279): [<ffffffff970ade79>] __local_bh_enable_ip+0xa9/0x160
      hardirqs last disabled at (1281): [<ffffffff97a008a5>] interrupt_entry+0xb5/0xd0
      softirqs last  enabled at (1278): [<ffffffff977cd9a1>] lock_sock_nested+0x51/0xc0
      softirqs last disabled at (1280): [<ffffffffc07a6e04>] ip6_finish_output2+0x124/0xe40 [ipv6]
      
      other info that might help us debug this:
      Possible unsafe locking scenario:
      
            CPU0
            ----
       lock(&(&cmd->t_state_lock)->rlock);
       <Interrupt>
         lock(&(&cmd->t_state_lock)->rlock);
      f4a9fd56
    • Martin K. Petersen's avatar
      scsi: sd: Optimal I/O size should be a multiple of physical block size · 852a4ab2
      Martin K. Petersen authored
      commit a83da8a4 upstream.
      
      It was reported that some devices report an OPTIMAL TRANSFER LENGTH of
      0xFFFF blocks. That looks bogus, especially for a device with a
      4096-byte physical block size.
      
      Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's
      reported physical block size.
      
      To make the sanity checking conditionals more readable--and to
      facilitate printing warnings--relocate the checking to a helper
      function. No functional change aside from the printks.
      
      Cc: <stable@vger.kernel.org>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759Reported-by: default avatarChristoph Anton Mitterer <calestyo@scientia.net>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      852a4ab2
    • Sagar Biradar's avatar
      scsi: aacraid: Fix performance issue on logical drives · e6e738e2
      Sagar Biradar authored
      commit 0015437c upstream.
      
      Fix performance issue where the queue depth for SmartIOC logical volumes is
      set to 1, and allow the usual logical volume code to be executed
      
      Fixes: a052865f (aacraid: Set correct Queue Depth for HBA1000 RAW disks)
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSagar Biradar <Sagar.Biradar@microchip.com>
      Reviewed-by: default avatarDave Carroll <david.carroll@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6e738e2
    • Felipe Franciosi's avatar
      scsi: virtio_scsi: don't send sc payload with tmfs · bd8a0e65
      Felipe Franciosi authored
      commit 3722e6a5 upstream.
      
      The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of
      device-readable records and a single device-writable response entry:
      
          struct virtio_scsi_ctrl_tmf
          {
              // Device-readable part
              le32 type;
              le32 subtype;
              u8 lun[8];
              le64 id;
              // Device-writable part
              u8 response;
          }
      
      The above should be organised as two descriptor entries (or potentially
      more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64
      id" or after "u8 response".
      
      The Linux driver doesn't respect that, with virtscsi_abort() and
      virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf().  It
      results in the original scsi command payload (or writable buffers) added to
      the tmf.
      
      This fixes the problem by leaving cmd->sc zeroed out, which makes
      virtscsi_kick_cmd() add the tmf to the control vq without any payload.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFelipe Franciosi <felipe@nutanix.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd8a0e65
    • Halil Pasic's avatar
      s390/virtio: handle find on invalid queue gracefully · 1653307c
      Halil Pasic authored
      commit 3438b2c0 upstream.
      
      A queue with a capacity of zero is clearly not a valid virtio queue.
      Some emulators report zero queue size if queried with an invalid queue
      index. Instead of crashing in this case let us just return -ENOENT. To
      make that work properly, let us fix the notifier cleanup logic as well.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1653307c
    • Martin Schwidefsky's avatar
      s390/setup: fix early warning messages · b52bdf53
      Martin Schwidefsky authored
      commit 87276384 upstream.
      
      The setup_lowcore() function creates a new prefix page for the boot CPU.
      The PSW mask for the system_call, external interrupt, i/o interrupt and
      the program check handler have the DAT bit set in this new prefix page.
      
      At the time setup_lowcore is called the system still runs without virtual
      address translation, the paging_init() function creates the kernel page
      table and loads the CR13 with the kernel ASCE.
      
      Any code between setup_lowcore() and the end of paging_init() that has
      a BUG or WARN statement will create a program check that can not be
      handled correctly as there is no kernel page table yet.
      
      To allow early WARN statements initially setup the lowcore with DAT off
      and set the DAT bit only after paging_init() has completed.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b52bdf53
    • Samuel Holland's avatar
      clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability · e19ca3fe
      Samuel Holland authored
      commit c950ca8c upstream.
      
      The Allwinner A64 SoC is known[1] to have an unstable architectural
      timer, which manifests itself most obviously in the time jumping forward
      a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a
      timer frequency of 24 MHz, implying that the time went slightly backward
      (and this was interpreted by the kernel as it jumping forward and
      wrapping around past the epoch).
      
      Investigation revealed instability in the low bits of CNTVCT at the
      point a high bit rolls over. This leads to power-of-two cycle forward
      and backward jumps. (Testing shows that forward jumps are about twice as
      likely as backward jumps.) Since the counter value returns to normal
      after an indeterminate read, each "jump" really consists of both a
      forward and backward jump from the software perspective.
      
      Unless the kernel is trapping CNTVCT reads, a userspace program is able
      to read the register in a loop faster than it changes. A test program
      running on all 4 CPU cores that reported jumps larger than 100 ms was
      run for 13.6 hours and reported the following:
      
       Count | Event
      -------+---------------------------
        9940 | jumped backward      699ms
         268 | jumped backward     1398ms
           1 | jumped backward     2097ms
       16020 | jumped forward       175ms
        6443 | jumped forward       699ms
        2976 | jumped forward      1398ms
           9 | jumped forward    356516ms
           9 | jumped forward    357215ms
           4 | jumped forward    714430ms
           1 | jumped forward   3578440ms
      
      This works out to a jump larger than 100 ms about every 5.5 seconds on
      each CPU core.
      
      The largest jump (almost an hour!) was the following sequence of reads:
          0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000
      
      Note that the middle bits don't necessarily all read as all zeroes or
      all ones during the anomalous behavior; however the low 10 bits checked
      by the function in this patch have never been observed with any other
      value.
      
      Also note that smaller jumps are much more common, with backward jumps
      of 2048 (2^11) cycles observed over 400 times per second on each core.
      (Of course, this is partially explained by lower bits rolling over more
      frequently.) Any one of these could have caused the 95 year time skip.
      
      Similar anomalies were observed while reading CNTPCT (after patching the
      kernel to allow reads from userspace). However, the CNTPCT jumps are
      much less frequent, and only small jumps were observed. The same program
      as before (except now reading CNTPCT) observed after 72 hours:
      
       Count | Event
      -------+---------------------------
          17 | jumped backward      699ms
          52 | jumped forward       175ms
        2831 | jumped forward       699ms
           5 | jumped forward      1398ms
      
      Further investigation showed that the instability in CNTPCT/CNTVCT also
      affected the respective timer's TVAL register. The following values were
      observed immediately after writing CNVT_TVAL to 0x10000000:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
      --------------------+------------+--------------------+-----------------
       0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000
       0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000
       0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000
       0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000
      
      The pattern of errors in CNTV_TVAL seemed to depend on exactly which
      value was written to it. For example, after writing 0x10101010:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
      --------------------+------------+--------------------+-----------------
       0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000
       0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000
       0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000
       0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000
       0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000
       0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000
      
      I was also twice able to reproduce the issue covered by Allwinner's
      workaround[4], that writing to TVAL sometimes fails, and both CVAL and
      TVAL are left with entirely bogus values. One was the following values:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL
      --------------------+------------+--------------------------------------
       0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past)
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      
      ========================================================================
      
      Because the CPU can read the CNTPCT/CNTVCT registers faster than they
      change, performing two reads of the register and comparing the high bits
      (like other workarounds) is not a workable solution. And because the
      timer can jump both forward and backward, no pair of reads can
      distinguish a good value from a bad one. The only way to guarantee a
      good value from consecutive reads would be to read _three_ times, and
      take the middle value only if the three values are 1) each unique and
      2) increasing. This takes at minimum 3 counter cycles (125 ns), or more
      if an anomaly is detected.
      
      However, since there is a distinct pattern to the bad values, we can
      optimize the common case (1022/1024 of the time) to a single read by
      simply ignoring values that match the error pattern. This still takes no
      more than 3 cycles in the worst case, and requires much less code. As an
      additional safety check, we still limit the loop iteration to the number
      of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods.
      
      For the TVAL registers, the simple solution is to not use them. Instead,
      read or write the CVAL and calculate the TVAL value in software.
      
      Although the manufacturer is aware of at least part of the erratum[4],
      there is no official name for it. For now, use the kernel-internal name
      "UNKNOWN1".
      
      [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9
      [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/
      [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26
      [4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272Acked-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Tested-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: default avatarSamuel Holland <samuel@sholland.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e19ca3fe
    • Stuart Menefy's avatar
      clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown · ef8062e2
      Stuart Menefy authored
      commit d2f276c8 upstream.
      
      When shutting down the timer, ensure that after we have stopped the
      timer any pending interrupts are cleared. This fixes a problem when
      suspending, as interrupts are disabled before the timer is stopped,
      so the timer interrupt may still be asserted, preventing the system
      entering a low power state when the wfi is executed.
      Signed-off-by: default avatarStuart Menefy <stuart.menefy@mathembedded.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: <stable@vger.kernel.org> # v4.3+
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef8062e2
    • Stuart Menefy's avatar
      clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR · c1f45c10
      Stuart Menefy authored
      commit a5719a40 upstream.
      
      When a timer tick occurs and the clock is in one-shot mode, the timer
      needs to be stopped to prevent it triggering subsequent interrupts.
      Currently this code is in exynos4_mct_tick_clear(), but as it is
      only needed when an ISR occurs move it into exynos4_mct_tick_isr(),
      leaving exynos4_mct_tick_clear() just doing what its name suggests it
      should.
      Signed-off-by: default avatarStuart Menefy <stuart.menefy@mathembedded.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Cc: stable@vger.kernel.org # v4.3+
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1f45c10
    • Stuart Menefy's avatar
      regulator: s2mpa01: Fix step values for some LDOs · 06607b1b
      Stuart Menefy authored
      commit 28c4f730 upstream.
      
      The step values for some of the LDOs appears to be incorrect, resulting
      in incorrect voltages (or at least, ones which are different from the
      Samsung 3.4 vendor kernel).
      Signed-off-by: default avatarStuart Menefy <stuart.menefy@mathembedded.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06607b1b
    • Mark Zhang's avatar
      regulator: max77620: Initialize values for DT properties · c288e34d
      Mark Zhang authored
      commit 0ab66b3c upstream.
      
      If regulator DT node doesn't exist, its of_parse_cb callback
      function isn't called. Then all values for DT properties are
      filled with zero. This leads to wrong register update for
      FPS and POK settings.
      Signed-off-by: default avatarJinyoung Park <jinyoungp@nvidia.com>
      Signed-off-by: default avatarMark Zhang <markz@nvidia.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c288e34d
    • Krzysztof Kozlowski's avatar
      regulator: s2mps11: Fix steps for buck7, buck8 and LDO35 · 462aee48
      Krzysztof Kozlowski authored
      commit 56b5d4ea upstream.
      
      LDO35 uses 25 mV step, not 50 mV.  Bucks 7 and 8 use 12.5 mV step
      instead of 6.25 mV.  Wrong step caused over-voltage (LDO35) or
      under-voltage (buck7 and 8) if regulators were used (e.g. on Exynos5420
      Arndale Octa board).
      
      Cc: <stable@vger.kernel.org>
      Fixes: cb74685e ("regulator: s2mps11: Add samsung s2mps11 regulator driver")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      462aee48
    • Andy Shevchenko's avatar
      spi: pxa2xx: Setup maximum supported DMA transfer length · 15ead7e2
      Andy Shevchenko authored
      commit ef070b4e upstream.
      
      When the commit b6ced294
      
         ("spi: pxa2xx: Switch to SPI core DMA mapping functionality")
      
      switches to SPI core provided DMA helpers, it missed to setup maximum
      supported DMA transfer length for the controller and thus users
      mistakenly try to send more data than supported with the following
      warning:
      
        ili9341 spi-PRP0001:01: DMA disabled for transfer length 153600 greater than 65536
      
      Setup maximum supported DMA transfer length in order to make users know
      the limit.
      
      Fixes: b6ced294 ("spi: pxa2xx: Switch to SPI core DMA mapping functionality")
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15ead7e2
    • Vignesh R's avatar
      spi: ti-qspi: Fix mmap read when more than one CS in use · e51c5ec9
      Vignesh R authored
      commit 673c865e upstream.
      
      Commit 4dea6c9b ("spi: spi-ti-qspi: add mmap mode read support") has
      has got order of parameter wrong when calling regmap_update_bits() to
      select CS for mmap access. Mask and value arguments are interchanged.
      Code will work on a system with single slave, but fails when more than
      one CS is in use. Fix this by correcting the order of parameters when
      calling regmap_update_bits().
      
      Fixes: 4dea6c9b ("spi: spi-ti-qspi: add mmap mode read support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVignesh R <vigneshr@ti.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e51c5ec9
    • Anders Roxell's avatar
      netfilter: ipt_CLUSTERIP: fix warning unused variable cn · 0d98ecb1
      Anders Roxell authored
      commit 206b8cc5 upstream.
      
      When CONFIG_PROC_FS isn't set the variable cn isn't used.
      
      net/ipv4/netfilter/ipt_CLUSTERIP.c: In function ‘clusterip_net_exit’:
      net/ipv4/netfilter/ipt_CLUSTERIP.c:849:24: warning: unused variable ‘cn’ [-Wunused-variable]
        struct clusterip_net *cn = clusterip_pernet(net);
                              ^~
      
      Rework so the variable 'cn' is declared inside "#ifdef CONFIG_PROC_FS".
      
      Fixes: b12f7bad ("netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d98ecb1
    • Jiong Wu's avatar
      mmc:fix a bug when max_discard is 0 · 6bd9959a
      Jiong Wu authored
      commit d4721339 upstream.
      
      The original purpose of the code I fix is to replace max_discard with
      max_trim if max_trim is less than max_discard. When max_discard is 0
      we should replace max_discard with max_trim as well, because
      max_discard equals 0 happens only when the max_do_calc_max_discard
      process is overflowed, so if mmc_can_trim(card) is true, max_discard
      should be replaced by an available max_trim.
      However, in the original code, there are two lines of code interfere
      the right process.
      1) if (max_discard && mmc_can_trim(card))
      when max_discard is 0, it skips the process checking if max_discard
      needs to be replaced with max_trim.
      2) if (max_trim < max_discard)
      the condition is false when max_discard is 0. it also skips the process
      that replaces max_discard with max_trim, in fact, we should replace the
      0-valued max_discard with max_trim.
      Signed-off-by: default avatarJiong Wu <Lohengrin1024@gmail.com>
      Fixes: b305882f (mmc: core: optimize mmc_calc_max_discard)
      Cc: stable@vger.kernel.org # v4.17+
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bd9959a
    • BOUGH CHEN's avatar
      mmc: sdhci-esdhc-imx: fix HS400 timing issue · 2946910e
      BOUGH CHEN authored
      commit de0a0dec upstream.
      
      Now tuning reset will be done when the timing is MMC_TIMING_LEGACY/
      MMC_TIMING_MMC_HS/MMC_TIMING_SD_HS. But for timing MMC_TIMING_MMC_HS,
      we can not do tuning reset, otherwise HS400 timing is not right.
      
      Here is the process of init HS400, first finish tuning in HS200 mode,
      then switch to HS mode and 8 bit DDR mode, finally switch to HS400
      mode. If we do tuning reset in HS mode, this will cause HS400 mode
      lost the tuning setting, which will cause CRC error.
      Signed-off-by: default avatarHaibo Chen <haibo.chen@nxp.com>
      Cc: stable@vger.kernel.org # v4.12+
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Fixes: d9370424 ("mmc: sdhci-esdhc-imx: reset tuning circuit when power on mmc card")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2946910e
    • Andy Shevchenko's avatar
      ACPI / device_sysfs: Avoid OF modalias creation for removed device · c19b9673
      Andy Shevchenko authored
      commit f16eb8a4 upstream.
      
      If SSDT overlay is loaded via ConfigFS and then unloaded the device,
      we would like to have OF modalias for, already gone. Thus, acpi_get_name()
      returns no allocated buffer for such case and kernel crashes afterwards:
      
       ACPI: Host-directed Dynamic ACPI Table Unload
       ads7950 spi-PRP0001:00: Dropping the link to regulator.0
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
       #PF error: [normal kernel read fault]
       PGD 80000000070d6067 P4D 80000000070d6067 PUD 70d0067 PMD 0
       Oops: 0000 [#1] SMP PTI
       CPU: 0 PID: 40 Comm: kworker/u4:2 Not tainted 5.0.0+ #96
       Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
       Workqueue: kacpi_hotplug acpi_device_del_work_fn
       RIP: 0010:create_of_modalias.isra.1+0x4c/0x150
       Code: 00 00 48 89 44 24 18 31 c0 48 8d 54 24 08 48 c7 44 24 10 00 00 00 00 48 c7 44 24 08 ff ff ff ff e8 7a b0 03 00 48 8b 4c 24 10 <0f> b6 01 84 c0 74 27 48 c7 c7 00 09 f4 a5 0f b6 f0 8d 50 20 f6 04
       RSP: 0000:ffffa51040297c10 EFLAGS: 00010246
       RAX: 0000000000001001 RBX: 0000000000000785 RCX: 0000000000000000
       RDX: 0000000000001001 RSI: 0000000000000286 RDI: ffffa2163dc042e0
       RBP: ffffa216062b1196 R08: 0000000000001001 R09: ffffa21639873000
       R10: ffffffffa606761d R11: 0000000000000001 R12: ffffa21639873218
       R13: ffffa2163deb5060 R14: ffffa216063d1010 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffffa2163e000000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000000 CR3: 0000000007114000 CR4: 00000000001006f0
       Call Trace:
        __acpi_device_uevent_modalias+0xb0/0x100
        spi_uevent+0xd/0x40
      
       ...
      
      In order to fix above let create_of_modalias() check the status returned
      by acpi_get_name() and bail out in case of failure.
      
      Fixes: 8765c5ba ("ACPI / scan: Rework modalias creation when "compatible" is present")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201381Reported-by: default avatarFerry Toth <fntoth@gmail.com>
      Tested-by: Ferry Toth<fntoth@gmail.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Cc: 4.1+ <stable@vger.kernel.org> # 4.1+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c19b9673
    • Juergen Gross's avatar
      xen: fix dom0 boot on huge systems · 468ff43f
      Juergen Gross authored
      commit 01bd2ac2 upstream.
      
      Commit f7c90c2a ("x86/xen: don't write ptes directly in 32-bit
      PV guests") introduced a regression for booting dom0 on huge systems
      with lots of RAM (in the TB range).
      
      Reason is that on those hosts the p2m list needs to be moved early in
      the boot process and this requires temporary page tables to be created.
      Said commit modified xen_set_pte_init() to use a hypercall for writing
      a PTE, but this requires the page table being in the direct mapped
      area, which is not the case for the temporary page tables used in
      xen_relocate_p2m().
      
      As the page tables are completely written before being linked to the
      actual address space instead of set_pte() a plain write to memory can
      be used in xen_relocate_p2m().
      
      Fixes: f7c90c2a ("x86/xen: don't write ptes directly in 32-bit PV guests")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      468ff43f
    • Jann Horn's avatar
      tracing/perf: Use strndup_user() instead of buggy open-coded version · 24d50976
      Jann Horn authored
      commit 83540fbc upstream.
      
      The first version of this method was missing the check for
      `ret == PATH_MAX`; then such a check was added, but it didn't call kfree()
      on error, so there was still a small memory leak in the error case.
      Fix it by using strndup_user() instead of open-coding it.
      
      Link: http://lkml.kernel.org/r/20190220165443.152385-1-jannh@google.com
      
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: 0eadcc7a ("perf/core: Fix perf_uprobe_init()")
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24d50976
    • zhangyi (F)'s avatar
      tracing: Do not free iter->trace in fail path of tracing_open_pipe() · f27077e5
      zhangyi (F) authored
      commit e7f0c424 upstream.
      
      Commit d716ff71 ("tracing: Remove taking of trace_types_lock in
      pipe files") use the current tracer instead of the copy in
      tracing_open_pipe(), but it forget to remove the freeing sentence in
      the error path.
      
      There's an error path that can call kfree(iter->trace) after the iter->trace
      was assigned to tr->current_trace, which would be bad to free.
      
      Link: http://lkml.kernel.org/r/1550060946-45984-1-git-send-email-yi.zhang@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: d716ff71 ("tracing: Remove taking of trace_types_lock in pipe files")
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f27077e5
    • Tom Zanussi's avatar
      tracing: Use strncpy instead of memcpy for string keys in hist triggers · ebca08d7
      Tom Zanussi authored
      commit 9f0bbf31 upstream.
      
      Because there may be random garbage beyond a string's null terminator,
      it's not correct to copy the the complete character array for use as a
      hist trigger key.  This results in multiple histogram entries for the
      'same' string key.
      
      So, in the case of a string key, use strncpy instead of memcpy to
      avoid copying in the extra bytes.
      
      Before, using the gdbus entries in the following hist trigger as an
      example:
      
        # echo 'hist:key=comm' > /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
        # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist
      
        ...
      
        { comm: ImgDecoder #4                      } hitcount:        203
        { comm: gmain                              } hitcount:        213
        { comm: gmain                              } hitcount:        216
        { comm: StreamTrans #73                    } hitcount:        221
        { comm: mozStorage #3                      } hitcount:        230
        { comm: gdbus                              } hitcount:        233
        { comm: StyleThread#5                      } hitcount:        253
        { comm: gdbus                              } hitcount:        256
        { comm: gdbus                              } hitcount:        260
        { comm: StyleThread#4                      } hitcount:        271
      
        ...
      
        # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
        51
      
      After:
      
        # cat /sys/kernel/debug/tracing/events/sched/sched_waking/hist | egrep gdbus | wc -l
        1
      
      Link: http://lkml.kernel.org/r/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com
      
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: 79e577cb ("tracing: Support string type key properly")
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ebca08d7
    • Pavel Shilovsky's avatar
      CIFS: Fix read after write for files with read caching · 43eaa6cc
      Pavel Shilovsky authored
      commit 6dfbd846 upstream.
      
      When we have a READ lease for a file and have just issued a write
      operation to the server we need to purge the cache and set oplock/lease
      level to NONE to avoid reading stale data. Currently we do that
      only if a write operation succedeed thus not covering cases when
      a request was sent to the server but a negative error code was
      returned later for some other reasons (e.g. -EIOCBQUEUED or -EINTR).
      Fix this by turning off caching regardless of the error code being
      returned.
      
      The patches fixes generic tests 075 and 112 from the xfs-tests.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43eaa6cc
    • Pavel Shilovsky's avatar
      CIFS: Do not skip SMB2 message IDs on send failures · dc8e8ad9
      Pavel Shilovsky authored
      commit c781af7e upstream.
      
      When we hit failures during constructing MIDs or sending PDUs
      through the network, we end up not using message IDs assigned
      to the packet. The next SMB packet will skip those message IDs
      and continue with the next one. This behavior may lead to a server
      not granting us credits until we use the skipped IDs. Fix this by
      reverting the current ID to the original value if any errors occur
      before we push the packet through the network stack.
      
      This patch fixes the generic/310 test from the xfs-tests.
      
      Cc: <stable@vger.kernel.org> # 4.19.x
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc8e8ad9
    • Pavel Shilovsky's avatar
      CIFS: Do not reset lease state to NONE on lease break · 3ed9f22e
      Pavel Shilovsky authored
      commit 7b9b9edb upstream.
      
      Currently on lease break the client sets a caching level twice:
      when oplock is detected and when oplock is processed. While the
      1st attempt sets the level to the value provided by the server,
      the 2nd one resets the level to None unconditionally.
      This happens because the oplock/lease processing code was changed
      to avoid races between page cache flushes and oplock breaks.
      The commit c11f1df5 ("cifs: Wait for writebacks to complete
      before attempting write.") fixed the races for oplocks but didn't
      apply the same changes for leases resulting in overwriting the
      server granted value to None. Fix this by properly processing
      lease breaks.
      Signed-off-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ed9f22e
    • Ard Biesheuvel's avatar
      crypto: arm64/aes-ccm - fix bugs in non-NEON fallback routine · 41e2d1c4
      Ard Biesheuvel authored
      commit 969e2f59 upstream.
      
      Commit 5092fcf3 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic
      fallback") introduced C fallback code to replace the NEON routines
      when invoked from a context where the NEON is not available (i.e.,
      from the context of a softirq taken while the NEON is already being
      used in kernel process context)
      
      Fix two logical flaws in the MAC calculation of the associated data.
      Reported-by: default avatarEric Biggers <ebiggers@kernel.org>
      Fixes: 5092fcf3 ("crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41e2d1c4
    • Ard Biesheuvel's avatar
      crypto: arm64/aes-ccm - fix logical bug in AAD MAC handling · d5a5bded
      Ard Biesheuvel authored
      commit eaf46edf upstream.
      
      The NEON MAC calculation routine fails to handle the case correctly
      where there is some data in the buffer, and the input fills it up
      exactly. In this case, we enter the loop at the end with w8 == 0,
      while a negative value is assumed, and so the loop carries on until
      the increment of the 32-bit counter wraps around, which is quite
      obviously wrong.
      
      So omit the loop altogether in this case, and exit right away.
      Reported-by: default avatarEric Biggers <ebiggers@kernel.org>
      Fixes: a3fd8210 ("arm64/crypto: AES in CCM mode using ARMv8 Crypto ...")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5a5bded
    • Eric Biggers's avatar
      crypto: x86/morus - fix handling chunked inputs and MAY_SLEEP · 66700c89
      Eric Biggers authored
      commit 2060e284 upstream.
      
      The x86 MORUS implementations all fail the improved AEAD tests because
      they produce the wrong result with some data layouts.  The issue is that
      they assume that if the skcipher_walk API gives 'nbytes' not aligned to
      the walksize (a.k.a. walk.stride), then it is the end of the data.  In
      fact, this can happen before the end.
      
      Also, when the CRYPTO_TFM_REQ_MAY_SLEEP flag is given, they can
      incorrectly sleep in the skcipher_walk_*() functions while preemption
      has been disabled by kernel_fpu_begin().
      
      Fix these bugs.
      
      Fixes: 56e8e57f ("crypto: morus - Add common SIMD glue code for MORUS")
      Cc: <stable@vger.kernel.org> # v4.18+
      Cc: Ondrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66700c89
    • Eric Biggers's avatar
      crypto: x86/aesni-gcm - fix crash on empty plaintext · 8a9fcf4a
      Eric Biggers authored
      commit 3af34963 upstream.
      
      gcmaes_crypt_by_sg() dereferences the NULL pointer returned by
      scatterwalk_ffwd() when encrypting an empty plaintext and the source
      scatterlist ends immediately after the associated data.
      
      Fix it by only fast-forwarding to the src/dst data scatterlists if the
      data length is nonzero.
      
      This bug is reproduced by the "rfc4543(gcm(aes))" test vectors when run
      with the new AEAD test manager.
      
      Fixes: e8455207 ("crypto: aesni - Update aesni-intel_glue to use scatter/gather")
      Cc: <stable@vger.kernel.org> # v4.17+
      Cc: Dave Watson <davejwatson@fb.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a9fcf4a