1. 01 Oct, 2020 40 commits
    • Douglas Anderson's avatar
      bdev: Reduce time holding bd_mutex in sync in blkdev_close() · b6256c29
      Douglas Anderson authored
      [ Upstream commit b849dd84 ]
      
      While trying to "dd" to the block device for a USB stick, I
      encountered a hung task warning (blocked for > 120 seconds).  I
      managed to come up with an easy way to reproduce this on my system
      (where /dev/sdb is the block device for my USB stick) with:
      
        while true; do dd if=/dev/zero of=/dev/sdb bs=4M; done
      
      With my reproduction here are the relevant bits from the hung task
      detector:
      
       INFO: task udevd:294 blocked for more than 122 seconds.
       ...
       udevd           D    0   294      1 0x00400008
       Call trace:
        ...
        mutex_lock_nested+0x40/0x50
        __blkdev_get+0x7c/0x3d4
        blkdev_get+0x118/0x138
        blkdev_open+0x94/0xa8
        do_dentry_open+0x268/0x3a0
        vfs_open+0x34/0x40
        path_openat+0x39c/0xdf4
        do_filp_open+0x90/0x10c
        do_sys_open+0x150/0x3c8
        ...
      
       ...
       Showing all locks held in the system:
       ...
       1 lock held by dd/2798:
        #0: ffffff814ac1a3b8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x50/0x204
       ...
       dd              D    0  2798   2764 0x00400208
       Call trace:
        ...
        schedule+0x8c/0xbc
        io_schedule+0x1c/0x40
        wait_on_page_bit_common+0x238/0x338
        __lock_page+0x5c/0x68
        write_cache_pages+0x194/0x500
        generic_writepages+0x64/0xa4
        blkdev_writepages+0x24/0x30
        do_writepages+0x48/0xa8
        __filemap_fdatawrite_range+0xac/0xd8
        filemap_write_and_wait+0x30/0x84
        __blkdev_put+0x88/0x204
        blkdev_put+0xc4/0xe4
        blkdev_close+0x28/0x38
        __fput+0xe0/0x238
        ____fput+0x1c/0x28
        task_work_run+0xb0/0xe4
        do_notify_resume+0xfc0/0x14bc
        work_pending+0x8/0x14
      
      The problem appears related to the fact that my USB disk is terribly
      slow and that I have a lot of RAM in my system to cache things.
      Specifically my writes seem to be happening at ~15 MB/s and I've got
      ~4 GB of RAM in my system that can be used for buffering.  To write 4
      GB of buffer to disk thus takes ~4000 MB / ~15 MB/s = ~267 seconds.
      
      The 267 second number is a problem because in __blkdev_put() we call
      sync_blockdev() while holding the bd_mutex.  Any other callers who
      want the bd_mutex will be blocked for the whole time.
      
      The problem is made worse because I believe blkdev_put() specifically
      tells other tasks (namely udev) to go try to access the device at right
      around the same time we're going to hold the mutex for a long time.
      
      Putting some traces around this (after disabling the hung task detector),
      I could confirm:
       dd:    437.608600: __blkdev_put() right before sync_blockdev() for sdb
       udevd: 437.623901: blkdev_open() right before blkdev_get() for sdb
       dd:    661.468451: __blkdev_put() right after sync_blockdev() for sdb
       udevd: 663.820426: blkdev_open() right after blkdev_get() for sdb
      
      A simple fix for this is to realize that sync_blockdev() works fine if
      you're not holding the mutex.  Also, it's not the end of the world if
      you sync a little early (though it can have performance impacts).
      Thus we can make a guess that we're going to need to do the sync and
      then do it without holding the mutex.  We still do one last sync with
      the mutex but it should be much, much faster.
      
      With this, my hung task warnings for my test case are gone.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGuenter Roeck <groeck@chromium.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b6256c29
    • Steve Rutherford's avatar
      KVM: Remove CREATE_IRQCHIP/SET_PIT2 race · 2c035666
      Steve Rutherford authored
      [ Upstream commit 7289fdb5 ]
      
      Fixes a NULL pointer dereference, caused by the PIT firing an interrupt
      before the interrupt table has been initialized.
      
      SET_PIT2 can race with the creation of the IRQchip. In particular,
      if SET_PIT2 is called with a low PIT timer period (after the creation of
      the IOAPIC, but before the instantiation of the irq routes), the PIT can
      fire an interrupt at an uninitialized table.
      Signed-off-by: default avatarSteve Rutherford <srutherford@google.com>
      Signed-off-by: default avatarJon Cargille <jcargill@google.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Message-Id: <20200416191152.259434-1-jcargill@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2c035666
    • Raviteja Narayanam's avatar
      serial: uartps: Wait for tx_empty in console setup · 99e4fecd
      Raviteja Narayanam authored
      [ Upstream commit 42e11948 ]
      
      On some platforms, the log is corrupted while console is being
      registered. It is observed that when set_termios is called, there
      are still some bytes in the FIFO to be transmitted.
      
      So, wait for tx_empty inside cdns_uart_console_setup before calling
      set_termios.
      Signed-off-by: default avatarRaviteja Narayanam <raviteja.narayanam@xilinx.com>
      Reviewed-by: default avatarShubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
      Link: https://lore.kernel.org/r/1586413563-29125-2-git-send-email-raviteja.narayanam@xilinx.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      99e4fecd
    • Nilesh Javali's avatar
      scsi: qedi: Fix termination timeouts in session logout · b860a828
      Nilesh Javali authored
      [ Upstream commit b9b97e69 ]
      
      The destroy connection ramrod timed out during session logout.  Fix the
      wait delay for graceful vs abortive termination as per the FW requirements.
      
      Link: https://lore.kernel.org/r/20200408064332.19377-7-mrangankar@marvell.comReviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarManish Rangankar <mrangankar@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b860a828
    • Jaewon Kim's avatar
      mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area · 6bee7991
      Jaewon Kim authored
      [ Upstream commit 09ef5283 ]
      
      On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
      arch_get_unmapped_area_topdown did not set align_offset.  Internally on
      both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
      then info->align_offset was meaningless.
      
      But commit df529cab ("mm: mmap: add trace point of
      vm_unmapped_area") always prints info->align_offset even though it is
      uninitialized.
      
      Fix this uninitialized value issue by setting it to 0 explicitly.
      
      Before:
        vm_unmapped_area: addr=0x755b155000 err=0 total_vm=0x15aaf0 flags=0x1 len=0x109000 lo=0x8000 hi=0x75eed48000 mask=0x0 ofs=0x4022
      
      After:
        vm_unmapped_area: addr=0x74a4ca1000 err=0 total_vm=0x168ab1 flags=0x1 len=0x9000 lo=0x8000 hi=0x753d94b000 mask=0x0 ofs=0x0
      Signed-off-by: default avatarJaewon Kim <jaewon31.kim@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Borislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/20200409094035.19457-1-jaewon31.kim@samsung.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6bee7991
    • Israel Rukshin's avatar
      nvmet-rdma: fix double free of rdma queue · 5fd750e8
      Israel Rukshin authored
      [ Upstream commit 21f90243 ]
      
      In case rdma accept fails at nvmet_rdma_queue_connect(), release work is
      scheduled. Later on, a new RDMA CM event may arrive since we didn't
      destroy the cm-id and call nvmet_rdma_queue_connect_fail(), which
      schedule another release work. This will cause calling
      nvmet_rdma_free_queue twice. To fix this we implicitly destroy the cm_id
      with non-zero ret code, which guarantees that new rdma_cm events will
      not arrive afterwards. Also add a qp pointer to nvmet_rdma_queue
      structure, so we can use it when the cm_id pointer is NULL or was
      destroyed.
      Signed-off-by: default avatarIsrael Rukshin <israelr@mellanox.com>
      Suggested-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5fd750e8
    • Qian Cai's avatar
      mm/vmscan.c: fix data races using kswapd_classzone_idx · b73c7440
      Qian Cai authored
      [ Upstream commit 5644e1fb ]
      
      pgdat->kswapd_classzone_idx could be accessed concurrently in
      wakeup_kswapd().  Plain writes and reads without any lock protection
      result in data races.  Fix them by adding a pair of READ|WRITE_ONCE() as
      well as saving a branch (compilers might well optimize the original code
      in an unintentional way anyway).  While at it, also take care of
      pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim().  The
      data races were reported by KCSAN,
      
       BUG: KCSAN: data-race in wakeup_kswapd / wakeup_kswapd
      
       write to 0xffff9f427ffff2dc of 4 bytes by task 7454 on cpu 13:
        wakeup_kswapd+0xf1/0x400
        wakeup_kswapd at mm/vmscan.c:3967
        wake_all_kswapds+0x59/0xc0
        wake_all_kswapds at mm/page_alloc.c:4241
        __alloc_pages_slowpath+0xdcc/0x1290
        __alloc_pages_slowpath at mm/page_alloc.c:4512
        __alloc_pages_nodemask+0x3bb/0x450
        alloc_pages_vma+0x8a/0x2c0
        do_anonymous_page+0x16e/0x6f0
        __handle_mm_fault+0xcd5/0xd40
        handle_mm_fault+0xfc/0x2f0
        do_page_fault+0x263/0x6f9
        page_fault+0x34/0x40
      
       1 lock held by mtest01/7454:
        #0: ffff9f425afe8808 (&mm->mmap_sem#2){++++}, at:
       do_page_fault+0x143/0x6f9
       do_user_addr_fault at arch/x86/mm/fault.c:1405
       (inlined by) do_page_fault at arch/x86/mm/fault.c:1539
       irq event stamp: 6944085
       count_memcg_event_mm+0x1a6/0x270
       count_memcg_event_mm+0x119/0x270
       __do_softirq+0x34c/0x57c
       irq_exit+0xa2/0xc0
      
       read to 0xffff9f427ffff2dc of 4 bytes by task 7472 on cpu 38:
        wakeup_kswapd+0xc8/0x400
        wake_all_kswapds+0x59/0xc0
        __alloc_pages_slowpath+0xdcc/0x1290
        __alloc_pages_nodemask+0x3bb/0x450
        alloc_pages_vma+0x8a/0x2c0
        do_anonymous_page+0x16e/0x6f0
        __handle_mm_fault+0xcd5/0xd40
        handle_mm_fault+0xfc/0x2f0
        do_page_fault+0x263/0x6f9
        page_fault+0x34/0x40
      
       1 lock held by mtest01/7472:
        #0: ffff9f425a9ac148 (&mm->mmap_sem#2){++++}, at:
       do_page_fault+0x143/0x6f9
       irq event stamp: 6793561
       count_memcg_event_mm+0x1a6/0x270
       count_memcg_event_mm+0x119/0x270
       __do_softirq+0x34c/0x57c
       irq_exit+0xa2/0xc0
      
       BUG: KCSAN: data-race in kswapd / wakeup_kswapd
      
       write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu 6:
        kswapd+0x27c/0x8d0
        kthread+0x1e0/0x200
        ret_from_fork+0x27/0x50
      
       read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu 0:
        wakeup_kswapd+0xf3/0x450
        wake_all_kswapds+0x59/0xc0
        __alloc_pages_slowpath+0xdcc/0x1290
        __alloc_pages_nodemask+0x3bb/0x450
        alloc_pages_vma+0x8a/0x2c0
        do_anonymous_page+0x170/0x700
        __handle_mm_fault+0xc9f/0xd00
        handle_mm_fault+0xfc/0x2f0
        do_page_fault+0x263/0x6f9
        page_fault+0x34/0x40
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Link: http://lkml.kernel.org/r/1582749472-5171-1-git-send-email-cai@lca.pwSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b73c7440
    • Xianting Tian's avatar
      mm/filemap.c: clear page error before actual read · cebefe4f
      Xianting Tian authored
      [ Upstream commit faffdfa0 ]
      
      Mount failure issue happens under the scenario: Application forked dozens
      of threads to mount the same number of cramfs images separately in docker,
      but several mounts failed with high probability.  Mount failed due to the
      checking result of the page(read from the superblock of loop dev) is not
      uptodate after wait_on_page_locked(page) returned in function cramfs_read:
      
         wait_on_page_locked(page);
         if (!PageUptodate(page)) {
            ...
         }
      
      The reason of the checking result of the page not uptodate: systemd-udevd
      read the loopX dev before mount, because the status of loopX is Lo_unbound
      at this time, so loop_make_request directly trigger the calling of io_end
      handler end_buffer_async_read, which called SetPageError(page).  So It
      caused the page can't be set to uptodate in function
      end_buffer_async_read:
      
         if(page_uptodate && !PageError(page)) {
            SetPageUptodate(page);
         }
      
      Then mount operation is performed, it used the same page which is just
      accessed by systemd-udevd above, Because this page is not uptodate, it
      will launch a actual read via submit_bh, then wait on this page by calling
      wait_on_page_locked(page).  When the I/O of the page done, io_end handler
      end_buffer_async_read is called, because no one cleared the page
      error(during the whole read path of mount), which is caused by
      systemd-udevd reading, so this page is still in "PageError" status, which
      can't be set to uptodate in function end_buffer_async_read, then caused
      mount failure.
      
      But sometimes mount succeed even through systemd-udeved read loopX dev
      just before, The reason is systemd-udevd launched other loopX read just
      between step 3.1 and 3.2, the steps as below:
      
      1, loopX dev default status is Lo_unbound;
      2, systemd-udved read loopX dev (page is set to PageError);
      3, mount operation
         1) set loopX status to Lo_bound;
         ==>systemd-udevd read loopX dev<==
         2) read loopX dev(page has no error)
         3) mount succeed
      
      As the loopX dev status is set to Lo_bound after step 3.1, so the other
      loopX dev read by systemd-udevd will go through the whole I/O stack, part
      of the call trace as below:
      
         SYS_read
            vfs_read
                do_sync_read
                    blkdev_aio_read
                       generic_file_aio_read
                           do_generic_file_read:
                              ClearPageError(page);
                              mapping->a_ops->readpage(filp, page);
      
      here, mapping->a_ops->readpage() is blkdev_readpage.  In latest kernel,
      some function name changed, the call trace as below:
      
         blkdev_read_iter
            generic_file_read_iter
               generic_file_buffered_read:
                  /*
                   * A previous I/O error may have been due to temporary
                   * failures, eg. mutipath errors.
                   * Pg_error will be set again if readpage fails.
                   */
                  ClearPageError(page);
                  /* Start the actual read. The read will unlock the page*/
                  error=mapping->a_ops->readpage(flip, page);
      
      We can see ClearPageError(page) is called before the actual read,
      then the read in step 3.2 succeed.
      
      This patch is to add the calling of ClearPageError just before the actual
      read of read path of cramfs mount.  Without the patch, the call trace as
      below when performing cramfs mount:
      
         do_mount
            cramfs_read
               cramfs_blkdev_read
                  read_cache_page
                     do_read_cache_page:
                        filler(data, page);
                        or
                        mapping->a_ops->readpage(data, page);
      
      With the patch, the call trace as below when performing mount:
      
         do_mount
            cramfs_read
               cramfs_blkdev_read
                  read_cache_page:
                     do_read_cache_page:
                        ClearPageError(page); <== new add
                        filler(data, page);
                        or
                        mapping->a_ops->readpage(data, page);
      
      With the patch, mount operation trigger the calling of
      ClearPageError(page) before the actual read, the page has no error if no
      additional page error happen when I/O done.
      Signed-off-by: default avatarXianting Tian <xianting_tian@126.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: <yubin@h3c.com>
      Link: http://lkml.kernel.org/r/1583318844-22971-1-git-send-email-xianting_tian@126.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cebefe4f
    • Nathan Chancellor's avatar
      mm/kmemleak.c: use address-of operator on section symbols · afe00148
      Nathan Chancellor authored
      [ Upstream commit b0d14fc4 ]
      
      Clang warns:
      
        mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
              if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
                                        ^
        mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
              if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
      
      These are not true arrays, they are linker defined symbols, which are just
      addresses.  Using the address of operator silences the warning and does
      not change the resulting assembly with either clang/ld.lld or gcc/ld
      (tested with diff + objdump -Dr).
      Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Link: https://github.com/ClangBuiltLinux/linux/issues/895
      Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      afe00148
    • Trond Myklebust's avatar
      NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests() · 1f39a7cc
      Trond Myklebust authored
      [ Upstream commit 08ca8b21 ]
      
      When a subrequest is being detached from the subgroup, we want to
      ensure that it is not holding the group lock, or in the process
      of waiting for the group lock.
      
      Fixes: 5b2b5187 ("NFS: Fix nfs_page_group_destroy() and nfs_lock_and_join_requests() race cases")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f39a7cc
    • Stuart Hayes's avatar
      PCI: pciehp: Fix MSI interrupt race · a8cc5227
      Stuart Hayes authored
      [ Upstream commit 8edf5332 ]
      
      Without this commit, a PCIe hotplug port can stop generating interrupts on
      hotplug events, so device adds and removals will not be seen:
      
      The pciehp interrupt handler pciehp_isr() reads the Slot Status register
      and then writes back to it to clear the bits that caused the interrupt.  If
      a different interrupt event bit gets set between the read and the write,
      pciehp_isr() returns without having cleared all of the interrupt event
      bits.  If this happens when the MSI isn't masked (which by default it isn't
      in handle_edge_irq(), and which it will never be when MSI per-vector
      masking is not supported), we won't get any more hotplug interrupts from
      that device.
      
      That is expected behavior, according to the PCIe Base Spec r5.0, section
      6.7.3.4, "Software Notification of Hot-Plug Events".
      
      Because the Presence Detect Changed and Data Link Layer State Changed event
      bits can both get set at nearly the same time when a device is added or
      removed, this is more likely to happen than it might seem.  The issue was
      found (and can be reproduced rather easily) by connecting and disconnecting
      an NVMe storage device on at least one system model where the NVMe devices
      were being connected to an AMD PCIe port (PCI device 0x1022/0x1483).
      
      Fix the issue by modifying pciehp_isr() to loop back and re-read the Slot
      Status register immediately after writing to it, until it sees that all of
      the event status bits have been cleared.
      
      [lukas: drop loop count limitation, write "events" instead of "status",
      don't loop back in INTx and poll modes, tweak code comment & commit msg]
      Link: https://lore.kernel.org/r/78b4ced5072bfe6e369d20e8b47c279b8c7af12e.1582121613.git.lukas@wunner.deTested-by: default avatarStuart Hayes <stuart.w.hayes@gmail.com>
      Signed-off-by: default avatarStuart Hayes <stuart.w.hayes@gmail.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a8cc5227
    • Andreas Steinmetz's avatar
      ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor · 65d95462
      Andreas Steinmetz authored
      [ Upstream commit 5c6cd702 ]
      
      The Miditech MIDIFACE 16x16 (USB ID 1290:1749) has more than one extra
      endpoint descriptor.
      
      The first extra descriptor is: 0x06 0x30 0x00 0x00 0x00 0x00
      
      As the code in snd_usbmidi_get_ms_info() looks only at the
      first extra descriptor to find USB_DT_CS_ENDPOINT the device
      as such is recognized but there is neither input nor output
      configured.
      
      The patch iterates through the extra descriptors to find the
      proper one. With this patch the device is correctly configured.
      Signed-off-by: default avatarAndreas Steinmetz <ast@domdv.de>
      Link: https://lore.kernel.org/r/1c3b431a86f69e1d60745b6110cdb93c299f120b.camel@domdv.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      65d95462
    • Liu Song's avatar
      ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len · 2f0a77cc
      Liu Song authored
      [ Upstream commit acc5af3e ]
      
      In “ubifs_check_node”, when the value of "node_len" is abnormal,
      the code will goto label of "out_len" for execution. Then, in the
      following "ubifs_dump_node", if inode type is "UBIFS_DATA_NODE",
      in "print_hex_dump", an out-of-bounds access may occur due to the
      wrong "ch->len".
      
      Therefore, when the value of "node_len" is abnormal, data length
      should to be adjusted to a reasonable safe range. At this time,
      structured data is not credible, so dump the corrupted data directly
      for analysis.
      Signed-off-by: default avatarLiu Song <liu.song11@zte.com.cn>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2f0a77cc
    • Mikel Rychliski's avatar
      PCI: Use ioremap(), not phys_to_virt() for platform ROM · 1841a993
      Mikel Rychliski authored
      [ Upstream commit 72e0ef0e ]
      
      On some EFI systems, the video BIOS is provided by the EFI firmware.  The
      boot stub code stores the physical address of the ROM image in pdev->rom.
      Currently we attempt to access this pointer using phys_to_virt(), which
      doesn't work with CONFIG_HIGHMEM.
      
      On these systems, attempting to load the radeon module on a x86_32 kernel
      can result in the following:
      
        BUG: unable to handle page fault for address: 3e8ed03c
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        *pde = 00000000
        Oops: 0000 [#1] PREEMPT SMP
        CPU: 0 PID: 317 Comm: systemd-udevd Not tainted 5.6.0-rc3-next-20200228 #2
        Hardware name: Apple Computer, Inc. MacPro1,1/Mac-F4208DC8, BIOS     MP11.88Z.005C.B08.0707021221 07/02/07
        EIP: radeon_get_bios+0x5ed/0xe50 [radeon]
        Code: 00 00 84 c0 0f 85 12 fd ff ff c7 87 64 01 00 00 00 00 00 00 8b 47 08 8b 55 b0 e8 1e 83 e1 d6 85 c0 74 1a 8b 55 c0 85 d2 74 13 <80> 38 55 75 0e 80 78 01 aa 0f 84 a4 03 00 00 8d 74 26 00 68 dc 06
        EAX: 3e8ed03c EBX: 00000000 ECX: 3e8ed03c EDX: 00010000
        ESI: 00040000 EDI: eec04000 EBP: eef3fc60 ESP: eef3fbe0
        DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010206
        CR0: 80050033 CR2: 3e8ed03c CR3: 2ec77000 CR4: 000006d0
        Call Trace:
         r520_init+0x26/0x240 [radeon]
         radeon_device_init+0x533/0xa50 [radeon]
         radeon_driver_load_kms+0x80/0x220 [radeon]
         drm_dev_register+0xa7/0x180 [drm]
         radeon_pci_probe+0x10f/0x1a0 [radeon]
         pci_device_probe+0xd4/0x140
      
      Fix the issue by updating all drivers which can access a platform provided
      ROM. Instead of calling the helper function pci_platform_rom() which uses
      phys_to_virt(), call ioremap() directly on the pdev->rom.
      
      radeon_read_platform_bios() previously directly accessed an __iomem
      pointer. Avoid this by calling memcpy_fromio() instead of kmemdup().
      
      pci_platform_rom() now has no remaining callers, so remove it.
      
      Link: https://lore.kernel.org/r/20200319021623.5426-1-mikel@mikelr.comSigned-off-by: default avatarMikel Rychliski <mikel@mikelr.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1841a993
    • Chuck Lever's avatar
      svcrdma: Fix leak of transport addresses · 308aeb36
      Chuck Lever authored
      [ Upstream commit 1a33d8a2 ]
      
      Kernel memory leak detected:
      
      unreferenced object 0xffff888849cdf480 (size 8):
        comm "kworker/u8:3", pid 2086, jiffies 4297898756 (age 4269.856s)
        hex dump (first 8 bytes):
          30 00 cd 49 88 88 ff ff                          0..I....
        backtrace:
          [<00000000acfc370b>] __kmalloc_track_caller+0x137/0x183
          [<00000000a2724354>] kstrdup+0x2b/0x43
          [<0000000082964f84>] xprt_rdma_format_addresses+0x114/0x17d [rpcrdma]
          [<00000000dfa6ed00>] xprt_setup_rdma_bc+0xc0/0x10c [rpcrdma]
          [<0000000073051a83>] xprt_create_transport+0x3f/0x1a0 [sunrpc]
          [<0000000053531a8e>] rpc_create+0x118/0x1cd [sunrpc]
          [<000000003a51b5f8>] setup_callback_client+0x1a5/0x27d [nfsd]
          [<000000001bd410af>] nfsd4_process_cb_update.isra.7+0x16c/0x1ac [nfsd]
          [<000000007f4bbd56>] nfsd4_run_cb_work+0x4c/0xbd [nfsd]
          [<0000000055c5586b>] process_one_work+0x1b2/0x2fe
          [<00000000b1e3e8ef>] worker_thread+0x1a6/0x25a
          [<000000005205fb78>] kthread+0xf6/0xfb
          [<000000006d2dc057>] ret_from_fork+0x3a/0x50
      
      Introduce a call to xprt_rdma_free_addresses() similar to the way
      that the TCP backchannel releases a transport's peer address
      strings.
      
      Fixes: 5d252f90 ("svcrdma: Add class for RDMA backwards direction transport")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      308aeb36
    • Christophe JAILLET's avatar
      SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()' · 38c46471
      Christophe JAILLET authored
      [ Upstream commit b25b60d7 ]
      
      'maxlen' is the total size of the destination buffer. There is only one
      caller and this value is 256.
      
      When we compute the size already used and what we would like to add in
      the buffer, the trailling NULL character is not taken into account.
      However, this trailling character will be added by the 'strcat' once we
      have checked that we have enough place.
      
      So, there is a off-by-one issue and 1 byte of the stack could be
      erroneously overwridden.
      
      Take into account the trailling NULL, when checking if there is enough
      place in the destination buffer.
      
      While at it, also replace a 'sprintf' by a safer 'snprintf', check for
      output truncation and avoid a superfluous 'strlen'.
      
      Fixes: dc9a16e4 ("svc: Add /proc/sys/sunrpc/transport files")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      [ cel: very minor fix to documenting comment
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      38c46471
    • Don Brace's avatar
      scsi: hpsa: correct race condition in offload enabled · b125a752
      Don Brace authored
      [ Upstream commit 3e16e83a ]
      
      Correct race condition where ioaccel is re-enabled before the raid_map is
      updated. For RAID_1, RAID_1ADM, and RAID 5/6 there is a BUG_ON called which
      is bad.
      
       - Change event thread to disable ioaccel only. Send all requests down the
         RAID path instead.
      
       - Have rescan thread handle offload_enable.
      
       - Since there is only one rescan allowed at a time, turning
         offload_enabled on/off should not be racy. Each handler queues up a
         rescan if one is already in progress.
      
        - For timing diagram, offload_enabled is initially off due to a change
          (transformation: splitmirror/remirror), ...
      
        otbe = offload_to_be_enabled
        oe   = offload_enabled
      
        Time Event         Rescan              Completion     Request
             Worker        Worker              Thread         Thread
        ---- ------        ------              ----------     -------
         T0   |             |                       + UA      |
         T1   |             + rescan started        | 0x3f    |
         T2   + Event       |                       | 0x0e    |
         T3   + Ack msg     |                       |         |
         T4   |             + if (!dev[i]->oe &&    |         |
         T5   |             |     dev[i]->otbe)     |         |
         T6   |             |      get_raid_map     |         |
         T7   + otbe = 1    |                       |         |
         T8   |             |                       |         |
         T9   |             + oe = otbe             |         |
         T10  |             |                       |         + ioaccel request
         T11                                                  * BUG_ON
      
        T0 - I/O completion with UA 0x3f 0x0e sets rescan flag.
        T1 - rescan worker thread starts a rescan.
        T2 - event comes in
        T3 - event thread starts and issues "Acknowledge" message
        ...
        T6 - rescan thread has bypassed code to reload new raid map.
        ...
        T7 - event thread runs and sets offload_to_be_enabled
        ...
        T9 - rescan thread turns on offload_enabled.
        T10- request comes in and goes down ioaccel path.
        T11- BUG_ON.
      
       - After the patch is applied, ioaccel_enabled can only be re-enabled in
         the re-scan thread.
      
      Link: https://lore.kernel.org/r/158472877894.14200.7077843399036368335.stgit@brunhildaReviewed-by: default avatarScott Teel <scott.teel@microsemi.com>
      Reviewed-by: default avatarMatt Perricone <matt.perricone@microsemi.com>
      Reviewed-by: default avatarScott Benesh <scott.benesh@microsemi.com>
      Signed-off-by: default avatarDon Brace <don.brace@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b125a752
    • Zhu Yanjun's avatar
      RDMA/rxe: Set sys_image_guid to be aligned with HW IB devices · db96986c
      Zhu Yanjun authored
      [ Upstream commit d0ca2c35 ]
      
      The RXE driver doesn't set sys_image_guid and user space applications see
      zeros. This causes to pyverbs tests to fail with the following traceback,
      because the IBTA spec requires to have valid sys_image_guid.
      
       Traceback (most recent call last):
         File "./tests/test_device.py", line 51, in test_query_device
           self.verify_device_attr(attr)
         File "./tests/test_device.py", line 74, in verify_device_attr
           assert attr.sys_image_guid != 0
      
      In order to fix it, set sys_image_guid to be equal to node_guid.
      
      Before:
       5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
       0000:0000:0000:0000
      
      After:
       5: rxe0: ... node_guid 5054:00ff:feaa:5363 sys_image_guid
       5054:00ff:feaa:5363
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Link: https://lore.kernel.org/r/20200323112800.1444784-1-leon@kernel.orgSigned-off-by: default avatarZhu Yanjun <yanjunz@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      db96986c
    • Israel Rukshin's avatar
      nvme: Fix controller creation races with teardown flow · 6f7baf41
      Israel Rukshin authored
      [ Upstream commit ce151813 ]
      
      Calling nvme_sysfs_delete() when the controller is in the middle of
      creation may cause several bugs. If the controller is in NEW state we
      remove delete_controller file and don't delete the controller. The user
      will not be able to use nvme disconnect command on that controller again,
      although the controller may be active. Other bugs may happen if the
      controller is in the middle of create_ctrl callback and
      nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at
      nvme_do_delete_ctrl() before it was allocated at create_ctrl callback.
      
      To fix all those races don't allow the user to delete the controller
      before it was fully created.
      Signed-off-by: default avatarIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6f7baf41
    • John Meneghini's avatar
      nvme-multipath: do not reset on unknown status · b3dc81c1
      John Meneghini authored
      [ Upstream commit 764e9332 ]
      
      The nvme multipath error handling defaults to controller reset if the
      error is unknown. There are, however, no existing nvme status codes that
      indicate a reset should be used, and resetting causes unnecessary
      disruption to the rest of IO.
      
      Change nvme's error handling to first check if failover should happen.
      If not, let the normal error handling take over rather than reset the
      controller.
      Based-on-a-patch-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJohn Meneghini <johnm@netapp.com>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b3dc81c1
    • Gabriel Ravier's avatar
      tools: gpio-hammer: Avoid potential overflow in main · 1d0e4829
      Gabriel Ravier authored
      [ Upstream commit d1ee7e1f ]
      
      If '-o' was used more than 64 times in a single invocation of gpio-hammer,
      this could lead to an overflow of the 'lines' array. This commit fixes
      this by avoiding the overflow and giving a proper diagnostic back to the
      user
      Signed-off-by: default avatarGabriel Ravier <gabravier@gmail.com>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d0e4829
    • Pratik Rajesh Sampat's avatar
      cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn · 68aaf039
      Pratik Rajesh Sampat authored
      [ Upstream commit d95fe371 ]
      
      The patch avoids allocating cpufreq_policy on stack hence fixing frame
      size overflow in 'powernv_cpufreq_work_fn'
      
      Fixes: 22794280 ("cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling")
      Signed-off-by: default avatarPratik Rajesh Sampat <psampat@linux.ibm.com>
      Reviewed-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200316135743.57735-1-psampat@linux.ibm.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      68aaf039
    • Christophe JAILLET's avatar
      perf cpumap: Fix snprintf overflow check · 9a1d2d2e
      Christophe JAILLET authored
      [ Upstream commit d74b181a ]
      
      'snprintf' returns the number of characters which would be generated for
      the given input.
      
      If the returned value is *greater than* or equal to the buffer size, it
      means that the output has been truncated.
      
      Fix the overflow test accordingly.
      
      Fixes: 7780c25b ("perf tools: Allow ability to map cpus to nodes easily")
      Fixes: 92a7e127 ("perf cpumap: Add cpu__max_present_cpu()")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Suggested-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: He Zhe <zhe.he@windriver.com>
      Cc: Jan Stancek <jstancek@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-janitors@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200324070319.10901-1-christophe.jaillet@wanadoo.frSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9a1d2d2e
    • Vignesh Raghavendra's avatar
      serial: 8250: 8250_omap: Terminate DMA before pushing data on RX timeout · 69077bd8
      Vignesh Raghavendra authored
      [ Upstream commit 7cf4df30 ]
      
      Terminate and flush DMA internal buffers, before pushing RX data to
      higher layer. Otherwise, this will lead to data corruption, as driver
      would end up pushing stale buffer data to higher layer while actual data
      is still stuck inside DMA hardware and has yet not arrived at the
      memory.
      While at that, replace deprecated dmaengine_terminate_all() with
      dmaengine_terminate_async().
      Signed-off-by: default avatarVignesh Raghavendra <vigneshr@ti.com>
      Link: https://lore.kernel.org/r/20200319110344.21348-2-vigneshr@ti.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      69077bd8
    • Peter Ujfalusi's avatar
      serial: 8250_omap: Fix sleeping function called from invalid context during probe · 10aa90fe
      Peter Ujfalusi authored
      [ Upstream commit 4ce35a36 ]
      
      When booting j721e the following bug is printed:
      
      [    1.154821] BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
      [    1.154827] in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 12, name: kworker/0:1
      [    1.154832] 3 locks held by kworker/0:1/12:
      [    1.154836]  #0: ffff000840030728 ((wq_completion)events){+.+.}, at: process_one_work+0x1d4/0x6e8
      [    1.154852]  #1: ffff80001214fdd8 (deferred_probe_work){+.+.}, at: process_one_work+0x1d4/0x6e8
      [    1.154860]  #2: ffff00084060b170 (&dev->mutex){....}, at: __device_attach+0x38/0x138
      [    1.154872] irq event stamp: 63096
      [    1.154881] hardirqs last  enabled at (63095): [<ffff800010b74318>] _raw_spin_unlock_irqrestore+0x70/0x78
      [    1.154887] hardirqs last disabled at (63096): [<ffff800010b740d8>] _raw_spin_lock_irqsave+0x28/0x80
      [    1.154893] softirqs last  enabled at (62254): [<ffff800010080c88>] _stext+0x488/0x564
      [    1.154899] softirqs last disabled at (62247): [<ffff8000100fdb3c>] irq_exit+0x114/0x140
      [    1.154906] CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.6.0-rc6-next-20200318-00094-g45e4089b0bd3 #221
      [    1.154911] Hardware name: Texas Instruments K3 J721E SoC (DT)
      [    1.154917] Workqueue: events deferred_probe_work_func
      [    1.154923] Call trace:
      [    1.154928]  dump_backtrace+0x0/0x190
      [    1.154933]  show_stack+0x14/0x20
      [    1.154940]  dump_stack+0xe0/0x148
      [    1.154946]  ___might_sleep+0x150/0x1f0
      [    1.154952]  __might_sleep+0x4c/0x80
      [    1.154957]  wait_for_completion_timeout+0x40/0x140
      [    1.154964]  ti_sci_set_device_state+0xa0/0x158
      [    1.154969]  ti_sci_cmd_get_device_exclusive+0x14/0x20
      [    1.154977]  ti_sci_dev_start+0x34/0x50
      [    1.154984]  genpd_runtime_resume+0x78/0x1f8
      [    1.154991]  __rpm_callback+0x3c/0x140
      [    1.154996]  rpm_callback+0x20/0x80
      [    1.155001]  rpm_resume+0x568/0x758
      [    1.155007]  __pm_runtime_resume+0x44/0xb0
      [    1.155013]  omap8250_probe+0x2b4/0x508
      [    1.155019]  platform_drv_probe+0x50/0xa0
      [    1.155023]  really_probe+0xd4/0x318
      [    1.155028]  driver_probe_device+0x54/0xe8
      [    1.155033]  __device_attach_driver+0x80/0xb8
      [    1.155039]  bus_for_each_drv+0x74/0xc0
      [    1.155044]  __device_attach+0xdc/0x138
      [    1.155049]  device_initial_probe+0x10/0x18
      [    1.155053]  bus_probe_device+0x98/0xa0
      [    1.155058]  deferred_probe_work_func+0x74/0xb0
      [    1.155063]  process_one_work+0x280/0x6e8
      [    1.155068]  worker_thread+0x48/0x430
      [    1.155073]  kthread+0x108/0x138
      [    1.155079]  ret_from_fork+0x10/0x18
      
      To fix the bug we need to first call pm_runtime_enable() prior to any
      pm_runtime calls.
      Reported-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Link: https://lore.kernel.org/r/20200320125200.6772-1-peter.ujfalusi@ti.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      10aa90fe
    • Vignesh Raghavendra's avatar
      serial: 8250_port: Don't service RX FIFO if throttled · 20191760
      Vignesh Raghavendra authored
      [ Upstream commit f19c3f6c ]
      
      When port's throttle callback is called, it should stop pushing any more
      data into TTY buffer to avoid buffer overflow. This means driver has to
      stop HW from receiving more data and assert the HW flow control. For
      UARTs with auto HW flow control (such as 8250_omap) manual assertion of
      flow control line is not possible and only way is to allow RX FIFO to
      fill up, thus trigger auto HW flow control logic.
      
      Therefore make sure that 8250 generic IRQ handler does not drain data
      when port is stopped (i.e UART_LSR_DR is unset in read_status_mask). Not
      servicing, RX FIFO would trigger auto HW flow control when FIFO
      occupancy reaches preset threshold, thus halting RX.
      Since, error conditions in UART_LSR register are cleared just by reading
      the register, data has to be drained in case there are FIFO errors, else
      error information will lost.
      Signed-off-by: default avatarVignesh Raghavendra <vigneshr@ti.com>
      Link: https://lore.kernel.org/r/20200319103230.16867-2-vigneshr@ti.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      20191760
    • Ian Rogers's avatar
      perf parse-events: Fix 3 use after frees found with clang ASAN · a0100a36
      Ian Rogers authored
      [ Upstream commit d4953f7e ]
      
      Reproducible with a clang asan build and then running perf test in
      particular 'Parse event definition strings'.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: clang-built-linux@googlegroups.com
      Link: http://lore.kernel.org/lkml/20200314170356.62914-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0100a36
    • Niklas Söderlund's avatar
      thermal: rcar_thermal: Handle probe error gracefully · 9d8b5dba
      Niklas Söderlund authored
      [ Upstream commit 39056e8a ]
      
      If the common register memory resource is not available the driver needs
      to fail gracefully to disable PM. Instead of returning the error
      directly store it in ret and use the already existing error path.
      Signed-off-by: default avatarNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Reviewed-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/r/20200310114709.1483860-1-niklas.soderlund+renesas@ragnatech.seSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      9d8b5dba
    • Nathan Chancellor's avatar
      tracing: Use address-of operator on section symbols · b92d156a
      Nathan Chancellor authored
      [ Upstream commit bf2cbe04 ]
      
      Clang warns:
      
      ../kernel/trace/trace.c:9335:33: warning: array comparison always
      evaluates to true [-Wtautological-compare]
              if (__stop___trace_bprintk_fmt != __start___trace_bprintk_fmt)
                                             ^
      1 warning generated.
      
      These are not true arrays, they are linker defined symbols, which are
      just addresses. Using the address of operator silences the warning and
      does not change the runtime result of the check (tested with some print
      statements compiled in with clang + ld.lld and gcc + ld.bfd in QEMU).
      
      Link: http://lkml.kernel.org/r/20200220051011.26113-1-natechancellor@gmail.com
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/893Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b92d156a
    • Jordan Crouse's avatar
      drm/msm/a5xx: Always set an OPP supported hardware value · 102bdec1
      Jordan Crouse authored
      [ Upstream commit 0478b4fc ]
      
      If the opp table specifies opp-supported-hw as a property but the driver
      has not set a supported hardware value the OPP subsystem will reject
      all the table entries.
      
      Set a "default" value that will match the default table entries but not
      conflict with any possible real bin values. Also fix a small memory leak
      and free the buffer allocated by nvmem_cell_read().
      Signed-off-by: default avatarJordan Crouse <jcrouse@codeaurora.org>
      Reviewed-by: default avatarEric Anholt <eric@anholt.net>
      Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      102bdec1
    • Pavel Machek's avatar
      drm/msm: fix leaks if initialization fails · 45e61801
      Pavel Machek authored
      [ Upstream commit 66be340f ]
      
      We should free resources in unlikely case of allocation failure.
      Signed-off-by: default avatarPavel Machek <pavel@denx.de>
      Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      45e61801
    • Gustavo Romero's avatar
      KVM: PPC: Book3S HV: Treat TM-related invalid form instructions on P9 like the valid ones · 22840383
      Gustavo Romero authored
      [ Upstream commit 1dff3064 ]
      
      On P9 DD2.2 due to a CPU defect some TM instructions need to be emulated by
      KVM. This is handled at first by the hardware raising a softpatch interrupt
      when certain TM instructions that need KVM assistance are executed in the
      guest. Althought some TM instructions per Power ISA are invalid forms they
      can raise a softpatch interrupt too. For instance, 'tresume.' instruction
      as defined in the ISA must have bit 31 set (1), but an instruction that
      matches 'tresume.' PO and XO opcode fields but has bit 31 not set (0), like
      0x7cfe9ddc, also raises a softpatch interrupt. Similarly for 'treclaim.'
      and 'trechkpt.' instructions with bit 31 = 0, i.e. 0x7c00075c and
      0x7c0007dc, respectively. Hence, if a code like the following is executed
      in the guest it will raise a softpatch interrupt just like a 'tresume.'
      when the TM facility is enabled ('tabort. 0' in the example is used only
      to enable the TM facility):
      
      int main() { asm("tabort. 0; .long 0x7cfe9ddc;"); }
      
      Currently in such a case KVM throws a complete trace like:
      
      [345523.705984] WARNING: CPU: 24 PID: 64413 at arch/powerpc/kvm/book3s_hv_tm.c:211 kvmhv_p9_tm_emulation+0x68/0x620 [kvm_hv]
      [345523.705985] Modules linked in: kvm_hv(E) xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat
      iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ebtable_filter ebtables ip6table_filter
      ip6_tables iptable_filter bridge stp llc sch_fq_codel ipmi_powernv at24 vmx_crypto ipmi_devintf ipmi_msghandler
      ibmpowernv uio_pdrv_genirq kvm opal_prd uio leds_powernv ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp
      libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456
      async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear tg3
      crct10dif_vpmsum crc32c_vpmsum ipr [last unloaded: kvm_hv]
      [345523.706030] CPU: 24 PID: 64413 Comm: CPU 0/KVM Tainted: G        W   E     5.5.0+ #1
      [345523.706031] NIP:  c0080000072cb9c0 LR: c0080000072b5e80 CTR: c0080000085c7850
      [345523.706034] REGS: c000000399467680 TRAP: 0700   Tainted: G        W   E      (5.5.0+)
      [345523.706034] MSR:  900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24022428  XER: 00000000
      [345523.706042] CFAR: c0080000072b5e7c IRQMASK: 0
                      GPR00: c0080000072b5e80 c000000399467910 c0080000072db500 c000000375ccc720
                      GPR04: c000000375ccc720 00000003fbec0000 0000a10395dda5a6 0000000000000000
                      GPR08: 000000007cfe9ddc 7cfe9ddc000005dc 7cfe9ddc7c0005dc c0080000072cd530
                      GPR12: c0080000085c7850 c0000003fffeb800 0000000000000001 00007dfb737f0000
                      GPR16: c0002001edcca558 0000000000000000 0000000000000000 0000000000000001
                      GPR20: c000000001b21258 c0002001edcca558 0000000000000018 0000000000000000
                      GPR24: 0000000001000000 ffffffffffffffff 0000000000000001 0000000000001500
                      GPR28: c0002001edcc4278 c00000037dd80000 800000050280f033 c000000375ccc720
      [345523.706062] NIP [c0080000072cb9c0] kvmhv_p9_tm_emulation+0x68/0x620 [kvm_hv]
      [345523.706065] LR [c0080000072b5e80] kvmppc_handle_exit_hv.isra.53+0x3e8/0x798 [kvm_hv]
      [345523.706066] Call Trace:
      [345523.706069] [c000000399467910] [c000000399467940] 0xc000000399467940 (unreliable)
      [345523.706071] [c000000399467950] [c000000399467980] 0xc000000399467980
      [345523.706075] [c0000003994679f0] [c0080000072bd1c4] kvmhv_run_single_vcpu+0xa1c/0xb80 [kvm_hv]
      [345523.706079] [c000000399467ac0] [c0080000072bd8e0] kvmppc_vcpu_run_hv+0x5b8/0xb00 [kvm_hv]
      [345523.706087] [c000000399467b90] [c0080000085c93cc] kvmppc_vcpu_run+0x34/0x48 [kvm]
      [345523.706095] [c000000399467bb0] [c0080000085c582c] kvm_arch_vcpu_ioctl_run+0x244/0x420 [kvm]
      [345523.706101] [c000000399467c40] [c0080000085b7498] kvm_vcpu_ioctl+0x3d0/0x7b0 [kvm]
      [345523.706105] [c000000399467db0] [c0000000004adf9c] ksys_ioctl+0x13c/0x170
      [345523.706107] [c000000399467e00] [c0000000004adff8] sys_ioctl+0x28/0x80
      [345523.706111] [c000000399467e20] [c00000000000b278] system_call+0x5c/0x68
      [345523.706112] Instruction dump:
      [345523.706114] 419e0390 7f8a4840 409d0048 6d497c00 2f89075d 419e021c 6d497c00 2f8907dd
      [345523.706119] 419e01c0 6d497c00 2f8905dd 419e00a4 <0fe00000> 38210040 38600000 ebc1fff0
      
      and then treats the executed instruction as a 'nop'.
      
      However the POWER9 User's Manual, in section "4.6.10 Book II Invalid
      Forms", informs that for TM instructions bit 31 is in fact ignored, thus
      for the TM-related invalid forms ignoring bit 31 and handling them like the
      valid forms is an acceptable way to handle them. POWER8 behaves the same
      way too.
      
      This commit changes the handling of the cases here described by treating
      the TM-related invalid forms that can generate a softpatch interrupt
      just like their valid forms (w/ bit 31 = 1) instead of as a 'nop' and by
      gently reporting any other unrecognized case to the host and treating it as
      illegal instruction instead of throwing a trace and treating it as a 'nop'.
      Signed-off-by: default avatarGustavo Romero <gromero@linux.ibm.com>
      Reviewed-by: default avatarSegher Boessenkool <segher@kernel.crashing.org>
      Acked-By: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarLeonardo Bras <leonardo@linux.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      22840383
    • Jason Gunthorpe's avatar
      RDMA/cm: Remove a race freeing timewait_info · 851eba10
      Jason Gunthorpe authored
      [ Upstream commit bede86a3 ]
      
      When creating a cm_id during REQ the id immediately becomes visible to the
      other MAD handlers, and shortly after the state is moved to IB_CM_REQ_RCVD
      
      This allows cm_rej_handler() to run concurrently and free the work:
      
              CPU 0                                CPU1
       cm_req_handler()
        ib_create_cm_id()
        cm_match_req()
          id_priv->state = IB_CM_REQ_RCVD
                                             cm_rej_handler()
                                               cm_acquire_id()
                                               spin_lock(&id_priv->lock)
                                               switch (id_priv->state)
        					   case IB_CM_REQ_RCVD:
                                                  cm_reset_to_idle()
                                                   kfree(id_priv->timewait_info);
         goto destroy
        destroy:
          kfree(id_priv->timewait_info);
                                                   id_priv->timewait_info = NULL
      
      Causing a double free or worse.
      
      Do not free the timewait_info without also holding the
      id_priv->lock. Simplify this entire flow by making the free unconditional
      during cm_destroy_id() and removing the confusing special case error
      unwind during creation of the timewait_info.
      
      This also fixes a leak of the timewait if cm_destroy_id() is called in
      IB_CM_ESTABLISHED with an XRC TGT QP. The state machine will be left in
      ESTABLISHED while it needed to transition through IB_CM_TIMEWAIT to
      release the timewait pointer.
      
      Also fix a leak of the timewait_info if the caller mis-uses the API and
      does ib_send_cm_reqs().
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Link: https://lore.kernel.org/r/20200310092545.251365-4-leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      851eba10
    • Trond Myklebust's avatar
      nfsd: Don't add locks to closed or closing open stateids · 1ab250aa
      Trond Myklebust authored
      [ Upstream commit a451b123 ]
      
      In NFSv4, the lock stateids are tied to the lockowner, and the open stateid,
      so that the action of closing the file also results in either an automatic
      loss of the locks, or an error of the form NFS4ERR_LOCKS_HELD.
      
      In practice this means we must not add new locks to the open stateid
      after the close process has been invoked. In fact doing so, can result
      in the following panic:
      
       kernel BUG at lib/list_debug.c:51!
       invalid opcode: 0000 [#1] SMP NOPTI
       CPU: 2 PID: 1085 Comm: nfsd Not tainted 5.6.0-rc3+ #2
       Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.14410784.B64.1908150010 08/15/2019
       RIP: 0010:__list_del_entry_valid.cold+0x31/0x55
       Code: 1a 3d 9b e8 74 10 c2 ff 0f 0b 48 c7 c7 f0 1a 3d 9b e8 66 10 c2 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 b0 1a 3d 9b e8 52 10 c2 ff <0f> 0b 48 89 fe 4c 89 c2 48 c7 c7 78 1a 3d 9b e8 3e 10 c2 ff 0f 0b
       RSP: 0018:ffffb296c1d47d90 EFLAGS: 00010246
       RAX: 0000000000000054 RBX: ffff8ba032456ec8 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: ffff8ba039e99cc8 RDI: ffff8ba039e99cc8
       RBP: ffff8ba032456e60 R08: 0000000000000781 R09: 0000000000000003
       R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ba009a4abe0
       R13: ffff8ba032456e8c R14: 0000000000000000 R15: ffff8ba00adb01d8
       FS:  0000000000000000(0000) GS:ffff8ba039e80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fb213f0b008 CR3: 00000001347de006 CR4: 00000000003606e0
       Call Trace:
        release_lock_stateid+0x2b/0x80 [nfsd]
        nfsd4_free_stateid+0x1e9/0x210 [nfsd]
        nfsd4_proc_compound+0x414/0x700 [nfsd]
        ? nfs4svc_decode_compoundargs+0x407/0x4c0 [nfsd]
        nfsd_dispatch+0xc1/0x200 [nfsd]
        svc_process_common+0x476/0x6f0 [sunrpc]
        ? svc_sock_secure_port+0x12/0x30 [sunrpc]
        ? svc_recv+0x313/0x9c0 [sunrpc]
        ? nfsd_svc+0x2d0/0x2d0 [nfsd]
        svc_process+0xd4/0x110 [sunrpc]
        nfsd+0xe3/0x140 [nfsd]
        kthread+0xf9/0x130
        ? nfsd_destroy+0x50/0x50 [nfsd]
        ? kthread_park+0x90/0x90
        ret_from_fork+0x1f/0x40
      
      The fix is to ensure that lock creation tests for whether or not the
      open stateid is unhashed, and to fail if that is the case.
      
      Fixes: 659aefb6 ("nfsd: Ensure we don't recognise lock stateids after freeing them")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ab250aa
    • Alexandre Belloni's avatar
      rtc: ds1374: fix possible race condition · 142513a2
      Alexandre Belloni authored
      [ Upstream commit c11af813 ]
      
      The RTC IRQ is requested before the struct rtc_device is allocated,
      this may lead to a NULL pointer dereference in the IRQ handler.
      
      To fix this issue, allocating the rtc_device struct before requesting
      the RTC IRQ using devm_rtc_allocate_device, and use rtc_register_device
      to register the RTC device.
      
      Link: https://lore.kernel.org/r/20200306073404.56921-1-alexandre.belloni@bootlin.comSigned-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      142513a2
    • Alexandre Belloni's avatar
      rtc: sa1100: fix possible race condition · e934a66d
      Alexandre Belloni authored
      [ Upstream commit f2997775 ]
      
      Both RTC IRQs are requested before the struct rtc_device is allocated,
      this may lead to a NULL pointer dereference in the IRQ handler.
      
      To fix this issue, allocating the rtc_device struct before requesting
      the IRQs using devm_rtc_allocate_device, and use rtc_register_device
      to register the RTC device.
      
      Link: https://lore.kernel.org/r/20200306010146.39762-1-alexandre.belloni@bootlin.comSigned-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e934a66d
    • Stefan Berger's avatar
      tpm: ibmvtpm: Wait for buffer to be set before proceeding · abc5b427
      Stefan Berger authored
      [ Upstream commit d8d74ea3 ]
      
      Synchronize with the results from the CRQs before continuing with
      the initialization. This avoids trying to send TPM commands while
      the rtce buffer has not been allocated, yet.
      
      This patch fixes an existing race condition that may occurr if the
      hypervisor does not quickly respond to the VTPM_GET_RTCE_BUFFER_SIZE
      request sent during initialization and therefore the ibmvtpm->rtce_buf
      has not been allocated at the time the first TPM command is sent.
      
      Fixes: 132f7629 ("drivers/char/tpm: Add new device driver to support IBM vTPM")
      Signed-off-by: default avatarStefan Berger <stefanb@linux.ibm.com>
      Acked-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Tested-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      abc5b427
    • Dmitry Monakhov's avatar
      ext4: mark block bitmap corrupted when found instead of BUGON · ff331054
      Dmitry Monakhov authored
      [ Upstream commit eb576086 ]
      
      We already has similar code in ext4_mb_complex_scan_group(), but
      ext4_mb_simple_scan_group() still affected.
      
      Other reports: https://www.spinics.net/lists/linux-ext4/msg60231.htmlReviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@gmail.com>
      Link: https://lore.kernel.org/r/20200310150156.641-1-dmonakhov@gmail.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff331054
    • Darrick J. Wong's avatar
      xfs: mark dir corrupt when lookup-by-hash fails · 7fff3f7f
      Darrick J. Wong authored
      [ Upstream commit 2e107cf8 ]
      
      In xchk_dir_actor, we attempt to validate the directory hash structures
      by performing a directory entry lookup by (hashed) name.  If the lookup
      returns ENOENT, that means that the hash information is corrupt.  The
      _process_error functions don't catch this, so we have to add that
      explicitly.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7fff3f7f
    • Darrick J. Wong's avatar
      xfs: don't ever return a stale pointer from __xfs_dir3_free_read · 6ab959f1
      Darrick J. Wong authored
      [ Upstream commit 1cb5deb5 ]
      
      If we decide that a directory free block is corrupt, we must take care
      not to leak a buffer pointer to the caller.  After xfs_trans_brelse
      returns, the buffer can be freed or reused, which means that we have to
      set *bpp back to NULL.
      
      Callers are supposed to notice the nonzero return value and not use the
      buffer pointer, but we should code more defensively, even if all current
      callers handle this situation correctly.
      
      Fixes: de14c5f5 ("xfs: verify free block header fields")
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6ab959f1