1. 03 Sep, 2020 40 commits
    • Li Jun's avatar
      usb: host: xhci: fix ep context print mismatch in debugfs · 567e1a91
      Li Jun authored
      commit 0077b1b2 upstream.
      
      dci is 0 based and xhci_get_ep_ctx() will do ep index increment to get
      the ep context.
      
      [rename dci to ep_index -Mathias]
      Cc: stable <stable@vger.kernel.org> # v4.15+
      Fixes: 02b6fdc2 ("usb: xhci: Add debugfs interface for xHCI driver")
      Signed-off-by: default avatarLi Jun <jun.li@nxp.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Link: https://lore.kernel.org/r/20200821091549.20556-2-mathias.nyman@linux.intel.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      567e1a91
    • Thomas Gleixner's avatar
      XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN... · 32a4f37b
      Thomas Gleixner authored
      XEN uses irqdesc::irq_data_common::handler_data to store a per interrupt XEN data pointer which contains XEN specific information.
      
      commit c330fb1d upstream.
      
      handler data is meant for interrupt handlers and not for storing irq chip
      specific information as some devices require handler data to store internal
      per interrupt information, e.g. pinctrl/GPIO chained interrupt handlers.
      
      This obviously creates a conflict of interests and crashes the machine
      because the XEN pointer is overwritten by the driver pointer.
      
      As the XEN data is not handler specific it should be stored in
      irqdesc::irq_data::chip_data instead.
      
      A simple sed s/irq_[sg]et_handler_data/irq_[sg]et_chip_data/ cures that.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRoman Shaposhnik <roman@zededa.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarRoman Shaposhnik <roman@zededa.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Link: https://lore.kernel.org/r/87lfi2yckt.fsf@nanos.tec.linutronix.deSigned-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      32a4f37b
    • Jan Kara's avatar
      writeback: Fix sync livelock due to b_dirty_time processing · 7c3d77a3
      Jan Kara authored
      commit f9cae926 upstream.
      
      When we are processing writeback for sync(2), move_expired_inodes()
      didn't set any inode expiry value (older_than_this). This can result in
      writeback never completing if there's steady stream of inodes added to
      b_dirty_time list as writeback rechecks dirty lists after each writeback
      round whether there's more work to be done. Fix the problem by using
      sync(2) start time is inode expiry value when processing b_dirty_time
      list similarly as for ordinarily dirtied inodes. This requires some
      refactoring of older_than_this handling which simplifies the code
      noticeably as a bonus.
      
      Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option")
      CC: stable@vger.kernel.org
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c3d77a3
    • Jan Kara's avatar
      writeback: Avoid skipping inode writeback · be0937e0
      Jan Kara authored
      commit 5afced3b upstream.
      
      Inode's i_io_list list head is used to attach inode to several different
      lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker
      prepares a list of inodes to writeback e.g. for sync(2), it moves inodes
      to b_io list. Thus it is critical for sync(2) data integrity guarantees
      that inode is not requeued to any other writeback list when inode is
      queued for processing by flush worker. That's the reason why
      writeback_single_inode() does not touch i_io_list (unless the inode is
      completely clean) and why __mark_inode_dirty() does not touch i_io_list
      if I_SYNC flag is set.
      
      However there are two flaws in the current logic:
      
      1) When inode has only I_DIRTY_TIME set but it is already queued in b_io
      list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC)
      can still move inode back to b_dirty list resulting in skipping
      writeback of inode time stamps during sync(2).
      
      2) When inode is on b_dirty_time list and writeback_single_inode() races
      with __mark_inode_dirty() like:
      
      writeback_single_inode()		__mark_inode_dirty(inode, I_DIRTY_PAGES)
        inode->i_state |= I_SYNC
        __writeback_single_inode()
      					  inode->i_state |= I_DIRTY_PAGES;
      					  if (inode->i_state & I_SYNC)
      					    bail
        if (!(inode->i_state & I_DIRTY_ALL))
        - not true so nothing done
      
      We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus
      standard background writeback will not writeback this inode leading to
      possible dirty throttling stalls etc. (thanks to Martijn Coenen for this
      analysis).
      
      Fix these problems by tracking whether inode is queued in b_io or
      b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we
      know flush worker has queued inode and we should not touch i_io_list.
      On the other hand we also know that once flush worker is done with the
      inode it will requeue the inode to appropriate dirty list. When
      I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode
      to appropriate dirty list.
      Reported-by: default avatarMartijn Coenen <maco@android.com>
      Reviewed-by: default avatarMartijn Coenen <maco@android.com>
      Tested-by: default avatarMartijn Coenen <maco@android.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option")
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be0937e0
    • Jan Kara's avatar
      writeback: Protect inode->i_io_list with inode->i_lock · 1f58ddc0
      Jan Kara authored
      commit b35250c0 upstream.
      
      Currently, operations on inode->i_io_list are protected by
      wb->list_lock. In the following patches we'll need to maintain
      consistency between inode->i_state and inode->i_io_list so change the
      code so that inode->i_lock protects also all inode's i_io_list handling.
      Reviewed-by: default avatarMartijn Coenen <maco@android.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      CC: stable@vger.kernel.org # Prerequisite for "writeback: Avoid skipping inode writeback"
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f58ddc0
    • Sergey Senozhatsky's avatar
      serial: 8250: change lock order in serial8250_do_startup() · 2eb35a11
      Sergey Senozhatsky authored
      commit 205d300a upstream.
      
      We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
      lockdep reports coming from 8250 driver; this causes a bit of trouble
      to people, so let's fix it.
      
      The problem is reverse lock order in two different call paths:
      
      chain #1:
      
       serial8250_do_startup()
        spin_lock_irqsave(&port->lock);
         disable_irq_nosync(port->irq);
          raw_spin_lock_irqsave(&desc->lock)
      
      chain #2:
      
        __report_bad_irq()
         raw_spin_lock_irqsave(&desc->lock)
          for_each_action_of_desc()
           printk()
            spin_lock_irqsave(&port->lock);
      
      Fix this by changing the order of locks in serial8250_do_startup():
       do disable_irq_nosync() first, which grabs desc->lock, and grab
       uart->port after that, so that chain #1 and chain #2 have same lock
       order.
      
      Full lockdep splat:
      
       ======================================================
       WARNING: possible circular locking dependency detected
       5.4.39 #55 Not tainted
       ======================================================
      
       swapper/0/0 is trying to acquire lock:
       ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57
      
       but task is already holding lock:
       ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #2 (&irq_desc_lock_class){-.-.}:
              _raw_spin_lock_irqsave+0x61/0x8d
              __irq_get_desc_lock+0x65/0x89
              __disable_irq_nosync+0x3b/0x93
              serial8250_do_startup+0x451/0x75c
              uart_startup+0x1b4/0x2ff
              uart_port_activate+0x73/0xa0
              tty_port_open+0xae/0x10a
              uart_open+0x1b/0x26
              tty_open+0x24d/0x3a0
              chrdev_open+0xd5/0x1cc
              do_dentry_open+0x299/0x3c8
              path_openat+0x434/0x1100
              do_filp_open+0x9b/0x10a
              do_sys_open+0x15f/0x3d7
              kernel_init_freeable+0x157/0x1dd
              kernel_init+0xe/0x105
              ret_from_fork+0x27/0x50
      
       -> #1 (&port_lock_key){-.-.}:
              _raw_spin_lock_irqsave+0x61/0x8d
              serial8250_console_write+0xa7/0x2a0
              console_unlock+0x3b7/0x528
              vprintk_emit+0x111/0x17f
              printk+0x59/0x73
              register_console+0x336/0x3a4
              uart_add_one_port+0x51b/0x5be
              serial8250_register_8250_port+0x454/0x55e
              dw8250_probe+0x4dc/0x5b9
              platform_drv_probe+0x67/0x8b
              really_probe+0x14a/0x422
              driver_probe_device+0x66/0x130
              device_driver_attach+0x42/0x5b
              __driver_attach+0xca/0x139
              bus_for_each_dev+0x97/0xc9
              bus_add_driver+0x12b/0x228
              driver_register+0x64/0xed
              do_one_initcall+0x20c/0x4a6
              do_initcall_level+0xb5/0xc5
              do_basic_setup+0x4c/0x58
              kernel_init_freeable+0x13f/0x1dd
              kernel_init+0xe/0x105
              ret_from_fork+0x27/0x50
      
       -> #0 (console_owner){-...}:
              __lock_acquire+0x118d/0x2714
              lock_acquire+0x203/0x258
              console_lock_spinning_enable+0x51/0x57
              console_unlock+0x25d/0x528
              vprintk_emit+0x111/0x17f
              printk+0x59/0x73
              __report_bad_irq+0xa3/0xba
              note_interrupt+0x19a/0x1d6
              handle_irq_event_percpu+0x57/0x79
              handle_irq_event+0x36/0x55
              handle_fasteoi_irq+0xc2/0x18a
              do_IRQ+0xb3/0x157
              ret_from_intr+0x0/0x1d
              cpuidle_enter_state+0x12f/0x1fd
              cpuidle_enter+0x2e/0x3d
              do_idle+0x1ce/0x2ce
              cpu_startup_entry+0x1d/0x1f
              start_kernel+0x406/0x46a
              secondary_startup_64+0xa4/0xb0
      
       other info that might help us debug this:
      
       Chain exists of:
         console_owner --> &port_lock_key --> &irq_desc_lock_class
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&irq_desc_lock_class);
                                      lock(&port_lock_key);
                                      lock(&irq_desc_lock_class);
         lock(console_owner);
      
        *** DEADLOCK ***
      
       2 locks held by swapper/0/0:
        #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
        #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181
      
       stack backtrace:
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55
       Hardware name: XXXXXX
       Call Trace:
        <IRQ>
        dump_stack+0xbf/0x133
        ? print_circular_bug+0xd6/0xe9
        check_noncircular+0x1b9/0x1c3
        __lock_acquire+0x118d/0x2714
        lock_acquire+0x203/0x258
        ? console_lock_spinning_enable+0x31/0x57
        console_lock_spinning_enable+0x51/0x57
        ? console_lock_spinning_enable+0x31/0x57
        console_unlock+0x25d/0x528
        ? console_trylock+0x18/0x4e
        vprintk_emit+0x111/0x17f
        ? lock_acquire+0x203/0x258
        printk+0x59/0x73
        __report_bad_irq+0xa3/0xba
        note_interrupt+0x19a/0x1d6
        handle_irq_event_percpu+0x57/0x79
        handle_irq_event+0x36/0x55
        handle_fasteoi_irq+0xc2/0x18a
        do_IRQ+0xb3/0x157
        common_interrupt+0xf/0xf
        </IRQ>
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Fixes: 768aec0b ("serial: 8250: fix shared interrupts issues with SMP and RT kernels")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reported-by: default avatarRaul Rangel <rrangel@google.com>
      BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
      Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#uReviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: stable <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200817022646.1484638-1-sergey.senozhatsky@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2eb35a11
    • Valmer Huhn's avatar
      serial: 8250_exar: Fix number of ports for Commtech PCIe cards · ce755e4e
      Valmer Huhn authored
      commit c6b9e95d upstream.
      
      The following in 8250_exar.c line 589 is used to determine the number
      of ports for each Exar board:
      
      nr_ports = board->num_ports ? board->num_ports : pcidev->device & 0x0f;
      
      If the number of ports a card has is not explicitly specified, it defaults
      to the rightmost 4 bits of the PCI device ID. This is prone to error since
      not all PCI device IDs contain a number which corresponds to the number of
      ports that card provides.
      
      This particular case involves COMMTECH_4222PCIE, COMMTECH_4224PCIE and
      COMMTECH_4228PCIE cards with device IDs 0x0022, 0x0020 and 0x0021.
      Currently the multiport cards receive 2, 0 and 1 port instead of 2, 4 and
      8 ports respectively.
      
      To fix this, each Commtech Fastcom PCIe card is given a struct where the
      number of ports is explicitly specified. This ensures 'board->num_ports'
      is used instead of the default 'pcidev->device & 0x0f'.
      
      Fixes: d0aeaa83 ("serial: exar: split out the exar code from 8250_pci")
      Signed-off-by: default avatarValmer Huhn <valmer.huhn@concurrent-rt.com>
      Tested-by: default avatarValmer Huhn <valmer.huhn@concurrent-rt.com>
      Cc: stable <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200813165255.GC345440@icarus.concurrent-rt.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce755e4e
    • Lukas Wunner's avatar
      serial: pl011: Don't leak amba_ports entry on driver register error · e12f3622
      Lukas Wunner authored
      commit 89efbe70 upstream.
      
      pl011_probe() calls pl011_setup_port() to reserve an amba_ports[] entry,
      then calls pl011_register_port() to register the uart driver with the
      tty layer.
      
      If registration of the uart driver fails, the amba_ports[] entry is not
      released.  If this happens 14 times (value of UART_NR macro), then all
      amba_ports[] entries will have been leaked and driver probing is no
      longer possible.  (To be fair, that can only happen if the DeviceTree
      doesn't contain alias IDs since they cause the same entry to be used for
      a given port.)   Fix it.
      
      Fixes: ef2889f7 ("serial: pl011: Move uart_register_driver call to device")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v3.15+
      Cc: Tushar Behera <tushar.behera@linaro.org>
      Link: https://lore.kernel.org/r/138f8c15afb2f184d8102583f8301575566064a6.1597316167.git.lukas@wunner.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e12f3622
    • Lukas Wunner's avatar
      serial: pl011: Fix oops on -EPROBE_DEFER · eec2f7d9
      Lukas Wunner authored
      commit 27afac93 upstream.
      
      If probing of a pl011 gets deferred until after free_initmem(), an oops
      ensues because pl011_console_match() is called which has been freed.
      
      Fix by removing the __init attribute from the function and those it
      calls.
      
      Commit 10879ae5 ("serial: pl011: add console matching function")
      introduced pl011_console_match() not just for early consoles but
      regular preferred consoles, such as those added by acpi_parse_spcr().
      Regular consoles may be registered after free_initmem() for various
      reasons, one being deferred probing, another being dynamic enablement
      of serial ports using a DeviceTree overlay.
      
      Thus, pl011_console_match() must not be declared __init and the
      functions it calls mustn't either.
      
      Stack trace for posterity:
      
      Unable to handle kernel paging request at virtual address 80c38b58
      Internal error: Oops: 8000000d [#1] PREEMPT SMP ARM
      PC is at pl011_console_match+0x0/0xfc
      LR is at register_console+0x150/0x468
      [<80187004>] (register_console)
      [<805a8184>] (uart_add_one_port)
      [<805b2b68>] (pl011_register_port)
      [<805b3ce4>] (pl011_probe)
      [<80569214>] (amba_probe)
      [<805ca088>] (really_probe)
      [<805ca2ec>] (driver_probe_device)
      [<805ca5b0>] (__device_attach_driver)
      [<805c8060>] (bus_for_each_drv)
      [<805c9dfc>] (__device_attach)
      [<805ca630>] (device_initial_probe)
      [<805c90a8>] (bus_probe_device)
      [<805c95a8>] (deferred_probe_work_func)
      
      Fixes: 10879ae5 ("serial: pl011: add console matching function")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v4.10+
      Cc: Aleksey Makarov <amakarov@marvell.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Christopher Covington <cov@codeaurora.org>
      Link: https://lore.kernel.org/r/f827ff09da55b8c57d316a1b008a137677b58921.1597315557.git.lukas@wunner.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eec2f7d9
    • Tamseel Shams's avatar
      serial: samsung: Removes the IRQ not found warning · 8a0d860c
      Tamseel Shams authored
      commit 8c6c378b upstream.
      
      In few older Samsung SoCs like s3c2410, s3c2412
      and s3c2440, UART IP is having 2 interrupt lines.
      However, in other SoCs like s3c6400, s5pv210,
      exynos5433, and exynos4210 UART is having only 1
      interrupt line. Due to this, "platform_get_irq(platdev, 1)"
      call in the driver gives the following false-positive error:
      "IRQ index 1 not found" on newer SoC's.
      
      This patch adds the condition to check for Tx interrupt
      only for the those SoC's which have 2 interrupt lines.
      Tested-by: default avatarAlim Akhtar <alim.akhtar@samsung.com>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Reviewed-by: default avatarAlim Akhtar <alim.akhtar@samsung.com>
      Signed-off-by: default avatarTamseel Shams <m.shams@samsung.com>
      Cc: stable <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200810030021.45348-1-m.shams@samsung.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a0d860c
    • George Kennedy's avatar
      vt_ioctl: change VT_RESIZEX ioctl to check for error return from vc_resize() · 1221d11e
      George Kennedy authored
      commit bc5269ca upstream.
      
      vc_resize() can return with an error after failure. Change VT_RESIZEX ioctl
      to save struct vc_data values that are modified and restore the original
      values in case of error.
      Signed-off-by: default avatarGeorge Kennedy <george.kennedy@oracle.com>
      Cc: stable <stable@vger.kernel.org>
      Reported-by: syzbot+38a3699c7eaf165b97a6@syzkaller.appspotmail.com
      Link: https://lore.kernel.org/r/1596213192-6635-2-git-send-email-george.kennedy@oracle.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1221d11e
    • Tetsuo Handa's avatar
      vt: defer kfree() of vc_screenbuf in vc_do_resize() · c1fe757d
      Tetsuo Handa authored
      commit f8d1653d upstream.
      
      syzbot is reporting UAF bug in set_origin() from vc_do_resize() [1], for
      vc_do_resize() calls kfree(vc->vc_screenbuf) before calling set_origin().
      
      Unfortunately, in set_origin(), vc->vc_sw->con_set_origin() might access
      vc->vc_pos when scroll is involved in order to manipulate cursor, but
      vc->vc_pos refers already released vc->vc_screenbuf until vc->vc_pos gets
      updated based on the result of vc->vc_sw->con_set_origin().
      
      Preserving old buffer and tolerating outdated vc members until set_origin()
      completes would be easier than preventing vc->vc_sw->con_set_origin() from
      accessing outdated vc members.
      
      [1] https://syzkaller.appspot.com/bug?id=6649da2081e2ebdc65c0642c214b27fe91099db3Reported-by: default avatarsyzbot <syzbot+9116ecc1978ca3a12f43@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: stable <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/1596034621-4714-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jpSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1fe757d
    • Evgeny Novikov's avatar
      USB: lvtest: return proper error code in probe · 7c451eae
      Evgeny Novikov authored
      commit 53141249 upstream.
      
      lvs_rh_probe() can return some nonnegative value from usb_control_msg()
      when it is less than "USB_DT_HUB_NONVAR_SIZE + 2" that is considered as
      a failure. Make lvs_rh_probe() return -EINVAL in this case.
      
      Found by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarEvgeny Novikov <novikov@ispras.ru>
      Cc: stable <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/20200805090643.3432-1-novikov@ispras.ruSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c451eae
    • George Kennedy's avatar
      fbcon: prevent user font height or width change from causing potential out-of-bounds access · 72f09980
      George Kennedy authored
      commit 39b3cffb upstream.
      
      Add a check to fbcon_resize() to ensure that a possible change to user font
      height or user font width will not allow a font data out-of-bounds access.
      NOTE: must use original charcount in calculation as font charcount can
      change and cannot be used to determine the font data allocated size.
      Signed-off-by: default avatarGeorge Kennedy <george.kennedy@oracle.com>
      Cc: stable <stable@vger.kernel.org>
      Reported-by: syzbot+38a3699c7eaf165b97a6@syzkaller.appspotmail.com
      Link: https://lore.kernel.org/r/1596213192-6635-1-git-send-email-george.kennedy@oracle.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72f09980
    • Filipe Manana's avatar
      btrfs: fix space cache memory leak after transaction abort · b0186a11
      Filipe Manana authored
      commit bbc37d6e upstream.
      
      If a transaction aborts it can cause a memory leak of the pages array of
      a block group's io_ctl structure. The following steps explain how that can
      happen:
      
      1) Transaction N is committing, currently in state TRANS_STATE_UNBLOCKED
         and it's about to start writing out dirty extent buffers;
      
      2) Transaction N + 1 already started and another task, task A, just called
         btrfs_commit_transaction() on it;
      
      3) Block group B was dirtied (extents allocated from it) by transaction
         N + 1, so when task A calls btrfs_start_dirty_block_groups(), at the
         very beginning of the transaction commit, it starts writeback for the
         block group's space cache by calling btrfs_write_out_cache(), which
         allocates the pages array for the block group's io_ctl with a call to
         io_ctl_init(). Block group A is added to the io_list of transaction
         N + 1 by btrfs_start_dirty_block_groups();
      
      4) While transaction N's commit is writing out the extent buffers, it gets
         an IO error and aborts transaction N, also setting the file system to
         RO mode;
      
      5) Task A has already returned from btrfs_start_dirty_block_groups(), is at
         btrfs_commit_transaction() and has set transaction N + 1 state to
         TRANS_STATE_COMMIT_START. Immediately after that it checks that the
         filesystem was turned to RO mode, due to transaction N's abort, and
         jumps to the "cleanup_transaction" label. After that we end up at
         btrfs_cleanup_one_transaction() which calls btrfs_cleanup_dirty_bgs().
         That helper finds block group B in the transaction's io_list but it
         never releases the pages array of the block group's io_ctl, resulting in
         a memory leak.
      
      In fact at the point when we are at btrfs_cleanup_dirty_bgs(), the pages
      array points to pages that were already released by us at
      __btrfs_write_out_cache() through the call to io_ctl_drop_pages(). We end
      up freeing the pages array only after waiting for the ordered extent to
      complete through btrfs_wait_cache_io(), which calls io_ctl_free() to do
      that. But in the transaction abort case we don't wait for the space cache's
      ordered extent to complete through a call to btrfs_wait_cache_io(), so
      that's why we end up with a memory leak - we wait for the ordered extent
      to complete indirectly by shutting down the work queues and waiting for
      any jobs in them to complete before returning from close_ctree().
      
      We can solve the leak simply by freeing the pages array right after
      releasing the pages (with the call to io_ctl_drop_pages()) at
      __btrfs_write_out_cache(), since we will never use it anymore after that
      and the pages array points to already released pages at that point, which
      is currently not a problem since no one will use it after that, but not a
      good practice anyway since it can easily lead to use-after-free issues.
      
      So fix this by freeing the pages array right after releasing the pages at
      __btrfs_write_out_cache().
      
      This issue can often be reproduced with test case generic/475 from fstests
      and kmemleak can detect it and reports it with the following trace:
      
      unreferenced object 0xffff9bbf009fa600 (size 512):
        comm "fsstress", pid 38807, jiffies 4298504428 (age 22.028s)
        hex dump (first 32 bytes):
          00 a0 7c 4d 3d ed ff ff 40 a0 7c 4d 3d ed ff ff  ..|M=...@.|M=...
          80 a0 7c 4d 3d ed ff ff c0 a0 7c 4d 3d ed ff ff  ..|M=.....|M=...
        backtrace:
          [<00000000f4b5cfe2>] __kmalloc+0x1a8/0x3e0
          [<0000000028665e7f>] io_ctl_init+0xa7/0x120 [btrfs]
          [<00000000a1f95b2d>] __btrfs_write_out_cache+0x86/0x4a0 [btrfs]
          [<00000000207ea1b0>] btrfs_write_out_cache+0x7f/0xf0 [btrfs]
          [<00000000af21f534>] btrfs_start_dirty_block_groups+0x27b/0x580 [btrfs]
          [<00000000c3c23d44>] btrfs_commit_transaction+0xa6f/0xe70 [btrfs]
          [<000000009588930c>] create_subvol+0x581/0x9a0 [btrfs]
          [<000000009ef2fd7f>] btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
          [<00000000474e5187>] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
          [<00000000708ee349>] btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs]
          [<00000000ea60106f>] btrfs_ioctl+0x12c/0x3130 [btrfs]
          [<000000005c923d6d>] __x64_sys_ioctl+0x83/0xb0
          [<0000000043ace2c9>] do_syscall_64+0x33/0x80
          [<00000000904efbce>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0186a11
    • Marcos Paulo de Souza's avatar
      btrfs: reset compression level for lzo on remount · ee203be4
      Marcos Paulo de Souza authored
      commit 282dd7d7 upstream.
      
      Currently a user can set mount "-o compress" which will set the
      compression algorithm to zlib, and use the default compress level for
      zlib (3):
      
        relatime,compress=zlib:3,space_cache
      
      If the user remounts the fs using "-o compress=lzo", then the old
      compress_level is used:
      
        relatime,compress=lzo:3,space_cache
      
      But lzo does not expose any tunable compression level. The same happens
      if we set any compress argument with different level, also with zstd.
      
      Fix this by resetting the compress_level when compress=lzo is
      specified.  With the fix applied, lzo is shown without compress level:
      
        relatime,compress=lzo,space_cache
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarMarcos Paulo de Souza <mpdesouza@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee203be4
    • Ming Lei's avatar
      blk-mq: order adding requests to hctx->dispatch and checking SCHED_RESTART · 77064570
      Ming Lei authored
      commit d7d8535f upstream.
      
      SCHED_RESTART code path is relied to re-run queue for dispatch requests
      in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding
      requests to hctx->dispatch.
      
      memory barriers have to be used for ordering the following two pair of OPs:
      
      1) adding requests to hctx->dispatch and checking SCHED_RESTART in
      blk_mq_dispatch_rq_list()
      
      2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch
      in blk_mq_sched_restart().
      
      Without the added memory barrier, either:
      
      1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime
      blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in
      dispatch side
      
      or
      
      2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue
      in dispatch side, meantime checking if there is request in
      hctx->dispatch from blk_mq_sched_restart() is missed.
      
      IO hang in ltp/fs_fill test is reported by kernel test robot:
      
      	https://lkml.org/lkml/2020/7/26/77
      
      Turns out it is caused by the above out-of-order OPs. And the IO hang
      can't be observed any more after applying this patch.
      
      Fixes: bd166ef1 ("blk-mq-sched: add framework for MQ capable IO schedulers")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Jeffery <djeffery@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77064570
    • Hans de Goede's avatar
      HID: i2c-hid: Always sleep 60ms after I2C_HID_PWR_ON commands · 30028c32
      Hans de Goede authored
      commit eef40162 upstream.
      
      Before this commit i2c_hid_parse() consists of the following steps:
      
      1. Send power on cmd
      2. usleep_range(1000, 5000)
      3. Send reset cmd
      4. Wait for reset to complete (device interrupt, or msleep(100))
      5. Send power on cmd
      6. Try to read HID descriptor
      
      Notice how there is an usleep_range(1000, 5000) after the first power-on
      command, but not after the second power-on command.
      
      Testing has shown that at least on the BMAX Y13 laptop's i2c-hid touchpad,
      not having a delay after the second power-on command causes the HID
      descriptor to read as all zeros.
      
      In case we hit this on other devices too, the descriptor being all zeros
      can be recognized by the following message being logged many, many times:
      
      hid-generic 0018:0911:5288.0002: unknown main item tag 0x0
      
      At the same time as the BMAX Y13's touchpad issue was debugged,
      Kai-Heng was working on debugging some issues with Goodix i2c-hid
      touchpads. It turns out that these need a delay after a PWR_ON command
      too, otherwise they stop working after a suspend/resume cycle.
      According to Goodix a delay of minimal 60ms is needed.
      
      Having multiple cases where we need a delay after sending the power-on
      command, seems to indicate that we should always sleep after the power-on
      command.
      
      This commit fixes the mentioned issues by moving the existing 1ms sleep to
      the i2c_hid_set_power() function and changing it to a 60ms sleep.
      
      Cc: stable@vger.kernel.org
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208247Reported-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Reported-and-tested-by: default avatarAndrea Borgia <andrea@borgia.bo.it>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30028c32
    • Ming Lei's avatar
      block: loop: set discard granularity and alignment for block device backed loop · 3e7f6159
      Ming Lei authored
      commit bcb21c8c upstream.
      
      In case of block device backend, if the backend supports write zeros, the
      loop device will set queue flag of QUEUE_FLAG_DISCARD. However,
      limits.discard_granularity isn't setup, and this way is wrong,
      see the following description in Documentation/ABI/testing/sysfs-block:
      
      	A discard_granularity of 0 means that the device does not support
      	discard functionality.
      
      Especially 9b15d109 ("block: improve discard bio alignment in
      __blkdev_issue_discard()") starts to take q->limits.discard_granularity
      for computing max discard sectors. And zero discard granularity may cause
      kernel oops, or fail discard request even though the loop queue claims
      discard support via QUEUE_FLAG_DISCARD.
      
      Fix the issue by setup discard granularity and alignment.
      
      Fixes: c52abf56 ("loop: Better discard support for block devices")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Xiao Ni <xni@redhat.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Gwendal Grignou <gwendal@chromium.org>
      Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e7f6159
    • Athira Rajeev's avatar
      powerpc/perf: Fix soft lockups due to missed interrupt accounting · bcaa4604
      Athira Rajeev authored
      [ Upstream commit 17899eaf ]
      
      Performance monitor interrupt handler checks if any counter has
      overflown and calls record_and_restart() in core-book3s which invokes
      perf_event_overflow() to record the sample information. Apart from
      creating sample, perf_event_overflow() also does the interrupt and
      period checks via perf_event_account_interrupt().
      
      Currently we record information only if the SIAR (Sampled Instruction
      Address Register) valid bit is set (using siar_valid() check) and
      hence the interrupt check.
      
      But it is possible that we do sampling for some events that are not
      generating valid SIAR, and hence there is no chance to disable the
      event if interrupts are more than max_samples_per_tick. This leads to
      soft lockup.
      
      Fix this by adding perf_event_account_interrupt() in the invalid SIAR
      code path for a sampling event. ie if SIAR is invalid, just do
      interrupt check and don't record the sample information.
      Reported-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      bcaa4604
    • Sumera Priyadarsini's avatar
      net: gianfar: Add of_node_put() before goto statement · 50b83d19
      Sumera Priyadarsini authored
      [ Upstream commit 989e4da0 ]
      
      Every iteration of for_each_available_child_of_node() decrements
      reference count of the previous node, however when control
      is transferred from the middle of the loop, as in the case of
      a return or break or goto, there is no decrement thus ultimately
      resulting in a memory leak.
      
      Fix a potential memory leak in gianfar.c by inserting of_node_put()
      before the goto statement.
      
      Issue found with Coccinelle.
      Signed-off-by: default avatarSumera Priyadarsini <sylphrenadin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      50b83d19
    • Alvin Šipraga's avatar
      macvlan: validate setting of multiple remote source MAC addresses · b1215198
      Alvin Šipraga authored
      [ Upstream commit 8b61fba5 ]
      
      Remote source MAC addresses can be set on a 'source mode' macvlan
      interface via the IFLA_MACVLAN_MACADDR_DATA attribute. This commit
      tightens the validation of these MAC addresses to match the validation
      already performed when setting or adding a single MAC address via the
      IFLA_MACVLAN_MACADDR attribute.
      
      iproute2 uses IFLA_MACVLAN_MACADDR_DATA for its 'macvlan macaddr set'
      command, and IFLA_MACVLAN_MACADDR for its 'macvlan macaddr add' command,
      which demonstrates the inconsistent behaviour that this commit
      addresses:
      
       # ip link add link eth0 name macvlan0 type macvlan mode source
       # ip link set link dev macvlan0 type macvlan macaddr add 01:00:00:00:00:00
       RTNETLINK answers: Cannot assign requested address
       # ip link set link dev macvlan0 type macvlan macaddr set 01:00:00:00:00:00
       # ip -d link show macvlan0
       5: macvlan0@eth0: <BROADCAST,MULTICAST,DYNAMIC,UP,LOWER_UP> mtu 1500 ...
           link/ether 2e:ac:fd:2d:69:f8 brd ff:ff:ff:ff:ff:ff promiscuity 0
           macvlan mode source remotes (1) 01:00:00:00:00:00 numtxqueues 1 ...
      
      With this change, the 'set' command will (rightly) fail in the same way
      as the 'add' command.
      Signed-off-by: default avatarAlvin Šipraga <alsi@bang-olufsen.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b1215198
    • Saurav Kashyap's avatar
      Revert "scsi: qla2xxx: Fix crash on qla2x00_mailbox_command" · 2d1a5f56
      Saurav Kashyap authored
      [ Upstream commit de7e6194 ]
      
      FCoE adapter initialization failed for ISP8021 with the following patch
      applied. In addition, reproduction of the issue the patch originally tried
      to address has been unsuccessful.
      
      This reverts commit 3cb182b3.
      
      Link: https://lore.kernel.org/r/20200806111014.28434-11-njavali@marvell.comReviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarSaurav Kashyap <skashyap@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2d1a5f56
    • Quinn Tran's avatar
      scsi: qla2xxx: Fix null pointer access during disconnect from subsystem · f92ff03e
      Quinn Tran authored
      [ Upstream commit 83949613 ]
      
      NVMEAsync command is being submitted to QLA while the same NVMe controller
      is in the middle of reset. The reset path has deleted the association and
      freed aen_op->fcp_req.private. Add a check for this private pointer before
      issuing the command.
      
      ...
       6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e
          [exception RIP: qla_nvme_post_cmd+394]
          RIP: ffffffffc0d012ba  RSP: ffffb656ca11fd98  RFLAGS: 00010206
          RAX: ffff8fb039eda228  RBX: ffff8fb039eda200  RCX: 00000000000da161
          RDX: ffffffffc0d4d0f0  RSI: ffffffffc0d26c9b  RDI: ffff8fb039eda220
          RBP: 0000000000000013   R8: ffff8fb47ff6aa80   R9: 0000000000000002
          R10: 0000000000000000  R11: ffffb656ca11fdc8  R12: ffff8fb27d04a3b0
          R13: ffff8fc46dd98a58  R14: 0000000000000000  R15: ffff8fc4540f0000
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc]
       8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc]
       9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core]
      10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437
      11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef
      12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402
      13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255
      
      --
      PID: 37824  TASK: ffff8fb033063d80  CPU: 20  COMMAND: "kworker/u97:451"
       0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3
       1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8
       2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed
       3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf
       4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5
       5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core]
       6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc]
       7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437
       8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50
       9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402
      10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255
      
      Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.comReviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarQuinn Tran <qutran@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f92ff03e
    • Saurav Kashyap's avatar
      scsi: qla2xxx: Check if FW supports MQ before enabling · aba99748
      Saurav Kashyap authored
      [ Upstream commit dffa1145 ]
      
      OS boot during Boot from SAN was stuck at dracut emergency shell after
      enabling NVMe driver parameter. For non-MQ support the driver was enabling
      MQ. Add a check to confirm if FW supports MQ.
      
      Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.comReviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarSaurav Kashyap <skashyap@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aba99748
    • Stanley Chu's avatar
      scsi: ufs: Clean up completed request without interrupt notification · 023e0e6b
      Stanley Chu authored
      [ Upstream commit b10178ee ]
      
      If somehow no interrupt notification is raised for a completed request and
      its doorbell bit is cleared by host, UFS driver needs to cleanup its
      outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally
      in the following scenario:
      
      After ufshcd_abort() returns, this request will be requeued by SCSI layer
      with its outstanding bit set. Any future completed request will trigger
      ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At
      this time the "abnormal outstanding bit" will be detected and the "requeued
      request" will be chosen to execute request post-processing flow. This is
      wrong because this request is still "alive".
      
      Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.comReviewed-by: default avatarCan Guo <cang@codeaurora.org>
      Acked-by: default avatarAvri Altman <avri.altman@wdc.com>
      Signed-off-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Signed-off-by: default avatarBean Huo <beanhuo@micron.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      023e0e6b
    • Adrian Hunter's avatar
      scsi: ufs: Improve interrupt handling for shared interrupts · 31871fb7
      Adrian Hunter authored
      [ Upstream commit 127d5f7c ]
      
      For shared interrupts, the interrupt status might be zero, so check that
      first.
      
      Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.comReviewed-by: default avatarAvri Altman <avri.altman@wdc.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      31871fb7
    • Stanley Chu's avatar
      scsi: ufs: Fix possible infinite loop in ufshcd_hold · 8f76a208
      Stanley Chu authored
      [ Upstream commit 93b6c5db ]
      
      In ufshcd_suspend(), after clk-gating is suspended and link is set
      as Hibern8 state, ufshcd_hold() is still possibly invoked before
      ufshcd_suspend() returns. For example, MediaTek's suspend vops may
      issue UIC commands which would call ufshcd_hold() during the command
      issuing flow.
      
      Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
      then ufshcd_hold() may enter infinite loops because there is no
      clk-ungating work scheduled or pending. In this case, ufshcd_hold()
      shall just bypass, and keep the link as Hibern8 state.
      
      Link: https://lore.kernel.org/r/20200809050734.18740-1-stanley.chu@mediatek.comReviewed-by: default avatarAvri Altman <avri.altman@wdc.com>
      Co-developed-by: default avatarAndy Teng <andy.teng@mediatek.com>
      Signed-off-by: default avatarAndy Teng <andy.teng@mediatek.com>
      Signed-off-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f76a208
    • Mike Christie's avatar
      scsi: fcoe: Fix I/O path allocation · 2229e50f
      Mike Christie authored
      [ Upstream commit fa39ab51 ]
      
      ixgbe_fcoe_ddp_setup() can be called from the main I/O path and is called
      with a spin_lock held, so we have to use GFP_ATOMIC allocation instead of
      GFP_KERNEL.
      
      Link: https://lore.kernel.org/r/1596831813-9839-1-git-send-email-michael.christie@oracle.com
      cc: Hannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2229e50f
    • Sylwester Nawrocki's avatar
      ASoC: wm8994: Avoid attempts to read unreadable registers · 00963a85
      Sylwester Nawrocki authored
      [ Upstream commit f082bb59 ]
      
      The driver supports WM1811, WM8994, WM8958 devices but according to
      documentation and the regmap definitions the WM8958_DSP2_* registers
      are only available on WM8958. In current code these registers are
      being accessed as if they were available on all the three chips.
      
      When starting playback on WM1811 CODEC multiple errors like:
      "wm8994-codec wm8994-codec: ASoC: error at soc_component_read_no_lock on wm8994-codec: -5"
      can be seen, which is caused by attempts to read an unavailable
      WM8958_DSP2_PROGRAM register. The issue has been uncovered by recent
      commit "e2329eeb ASoC: soc-component: add soc_component_err()".
      
      This patch adds a check in wm8958_aif_ev() callback so the DSP2 handling
      is only done for WM8958.
      Signed-off-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Acked-by: default avatarCharles Keepax <ckeepax@opensource.cirrus.com>
      Link: https://lore.kernel.org/r/20200731173834.23832-1-s.nawrocki@samsung.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00963a85
    • Vineeth Vijayan's avatar
      s390/cio: add cond_resched() in the slow_eval_known_fn() loop · 017d36c5
      Vineeth Vijayan authored
      [ Upstream commit 0b8eb2ee ]
      
      The scanning through subchannels during the time of an event could
      take significant amount of time in case of platforms with lots of
      known subchannels. This might result in higher scheduling latencies
      for other tasks especially on systems with a single CPU. Add
      cond_resched() call, as the loop in slow_eval_known_fn() can be
      executed for a longer duration.
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarVineeth Vijayan <vneethv@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      017d36c5
    • Amelie Delaunay's avatar
      spi: stm32: fix stm32_spi_prepare_mbr in case of odd clk_rate · ca57f450
      Amelie Delaunay authored
      [ Upstream commit 9cc61973 ]
      
      Fix spi->clk_rate when it is odd to the nearest lowest even value because
      minimum SPI divider is 2.
      Signed-off-by: default avatarAmelie Delaunay <amelie.delaunay@st.com>
      Signed-off-by: default avatarAlain Volmat <alain.volmat@st.com>
      Link: https://lore.kernel.org/r/1597043558-29668-4-git-send-email-alain.volmat@st.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ca57f450
    • Xianting Tian's avatar
      fs: prevent BUG_ON in submit_bh_wbc() · 4aaac9c5
      Xianting Tian authored
      [ Upstream commit 377254b2 ]
      
      If a device is hot-removed --- for example, when a physical device is
      unplugged from pcie slot or a nbd device's network is shutdown ---
      this can result in a BUG_ON() crash in submit_bh_wbc().  This is
      because the when the block device dies, the buffer heads will have
      their Buffer_Mapped flag get cleared, leading to the crash in
      submit_bh_wbc.
      
      We had attempted to work around this problem in commit a17712c8
      ("ext4: check superblock mapped prior to committing").  Unfortunately,
      it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the
      device dies between when the work-around check in ext4_commit_super()
      and when submit_bh_wbh() is finally called:
      
      Code path:
      ext4_commit_super
          judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8
                lock_buffer(sbh)
                ...
                unlock_buffer(sbh)
                     __sync_dirty_buffer(sbh,...
                          lock_buffer(sbh)
                              judge if 'buffer_mapped(sbh))' is false, return <== added by this patch
                                  submit_bh(...,sbh)
                                      submit_bh_wbc(...,sbh,...)
      
      [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc()
      [100722.966503] invalid opcode: 0000 [#1] SMP
      [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000
      [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190
      [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246
      [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000
      [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001
      [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d
      [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800
      [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000
      [100722.966580] FS:  00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000
      [100722.966580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0
      [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [100722.966583] PKRU: 55555554
      [100722.966583] Call Trace:
      [100722.966588]  __sync_dirty_buffer+0x6e/0xd0
      [100722.966614]  ext4_commit_super+0x1d8/0x290 [ext4]
      [100722.966626]  __ext4_std_error+0x78/0x100 [ext4]
      [100722.966635]  ? __ext4_journal_get_write_access+0xca/0x120 [ext4]
      [100722.966646]  ext4_reserve_inode_write+0x58/0xb0 [ext4]
      [100722.966655]  ? ext4_dirty_inode+0x48/0x70 [ext4]
      [100722.966663]  ext4_mark_inode_dirty+0x53/0x1e0 [ext4]
      [100722.966671]  ? __ext4_journal_start_sb+0x6d/0xf0 [ext4]
      [100722.966679]  ext4_dirty_inode+0x48/0x70 [ext4]
      [100722.966682]  __mark_inode_dirty+0x17f/0x350
      [100722.966686]  generic_update_time+0x87/0xd0
      [100722.966687]  touch_atime+0xa9/0xd0
      [100722.966690]  generic_file_read_iter+0xa09/0xcd0
      [100722.966694]  ? page_cache_tree_insert+0xb0/0xb0
      [100722.966704]  ext4_file_read_iter+0x4a/0x100 [ext4]
      [100722.966707]  ? __inode_security_revalidate+0x4f/0x60
      [100722.966709]  __vfs_read+0xec/0x160
      [100722.966711]  vfs_read+0x8c/0x130
      [100722.966712]  SyS_pread64+0x87/0xb0
      [100722.966716]  do_syscall_64+0x67/0x1b0
      [100722.966719]  entry_SYSCALL64_slow_path+0x25/0x25
      
      To address this, add the check of 'buffer_mapped(bh)' to
      __sync_dirty_buffer().  This also has the benefit of fixing this for
      other file systems.
      
      With this addition, we can drop the workaround in ext4_commit_supper().
      
      [ Commit description rewritten by tytso. ]
      Signed-off-by: default avatarXianting Tian <xianting_tian@126.com>
      Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4aaac9c5
    • Jan Kara's avatar
      ext4: correctly restore system zone info when remount fails · 7f6858a3
      Jan Kara authored
      [ Upstream commit 0f5bde1d ]
      
      When remounting filesystem fails late during remount handling and
      block_validity mount option is also changed during the remount, we fail
      to restore system zone information to a state matching the mount option.
      This is mostly harmless, just the block validity checking will not match
      the situation described by the mount option. Make sure these two are always
      consistent.
      Reported-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200728130437.7804-7-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f6858a3
    • Jan Kara's avatar
      ext4: handle error of ext4_setup_system_zone() on remount · c279f7a4
      Jan Kara authored
      [ Upstream commit d176b1f6 ]
      
      ext4_setup_system_zone() can fail. Handle the failure in ext4_remount().
      Reviewed-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200728130437.7804-2-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c279f7a4
    • Lukas Czerner's avatar
      ext4: handle option set by mount flags correctly · bfb8d9b7
      Lukas Czerner authored
      [ Upstream commit f25391eb ]
      
      Currently there is a problem with mount options that can be both set by
      vfs using mount flags or by a string parsing in ext4.
      
      i_version/iversion options gets lost after remount, for example
      
      $ mount -o i_version /dev/pmem0 /mnt
      $ grep pmem0 /proc/self/mountinfo | grep i_version
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,seclabel,i_version
      $ mount -o remount,ro /mnt
      $ grep pmem0 /proc/self/mountinfo | grep i_version
      
      nolazytime gets ignored by ext4 on remount, for example
      
      $ mount -o lazytime /dev/pmem0 /mnt
      $ grep pmem0 /proc/self/mountinfo | grep lazytime
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,lazytime,seclabel
      $ mount -o remount,nolazytime /mnt
      $ grep pmem0 /proc/self/mountinfo | grep lazytime
      310 95 259:0 / /mnt rw,relatime shared:163 - ext4 /dev/pmem0 rw,lazytime,seclabel
      
      Fix it by applying the SB_LAZYTIME and SB_I_VERSION flags from *flags to
      s_flags before we parse the option and use the resulting state of the
      same flags in *flags at the end of successful remount.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarRitesh Harjani <riteshh@linux.ibm.com>
      Link: https://lore.kernel.org/r/20200723150526.19931-1-lczerner@redhat.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bfb8d9b7
    • zhangyi (F)'s avatar
      jbd2: abort journal if free a async write error metadata buffer · 8eed535d
      zhangyi (F) authored
      [ Upstream commit c044f3d8 ]
      
      If we free a metadata buffer which has been failed to async write out
      in the background, the jbd2 checkpoint procedure will not detect this
      failure in jbd2_log_do_checkpoint(), so it may lead to filesystem
      inconsistency after cleanup journal tail. This patch abort the journal
      if free a buffer has write_io_error flag to prevent potential further
      inconsistency.
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Link: https://lore.kernel.org/r/20200620025427.1756360-5-yi.zhang@huawei.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8eed535d
    • Lukas Czerner's avatar
      ext4: handle read only external journal device · 47788043
      Lukas Czerner authored
      [ Upstream commit 273108fa ]
      
      Ext4 uses blkdev_get_by_dev() to get the block_device for journal device
      which does check to see if the read-only block device was opened
      read-only.
      
      As a result ext4 will hapily proceed mounting the file system with
      external journal on read-only device. This is bad as we would not be
      able to use the journal leading to errors later on.
      
      Instead of simply failing to mount file system in this case, treat it in
      a similar way we treat internal journal on read-only device. Allow to
      mount with -o noload in read-only mode.
      
      This can be reproduced easily like this:
      
      mke2fs -F -O journal_dev $JOURNAL_DEV 100M
      mkfs.$FSTYPE -F -J device=$JOURNAL_DEV $FS_DEV
      blockdev --setro $JOURNAL_DEV
      mount $FS_DEV $MNT
      touch $MNT/file
      umount $MNT
      
      leading to error like this
      
      [ 1307.318713] ------------[ cut here ]------------
      [ 1307.323362] generic_make_request: Trying to write to read-only block-device dm-2 (partno 0)
      [ 1307.331741] WARNING: CPU: 36 PID: 3224 at block/blk-core.c:855 generic_make_request_checks+0x2c3/0x580
      [ 1307.341041] Modules linked in: ext4 mbcache jbd2 rfkill intel_rapl_msr intel_rapl_common isst_if_commd
      [ 1307.419445] CPU: 36 PID: 3224 Comm: jbd2/dm-2 Tainted: G        W I       5.8.0-rc5 #2
      [ 1307.427359] Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 2.3.10 08/15/2019
      [ 1307.434932] RIP: 0010:generic_make_request_checks+0x2c3/0x580
      [ 1307.440676] Code: 94 03 00 00 48 89 df 48 8d 74 24 08 c6 05 cf 2b 18 01 01 e8 7f a4 ff ff 48 c7 c7 50e
      [ 1307.459420] RSP: 0018:ffffc0d70eb5fb48 EFLAGS: 00010286
      [ 1307.464646] RAX: 0000000000000000 RBX: ffff9b33b2978300 RCX: 0000000000000000
      [ 1307.471780] RDX: ffff9b33e12a81e0 RSI: ffff9b33e1298000 RDI: ffff9b33e1298000
      [ 1307.478913] RBP: ffff9b7b9679e0c0 R08: 0000000000000837 R09: 0000000000000024
      [ 1307.486044] R10: 0000000000000000 R11: ffffc0d70eb5f9f0 R12: 0000000000000400
      [ 1307.493177] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
      [ 1307.500308] FS:  0000000000000000(0000) GS:ffff9b33e1280000(0000) knlGS:0000000000000000
      [ 1307.508396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1307.514142] CR2: 000055eaf4109000 CR3: 0000003dee40a006 CR4: 00000000007606e0
      [ 1307.521273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1307.528407] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1307.535538] PKRU: 55555554
      [ 1307.538250] Call Trace:
      [ 1307.540708]  generic_make_request+0x30/0x340
      [ 1307.544985]  submit_bio+0x43/0x190
      [ 1307.548393]  ? bio_add_page+0x62/0x90
      [ 1307.552068]  submit_bh_wbc+0x16a/0x190
      [ 1307.555833]  jbd2_write_superblock+0xec/0x200 [jbd2]
      [ 1307.560803]  jbd2_journal_update_sb_log_tail+0x65/0xc0 [jbd2]
      [ 1307.566557]  jbd2_journal_commit_transaction+0x2ae/0x1860 [jbd2]
      [ 1307.572566]  ? check_preempt_curr+0x7a/0x90
      [ 1307.576756]  ? update_curr+0xe1/0x1d0
      [ 1307.580421]  ? account_entity_dequeue+0x7b/0xb0
      [ 1307.584955]  ? newidle_balance+0x231/0x3d0
      [ 1307.589056]  ? __switch_to_asm+0x42/0x70
      [ 1307.592986]  ? __switch_to_asm+0x36/0x70
      [ 1307.596918]  ? lock_timer_base+0x67/0x80
      [ 1307.600851]  kjournald2+0xbd/0x270 [jbd2]
      [ 1307.604873]  ? finish_wait+0x80/0x80
      [ 1307.608460]  ? commit_timeout+0x10/0x10 [jbd2]
      [ 1307.612915]  kthread+0x114/0x130
      [ 1307.616152]  ? kthread_park+0x80/0x80
      [ 1307.619816]  ret_from_fork+0x22/0x30
      [ 1307.623400] ---[ end trace 27490236265b1630 ]---
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Link: https://lore.kernel.org/r/20200717090605.2612-1-lczerner@redhat.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47788043
    • Jan Kara's avatar
      ext4: don't BUG on inconsistent journal feature · a6d49257
      Jan Kara authored
      [ Upstream commit 11215630 ]
      
      A customer has reported a BUG_ON in ext4_clear_journal_err() hitting
      during an LTP testing. Either this has been caused by a test setup
      issue where the filesystem was being overwritten while LTP was mounting
      it or the journal replay has overwritten the superblock with invalid
      data. In either case it is preferable we don't take the machine down
      with a BUG_ON. So handle the situation of unexpectedly missing
      has_journal feature more gracefully. We issue warning and fail the mount
      in the cases where the race window is narrow and the failed check is
      most likely a programming error. In cases where fs corruption is more
      likely, we do full ext4_error() handling before failing mount / remount.
      Reviewed-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200710140759.18031-1-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a6d49257
    • Lukas Czerner's avatar
      jbd2: make sure jh have b_transaction set in refile/unfile_buffer · 3b1a4ea0
      Lukas Czerner authored
      [ Upstream commit 24dc9864 ]
      
      Callers of __jbd2_journal_unfile_buffer() and
      __jbd2_journal_refile_buffer() assume that the b_transaction is set. In
      fact if it's not, we can end up with journal_head refcounting errors
      leading to crash much later that might be very hard to track down. Add
      asserts to make sure that is the case.
      
      We also make sure that b_next_transaction is NULL in
      __jbd2_journal_unfile_buffer() since the callers expect that as well and
      we should not get into that stage in this state anyway, leading to
      problems later on if we do.
      
      Tested with fstests.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20200617092549.6712-1-lczerner@redhat.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3b1a4ea0