1. 16 Oct, 2020 5 commits
    • David Howells's avatar
      afs: Add tracing for cell refcount and active user count · dca54a7b
      David Howells authored
      Add a tracepoint to log the cell refcount and active user count and pass in
      a reason code through various functions that manipulate these counters.
      
      Additionally, a helper function, afs_see_cell(), is provided to log
      interesting places that deal with a cell without actually doing any
      accounting directly.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      dca54a7b
    • David Howells's avatar
      afs: Fix cell removal · 1d0e850a
      David Howells authored
      Fix cell removal by inserting a more final state than AFS_CELL_FAILED that
      indicates that the cell has been unpublished in case the manager is already
      requeued and will go through again.  The new AFS_CELL_REMOVED state will
      just immediately leave the manager function.
      
      Going through a second time in the AFS_CELL_FAILED state will cause it to
      try to remove the cell again, potentially leading to the proc list being
      removed.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Reported-by: syzbot+b994ecf2b023f14832c1@syzkaller.appspotmail.com
      Reported-by: syzbot+0e0db88e1eb44a91ae8d@syzkaller.appspotmail.com
      Reported-by: syzbot+2d0585e5efcd43d113c2@syzkaller.appspotmail.com
      Reported-by: syzbot+1ecc2f9d3387f1d79d42@syzkaller.appspotmail.com
      Reported-by: syzbot+18d51774588492bf3f69@syzkaller.appspotmail.com
      Reported-by: syzbot+a5e4946b04d6ca8fa5f3@syzkaller.appspotmail.com
      Suggested-by: default avatarHillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hillf Danton <hdanton@sina.com>
      1d0e850a
    • David Howells's avatar
      afs: Fix cell purging with aliases · 286377f6
      David Howells authored
      When the afs module is removed, one of the things that has to be done is to
      purge the cell database.  afs_cell_purge() cancels the management timer and
      then starts the cell manager work item to do the purging.  This does a
      single run through and then assumes that all cells are now purged - but
      this is no longer the case.
      
      With the introduction of alias detection, a later cell in the database can
      now be holding an active count on an earlier cell (cell->alias_of).  The
      purge scan passes by the earlier cell first, but this can't be got rid of
      until it has discarded the alias.  Ordinarily, afs_unuse_cell() would
      handle this by setting the management timer to trigger another pass - but
      afs_set_cell_timer() doesn't do anything if the namespace is being removed
      (net->live == false).  rmmod then hangs in the wait on cells_outstanding in
      afs_cell_purge().
      
      Fix this by making afs_set_cell_timer() directly queue the cell manager if
      net->live is false.  This causes additional management passes.
      
      Queueing the cell manager increments cells_outstanding to make sure the
      wait won't complete until all cells are destroyed.
      
      Fixes: 8a070a96 ("afs: Detect cell aliases 1 - Cells with root volumes")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      286377f6
    • David Howells's avatar
      afs: Fix cell refcounting by splitting the usage counter · 88c853c3
      David Howells authored
      Management of the lifetime of afs_cell struct has some problems due to the
      usage counter being used to determine whether objects of that type are in
      use in addition to whether anyone might be interested in the structure.
      
      This is made trickier by cell objects being cached for a period of time in
      case they're quickly reused as they hold the result of a setup process that
      may be slow (DNS lookups, AFS RPC ops).
      
      Problems include the cached root volume from alias resolution pinning its
      parent cell record, rmmod occasionally hanging and occasionally producing
      assertion failures.
      
      Fix this by splitting the count of active users from the struct reference
      count.  Things then work as follows:
      
       (1) The cell cache keeps +1 on the cell's activity count and this has to
           be dropped before the cell can be removed.  afs_manage_cell() tries to
           exchange the 1 to a 0 with the cells_lock write-locked, and if
           successful, the record is removed from the net->cells.
      
       (2) One struct ref is 'owned' by the activity count.  That is put when the
           active count is reduced to 0 (final_destruction label).
      
       (3) A ref can be held on a cell whilst it is queued for management on a
           work queue without confusing the active count.  afs_queue_cell() is
           added to wrap this.
      
       (4) The queue's ref is dropped at the end of the management.  This is
           split out into a separate function, afs_manage_cell_work().
      
       (5) The root volume record is put after a cell is removed (at the
           final_destruction label) rather then in the RCU destruction routine.
      
       (6) Volumes hold struct refs, but aren't active users.
      
       (7) Both counts are displayed in /proc/net/afs/cells.
      
      There are some management function changes:
      
       (*) afs_put_cell() now just decrements the refcount and triggers the RCU
           destruction if it becomes 0.  It no longer sets a timer to have the
           manager do this.
      
       (*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease
           the active count.  afs_unuse_cell() sets the management timer.
      
       (*) afs_queue_cell() is added to queue a cell with approprate refs.
      
      There are also some other fixes:
      
       (*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL.
      
       (*) Make sure that candidate cells in lookups are properly destroyed
           rather than being simply kfree'd.  This ensures the bits it points to
           are destroyed also.
      
       (*) afs_dec_cells_outstanding() is now called in cell destruction rather
           than at "final_destruction".  This ensures that cell->net is still
           valid to the end of the destructor.
      
       (*) As a consequence of the previous two changes, move the increment of
           net->cells_outstanding that was at the point of insertion into the
           tree to the allocation routine to correctly balance things.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      88c853c3
    • David Howells's avatar
      afs: Fix rapid cell addition/removal by not using RCU on cells tree · 92e3cc91
      David Howells authored
      There are a number of problems that are being seen by the rapidly mounting
      and unmounting an afs dynamic root with an explicit cell and volume
      specified (which should probably be rejected, but that's a separate issue):
      
      What the tests are doing is to look up/create a cell record for the name
      given and then tear it down again without actually using it to try to talk
      to a server.  This is repeated endlessly, very fast, and the new cell
      collides with the old one if it's not quick enough to reuse it.
      
      It appears (as suggested by Hillf Danton) that the search through the RB
      tree under a read_seqbegin_or_lock() under RCU conditions isn't safe and
      that it's not blocking the write_seqlock(), despite taking two passes at
      it.  He suggested that the code should take a ref on the cell it's
      attempting to look at - but this shouldn't be necessary until we've
      compared the cell names.  It's possible that I'm missing a barrier
      somewhere.
      
      However, using an RCU search for this is overkill, really - we only need to
      access the cell name in a few places, and they're places where we're may
      end up sleeping anyway.
      
      Fix this by switching to an R/W semaphore instead.
      
      Additionally, draw the down_read() call inside the function (renamed to
      afs_find_cell()) since all the callers were taking the RCU read lock (or
      should've been[*]).
      
      [*] afs_probe_cell_name() should have been, but that doesn't appear to be
      involved in the bug reports.
      
      The symptoms of this look like:
      
      	general protection fault, probably for non-canonical address 0xf27d208691691fdb: 0000 [#1] PREEMPT SMP KASAN
      	KASAN: maybe wild-memory-access in range [0x93e924348b48fed8-0x93e924348b48fedf]
      	...
      	RIP: 0010:strncasecmp lib/string.c:52 [inline]
      	RIP: 0010:strncasecmp+0x5f/0x240 lib/string.c:43
      	 afs_lookup_cell_rcu+0x313/0x720 fs/afs/cell.c:88
      	 afs_lookup_cell+0x2ee/0x1440 fs/afs/cell.c:249
      	 afs_parse_source fs/afs/super.c:290 [inline]
      	...
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Reported-by: syzbot+459a5dce0b4cb70fd076@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hillf Danton <hdanton@sina.com>
      cc: syzkaller-bugs@googlegroups.com
      92e3cc91
  2. 11 Oct, 2020 10 commits
  3. 10 Oct, 2020 6 commits
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · da690031
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Some more driver bugfixes for I2C. Including a revert - the updated
        series for it will come during the next merge window"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: owl: Clear NACK and BUS error bits
        Revert "i2c: imx: Fix reset of I2SR_IAL flag"
        i2c: meson: fixup rate calculation with filter delay
        i2c: meson: keep peripheral clock enabled
        i2c: meson: fix clock setting overwrite
        i2c: imx: Fix reset of I2SR_IAL flag
      da690031
    • Vladimir Zapolskiy's avatar
      cifs: Fix incomplete memory allocation on setxattr path · 64b7f674
      Vladimir Zapolskiy authored
      On setxattr() syscall path due to an apprent typo the size of a dynamically
      allocated memory chunk for storing struct smb2_file_full_ea_info object is
      computed incorrectly, to be more precise the first addend is the size of
      a pointer instead of the wanted object size. Coincidentally it makes no
      difference on 64-bit platforms, however on 32-bit targets the following
      memcpy() writes 4 bytes of data outside of the dynamically allocated memory.
      
        =============================================================================
        BUG kmalloc-16 (Not tainted): Redzone overwritten
        -----------------------------------------------------------------------------
      
        Disabling lock debugging due to kernel taint
        INFO: 0x79e69a6f-0x9e5cdecf @offset=368. First byte 0x73 instead of 0xcc
        INFO: Slab 0xd36d2454 objects=85 used=51 fp=0xf7d0fc7a flags=0x35000201
        INFO: Object 0x6f171df3 @offset=352 fp=0x00000000
      
        Redzone 5d4ff02d: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
        Object 6f171df3: 00 00 00 00 00 05 06 00 73 6e 72 75 62 00 66 69  ........snrub.fi
        Redzone 79e69a6f: 73 68 32 0a                                      sh2.
        Padding 56254d82: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
        CPU: 0 PID: 8196 Comm: attr Tainted: G    B             5.9.0-rc8+ #3
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
        Call Trace:
         dump_stack+0x54/0x6e
         print_trailer+0x12c/0x134
         check_bytes_and_report.cold+0x3e/0x69
         check_object+0x18c/0x250
         free_debug_processing+0xfe/0x230
         __slab_free+0x1c0/0x300
         kfree+0x1d3/0x220
         smb2_set_ea+0x27d/0x540
         cifs_xattr_set+0x57f/0x620
         __vfs_setxattr+0x4e/0x60
         __vfs_setxattr_noperm+0x4e/0x100
         __vfs_setxattr_locked+0xae/0xd0
         vfs_setxattr+0x4e/0xe0
         setxattr+0x12c/0x1a0
         path_setxattr+0xa4/0xc0
         __ia32_sys_lsetxattr+0x1d/0x20
         __do_fast_syscall_32+0x40/0x70
         do_fast_syscall_32+0x29/0x60
         do_SYSENTER_32+0x15/0x20
         entry_SYSENTER_32+0x9f/0xf2
      
      Fixes: 5517554e ("cifs: Add support for writing attributes on SMB2+")
      Signed-off-by: default avatarVladimir Zapolskiy <vladimir@tuxera.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      64b7f674
    • Hugh Dickins's avatar
      mm/khugepaged: fix filemap page_to_pgoff(page) != offset · 033b5d77
      Hugh Dickins authored
      There have been elusive reports of filemap_fault() hitting its
      VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page) on kernels built
      with CONFIG_READ_ONLY_THP_FOR_FS=y.
      
      Suren has hit it on a kernel with CONFIG_READ_ONLY_THP_FOR_FS=y and
      CONFIG_NUMA is not set: and he has analyzed it down to how khugepaged
      without NUMA reuses the same huge page after collapse_file() failed
      (whereas NUMA targets its allocation to the respective node each time).
      And most of us were usually testing with CONFIG_NUMA=y kernels.
      
      collapse_file(old start)
        new_page = khugepaged_alloc_page(hpage)
        __SetPageLocked(new_page)
        new_page->index = start // hpage->index=old offset
        new_page->mapping = mapping
        xas_store(&xas, new_page)
      
                                filemap_fault
                                  page = find_get_page(mapping, offset)
                                  // if offset falls inside hpage then
                                  // compound_head(page) == hpage
                                  lock_page_maybe_drop_mmap()
                                    __lock_page(page)
      
        // collapse fails
        xas_store(&xas, old page)
        new_page->mapping = NULL
        unlock_page(new_page)
      
      collapse_file(new start)
        new_page = khugepaged_alloc_page(hpage)
        __SetPageLocked(new_page)
        new_page->index = start // hpage->index=new offset
        new_page->mapping = mapping // mapping becomes valid again
      
                                  // since compound_head(page) == hpage
                                  // page_to_pgoff(page) got changed
                                  VM_BUG_ON_PAGE(page_to_pgoff(page) != offset)
      
      An initial patch replaced __SetPageLocked() by lock_page(), which did
      fix the race which Suren illustrates above.  But testing showed that it's
      not good enough: if the racing task's __lock_page() gets delayed long
      after its find_get_page(), then it may follow collapse_file(new start)'s
      successful final unlock_page(), and crash on the same VM_BUG_ON_PAGE.
      
      It could be fixed by relaxing filemap_fault()'s VM_BUG_ON_PAGE to a
      check and retry (as is done for mapping), with similar relaxations in
      find_lock_entry() and pagecache_get_page(): but it's not obvious what
      else might get caught out; and khugepaged non-NUMA appears to be unique
      in exposing a page to page cache, then revoking, without going through
      a full cycle of freeing before reuse.
      
      Instead, non-NUMA khugepaged_prealloc_page() release the old page
      if anyone else has a reference to it (1% of cases when I tested).
      
      Although never reported on huge tmpfs, I believe its find_lock_entry()
      has been at similar risk; but huge tmpfs does not rely on khugepaged
      for its normal working nearly so much as READ_ONLY_THP_FOR_FS does.
      Reported-by: default avatarDenis Lisov <dennis.lissov@gmail.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206569
      Link: https://lore.kernel.org/linux-mm/?q=20200219144635.3b7417145de19b65f258c943%40linux-foundation.orgReported-by: default avatarQian Cai <cai@lca.pw>
      Link: https://lore.kernel.org/linux-xfs/?q=20200616013309.GB815%40lca.pwReported-and-analyzed-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Fixes: 87c460a0 ("mm/khugepaged: collapse_shmem() without freezing new_page")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org # v4.9+
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      033b5d77
    • Cristian Ciocaltea's avatar
      i2c: owl: Clear NACK and BUS error bits · f5b3f433
      Cristian Ciocaltea authored
      When the NACK and BUS error bits are set by the hardware, the driver is
      responsible for clearing them by writing "1" into the corresponding
      status registers.
      
      Hence perform the necessary operations in owl_i2c_interrupt().
      
      Fixes: d211e62a ("i2c: Add Actions Semiconductor Owl family S900 I2C driver")
      Reported-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Signed-off-by: default avatarCristian Ciocaltea <cristian.ciocaltea@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      f5b3f433
    • Wolfram Sang's avatar
      Revert "i2c: imx: Fix reset of I2SR_IAL flag" · 5a02e7c4
      Wolfram Sang authored
      This reverts commit fa4d3055. An updated
      version was sent. So, revert this version and give the new version more
      time for testing.
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      5a02e7c4
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 6f2f486d
      Linus Torvalds authored
      Pull spi fix from Mark Brown:
       "One last minute fix for v5.9 which has been causing crashes in test
        systems with the fsl-dspi driver when they hit deferred probe (and
        which I probably let cook in next a bit longer than is ideal).
      
        And an update to MAINTAINERS reflecting Serge's extensive and
        detailed recent work on the DesignWare driver"
      
      * tag 'spi-fix-v5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        MAINTAINERS: Add maintainer of DW APB SSI driver
        spi: fsl-dspi: fix NULL pointer dereference
      6f2f486d
  4. 09 Oct, 2020 9 commits
  5. 08 Oct, 2020 10 commits