1. 30 Nov, 2017 40 commits
    • Andrew Elble's avatar
      nfsd: deal with revoked delegations appropriately · 584f0bb5
      Andrew Elble authored
      commit 95da1b3a upstream.
      
      If a delegation has been revoked by the server, operations using that
      delegation should error out with NFS4ERR_DELEG_REVOKED in the >4.1
      case, and NFS4ERR_BAD_STATEID otherwise.
      
      The server needs NFSv4.1 clients to explicitly free revoked delegations.
      If the server returns NFS4ERR_DELEG_REVOKED, the client will do that;
      otherwise it may just forget about the delegation and be unable to
      recover when it later sees SEQ4_STATUS_RECALLABLE_STATE_REVOKED set on a
      SEQUENCE reply.  That can cause the Linux 4.1 client to loop in its
      stage manager.
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Reviewed-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      584f0bb5
    • NeilBrown's avatar
      NFS: revalidate "." etc correctly on "open". · 57567073
      NeilBrown authored
      commit b688741c upstream.
      
      For correct close-to-open semantics, NFS must validate
      the change attribute of a directory (or file) on open.
      
      Since commit ecf3d1f1 ("vfs: kill FS_REVAL_DOT by adding a
      d_weak_revalidate dentry op"), open() of "." or a path ending ".." is
      not revalidated reliably (except when that direct is a mount point).
      
      Prior to that commit, "." was revalidated using nfs_lookup_revalidate()
      which checks the LOOKUP_OPEN flag and forces revalidation if the flag is
      set.
      Since that commit, nfs_weak_revalidate() is used for NFSv3 (which
      ignores the flags) and nothing is used for NFSv4.
      
      This is fixed by using nfs_lookup_verify_inode() in
      nfs_weak_revalidate().  This does the revalidation exactly when needed.
      Also, add a definition of .d_weak_revalidate for NFSv4.
      
      The incorrect behavior is easily demonstrated by running "echo *" in
      some non-mountpoint NFS directory while watching network traffic.
      Without this patch, "echo *" sometimes doesn't produce any traffic.
      With the patch it always does.
      
      Fixes: ecf3d1f1 ("vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57567073
    • Anna Schumaker's avatar
      NFS: Avoid RCU usage in tracepoints · 2deb8945
      Anna Schumaker authored
      commit 3944369d upstream.
      
      There isn't an obvious way to acquire and release the RCU lock during a
      tracepoint, so we can't use the rpc_peeraddr2str() function here.
      Instead, rely on the client's cl_hostname, which should have similar
      enough information without needing an rcu_dereference().
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2deb8945
    • Chuck Lever's avatar
      nfs: Fix ugly referral attributes · aed1a433
      Chuck Lever authored
      commit c05cefcc upstream.
      
      Before traversing a referral and performing a mount, the mounted-on
      directory looks strange:
      
      dr-xr-xr-x. 2 4294967294 4294967294 0 Dec 31  1969 dir.0
      
      nfs4_get_referral is wiping out any cached attributes with what was
      returned via GETATTR(fs_locations), but the bit mask for that
      operation does not request any file attributes.
      
      Retrieve owner and timestamp information so that the memcpy in
      nfs4_get_referral fills in more attributes.
      
      Changes since v1:
      - Don't request attributes that the client unconditionally replaces
      - Request only MOUNTED_ON_FILEID or FILEID attribute, not both
      - encode_fs_locations() doesn't use the third bitmask word
      
      Fixes: 6b97fd3d ("NFSv4: Follow a referral")
      Suggested-by: default avatarPradeep Thomas <pradeepthomas@gmail.com>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aed1a433
    • Benjamin Coddington's avatar
      NFS: Revert "NFS: Move the flock open mode check into nfs_flock()" · 57f3c05d
      Benjamin Coddington authored
      commit fcfa4470 upstream.
      
      Commit e1293727 "NFS: Move the flock open mode check into nfs_flock()"
      changed NFSv3 behavior for flock() such that the open mode must match the
      lock type, however that requirement shouldn't be enforced for flock().
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57f3c05d
    • Joshua Watt's avatar
      NFS: Fix typo in nomigration mount option · afaacc00
      Joshua Watt authored
      commit f02fee22 upstream.
      
      The option was incorrectly masking off all other options.
      Signed-off-by: default avatarJoshua Watt <JPEWhacker@gmail.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afaacc00
    • Jaegeuk Kim's avatar
      f2fs: expose some sectors to user in inline data or dentry case · d628ac8a
      Jaegeuk Kim authored
      commit 5b4267d1 upstream.
      
      If there's some data written through inline data or dentry, we need to shouw
      st_blocks. This fixes reporting zero blocks even though there is small written
      data.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      [Jaegeuk Kim: avoid link file for quotacheck]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d628ac8a
    • Josef Bacik's avatar
      btrfs: change how we decide to commit transactions during flushing · f1117628
      Josef Bacik authored
      commit 996478ca upstream.
      
      Nikolay reported that generic/273 was failing currently with ENOSPC.
      Turns out this is because we get to the point where the outstanding
      reservations are greater than the pinned space on the fs.  This is a
      mistake, previously we used the current reservation amount in
      may_commit_transaction, not the entire outstanding reservation amount.
      Fix this to find the minimum byte size needed to make progress in
      flushing, and pass that into may_commit_transaction.  From there we can
      make a smarter decision on whether to commit the transaction or not.
      This fixes the failure in generic/273.
      
      From Nikolai, IOW: when we go to the final stage of deciding whether to
      do trans commit, instead of passing all the reservations from all
      tickets we just pass the reservation for the current ticket. Otherwise,
      in case all reservations exceed pinned space, then we don't commit
      transaction and fail prematurely. Before we passed num_bytes from
      flush_space, where num_bytes was the sum of all pending reserverations,
      but now all we do is take the first ticket and commit the trans if we
      can satisfy that.
      
      Fixes: 957780eb ("Btrfs: introduce ticketed enospc infrastructure")
      Reported-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Tested-by: default avatarNikolay Borisov <nborisov@suse.com>
      [ added Nikolai's comment ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1117628
    • Arnd Bergmann's avatar
      isofs: fix timestamps beyond 2027 · f2122d66
      Arnd Bergmann authored
      commit 34be4dbf upstream.
      
      isofs uses a 'char' variable to load the number of years since
      1900 for an inode timestamp. On architectures that use a signed
      char type by default, this results in an invalid date for
      anything beyond 2027.
      
      This changes the function argument to a 'u8' array, which
      is defined the same way on all architectures, and unambiguously
      lets us use years until 2155.
      
      This should be backported to all kernels that might still be
      in use by that date.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2122d66
    • Miklos Szeredi's avatar
      fanotify: fix fsnotify_prepare_user_wait() failure · 1dd7dd07
      Miklos Szeredi authored
      commit f37650f1 upstream.
      
      If fsnotify_prepare_user_wait() fails, we leave the event on the
      notification list.  Which will result in a warning in
      fsnotify_destroy_event() and later use-after-free.
      
      Instead of adding a new helper to remove the event from the list in this
      case, I opted to move the prepare/finish up into fanotify_handle_event().
      
      This will allow these to be moved further out into the generic code later,
      and perhaps let us move to non-sleeping RCU.
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 05f0e387 ("fanotify: Release SRCU lock when waiting for userspace response")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1dd7dd07
    • Greg Edwards's avatar
      fs: guard_bio_eod() needs to consider partitions · 5c21c3dd
      Greg Edwards authored
      commit 67f2519f upstream.
      
      guard_bio_eod() needs to look at the partition capacity, not just the
      capacity of the whole device, when determining if truncation is
      necessary.
      
      [   60.268688] attempt to access beyond end of device
      [   60.268690] unknown-block(9,1): rw=0, want=67103509, limit=67103506
      [   60.268693] buffer_io_error: 2 callbacks suppressed
      [   60.268696] Buffer I/O error on dev md1p7, logical block 4524305, async page read
      
      Fixes: 74d46992 ("block: replace bi_bdev with a gendisk pointer and partitions index")
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Edwards <gedwards@ddn.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c21c3dd
    • Coly Li's avatar
      bcache: check ca->alloc_thread initialized before wake up it · e9c80881
      Coly Li authored
      commit 91af8300 upstream.
      
      In bcache code, sysfs entries are created before all resources get
      allocated, e.g. allocation thread of a cache set.
      
      There is posibility for NULL pointer deference if a resource is accessed
      but which is not initialized yet. Indeed Jorg Bornschein catches one on
      cache set allocation thread and gets a kernel oops.
      
      The reason for this bug is, when bch_bucket_alloc() is called during
      cache set registration and attaching, ca->alloc_thread is not properly
      allocated and initialized yet, call wake_up_process() on ca->alloc_thread
      triggers NULL pointer deference failure. A simple and fast fix is, before
      waking up ca->alloc_thread, checking whether it is allocated, and only
      wake up ca->alloc_thread when it is not NULL.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reported-by: default avatarJorg Bornschein <jb@capsec.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9c80881
    • Eric Biggers's avatar
      libceph: don't WARN() if user tries to add invalid key · bcae2363
      Eric Biggers authored
      commit b1127085 upstream.
      
      The WARN_ON(!key->len) in set_secret() in net/ceph/crypto.c is hit if a
      user tries to add a key of type "ceph" with an invalid payload as
      follows (assuming CONFIG_CEPH_LIB=y):
      
          echo -e -n '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' \
      	| keyctl padd ceph desc @s
      
      This can be hit by fuzzers.  As this is merely bad input and not a
      kernel bug, replace the WARN_ON() with return -EINVAL.
      
      Fixes: 7af3ea18 ("libceph: stop allocating a new cipher on every crypto request")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bcae2363
    • Dan Carpenter's avatar
      eCryptfs: use after free in ecryptfs_release_messaging() · bc6e8968
      Dan Carpenter authored
      commit db86be3a upstream.
      
      We're freeing the list iterator so we should be using the _safe()
      version of hlist_for_each_entry().
      
      Fixes: 88b4a07e ("[PATCH] eCryptfs: Public key transport mechanism")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc6e8968
    • Eric Biggers's avatar
      fscrypt: lock mutex before checking for bounce page pool · ddf1264e
      Eric Biggers authored
      commit a0b3bc85 upstream.
      
      fscrypt_initialize(), which allocates the global bounce page pool when
      an encrypted file is first accessed, uses "double-checked locking" to
      try to avoid locking fscrypt_init_mutex.  However, it doesn't use any
      memory barriers, so it's theoretically possible for a thread to observe
      a bounce page pool which has not been fully initialized.  This is a
      classic bug with "double-checked locking".
      
      While "only a theoretical issue" in the latest kernel, in pre-4.8
      kernels the pointer that was checked was not even the last to be
      initialized, so it was easily possible for a crash (NULL pointer
      dereference) to happen.  This was changed only incidentally by the large
      refactor to use fs/crypto/.
      
      Solve both problems in a trivial way that can easily be backported: just
      always take the mutex.  It's theoretically less efficient, but it
      shouldn't be noticeable in practice as the mutex is only acquired very
      briefly once per encrypted file.
      
      Later I'd like to make this use a helper macro like DO_ONCE().  However,
      DO_ONCE() runs in atomic context, so we'd need to add a new macro that
      allows blocking.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddf1264e
    • Andreas Rohner's avatar
      nilfs2: fix race condition that causes file system corruption · f9478266
      Andreas Rohner authored
      commit 31ccb1f7 upstream.
      
      There is a race condition between nilfs_dirty_inode() and
      nilfs_set_file_dirty().
      
      When a file is opened, nilfs_dirty_inode() is called to update the
      access timestamp in the inode.  It calls __nilfs_mark_inode_dirty() in a
      separate transaction.  __nilfs_mark_inode_dirty() caches the ifile
      buffer_head in the i_bh field of the inode info structure and marks it
      as dirty.
      
      After some data was written to the file in another transaction, the
      function nilfs_set_file_dirty() is called, which adds the inode to the
      ns_dirty_files list.
      
      Then the segment construction calls nilfs_segctor_collect_dirty_files(),
      which goes through the ns_dirty_files list and checks the i_bh field.
      If there is a cached buffer_head in i_bh it is not marked as dirty
      again.
      
      Since nilfs_dirty_inode() and nilfs_set_file_dirty() use separate
      transactions, it is possible that a segment construction that writes out
      the ifile occurs in-between the two.  If this happens the inode is not
      on the ns_dirty_files list, but its ifile block is still marked as dirty
      and written out.
      
      In the next segment construction, the data for the file is written out
      and nilfs_bmap_propagate() updates the b-tree.  Eventually the bmap root
      is written into the i_bh block, which is not dirty, because it was
      written out in another segment construction.
      
      As a result the bmap update can be lost, which leads to file system
      corruption.  Either the virtual block address points to an unallocated
      DAT block, or the DAT entry will be reused for something different.
      
      The error can remain undetected for a long time.  A typical error
      message would be one of the "bad btree" errors or a warning that a DAT
      entry could not be found.
      
      This bug can be reproduced reliably by a simple benchmark that creates
      and overwrites millions of 4k files.
      
      Link: http://lkml.kernel.org/r/1509367935-3086-2-git-send-email-konishi.ryusuke@lab.ntt.co.jpSigned-off-by: default avatarAndreas Rohner <andreas.rohner@gmx.net>
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Tested-by: default avatarAndreas Rohner <andreas.rohner@gmx.net>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9478266
    • NeilBrown's avatar
      autofs: don't fail mount for transient error · 7b7f5437
      NeilBrown authored
      commit ecc0c469 upstream.
      
      Currently if the autofs kernel module gets an error when writing to the
      pipe which links to the daemon, then it marks the whole moutpoint as
      catatonic, and it will stop working.
      
      It is possible that the error is transient.  This can happen if the
      daemon is slow and more than 16 requests queue up.  If a subsequent
      process tries to queue a request, and is then signalled, the write to
      the pipe will return -ERESTARTSYS and autofs will take that as total
      failure.
      
      So change the code to assess -ERESTARTSYS and -ENOMEM as transient
      failures which only abort the current request, not the whole mountpoint.
      
      It isn't a crash or a data corruption, but having autofs mountpoints
      suddenly stop working is rather inconvenient.
      
      Ian said:
      
      : And given the problems with a half dozen (or so) user space applications
      : consuming large amounts of CPU under heavy mount and umount activity this
      : could happen more easily than we expect.
      
      Link: http://lkml.kernel.org/r/87y3norvgp.fsf@notabene.neil.brown.nameSigned-off-by: default avatarNeilBrown <neilb@suse.com>
      Acked-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b7f5437
    • Vitaly Wool's avatar
      mm/z3fold.c: use kref to prevent page free/compact race · c1a14af3
      Vitaly Wool authored
      commit 5d03a661 upstream.
      
      There is a race in the current z3fold implementation between
      do_compact() called in a work queue context and the page release
      procedure when page's kref goes to 0.
      
      do_compact() may be waiting for page lock, which is released by
      release_z3fold_page_locked right before putting the page onto the
      "stale" list, and then the page may be freed as do_compact() modifies
      its contents.
      
      The mechanism currently implemented to handle that (checking the
      PAGE_STALE flag) is not reliable enough.  Instead, we'll use page's kref
      counter to guarantee that the page is not released if its compaction is
      scheduled.  It then becomes compaction function's responsibility to
      decrease the counter and quit immediately if the page was actually
      freed.
      
      Link: http://lkml.kernel.org/r/20171117092032.00ea56f42affbed19f4fcc6c@gmail.comSigned-off-by: default avatarVitaly Wool <vitaly.wool@sonymobile.com>
      Cc: <Oleksiy.Avramchenko@sony.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1a14af3
    • Stanislaw Gruszka's avatar
      rt2x00usb: mark device removed when get ENOENT usb error · 769bfea5
      Stanislaw Gruszka authored
      commit bfa62a52 upstream.
      
      ENOENT usb error mean "specified interface or endpoint does not exist or
      is not enabled". Mark device not present when we encounter this error
      similar like we do with ENODEV error.
      
      Otherwise we can have infinite loop in rt2x00usb_work_rxdone(), because
      we remove and put again RX entries to the queue infinitely.
      
      We can have similar situation when submit urb will fail all the time
      with other error, so we need consider to limit number of entries
      processed by rxdone work. But for now, since the patch fixes
      reproducible soft lockup issue on single processor systems
      and taken ENOENT error meaning, let apply this fix.
      
      Patch adds additional ENOENT check not only in rx kick routine, but
      also on other places where we check for ENODEV error.
      Reported-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Debugged-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Tested-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      769bfea5
    • Aleksandar Markovic's avatar
      MIPS: math-emu: Fix final emulation phase for certain instructions · 085d6651
      Aleksandar Markovic authored
      commit 409fcace upstream.
      
      Fix final phase of <CLASS|MADDF|MSUBF|MAX|MIN|MAXA|MINA>.<D|S>
      emulation. Provide proper generation of SIGFPE signal and updating
      debugfs FP exception stats in cases of any exception flags set in
      preceding phases of emulation.
      
      CLASS.<D|S> instruction may generate "Unimplemented Operation" FP
      exception. <MADDF|MSUBF>.<D|S> instructions may generate "Inexact",
      "Unimplemented Operation", "Invalid Operation", "Overflow", and
      "Underflow" FP exceptions. <MAX|MIN|MAXA|MINA>.<D|S> instructions
      can generate "Unimplemented Operation" and "Invalid Operation" FP
      exceptions.
      
      The proper final processing of the cases when any FP exception
      flag is set is achieved by replacing "break" statement with "goto
      copcsr" statement. With such solution, this patch brings the final
      phase of emulation of the above instructions consistent with the
      one corresponding to the previously implemented emulation of other
      related FPU instructions (ADD, SUB, etc.).
      
      Fixes: 38db37ba ("MIPS: math-emu: Add support for the MIPS R6 CLASS FPU instruction")
      Fixes: e24c3bec ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
      Fixes: 83d43305 ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")
      Fixes: a79f5f9b ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
      Fixes: 4e9561b2 ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")
      Signed-off-by: default avatarAleksandar Markovic <aleksandar.markovic@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Douglas Leung <douglas.leung@mips.com>
      Cc: Goran Ferenc <goran.ferenc@mips.com>
      Cc: "Maciej W. Rozycki" <macro@imgtec.com>
      Cc: Miodrag Dinic <miodrag.dinic@mips.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petar Jovanovic <petar.jovanovic@mips.com>
      Cc: Raghu Gandham <raghu.gandham@mips.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/17581/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      085d6651
    • Mirko Parthey's avatar
      MIPS: BCM47XX: Fix LED inversion for WRT54GSv1 · 8d187fa8
      Mirko Parthey authored
      commit 56a46acf upstream.
      
      The WLAN LED on the Linksys WRT54GSv1 is active low, but the software
      treats it as active high. Fix the inverted logic.
      
      Fixes: 7bb26b16 ("MIPS: BCM47xx: Fix LEDs on WRT54GS V1.0")
      Signed-off-by: default avatarMirko Parthey <mirko.parthey@web.de>
      Looks-ok-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Cc: Hauke Mehrtens <hauke@hauke-m.de>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/16071/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d187fa8
    • Maciej W. Rozycki's avatar
      MIPS: Fix an n32 core file generation regset support regression · dc3aceed
      Maciej W. Rozycki authored
      commit 547da673 upstream.
      
      Fix a commit 7aeb753b ("MIPS: Implement task_user_regset_view.")
      regression, then activated by commit 6a9c001b ("MIPS: Switch ELF
      core dumper to use regsets.)", that caused n32 processes to dump o32
      core files by failing to set the EF_MIPS_ABI2 flag in the ELF core file
      header's `e_flags' member:
      
      $ file tls-core
      tls-core: ELF 32-bit MSB executable, MIPS, N32 MIPS64 rel2 version 1 (SYSV), [...]
      $ ./tls-core
      Aborted (core dumped)
      $ file core
      core: ELF 32-bit MSB core file MIPS, MIPS-I version 1 (SYSV), SVR4-style
      $
      
      Previously the flag was set as the result of a:
      
      statement placed in arch/mips/kernel/binfmt_elfn32.c, however in the
      regset case, i.e. when CORE_DUMP_USE_REGSET is set, ELF_CORE_EFLAGS is
      no longer used by `fill_note_info' in fs/binfmt_elf.c, and instead the
      `->e_flags' member of the regset view chosen is.  We have the views
      defined in arch/mips/kernel/ptrace.c, however only an o32 and an n64
      one, and the latter is used for n32 as well.  Consequently an o32 core
      file is incorrectly dumped from n32 processes (the ELF32 vs ELF64 class
      is chosen elsewhere, and the 32-bit one is correctly selected for n32).
      
      Correct the issue then by defining an n32 regset view and using it as
      appropriate.  Issue discovered in GDB testing.
      
      Fixes: 7aeb753b ("MIPS: Implement task_user_regset_view.")
      Signed-off-by: default avatarMaciej W. Rozycki <macro@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Djordje Todorovic <djordje.todorovic@rt-rk.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/17617/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc3aceed
    • Masahiro Yamada's avatar
      MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry · 43bce9f2
      Masahiro Yamada authored
      commit 3cad14d5 upstream.
      
      arch/mips/boot/dts/brcm/bcm96358nb4ser.dts does not exist, so
      we cannot build bcm96358nb4ser.dtb .
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Fixes: 69583551 ("MIPS: BMIPS: rename bcm96358nb4ser to bcm6358-neufbox4-sercom")
      Acked-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43bce9f2
    • James Hogan's avatar
      MIPS: Fix MIPS64 FP save/restore on 32-bit kernels · d6353404
      James Hogan authored
      commit 22b8ba76 upstream.
      
      32-bit kernels can be configured to support MIPS64, in which case
      neither CONFIG_64BIT or CONFIG_CPU_MIPS32_R* will be set. This causes
      the CP0_Status.FR checks at the point of floating point register save
      and restore to be compiled out, which results in odd FP registers not
      being saved or restored to the task or signal context even when
      CP0_Status.FR is set.
      
      Fix the ifdefs to use CONFIG_CPU_MIPSR2 and CONFIG_CPU_MIPSR6, which are
      enabled for the relevant revisions of either MIPS32 or MIPS64, along
      with some other CPUs such as Octeon (r2), Loongson1 (r2), XLP (r2),
      Loongson 3A R2.
      
      The suspect code originates from commit 597ce172 ("MIPS: Support for
      64-bit FP with O32 binaries") in v3.14, however the code in
      __enable_fpu() was consistent and refused to set FR=1, falling back to
      software FPU emulation. This was suboptimal but should be functionally
      correct.
      
      Commit fcc53b5f ("MIPS: fpu.h: Allow 64-bit FPU on a 64-bit MIPS R6
      CPU") in v4.2 (and stable tagged back to 4.0) later introduced the bug
      by updating __enable_fpu() to set FR=1 but failing to update the other
      similar ifdefs to enable FR=1 state handling.
      
      Fixes: fcc53b5f ("MIPS: fpu.h: Allow 64-bit FPU on a 64-bit MIPS R6 CPU")
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/16739/Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6353404
    • James Hogan's avatar
      MIPS: Fix odd fp register warnings with MIPS64r2 · 43292e65
      James Hogan authored
      commit c7fd89a6 upstream.
      
      Building 32-bit MIPS64r2 kernels produces warnings like the following
      on certain toolchains (such as GNU assembler 2.24.90, but not GNU
      assembler 2.28.51) since commit 22b8ba76 ("MIPS: Fix MIPS64 FP
      save/restore on 32-bit kernels"), due to the exposure of fpu_save_16odd
      from fpu_save_double and fpu_restore_16odd from fpu_restore_double:
      
      arch/mips/kernel/r4k_fpu.S:47: Warning: float register should be even, was 1
      ...
      arch/mips/kernel/r4k_fpu.S:59: Warning: float register should be even, was 1
      ...
      
      This appears to be because .set mips64r2 does not change the FPU ABI to
      64-bit when -march=mips64r2 (or e.g. -march=xlp) is provided on the
      command line on that toolchain, from the default FPU ABI of 32-bit due
      to the -mabi=32. This makes access to the odd FPU registers invalid.
      
      Fix by explicitly changing the FPU ABI with .set fp=64 directives in
      fpu_save_16odd and fpu_restore_16odd, and moving the undefine of fp up
      in asmmacro.h so fp doesn't turn into $30.
      
      Fixes: 22b8ba76 ("MIPS: Fix MIPS64 FP save/restore on 32-bit kernels")
      Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/17656/Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43292e65
    • Mike Snitzer's avatar
      dm: discard support requires all targets in a table support discards · e39516d2
      Mike Snitzer authored
      commit 8a74d29d upstream.
      
      A DM device with a mix of discard capabilities (due to some underlying
      devices not having discard support) _should_ just return -EOPNOTSUPP for
      the region of the device that doesn't support discards (even if only by
      way of the underlying driver formally not supporting discards).  BUT,
      that does ask the underlying driver to handle something that it never
      advertised support for.  In doing so we're exposing users to the
      potential for a underlying disk driver hanging if/when a discard is
      issued a the device that is incapable and never claimed to support
      discards.
      
      Fix this by requiring that each DM target in a DM table provide discard
      support as a prereq for a DM device to advertise support for discards.
      
      This may cause some configurations that were happily supporting discards
      (even in the face of a mix of discard support) to stop supporting
      discards -- but the risk of users hitting driver hangs, and forced
      reboots, outweighs supporting those fringe mixed discard
      configurations.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e39516d2
    • Hou Tao's avatar
      dm: fix race between dm_get_from_kobject() and __dm_destroy() · 3bfb87ec
      Hou Tao authored
      commit b9a41d21 upstream.
      
      The following BUG_ON was hit when testing repeat creation and removal of
      DM devices:
      
          kernel BUG at drivers/md/dm.c:2919!
          CPU: 7 PID: 750 Comm: systemd-udevd Not tainted 4.1.44
          Call Trace:
           [<ffffffff81649e8b>] dm_get_from_kobject+0x34/0x3a
           [<ffffffff81650ef1>] dm_attr_show+0x2b/0x5e
           [<ffffffff817b46d1>] ? mutex_lock+0x26/0x44
           [<ffffffff811df7f5>] sysfs_kf_seq_show+0x83/0xcf
           [<ffffffff811de257>] kernfs_seq_show+0x23/0x25
           [<ffffffff81199118>] seq_read+0x16f/0x325
           [<ffffffff811de994>] kernfs_fop_read+0x3a/0x13f
           [<ffffffff8117b625>] __vfs_read+0x26/0x9d
           [<ffffffff8130eb59>] ? security_file_permission+0x3c/0x44
           [<ffffffff8117bdb8>] ? rw_verify_area+0x83/0xd9
           [<ffffffff8117be9d>] vfs_read+0x8f/0xcf
           [<ffffffff81193e34>] ? __fdget_pos+0x12/0x41
           [<ffffffff8117c686>] SyS_read+0x4b/0x76
           [<ffffffff817b606e>] system_call_fastpath+0x12/0x71
      
      The bug can be easily triggered, if an extra delay (e.g. 10ms) is added
      between the test of DMF_FREEING & DMF_DELETING and dm_get() in
      dm_get_from_kobject().
      
      To fix it, we need to ensure the test of DMF_FREEING & DMF_DELETING and
      dm_get() are done in an atomic way, so _minor_lock is used.
      
      The other callers of dm_get() have also been checked to be OK: some
      callers invoke dm_get() under _minor_lock, some callers invoke it under
      _hash_lock, and dm_start_request() invoke it after increasing
      md->open_count.
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bfb87ec
    • John Crispin's avatar
      MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver · 9be341ed
      John Crispin authored
      commit 8593b18a upstream.
      
      Switch the printk() call to the prefered pr_warn() api.
      
      Fixes: 7e5873d3 ("MIPS: pci: Add MT7620a PCIE driver")
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/15321/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9be341ed
    • Steven Rostedt (Red Hat)'s avatar
      sched/rt: Simplify the IPI based RT balancing logic · f17c786b
      Steven Rostedt (Red Hat) authored
      commit 4bdced5c upstream.
      
      When a CPU lowers its priority (schedules out a high priority task for a
      lower priority one), a check is made to see if any other CPU has overloaded
      RT tasks (more than one). It checks the rto_mask to determine this and if so
      it will request to pull one of those tasks to itself if the non running RT
      task is of higher priority than the new priority of the next task to run on
      the current CPU.
      
      When we deal with large number of CPUs, the original pull logic suffered
      from large lock contention on a single CPU run queue, which caused a huge
      latency across all CPUs. This was caused by only having one CPU having
      overloaded RT tasks and a bunch of other CPUs lowering their priority. To
      solve this issue, commit:
      
        b6366f04 ("sched/rt: Use IPI to trigger RT task push migration instead of pulling")
      
      changed the way to request a pull. Instead of grabbing the lock of the
      overloaded CPU's runqueue, it simply sent an IPI to that CPU to do the work.
      
      Although the IPI logic worked very well in removing the large latency build
      up, it still could suffer from a large number of IPIs being sent to a single
      CPU. On a 80 CPU box, I measured over 200us of processing IPIs. Worse yet,
      when I tested this on a 120 CPU box, with a stress test that had lots of
      RT tasks scheduling on all CPUs, it actually triggered the hard lockup
      detector! One CPU had so many IPIs sent to it, and due to the restart
      mechanism that is triggered when the source run queue has a priority status
      change, the CPU spent minutes! processing the IPIs.
      
      Thinking about this further, I realized there's no reason for each run queue
      to send its own IPI. As all CPUs with overloaded tasks must be scanned
      regardless if there's one or many CPUs lowering their priority, because
      there's no current way to find the CPU with the highest priority task that
      can schedule to one of these CPUs, there really only needs to be one IPI
      being sent around at a time.
      
      This greatly simplifies the code!
      
      The new approach is to have each root domain have its own irq work, as the
      rto_mask is per root domain. The root domain has the following fields
      attached to it:
      
        rto_push_work	 - the irq work to process each CPU set in rto_mask
        rto_lock	 - the lock to protect some of the other rto fields
        rto_loop_start - an atomic that keeps contention down on rto_lock
      		    the first CPU scheduling in a lower priority task
      		    is the one to kick off the process.
        rto_loop_next	 - an atomic that gets incremented for each CPU that
      		    schedules in a lower priority task.
        rto_loop	 - a variable protected by rto_lock that is used to
      		    compare against rto_loop_next
        rto_cpu	 - The cpu to send the next IPI to, also protected by
      		    the rto_lock.
      
      When a CPU schedules in a lower priority task and wants to make sure
      overloaded CPUs know about it. It increments the rto_loop_next. Then it
      atomically sets rto_loop_start with a cmpxchg. If the old value is not "0",
      then it is done, as another CPU is kicking off the IPI loop. If the old
      value is "0", then it will take the rto_lock to synchronize with a possible
      IPI being sent around to the overloaded CPUs.
      
      If rto_cpu is greater than or equal to nr_cpu_ids, then there's either no
      IPI being sent around, or one is about to finish. Then rto_cpu is set to the
      first CPU in rto_mask and an IPI is sent to that CPU. If there's no CPUs set
      in rto_mask, then there's nothing to be done.
      
      When the CPU receives the IPI, it will first try to push any RT tasks that is
      queued on the CPU but can't run because a higher priority RT task is
      currently running on that CPU.
      
      Then it takes the rto_lock and looks for the next CPU in the rto_mask. If it
      finds one, it simply sends an IPI to that CPU and the process continues.
      
      If there's no more CPUs in the rto_mask, then rto_loop is compared with
      rto_loop_next. If they match, everything is done and the process is over. If
      they do not match, then a CPU scheduled in a lower priority task as the IPI
      was being passed around, and the process needs to start again. The first CPU
      in rto_mask is sent the IPI.
      
      This change removes this duplication of work in the IPI logic, and greatly
      lowers the latency caused by the IPIs. This removed the lockup happening on
      the 120 CPU machine. It also simplifies the code tremendously. What else
      could anyone ask for?
      
      Thanks to Peter Zijlstra for simplifying the rto_loop_start atomic logic and
      supplying me with the rto_start_trylock() and rto_start_unlock() helper
      functions.
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott Wood <swood@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170424114732.1aac6dc4@gandalf.local.homeSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f17c786b
    • Mikulas Patocka's avatar
      dm: allocate struct mapped_device with kvzalloc · 2bf483c9
      Mikulas Patocka authored
      commit 856eb091 upstream.
      
      The structure srcu_struct can be very big, its size is proportional to the
      value CONFIG_NR_CPUS. The Fedora kernel has CONFIG_NR_CPUS 8192, the field
      io_barrier in the struct mapped_device has 84kB in the debugging kernel
      and 50kB in the non-debugging kernel. The large size may result in failure
      of the function kzalloc_node.
      
      In order to avoid the allocation failure, we use the function
      kvzalloc_node, this function falls back to vmalloc if a large contiguous
      chunk of memory is not available. This patch also moves the field
      io_barrier to the last position of struct mapped_device - the reason is
      that on many processor architectures, short memory offsets result in
      smaller code than long memory offsets - on x86-64 it reduces code size by
      320 bytes.
      
      Note to stable kernel maintainers - the kernels 4.11 and older don't have
      the function kvzalloc_node, you can use the function vzalloc_node instead.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bf483c9
    • Vivek Goyal's avatar
      ovl: Put upperdentry if ovl_check_origin() fails · 13e65600
      Vivek Goyal authored
      commit 5455f92b upstream.
      
      If ovl_check_origin() fails, we should put upperdentry. We have a reference
      on it by now. So goto out_put_upper instead of out.
      
      Fixes: a9d01957 ("ovl: lookup non-dir copy-up-origin by file handle")
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13e65600
    • Eric Biggers's avatar
      dm bufio: fix integer overflow when limiting maximum cache size · 08720bf9
      Eric Biggers authored
      commit 74d4108d upstream.
      
      The default max_cache_size_bytes for dm-bufio is meant to be the lesser
      of 25% of the size of the vmalloc area and 2% of the size of lowmem.
      However, on 32-bit systems the intermediate result in the expression
      
          (VMALLOC_END - VMALLOC_START) * DM_BUFIO_VMALLOC_PERCENT / 100
      
      overflows, causing the wrong result to be computed.  For example, on a
      32-bit system where the vmalloc area is 520093696 bytes, the result is
      1174405 rather than the expected 130023424, which makes the maximum
      cache size much too small (far less than 2% of lowmem).  This causes
      severe performance problems for dm-verity users on affected systems.
      
      Fix this by using mult_frac() to correctly multiply by a percentage.  Do
      this for all places in dm-bufio that multiply by a percentage.  Also
      replace (VMALLOC_END - VMALLOC_START) with VMALLOC_TOTAL, which contrary
      to the comment is now defined in include/linux/vmalloc.h.
      
      Depends-on: 9993bc63 ("sched/x86: Fix overflow in cyc2ns_offset")
      Fixes: 95d402f0 ("dm: add bufio")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08720bf9
    • Ming Lei's avatar
      dm mpath: remove annoying message of 'blk_get_request() returned -11' · 4d7a55f5
      Ming Lei authored
      commit 9dc112e2 upstream.
      
      It is very normal to see allocation failure, especially with blk-mq
      request_queues, so it's unnecessary to report this error and annoy
      people.
      
      In practice this 'blk_get_request() returned -11' error gets logged
      quite frequently when a blk-mq DM multipath device sees heavy IO.
      
      This change is marked for stable@ because the annoying message in
      question was included in stable@ commit 7083abbb.
      
      Fixes: 7083abbb ("dm mpath: avoid that path removal can trigger an infinite loop")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d7a55f5
    • Damien Le Moal's avatar
      dm zoned: ignore last smaller runt zone · ac29afdb
      Damien Le Moal authored
      commit 114e0259 upstream.
      
      The SCSI layer allows ZBC drives to have a smaller last runt zone. For
      such a device, specifying the entire capacity for a dm-zoned target
      table entry fails because the specified capacity is not aligned on a
      device zone size indicated in the request queue structure of the
      device.
      
      Fix this problem by ignoring the last runt zone in the entry length
      when seting up the dm-zoned target (ctr method) and when iterating table
      entries of the target (iterate_devices method). This allows dm-zoned
      users to still easily setup a target using the entire device capacity
      (as mandated by dm-zoned) or the aligned capacity excluding the last
      runt zone.
      
      While at it, replace direct references to the device queue chunk_sectors
      limit with calls to the accessor blk_queue_zone_sectors().
      Reported-by: default avatarPeter Desnoyers <pjd@ccs.neu.edu>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac29afdb
    • Mikulas Patocka's avatar
      dm crypt: allow unaligned bv_offset · 8f71f493
      Mikulas Patocka authored
      commit 0440d5c0 upstream.
      
      When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
      this unaligned memory for its buffers (if an unaligned buffer crosses a
      page, XFS frees it and allocates a full page instead - see the function
      xfs_buf_allocate_memory).
      
      dm-crypt checks if bv_offset is aligned on page size and these checks
      fail with slub_debug and XFS.
      
      Fix this bug by removing the bv_offset checks. Switch to checking if
      bv_len is aligned instead of bv_offset (this check should be sufficient
      to prevent overruns if a bio with too small bv_len is received).
      
      Fixes: 8f0009a2 ("dm crypt: optionally support larger encryption sector size")
      Reported-by: default avatarBruno Prémont <bonbons@sysophe.eu>
      Tested-by: default avatarBruno Prémont <bonbons@sysophe.eu>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f71f493
    • Joe Thornber's avatar
      dm cache: fix race condition in the writeback mode overwrite_bio optimisation · ec544ec9
      Joe Thornber authored
      commit d1260e2a upstream.
      
      When a DM cache in writeback mode moves data between the slow and fast
      device it can often avoid a copy if the triggering bio either:
      
      i) covers the whole block (no point copying if we're about to overwrite it)
      ii) the migration is a promotion and the origin block is currently discarded
      
      Prior to this fix there was a race with case (ii).  The discard status
      was checked with a shared lock held (rather than exclusive).  This meant
      another bio could run in parallel and write data to the origin, removing
      the discard state.  After the promotion the parallel write would have
      been lost.
      
      With this fix the discard status is re-checked once the exclusive lock
      has been aquired.  If the block is no longer discarded it falls back to
      the slower full copy path.
      
      Fixes: b29d4986 ("dm cache: significant rework to leverage dm-bio-prison-v2")
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec544ec9
    • Mikulas Patocka's avatar
      dm integrity: allow unaligned bv_offset · a502cd2d
      Mikulas Patocka authored
      commit 95b1369a upstream.
      
      When slub_debug is enabled kmalloc returns unaligned memory. XFS uses
      this unaligned memory for its buffers (if an unaligned buffer crosses a
      page, XFS frees it and allocates a full page instead - see the function
      xfs_buf_allocate_memory).
      
      dm-integrity checks if bv_offset is aligned on page size and this check
      fail with slub_debug and XFS.
      
      Fix this bug by removing the bv_offset check, leaving only the check for
      bv_len.
      
      Fixes: 7eada909 ("dm: add integrity target")
      Reported-by: default avatarBruno Prémont <bonbons@sysophe.eu>
      Reviewed-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a502cd2d
    • Vijendar Mukunda's avatar
      ALSA: hda: Add Raven PCI ID · ca90f34e
      Vijendar Mukunda authored
      commit 9ceace3c upstream.
      
      This commit adds PCI ID for Raven platform
      Signed-off-by: default avatarVijendar Mukunda <Vijendar.Mukunda@amd.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca90f34e
    • Vadim Lomovtsev's avatar
      PCI: Apply Cavium ThunderX ACS quirk to more Root Ports · a529422a
      Vadim Lomovtsev authored
      commit f2ddaf8d upstream.
      
      Extend the Cavium ThunderX ACS quirk to cover more device IDs and restrict
      it to only Root Ports.
      Signed-off-by: default avatarVadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
      [bhelgaas: changelog, stable tag]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a529422a
    • Vadim Lomovtsev's avatar
      PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF · f97ca607
      Vadim Lomovtsev authored
      commit 7f342678 upstream.
      
      The Cavium ThunderX (CN8XXX) family of PCIe Root Ports does not advertise
      an ACS capability.  However, the RTL internally implements similar
      protection as if ACS had Request Redirection, Completion Redirection,
      Source Validation, and Upstream Forwarding features enabled.
      
      Change Cavium ACS capabilities quirk flags accordingly.
      
      Fixes: b404bcfb ("PCI: Add ACS quirk for all Cavium devices")
      Signed-off-by: default avatarVadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
      [bhelgaas: tidy changelog, comment, stable tag]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f97ca607