1. 22 Sep, 2020 10 commits
    • Eric Biggers's avatar
      fscrypt: make "#define fscrypt_policy" user-only · c7f0207b
      Eric Biggers authored
      The fscrypt UAPI header defines fscrypt_policy to fscrypt_policy_v1,
      for source compatibility with old userspace programs.
      
      Internally, the kernel doesn't want that compatibility definition.
      Instead, fscrypt_private.h #undefs it and re-defines it to a union.
      
      That works for now.  However, in order to add
      fscrypt_operations::get_dummy_policy(), we'll need to forward declare
      'union fscrypt_policy' in include/linux/fscrypt.h.  That would cause
      build errors because "fscrypt_policy" is used in ioctl numbers.
      
      To avoid this, modify the UAPI header to make the fscrypt_policy
      compatibility definition conditional on !__KERNEL__, and make the ioctls
      use fscrypt_policy_v1 instead of fscrypt_policy.
      
      Note that this doesn't change the actual ioctl numbers.
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-11-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      c7f0207b
    • Eric Biggers's avatar
      fscrypt: stop pretending that key setup is nofs-safe · 9dad5feb
      Eric Biggers authored
      fscrypt_get_encryption_info() has never actually been safe to call in a
      context that needs GFP_NOFS, since it calls crypto_alloc_skcipher().
      
      crypto_alloc_skcipher() isn't GFP_NOFS-safe, even if called under
      memalloc_nofs_save().  This is because it may load kernel modules, and
      also because it internally takes crypto_alg_sem.  Other tasks can do
      GFP_KERNEL allocations while holding crypto_alg_sem for write.
      
      The use of fscrypt_init_mutex isn't GFP_NOFS-safe either.
      
      So, stop pretending that fscrypt_get_encryption_info() is nofs-safe.
      I.e., when it allocates memory, just use GFP_KERNEL instead of GFP_NOFS.
      
      Note, another reason to do this is that GFP_NOFS is deprecated in favor
      of using memalloc_nofs_save() in the proper places.
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-10-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      9dad5feb
    • Eric Biggers's avatar
      fscrypt: require that fscrypt_encrypt_symlink() already has key · 4cc1a3e7
      Eric Biggers authored
      Now that all filesystems have been converted to use
      fscrypt_prepare_new_inode(), the encryption key for new symlink inodes
      is now already set up whenever we try to encrypt the symlink target.
      Enforce this rather than try to set up the key again when it may be too
      late to do so safely.
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-9-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      4cc1a3e7
    • Eric Biggers's avatar
      fscrypt: remove fscrypt_inherit_context() · e9d5e31d
      Eric Biggers authored
      Now that all filesystems have been converted to use
      fscrypt_prepare_new_inode() and fscrypt_set_context(),
      fscrypt_inherit_context() is no longer used.  Remove it.
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-8-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      e9d5e31d
    • Eric Biggers's avatar
      fscrypt: adjust logging for in-creation inodes · ae9ff8ad
      Eric Biggers authored
      Now that a fscrypt_info may be set up for inodes that are currently
      being created and haven't yet had an inode number assigned, avoid
      logging confusing messages about "inode 0".
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-7-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      ae9ff8ad
    • Eric Biggers's avatar
      ubifs: use fscrypt_prepare_new_inode() and fscrypt_set_context() · 4c030fa8
      Eric Biggers authored
      Convert ubifs to use the new functions fscrypt_prepare_new_inode() and
      fscrypt_set_context().
      
      Unlike ext4 and f2fs, this doesn't appear to fix any deadlock bug.  But
      it does shorten the code slightly and get all filesystems using the same
      helper functions, so that fscrypt_inherit_context() can be removed.
      
      It also fixes an incorrect error code where ubifs returned EPERM instead
      of the expected ENOKEY.
      
      Link: https://lore.kernel.org/r/20200917041136.178600-6-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      4c030fa8
    • Eric Biggers's avatar
      f2fs: use fscrypt_prepare_new_inode() and fscrypt_set_context() · e075b690
      Eric Biggers authored
      Convert f2fs to use the new functions fscrypt_prepare_new_inode() and
      fscrypt_set_context().  This avoids calling
      fscrypt_get_encryption_info() from under f2fs_lock_op(), which can
      deadlock because fscrypt_get_encryption_info() isn't GFP_NOFS-safe.
      
      For more details about this problem, see the earlier patch
      "fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()".
      
      This also fixes a f2fs-specific deadlock when the filesystem is mounted
      with '-o test_dummy_encryption' and a file is created in an unencrypted
      directory other than the root directory:
      
          INFO: task touch:207 blocked for more than 30 seconds.
                Not tainted 5.9.0-rc4-00099-g729e3d09 #2
          "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
          task:touch           state:D stack:    0 pid:  207 ppid:   167 flags:0x00000000
          Call Trace:
           [...]
           lock_page include/linux/pagemap.h:548 [inline]
           pagecache_get_page+0x25e/0x310 mm/filemap.c:1682
           find_or_create_page include/linux/pagemap.h:348 [inline]
           grab_cache_page include/linux/pagemap.h:424 [inline]
           f2fs_grab_cache_page fs/f2fs/f2fs.h:2395 [inline]
           f2fs_grab_cache_page fs/f2fs/f2fs.h:2373 [inline]
           __get_node_page.part.0+0x39/0x2d0 fs/f2fs/node.c:1350
           __get_node_page fs/f2fs/node.c:35 [inline]
           f2fs_get_node_page+0x2e/0x60 fs/f2fs/node.c:1399
           read_inline_xattr+0x88/0x140 fs/f2fs/xattr.c:288
           lookup_all_xattrs+0x1f9/0x2c0 fs/f2fs/xattr.c:344
           f2fs_getxattr+0x9b/0x160 fs/f2fs/xattr.c:532
           f2fs_get_context+0x1e/0x20 fs/f2fs/super.c:2460
           fscrypt_get_encryption_info+0x9b/0x450 fs/crypto/keysetup.c:472
           fscrypt_inherit_context+0x2f/0xb0 fs/crypto/policy.c:640
           f2fs_init_inode_metadata+0xab/0x340 fs/f2fs/dir.c:540
           f2fs_add_inline_entry+0x145/0x390 fs/f2fs/inline.c:621
           f2fs_add_dentry+0x31/0x80 fs/f2fs/dir.c:757
           f2fs_do_add_link+0xcd/0x130 fs/f2fs/dir.c:798
           f2fs_add_link fs/f2fs/f2fs.h:3234 [inline]
           f2fs_create+0x104/0x290 fs/f2fs/namei.c:344
           lookup_open.isra.0+0x2de/0x500 fs/namei.c:3103
           open_last_lookups+0xa9/0x340 fs/namei.c:3177
           path_openat+0x8f/0x1b0 fs/namei.c:3365
           do_filp_open+0x87/0x130 fs/namei.c:3395
           do_sys_openat2+0x96/0x150 fs/open.c:1168
           [...]
      
      That happened because f2fs_add_inline_entry() locks the directory
      inode's page in order to add the dentry, then f2fs_get_context() tries
      to lock it recursively in order to read the encryption xattr.  This
      problem is specific to "test_dummy_encryption" because normally the
      directory's fscrypt_info would be set up prior to
      f2fs_add_inline_entry() in order to encrypt the new filename.
      
      Regardless, the new design fixes this test_dummy_encryption deadlock as
      well as potential deadlocks with fs reclaim, by setting up any needed
      fscrypt_info structs prior to taking so many locks.
      
      The test_dummy_encryption deadlock was reported by Daniel Rosenberg.
      Reported-by: default avatarDaniel Rosenberg <drosen@google.com>
      Acked-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-5-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      e075b690
    • Eric Biggers's avatar
      ext4: use fscrypt_prepare_new_inode() and fscrypt_set_context() · 02ce5316
      Eric Biggers authored
      Convert ext4 to use the new functions fscrypt_prepare_new_inode() and
      fscrypt_set_context().  This avoids calling
      fscrypt_get_encryption_info() from within a transaction, which can
      deadlock because fscrypt_get_encryption_info() isn't GFP_NOFS-safe.
      
      For more details about this problem, see the earlier patch
      "fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context()".
      
      Link: https://lore.kernel.org/r/20200917041136.178600-4-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      02ce5316
    • Eric Biggers's avatar
      ext4: factor out ext4_xattr_credits_for_new_inode() · 177cc0e7
      Eric Biggers authored
      To compute a new inode's xattr credits, we need to know whether the
      inode will be encrypted or not.  When we switch to use the new helper
      function fscrypt_prepare_new_inode(), we won't find out whether the
      inode will be encrypted until slightly later than is currently the case.
      That will require moving the code block that computes the xattr credits.
      
      To make this easier and reduce the length of __ext4_new_inode(), move
      this code block into a new function ext4_xattr_credits_for_new_inode().
      
      Link: https://lore.kernel.org/r/20200917041136.178600-3-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      177cc0e7
    • Eric Biggers's avatar
      fscrypt: add fscrypt_prepare_new_inode() and fscrypt_set_context() · a992b20c
      Eric Biggers authored
      fscrypt_get_encryption_info() is intended to be GFP_NOFS-safe.  But
      actually it isn't, since it uses functions like crypto_alloc_skcipher()
      which aren't GFP_NOFS-safe, even when called under memalloc_nofs_save().
      Therefore it can deadlock when called from a context that needs
      GFP_NOFS, e.g. during an ext4 transaction or between f2fs_lock_op() and
      f2fs_unlock_op().  This happens when creating a new encrypted file.
      
      We can't fix this by just not setting up the key for new inodes right
      away, since new symlinks need their key to encrypt the symlink target.
      
      So we need to set up the new inode's key before starting the
      transaction.  But just calling fscrypt_get_encryption_info() earlier
      doesn't work, since it assumes the encryption context is already set,
      and the encryption context can't be set until the transaction.
      
      The recently proposed fscrypt support for the ceph filesystem
      (https://lkml.kernel.org/linux-fscrypt/20200821182813.52570-1-jlayton@kernel.org/T/#u)
      will have this same ordering problem too, since ceph will need to
      encrypt new symlinks before setting their encryption context.
      
      Finally, f2fs can deadlock when the filesystem is mounted with
      '-o test_dummy_encryption' and a new file is created in an existing
      unencrypted directory.  Similarly, this is caused by holding too many
      locks when calling fscrypt_get_encryption_info().
      
      To solve all these problems, add new helper functions:
      
      - fscrypt_prepare_new_inode() sets up a new inode's encryption key
        (fscrypt_info), using the parent directory's encryption policy and a
        new random nonce.  It neither reads nor writes the encryption context.
      
      - fscrypt_set_context() persists the encryption context of a new inode,
        using the information from the fscrypt_info already in memory.  This
        replaces fscrypt_inherit_context().
      
      Temporarily keep fscrypt_inherit_context() around until all filesystems
      have been converted to use fscrypt_set_context().
      Acked-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/r/20200917041136.178600-2-ebiggers@kernel.orgSigned-off-by: default avatarEric Biggers <ebiggers@google.com>
      a992b20c
  2. 07 Sep, 2020 3 commits
  3. 06 Sep, 2020 4 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block · a8205e31
      Linus Torvalds authored
      Pull more io_uring fixes from Jens Axboe:
       "Two followup fixes. One is fixing a regression from this merge window,
        the other is two commits fixing cancelation of deferred requests.
      
        Both have gone through full testing, and both spawned a few new
        regression test additions to liburing.
      
         - Don't play games with const, properly store the output iovec and
           assign it as needed.
      
         - Deferred request cancelation fix (Pavel)"
      
      * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block:
        io_uring: fix linked deferred ->files cancellation
        io_uring: fix cancel of deferred reqs with ->files
        io_uring: fix explicit async read/write mapping for large segments
      a8205e31
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 2ccdd9f8
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL
         pointer dereference bug and serialize a hardware register access as
         required by the VT-d spec.
      
       - two patches for AMD IOMMU to force AMD GPUs into translation mode
         when memory encryption is active and disallow using IOMMUv2
         functionality.  This makes the AMDGPU driver work when memory
         encryption is active.
      
       - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping
         Table Entries.
      
       - MAINTAINERS file update for the Qualcom IOMMU driver.
      
      * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Handle 36bit addressing for x86-32
        iommu/amd: Do not use IOMMUv2 functionality when SME is active
        iommu/amd: Do not force direct mapping when SME is active
        iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE
        iommu/amd: Restore IRTE.RemapEn bit after programming IRTE
        iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()
        iommu/vt-d: Serialize IOMMU GCMD register modifications
        MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
      2ccdd9f8
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 015b3155
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - more generic entry code ABI fallout
      
       - debug register handling bugfixes
      
       - fix vmalloc mappings on 32-bit kernels
      
       - kprobes instrumentation output fix on 32-bit kernels
      
       - fix over-eager WARN_ON_ONCE() on !SMAP hardware
      
       - NUMA debugging fix
      
       - fix Clang related crash on !RETPOLINE kernels
      
      * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry: Unbreak 32bit fast syscall
        x86/debug: Allow a single level of #DB recursion
        x86/entry: Fix AC assertion
        tracing/kprobes, x86/ptrace: Fix regs argument order for i386
        x86, fakenuma: Fix invalid starting node ID
        x86/mm/32: Bring back vmalloc faulting on x86_32
        x86/cmdline: Disable jump tables for cmdline.c
      015b3155
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 68beef57
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
       "A small series for fixing a problem with Xen PVH guests when running
        as backends (e.g. as dom0).
      
        Mapping other guests' memory is now working via ZONE_DEVICE, thus not
        requiring to abuse the memory hotplug functionality for that purpose"
      
      * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: add helpers to allocate unpopulated memory
        memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC
        xen/balloon: add header guard
      68beef57
  4. 05 Sep, 2020 23 commits