1. 01 Mar, 2019 3 commits
    • Liu Song's avatar
      jbd2: jbd2_get_transaction does not need to return a value · 0df6f469
      Liu Song authored
      In jbd2_get_transaction, a new transaction is initialized,
      and set to the j_running_transaction. No need for a return
      value, so remove it.
      
      Also, adjust some comments to match the actual operation
      of this function.
      Signed-off-by: default avatarLiu Song <liu.song11@zte.com.cn>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      0df6f469
    • luojiajun's avatar
      jbd2: fix invalid descriptor block checksum · 6e876c3d
      luojiajun authored
      In jbd2_journal_commit_transaction(), if we are in abort mode,
      we may flush the buffer without setting descriptor block checksum
      by goto start_journal_io. Then fs is mounted,
      jbd2_descriptor_block_csum_verify() failed.
      
      [  271.379811] EXT4-fs (vdd): shut down requested (2)
      [  271.381827] Aborting journal on device vdd-8.
      [  271.597136] JBD2: Invalid checksum recovering block 22199 in log
      [  271.598023] JBD2: recovery failed
      [  271.598484] EXT4-fs (vdd): error loading journal
      
      Fix this problem by keep setting descriptor block checksum if the
      descriptor buffer is not NULL.
      
      This checksum problem can be reproduced by xfstests generic/388.
      Signed-off-by: default avatarluojiajun <luojiajun3@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      6e876c3d
    • Eric Whitney's avatar
      ext4: fix bigalloc cluster freeing when hole punching under load · 7bd75230
      Eric Whitney authored
      Ext4 may not free clusters correctly when punching holes in bigalloc
      file systems under high load conditions.  If it's not possible to
      extend and restart the journal in ext4_ext_rm_leaf() when preparing to
      remove blocks from a punched region, a retry of the entire punch
      operation is triggered in ext4_ext_remove_space().  This causes a
      partial cluster to be set to the first cluster in the extent found to
      the right of the punched region.  However, if the punch operation
      prior to the retry had made enough progress to delete one or more
      extents and a partial cluster candidate for freeing had already been
      recorded, the retry would overwrite the partial cluster.  The loss of
      this information makes it impossible to correctly free the original
      partial cluster in all cases.
      
      This bug can cause generic/476 to fail when run as part of
      xfstests-bld's bigalloc and bigalloc_1k test cases.  The failure is
      reported when e2fsck detects bad iblocks counts greater than expected
      in units of whole clusters and also detects a number of negative block
      bitmap differences equal to the iblocks discrepancy in cluster units.
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      7bd75230
  2. 21 Feb, 2019 7 commits
  3. 14 Feb, 2019 3 commits
    • Andreas Dilger's avatar
      ext4: don't update s_rev_level if not required · c9e716eb
      Andreas Dilger authored
      Don't update the superblock s_rev_level during mount if it isn't
      actually necessary, only if superblock features are being set by
      the kernel.  This was originally added for ext3 since it always
      set the INCOMPAT_RECOVER and HAS_JOURNAL features during mount,
      but this is not needed since no journal mode was added to ext4.
      
      That will allow Geert to mount his 20-year-old ext2 rev 0.0 m68k
      filesystem, as a testament of the backward compatibility of ext4.
      
      Fixes: 0390131b ("ext4: Allow ext4 to run without a journal")
      Signed-off-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      c9e716eb
    • Theodore Ts'o's avatar
      jbd2: fold jbd2_superblock_csum_{verify,set} into their callers · a58ca992
      Theodore Ts'o authored
      The functions jbd2_superblock_csum_verify() and
      jbd2_superblock_csum_set() only get called from one location, so to
      simplify things, fold them into their callers.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      a58ca992
    • Theodore Ts'o's avatar
      jbd2: fix race when writing superblock · 538bcaa6
      Theodore Ts'o authored
      The jbd2 superblock is lockless now, so there is probably a race
      condition between writing it so disk and modifing contents of it, which
      may lead to checksum error. The following race is the one case that we
      have captured.
      
      jbd2                                fsstress
      jbd2_journal_commit_transaction
       jbd2_journal_update_sb_log_tail
        jbd2_write_superblock
         jbd2_superblock_csum_set         jbd2_journal_revoke
                                           jbd2_journal_set_features(revork)
                                           modify superblock
         submit_bh(checksum incorrect)
      
      Fix this by locking the buffer head before modifing it.  We always
      write the jbd2 superblock after we modify it, so this just means
      calling the lock_buffer() a little earlier.
      
      This checksum corruption problem can be reproduced by xfstests
      generic/475.
      Reported-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Suggested-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      538bcaa6
  4. 11 Feb, 2019 11 commits
    • Jan Kara's avatar
      ext4: fix crash during online resizing · f96c3ac8
      Jan Kara authored
      When computing maximum size of filesystem possible with given number of
      group descriptor blocks, we forget to include s_first_data_block into
      the number of blocks. Thus for filesystems with non-zero
      s_first_data_block it can happen that computed maximum filesystem size
      is actually lower than current filesystem size which confuses the code
      and eventually leads to a BUG_ON in ext4_alloc_group_tables() hitting on
      flex_gd->count == 0. The problem can be reproduced like:
      
      truncate -s 100g /tmp/image
      mkfs.ext4 -b 1024 -E resize=262144 /tmp/image 32768
      mount -t ext4 -o loop /tmp/image /mnt
      resize2fs /dev/loop0 262145
      resize2fs /dev/loop0 300000
      
      Fix the problem by properly including s_first_data_block into the
      computed number of filesystem blocks.
      
      Fixes: 1c6bd717 "ext4: convert file system to meta_bg if needed..."
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      f96c3ac8
    • Theodore Ts'o's avatar
      ext4: disallow files with EXT4_JOURNAL_DATA_FL from EXT4_IOC_SWAP_BOOT · 6e589291
      Theodore Ts'o authored
      A malicious/clueless root user can use EXT4_IOC_SWAP_BOOT to force a
      corner casew which can lead to the file system getting corrupted.
      There's no usefulness to allowing this, so just prohibit this case.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      6e589291
    • yangerkun's avatar
      ext4: add mask of ext4 flags to swap · abdc644e
      yangerkun authored
      The reason is that while swapping two inode, we swap the flags too.
      Some flags such as EXT4_JOURNAL_DATA_FL can really confuse the things
      since we're not resetting the address operations structure.  The
      simplest way to keep things sane is to restrict the flags that can be
      swapped.
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      abdc644e
    • yangerkun's avatar
      ext4: update quota information while swapping boot loader inode · aa507b5f
      yangerkun authored
      While do swap between two inode, they swap i_data without update
      quota information. Also, swap_inode_boot_loader can do "revert"
      somtimes, so update the quota while all operations has been finished.
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      aa507b5f
    • yangerkun's avatar
      ext4: cleanup pagecache before swap i_data · a46c68a3
      yangerkun authored
      While do swap, we should make sure there has no new dirty page since we
      should swap i_data between two inode:
      1.We should lock i_mmap_sem with write to avoid new pagecache from mmap
      read/write;
      2.Change filemap_flush to filemap_write_and_wait and move them to the
      space protected by inode lock to avoid new pagecache from buffer read/write.
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      a46c68a3
    • yangerkun's avatar
      ext4: fix check of inode in swap_inode_boot_loader · 67a11611
      yangerkun authored
      Before really do swap between inode and boot inode, something need to
      check to avoid invalid or not permitted operation, like does this inode
      has inline data. But the condition check should be protected by inode
      lock to avoid change while swapping. Also some other condition will not
      change between swapping, but there has no problem to do this under inode
      lock.
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      67a11611
    • Xiaoguang Wang's avatar
      ext4: unlock unused_pages timely when doing writeback · a297b2fc
      Xiaoguang Wang authored
      In mpage_add_bh_to_extent(), when accumulated extents length is greater
      than MAX_WRITEPAGES_EXTENT_LEN or buffer head's b_stat is not equal, we
      will not continue to search unmapped area for this page, but note this
      page is locked, and will only be unlocked in mpage_release_unused_pages()
      after ext4_io_submit, if io also is throttled by blk-throttle or similar
      io qos, we will hold this page locked for a while, it's unnecessary.
      
      I think the best fix is to refactor mpage_add_bh_to_extent() to let it
      return some hints whether to unlock this page, but given that we will
      improve dioread_nolock later, we can let it done later, so currently
      the simple fix would just call mpage_release_unused_pages() before
      ext4_io_submit().
      Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      a297b2fc
    • zhangyi (F)'s avatar
      ext4: cleanup clean_bdev_aliases() calls · 16e08b14
      zhangyi (F) authored
      Now, we have already handle all cases of forgetting buffer in
      jbd2_journal_forget(), the buffer should not be mapped to blockdevice
      when reallocating it. So this patch remove all clean_bdev_aliases() and
      clean_bdev_bh_alias() calls which were invoked by ext4 explicitly.
      Suggested-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      16e08b14
    • zhangyi (F)'s avatar
      jbd2: discard dirty data when forgetting an un-journalled buffer · 59759926
      zhangyi (F) authored
      We do not unmap and clear dirty flag when forgetting a buffer without
      journal or does not belongs to any transaction, so the invalid dirty
      data may still be written to the disk later. It's fine if the
      corresponding block is never used before the next mount, and it's also
      fine that we invoke clean_bdev_aliases() related functions to unmap
      the block device mapping when re-allocating such freed block as data
      block. But this logic is somewhat fragile and risky that may lead to
      data corruption if we forget to clean bdev aliases. So, It's better to
      discard dirty data during forget time.
      
      We have been already handled all the cases of forgetting journalled
      buffer, this patch deal with the remaining two cases.
      
      - buffer is not journalled yet,
      - buffer is journalled but doesn't belongs to any transaction.
      
      We invoke __bforget() instead of __brelese() when forgetting an
      un-journalled buffer in jbd2_journal_forget(). After this patch we can
      remove all clean_bdev_aliases() related calls in ext4.
      Suggested-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      59759926
    • zhangyi (F)'s avatar
      jbd2: clear dirty flag when revoking a buffer from an older transaction · 904cdbd4
      zhangyi (F) authored
      Now, we capture a data corruption problem on ext4 while we're truncating
      an extent index block. Imaging that if we are revoking a buffer which
      has been journaled by the committing transaction, the buffer's jbddirty
      flag will not be cleared in jbd2_journal_forget(), so the commit code
      will set the buffer dirty flag again after refile the buffer.
      
      fsx                               kjournald2
                                        jbd2_journal_commit_transaction
      jbd2_journal_revoke                commit phase 1~5...
       jbd2_journal_forget
         belongs to older transaction    commit phase 6
         jbddirty not clear               __jbd2_journal_refile_buffer
                                           __jbd2_journal_unfile_buffer
                                            test_clear_buffer_jbddirty
                                             mark_buffer_dirty
      
      Finally, if the freed extent index block was allocated again as data
      block by some other files, it may corrupt the file data after writing
      cached pages later, such as during unmount time. (In general,
      clean_bdev_aliases() related helpers should be invoked after
      re-allocation to prevent the above corruption, but unfortunately we
      missed it when zeroout the head of extra extent blocks in
      ext4_ext_handle_unwritten_extents()).
      
      This patch mark buffer as freed and set j_next_transaction to the new
      transaction when it already belongs to the committing transaction in
      jbd2_journal_forget(), so that commit code knows it should clear dirty
      bits when it is done with the buffer.
      
      This problem can be reproduced by xfstests generic/455 easily with
      seeds (3246 3247 3248 3249).
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      904cdbd4
    • Nikolay Borisov's avatar
      ext4: replace opencoded i_writecount usage with inode_is_open_for_write() · 82dd124c
      Nikolay Borisov authored
      There is a function which clearly conveys the objective of checking
      i_writecount. Additionally the usage in ext4_mb_initialize_context was
      wrong, since a node would have wrongfully been reported as writable if
      i_writecount had a negative value (MMAP_DENY_WRITE).
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      82dd124c
  5. 01 Feb, 2019 2 commits
    • Xiaoguang Wang's avatar
      jbd2: fix deadlock while checkpoint thread waits commit thread to finish · 53cf9784
      Xiaoguang Wang authored
      This issue was found when I tried to put checkpoint work in a separate thread,
      the deadlock below happened:
               Thread1                                |   Thread2
      __jbd2_log_wait_for_space                       |
      jbd2_log_do_checkpoint (hold j_checkpoint_mutex)|
        if (jh->b_transaction != NULL)                |
          ...                                         |
          jbd2_log_start_commit(journal, tid);        |jbd2_update_log_tail
                                                      |  will lock j_checkpoint_mutex,
                                                      |  but will be blocked here.
                                                      |
          jbd2_log_wait_commit(journal, tid);         |
          wait_event(journal->j_wait_done_commit,     |
           !tid_gt(tid, journal->j_commit_sequence)); |
           ...                                        |wake_up(j_wait_done_commit)
        }                                             |
      
      then deadlock occurs, Thread1 will never be waken up.
      
      To fix this issue, drop j_checkpoint_mutex in jbd2_log_do_checkpoint()
      when we are going to wait for transaction commit.
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      53cf9784
    • Theodore Ts'o's avatar
      Revert "ext4: use ext4_write_inode() when fsyncing w/o a journal" · 8fdd60f2
      Theodore Ts'o authored
      This reverts commit ad211f3e.
      
      As Jan Kara pointed out, this change was unsafe since it means we lose
      the call to sync_mapping_buffers() in the nojournal case.  The
      original point of the commit was avoid taking the inode mutex (since
      it causes a lockdep warning in generic/113); but we need the mutex in
      order to call sync_mapping_buffers().
      
      The real fix to this problem was discussed here:
      
      https://lore.kernel.org/lkml/20181025150540.259281-4-bvanassche@acm.org
      
      The proposed patch was to fix a syzbot complaint, but the problem can
      also demonstrated via "kvm-xfstests -c nojournal generic/113".
      Multiple solutions were discused in the e-mail thread, but none have
      landed in the kernel as of this writing.  Anyway, commit
      ad211f3e is absolutely the wrong way to suppress the lockdep, so
      revert it.
      
      Fixes: ad211f3e ("ext4: use ext4_write_inode() when fsyncing w/o a journal")
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reported: Jan Kara <jack@suse.cz>
      8fdd60f2
  6. 21 Jan, 2019 3 commits
  7. 20 Jan, 2019 11 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7d0ae236
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix endless loop in nf_tables, from Phil Sutter.
      
       2) Fix cross namespace ip6_gre tunnel hash list corruption, from
          Olivier Matz.
      
       3) Don't be too strict in phy_start_aneg() otherwise we might not allow
          restarting auto negotiation. From Heiner Kallweit.
      
       4) Fix various KMSAN uninitialized value cases in tipc, from Ying Xue.
      
       5) Memory leak in act_tunnel_key, from Davide Caratti.
      
       6) Handle chip errata of mv88e6390 PHY, from Andrew Lunn.
      
       7) Remove linear SKB assumption in fou/fou6, from Eric Dumazet.
      
       8) Missing udplite rehash callbacks, from Alexey Kodanev.
      
       9) Log dirty pages properly in vhost, from Jason Wang.
      
      10) Use consume_skb() in neigh_probe() as this is a normal free not a
          drop, from Yang Wei. Likewise in macvlan_process_broadcast().
      
      11) Missing device_del() in mdiobus_register() error paths, from Thomas
          Petazzoni.
      
      12) Fix checksum handling of short packets in mlx5, from Cong Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (96 commits)
        bpf: in __bpf_redirect_no_mac pull mac only if present
        virtio_net: bulk free tx skbs
        net: phy: phy driver features are mandatory
        isdn: avm: Fix string plus integer warning from Clang
        net/mlx5e: Fix cb_ident duplicate in indirect block register
        net/mlx5e: Fix wrong (zero) TX drop counter indication for representor
        net/mlx5e: Fix wrong error code return on FEC query failure
        net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
        tools: bpftool: Cleanup license mess
        bpf: fix inner map masking to prevent oob under speculation
        bpf: pull in pkt_sched.h header for tooling to fix bpftool build
        selftests: forwarding: Add a test case for externally learned FDB entries
        selftests: mlxsw: Test FDB offload indication
        mlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky
        net: bridge: Mark FDB entries that were added by user as such
        mlxsw: spectrum_fid: Update dummy FID index
        mlxsw: pci: Return error on PCI reset timeout
        mlxsw: pci: Increase PCI SW reset timeout
        mlxsw: pci: Ring CQ's doorbell before RDQ's
        MAINTAINERS: update email addresses of liquidio driver maintainers
        ...
      7d0ae236
    • Kees Cook's avatar
      pstore/ram: Avoid allocation and leak of platform data · 5631e857
      Kees Cook authored
      Yue Hu noticed that when parsing device tree the allocated platform data
      was never freed. Since it's not used beyond the function scope, this
      switches to using a stack variable instead.
      Reported-by: default avatarYue Hu <huyue2@yulong.com>
      Fixes: 35da6094 ("pstore/ram: add Device Tree bindings")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      5631e857
    • Ard Biesheuvel's avatar
      gcc-plugins: arm_ssp_per_task_plugin: fix for GCC 9+ · 2c88c742
      Ard Biesheuvel authored
      GCC 9 reworks the way the references to the stack canary are
      emitted, to prevent the value from being spilled to the stack
      before the final comparison in the epilogue, defeating the
      purpose, given that the spill slot is under control of the
      attacker that we are protecting ourselves from.
      
      Since our canary value address is obtained without accessing
      memory (as opposed to pre-v7 code that will obtain it from a
      literal pool), it is unlikely (although not guaranteed) that
      the compiler will spill the canary value in the same way, so
      let's just disable this improvement when building with GCC9+.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      2c88c742
    • Ard Biesheuvel's avatar
      gcc-plugins: arm_ssp_per_task_plugin: sign extend the SP mask · 560706d5
      Ard Biesheuvel authored
      The ARM per-task stack protector GCC plugin hits an assert in
      the compiler in some case, due to the fact the the SP mask
      expression is not sign-extended as it should be. So fix that.
      Suggested-by: default avatarKugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      560706d5
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · bb617b9b
      Linus Torvalds authored
      Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
       "Fixes and cleanups all over the place"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vhost/scsi: Use copy_to_iter() to send control queue response
        vhost: return EINVAL if iovecs size does not match the message size
        virtio-balloon: tweak config_changed implementation
        virtio: don't allocate vqs when names[i] = NULL
        virtio_pci: use queue idx instead of array idx to set up the vq
        virtio: document virtio_config_ops restrictions
        virtio: fix virtio_config_ops description
      bb617b9b
    • Linus Torvalds's avatar
      Merge tag 'for-5.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 1be969f4
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A handful of fixes (some of them in testing for a long time):
      
         - fix some test failures regarding cleanup after transaction abort
      
         - revert of a patch that could cause a deadlock
      
         - delayed iput fixes, that can help in ENOSPC situation when there's
           low space and a lot data to write"
      
      * tag 'for-5.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: wakeup cleaner thread when adding delayed iput
        btrfs: run delayed iputs before committing
        btrfs: wait on ordered extents on abort cleanup
        btrfs: handle delayed ref head accounting cleanup in abort
        Revert "btrfs: balance dirty metadata pages in btrfs_finish_ordered_io"
      1be969f4
    • Linus Torvalds's avatar
      Merge tags 'compiler-attributes-for-linus-v5.0-rc3' and... · 315a6d85
      Linus Torvalds authored
      Merge tags 'compiler-attributes-for-linus-v5.0-rc3' and 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux
      
      Pull misc clang fixes from Miguel Ojeda:
      
        - A fix for OPTIMIZER_HIDE_VAR from Michael S Tsirkin
      
        - Update clang-format with the latest for_each macro list from Jason
          Gunthorpe
      
      * tag 'compiler-attributes-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
        include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
      
      * tag 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
        clang-format: Update .clang-format with the latest for_each macro list
      315a6d85
    • Florian La Roche's avatar
      fix int_sqrt64() for very large numbers · fbfaf851
      Florian La Roche authored
      If an input number x for int_sqrt64() has the highest bit set, then
      fls64(x) is 64.  (1UL << 64) is an overflow and breaks the algorithm.
      
      Subtracting 1 is a better guess for the initial value of m anyway and
      that's what also done in int_sqrt() implicitly [*].
      
      [*] Note how int_sqrt() uses __fls() with two underscores, which already
          returns the proper raw bit number.
      
          In contrast, int_sqrt64() used fls64(), and that returns bit numbers
          illogically starting at 1, because of error handling for the "no
          bits set" case. Will points out that he bug probably is due to a
          copy-and-paste error from the regular int_sqrt() case.
      Signed-off-by: default avatarFlorian La Roche <Florian.LaRoche@googlemail.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbfaf851
    • Will Deacon's avatar
      x86: uaccess: Inhibit speculation past access_ok() in user_access_begin() · 6e693b3f
      Will Deacon authored
      Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'")
      makes the access_ok() check part of the user_access_begin() preceding a
      series of 'unsafe' accesses.  This has the desirable effect of ensuring
      that all 'unsafe' accesses have been range-checked, without having to
      pick through all of the callsites to verify whether the appropriate
      checking has been made.
      
      However, the consolidated range check does not inhibit speculation, so
      it is still up to the caller to ensure that they are not susceptible to
      any speculative side-channel attacks for user addresses that ultimately
      fail the access_ok() check.
      
      This is an oversight, so use __uaccess_begin_nospec() to ensure that
      speculation is inhibited until the access_ok() check has passed.
      Reported-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e693b3f
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · b0f3e768
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Three arm64 fixes for -rc3.
      
        We've plugged a couple of nasty issues involving KASLR-enabled
        kernels, and removed a redundant #define that was introduced as part
        of the KHWASAN fixes from akpm at -rc2.
      
         - Fix broken kpti page-table rewrite in bizarre KASLR configuration
      
         - Fix module loading with KASLR
      
         - Remove redundant definition of ARCH_SLAB_MINALIGN"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        kasan, arm64: remove redundant ARCH_SLAB_MINALIGN define
        arm64: kaslr: ensure randomized quantities are clean to the PoC
        arm64: kpti: Update arm64_kernel_use_ng_mappings() when forced on
      b0f3e768
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 6436408e
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2019-01-20
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix a out-of-bounds access in __bpf_redirect_no_mac, from Willem.
      
      2) Fix bpf_setsockopt to reset sock dst on SO_MARK changes, from Peter.
      
      3) Fix map in map masking to prevent out-of-bounds access under
         speculative execution, from Daniel.
      
      4) Fix bpf_setsockopt's SO_MAX_PACING_RATE to support TCP internal
         pacing, from Yuchung.
      
      5) Fix json writer license in bpftool, from Thomas.
      
      6) Fix AF_XDP to check if an actually queue exists during umem
         setup, from Krzysztof.
      
      7) Several fixes to BPF stackmap's build id handling. Another fix
         for bpftool build to account for libbfd variations wrt linking
         requirements, from Stanislav.
      
      8) Fix BPF samples build with clang by working around missing asm
         goto, from Yonghong.
      
      9) Fix libbpf to retry program load on signal interrupt, from Lorenz.
      
      10) Various minor compile warning fixes in BPF code, from Mathieu.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6436408e