- 18 Nov, 2020 2 commits
-
-
Darrick J. Wong authored
We always know the correct state of the rmap record flags (attr, bmbt, unwritten) so check them by direct comparison. Fixes: d852657c ("xfs: cross-reference reverse-mapping btree") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
The comment and logic in xchk_btree_check_minrecs for dealing with inode-rooted btrees isn't quite correct. While the direct children of the inode root are allowed to have fewer records than what would normally be allowed for a regular ondisk btree block, this is only true if there is only one child block and the number of records don't fit in the inode root. Fixes: 08a3a692 ("xfs: btree scrub should check minrecs") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
- 11 Nov, 2020 5 commits
-
-
Christoph Hellwig authored
We also need to drop the iolock when invalidate_inode_pages2 fails, not only on all other error or successful cases. Fixes: 52785112 ("xfs: implement pNFS export operations") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Darrick J. Wong authored
Fix some serious WTF in the reference count scrubber's rmap fragment processing. The code comment says that this loop is supposed to move all fragment records starting at or before bno onto the worklist, but there's no obvious reason why nr (the number of items added) should increment starting from 1, and breaking the loop when we've added the target number seems dubious since we could have more rmap fragments that should have been added to the worklist. This seems to manifest in xfs/411 when adding one to the refcount field. Fixes: dbde19da ("xfs: cross-reference the rmapbt data with the refcountbt") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Keys for extent interval records in the reverse mapping btree are supposed to be computed as follows: (physical block, owner, fork, is_btree, is_unwritten, offset) This provides users the ability to look up a reverse mapping from a bmbt record -- start with the physical block; then if there are multiple records for the same block, move on to the owner; then the inode fork type; and so on to the file offset. However, the key comparison functions incorrectly remove the fork/btree/unwritten information that's encoded in the on-disk offset. This means that lookup comparisons are only done with: (physical block, owner, offset) This means that queries can return incorrect results. On consistent filesystems this hasn't been an issue because blocks are never shared between forks or with bmbt blocks; and are never unwritten. However, this bug means that online repair cannot always detect corruption in the key information in internal rmapbt nodes. Found by fuzzing keys[1].attrfork = ones on xfs/371. Fixes: 4b8ed677 ("xfs: add rmap btree operations") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
When the bmbt scrubber is looking up rmap extents, we need to set the extent flags from the bmbt record fully. This will matter once we fix the rmap btree comparison functions to check those flags correctly. Fixes: d852657c ("xfs: cross-reference reverse-mapping btree") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
Pass the same oldext argument (which contains the existing rmapping's unwritten state) to xfs_rmap_lookup_le_range at the start of xfs_rmap_convert_shared. At this point in the code, flags is zero, which means that we perform lookups using the wrong key. Fixes: 3f165b33 ("xfs: convert unwritten status of reverse mappings for shared files") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
- 05 Nov, 2020 1 commit
-
-
Darrick J. Wong authored
There's no reason to flush an entire file when we're unsharing part of a file. Therefore, only initiate writeback on the selected range. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
-
- 04 Nov, 2020 5 commits
-
-
Darrick J. Wong authored
The kernel has always allowed directories to have the rtinherit flag set, even if there is no rt device, so this check is wrong. Fixes: 80e4e126 ("xfs: scrub inodes") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Darrick J. Wong authored
In commit 7588cbee, we tried to fix a race stemming from the lack of coordination between higher level code that wants to allocate and remap CoW fork extents into the data fork. Christoph cites as examples the always_cow mode, and a directio write completion racing with writeback. According to the comments before the goto retry, we want to restart the lookup to catch the extent in the data fork, but we don't actually reset whichfork or cow_fsb, which means the second try executes using stale information. Up until now I think we've gotten lucky that either there's something left in the CoW fork to cause cow_fsb to be reset, or either data/cow fork sequence numbers have advanced enough to force a fresh lookup from the data fork. However, if we reach the retry with an empty stable CoW fork and a stable data fork, neither of those things happens. The retry foolishly re-calls xfs_convert_blocks on the CoW fork which fails again. This time, we toss the write. I've recently been working on extending reflink to the realtime device. When the realtime extent size is larger than a single block, we have to force the page cache to CoW the entire rt extent if a write (or fallocate) are not aligned with the rt extent size. The strategy I've chosen to deal with this is derived from Dave's blocksize > pagesize series: dirtying around the write range, and ensuring that writeback always starts mapping on an rt extent boundary. This has brought this race front and center, since generic/522 blows up immediately. However, I'm pretty sure this is a bug outright, independent of that. Fixes: 7588cbee ("xfs: retry COW fork delalloc conversion when no extent was found") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Brian Foster authored
The iomap writepage error handling logic is a mash of old and slightly broken XFS writepage logic. When keepwrite writeback state tracking was introduced in XFS in commit 0d085a52 ("xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly"), XFS had an additional cluster writeback context that scanned ahead of ->writepage() to process dirty pages over the current ->writepage() extent mapping. This context expected a dirty page and required retention of the TOWRITE tag on partial page processing so the higher level writeback context would revisit the page (in contrast to ->writepage(), which passes a page with the dirty bit already cleared). The cluster writeback mechanism was eventually removed and some of the error handling logic folded into the primary writeback path in commit 150d5be0 ("xfs: remove xfs_cancel_ioend"). This patch accidentally conflated the two contexts by using the keepwrite logic in ->writepage() without accounting for the fact that the page is not dirty. Further, the keepwrite logic has no practical effect on the core ->writepage() caller (write_cache_pages()) because it never revisits a page in the current function invocation. Technically, the page should be redirtied for the keepwrite logic to have any effect. Otherwise, write_cache_pages() may find the tagged page but will skip it since it is clean. Even if the page was redirtied, however, there is still no practical effect to keepwrite since write_cache_pages() does not wrap around within a single invocation of the function. Therefore, the dirty page would simply end up retagged on the next writeback sequence over the associated range. All that being said, none of this really matters because redirtying a partially processed page introduces a potential infinite redirty -> writeback failure loop that deviates from the current design principle of clearing the dirty state on writepage failure to avoid building up too much dirty, unreclaimable memory on the system. Therefore, drop the spurious keepwrite usage and dirty state clearing logic from iomap_writepage_map(), treat the partially processed page the same as a fully processed page, and let the imminent ioend failure clean up the writeback state. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Brian Foster authored
iomap writeback mapping failure only calls into ->discard_page() if the current page has not been added to the ioend. Accordingly, the XFS callback assumes a full page discard and invalidation. This is problematic for sub-page block size filesystems where some portion of a page might have been mapped successfully before a failure to map a delalloc block occurs. ->discard_page() is not called in that error scenario and the bio is explicitly failed by iomap via the error return from ->prepare_ioend(). As a result, the filesystem leaks delalloc blocks and corrupts the filesystem block counters. Since XFS is the only user of ->discard_page(), tweak the semantics to invoke the callback unconditionally on mapping errors and provide the file offset that failed to map. Update xfs_discard_page() to discard the corresponding portion of the file and pass the range along to iomap_invalidatepage(). The latter already properly handles both full and sub-page scenarios by not changing any iomap or page state on sub-page invalidations. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Brian Foster authored
It is possible to expose non-zeroed post-EOF data in XFS if the new EOF page is dirty, backed by an unwritten block and the truncate happens to race with writeback. iomap_truncate_page() will not zero the post-EOF portion of the page if the underlying block is unwritten. The subsequent call to truncate_setsize() will, but doesn't dirty the page. Therefore, if writeback happens to complete after iomap_truncate_page() (so it still sees the unwritten block) but before truncate_setsize(), the cached page becomes inconsistent with the on-disk block. A mapped read after the associated page is reclaimed or invalidated exposes non-zero post-EOF data. For example, consider the following sequence when run on a kernel modified to explicitly flush the new EOF page within the race window: $ xfs_io -fc "falloc 0 4k" -c fsync /mnt/file $ xfs_io -c "pwrite 0 4k" -c "truncate 1k" /mnt/file ... $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file 00000400: 00 00 00 00 00 00 00 00 ........ $ umount /mnt/; mount <dev> /mnt/ $ xfs_io -c "mmap 0 4k" -c "mread -v 1k 8" /mnt/file 00000400: cd cd cd cd cd cd cd cd ........ Update xfs_setattr_size() to explicitly flush the new EOF page prior to the page truncate to ensure iomap has the latest state of the underlying block. Fixes: 68a9f5e7 ("xfs: implement iomap based buffered write path") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
- 29 Oct, 2020 1 commit
-
-
Darrick J. Wong authored
Make sure that we actually initialize xefi_discard when we're scheduling a deferred free of an AGFL block. This was (eventually) found by the UBSAN while I was banging on realtime rmap problems, but it exists in the upstream codebase. While we're at it, rearrange the structure to reduce the struct size from 64 to 56 bytes. Fixes: fcb762f5 ("xfs: add bmapi nodiscard flag") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
-
- 25 Oct, 2020 17 commits
-
-
Linus Torvalds authored
-
Joe Perches authored
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.plSigned-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rasmus Villemoes authored
tid_addr is not a "pointer to (pointer to int in userspace)"; it is in fact a "pointer to (pointer to int in userspace) in userspace". So sparse rightfully complains about passing a kernel pointer to put_user(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric Biggers authored
Commit 453431a5 ("mm, treewide: rename kzfree() to kfree_sensitive()") renamed kzfree() to kfree_sensitive(), but it left a compatibility definition of kzfree() to avoid being too disruptive. Since then a few more instances of kzfree() have slipped in. Just get rid of them and remove the compatibility definition once and for all. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
If set, use the environment variable GIT_DIR to change the default .git location of the kernel git tree. If GIT_DIR is unset, keep using the current ".git" default. Link: https://lkml.kernel.org/r/c5e23b45562373d632fccb8bc04e563abba4dd1d.camel@perches.comSigned-off-by: Joe Perches <joe@perches.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Thomas Gleixner: "A time namespace fix and a matching selftest. The futex absolute timeouts which are based on CLOCK_MONOTONIC require time namespace corrected. This was missed in the original time namesapce support" * tag 'timers-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/timens: Add a test for futex() futex: Adjust absolute futex timeouts with per time namespace offset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Thomas Gleixner: "Two scheduler fixes: - A trivial build fix for sched_feat() to compile correctly with CONFIG_JUMP_LABEL=n - Replace a zero lenght array with a flexible array" * tag 'sched-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/features: Fix !CONFIG_JUMP_LABEL case sched: Replace zero-length array with flexible-array
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fix from Thomas Gleixner: "A single fix to compute the field offset of the SNOOPX bit in the data source bitmask of perf events correctly" * tag 'perf-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: correct SNOOPX field offset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fix from Thomas Gleixner: "Just a trivial fix for kernel-doc warnings" * tag 'locking-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/seqlocks: Fix kernel-doc warnings
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB fixes from Jon Mason. * tag 'ntb-5.10' of git://github.com/jonmason/ntb: NTB: Use struct_size() helper in devm_kzalloc() ntb: intel: Fix memleak in intel_ntb_pci_probe NTB: hw: amd: fix an issue about leak system resources
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c fix from Wolfram Sang: "Regression fix for rc1 and stable kernels as well" * 'i2c/for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: core: Restore acpi_walk_dep_device_list() getting called after registering the ACPI i2c devs
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more cifs updates from Steve French: "Add support for stat of various special file types (WSL reparse points for char, block, fifo)" * tag '5.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number smb3: add some missing definitions from MS-FSCC smb3: remove two unused variables smb3: add support for stat of WSL reparse points for special file types
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull more parisc updates from Helge Deller: - During this merge window O_NONBLOCK was changed to become 000200000, but we missed that the syscalls timerfd_create(), signalfd4(), eventfd2(), pipe2(), inotify_init1() and userfaultfd() do a strict bit-wise check of the flags parameter. To provide backward compatibility with existing userspace we introduce parisc specific wrappers for those syscalls which filter out the old O_NONBLOCK value and replaces it with the new one. - Prevent HIL bus driver to get stuck when keyboard or mouse isn't attached - Improve error return codes when setting rtc time - Minor documentation fix in pata_ns87415.c * 'parisc-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: ata: pata_ns87415.c: Document support on parisc with superio chip parisc: Add wrapper syscalls to fix O_NONBLOCK flag usage hil/parisc: Disable HIL driver when it gets stuck parisc: Improve error return codes when setting rtc time
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull more xen updates from Juergen Gross: - a series for the Xen pv block drivers adding module parameters for better control of resource usge - a cleanup series for the Xen event driver * tag 'for-linus-5.10b-rc1c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: Documentation: add xen.fifo_events kernel parameter description xen/events: unmask a fifo event channel only if it was masked xen/events: only register debug interrupt for 2-level events xen/events: make struct irq_info private to events_base.c xen: remove no longer used functions xen-blkfront: Apply changed parameter name to the document xen-blkfront: add a parameter for disabling of persistent grants xen-blkback: add a parameter for disabling of persistent grants
-
git://github.com/micah-morton/linuxLinus Torvalds authored
Pull SafeSetID updates from Micah Morton: "The changes are mostly contained to within the SafeSetID LSM, with the exception of a few 1-line changes to change some ns_capable() calls to ns_capable_setid() -- causing a flag (CAP_OPT_INSETID) to be set that is examined by SafeSetID code and nothing else in the kernel. The changes to SafeSetID internally allow for setting up GID transition security policies, as already existed for UIDs" * tag 'safesetid-5.10' of git://github.com/micah-morton/linux: LSM: SafeSetID: Fix warnings reported by test bot LSM: SafeSetID: Add GID security policy handling LSM: Signal to SafeSetID when setting group IDs
-
git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandomLinus Torvalds authored
Pull random32 updates from Willy Tarreau: "Make prandom_u32() less predictable. This is the cleanup of the latest series of prandom_u32 experimentations consisting in using SipHash instead of Tausworthe to produce the randoms used by the network stack. The changes to the files were kept minimal, and the controversial commit that used to take noise from the fast_pool (f227e3ec) was reverted. Instead, a dedicated "net_rand_noise" per_cpu variable is fed from various sources of activities (networking, scheduling) to perturb the SipHash state using fast, non-trivially predictable data, instead of keeping it fully deterministic. The goal is essentially to make any occasional memory leakage or brute-force attempt useless. The resulting code was verified to be very slightly faster on x86_64 than what is was with the controversial commit above, though this remains barely above measurement noise. It was also tested on i386 and arm, and build- tested only on arm64" Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ * tag '20201024-v4-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom: random32: add a selftest for the prandom32 code random32: add noise from network and scheduling activity random32: make prandom_u32() output unpredictable
-
Hans de Goede authored
Commit 21653a41 ("i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()")'s intention was to only move the acpi_install_address_space_handler() call to the point before where the ACPI declared i2c-children of the adapter where instantiated by i2c_acpi_register_devices(). But i2c_acpi_install_space_handler() had a call to acpi_walk_dep_device_list() hidden (that is I missed it) at the end of it, so as an unwanted side-effect now acpi_walk_dep_device_list() was also being called before i2c_acpi_register_devices(). Move the acpi_walk_dep_device_list() call to the end of i2c_acpi_register_devices(), so that it is once again called *after* the i2c_client-s hanging of the adapter have been created. This fixes the Microsoft Surface Go 2 hanging at boot. Fixes: 21653a41 ("i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209627Reported-by: Rainer Finke <rainer@finke.cc> Reported-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Suggested-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
-
- 24 Oct, 2020 9 commits
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: - NVMe pull request from Christoph - rdma error handling fixes (Chao Leng) - fc error handling and reconnect fixes (James Smart) - fix the qid displace when tracing ioctl command (Keith Busch) - don't use BLK_MQ_REQ_NOWAIT for passthru (Chaitanya Kulkarni) - fix MTDT for passthru (Logan Gunthorpe) - blacklist Write Same on more devices (Kai-Heng Feng) - fix an uninitialized work struct (zhenwei pi)" - lightnvm out-of-bounds fix (Colin) - SG allocation leak fix (Doug) - rnbd fixes (Gioh, Guoqing, Jack) - zone error translation fixes (Keith) - kerneldoc markup fix (Mauro) - zram lockdep fix (Peter) - Kill unused io_context members (Yufen) - NUMA memory allocation cleanup (Xianting) - NBD config wakeup fix (Xiubo) * tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block: (27 commits) block: blk-mq: fix a kernel-doc markup nvme-fc: shorten reconnect delay if possible for FC nvme-fc: wait for queues to freeze before calling update_hr_hw_queues nvme-fc: fix error loop in create_hw_io_queues nvme-fc: fix io timeout to abort I/O null_blk: use zone status for max active/open nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru nvmet: cleanup nvmet_passthru_map_sg() nvmet: limit passthru MTDS by BIO_MAX_PAGES nvmet: fix uninitialized work for zero kato nvme-pci: disable Write Zeroes on Sandisk Skyhawk nvme: use queuedata for nvme_req_qid nvme-rdma: fix crash due to incorrect cqe nvme-rdma: fix crash when connect rejected block: remove unused members for io_context blk-mq: remove the calling of local_memory_node() zram: Fix __zram_bvec_{read,write}() locking order skd_main: remove unused including <linux/version.h> sgl_alloc_order: fix memory leak lightnvm: fix out-of-bounds write to array devices->info[] ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: - fsize was missed in previous unification of work flags - Few fixes cleaning up the flags unification creds cases (Pavel) - Fix NUMA affinities for completely unplugged/replugged node for io-wq - Two fallout fixes from the set_fs changes. One local to io_uring, one for the splice entry point that io_uring uses. - Linked timeout fixes (Pavel) - Removal of ->flush() ->files work-around that we don't need anymore with referenced files (Pavel) - Various cleanups (Pavel) * tag 'io_uring-5.10-2020-10-24' of git://git.kernel.dk/linux-block: splice: change exported internal do_splice() helper to take kernel offset io_uring: make loop_rw_iter() use original user supplied pointers io_uring: remove req cancel in ->flush() io-wq: re-set NUMA node affinities if CPUs come online io_uring: don't reuse linked_timeout io_uring: unify fsize with def->work_flags io_uring: fix racy REQ_F_LINK_TIMEOUT clearing io_uring: do poll's hash_node init in common code io_uring: inline io_poll_task_handler() io_uring: remove extra ->file check in poll prep io_uring: make cached_cq_overflow non atomic_t io_uring: inline io_fail_links() io_uring: kill ref get/drop in personality init io_uring: flags-based creds init in queue
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull libata fixes from Jens Axboe: "Two minor libata fixes: - Fix a DMA boundary mask regression for sata_rcar (Geert) - kerneldoc markup fix (Mauro)" * tag 'libata-5.10-2020-10-24' of git://git.kernel.dk/linux-block: ata: fix some kernel-doc markups ata: sata_rcar: Fix DMA boundary mask
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull misc vfs updates from Al Viro: "Assorted stuff all over the place (the largest group here is Christoph's stat cleanups)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: remove KSTAT_QUERY_FLAGS fs: remove vfs_stat_set_lookup_flags fs: move vfs_fstatat out of line fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat fs: remove vfs_statx_fd fs: omfs: use kmemdup() rather than kmalloc+memcpy [PATCH] reduce boilerplate in fsid handling fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS selftests: mount: add nosymfollow tests Add a "nosymfollow" mount option.
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: - document the new dma_{alloc,free}_pages() API - two fixups for the dma-mapping.h split * tag 'dma-mapping-5.10-1' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: document dma_{alloc,free}_pages dma-mapping: move more functions to dma-map-ops.h ARM/sa1111: add a missing include of dma-map-ops.h
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Paolo Bonzini: "Two fixes for this merge window, and an unrelated bugfix for a host hang" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: ioapic: break infinite recursion on lazy EOI KVM: vmx: rename pi_init to avoid conflict with paride KVM: x86/mmu: Avoid modulo operator on 64-bit value to fix i386 build
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 SEV-ES fixes from Borislav Petkov: "Three fixes to SEV-ES to correct setting up the new early pagetable on 5-level paging machines, to always map boot_params and the kernel cmdline, and disable stack protector for ../compressed/head{32,64}.c. (Arvind Sankar)" * tag 'x86_seves_fixes_for_v5.10_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/64: Explicitly map boot_params and command line x86/head/64: Disable stack protection for head$(BITS).o x86/boot/64: Initialize 5-level paging variables earlier
-
Willy Tarreau authored
Given that this code is new, let's add a selftest for it as well. It doesn't rely on fixed sets, instead it picks 1024 numbers and verifies that they're not more correlated than desired. Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ Cc: George Spelvin <lkml@sdf.org> Cc: Amit Klein <aksecurity@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: tytso@mit.edu Cc: Florian Westphal <fw@strlen.de> Cc: Marc Plumb <lkml.mplumb@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
-
Willy Tarreau authored
With the removal of the interrupt perturbations in previous random32 change (random32: make prandom_u32() output unpredictable), the PRNG has become 100% deterministic again. While SipHash is expected to be way more robust against brute force than the previous Tausworthe LFSR, there's still the risk that whoever has even one temporary access to the PRNG's internal state is able to predict all subsequent draws till the next reseed (roughly every minute). This may happen through a side channel attack or any data leak. This patch restores the spirit of commit f227e3ec ("random32: update the net random state on interrupt and activity") in that it will perturb the internal PRNG's statee using externally collected noise, except that it will not pick that noise from the random pool's bits nor upon interrupt, but will rather combine a few elements along the Tx path that are collectively hard to predict, such as dev, skb and txq pointers, packet length and jiffies values. These ones are combined using a single round of SipHash into a single long variable that is mixed with the net_rand_state upon each invocation. The operation was inlined because it produces very small and efficient code, typically 3 xor, 2 add and 2 rol. The performance was measured to be the same (even very slightly better) than before the switch to SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC (i40e), the connection rate dropped from 556k/s to 555k/s while the SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps. Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/ Cc: George Spelvin <lkml@sdf.org> Cc: Amit Klein <aksecurity@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: tytso@mit.edu Cc: Florian Westphal <fw@strlen.de> Cc: Marc Plumb <lkml.mplumb@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
-