- 10 Mar, 2016 1 commit
-
-
Petr Mladek authored
This patch moves the deferred work from the "vballoon" kthread into a system freezable workqueue. We do not need to maintain and run a dedicated kthread. Also the event driven workqueues API makes the logic much easier. Especially, we do not longer need an own wait queue, wait function, and freeze point. The conversion is pretty straightforward. One cycle of the main loop is put into a work. The work is queued instead of waking the kthread. fill_balloon() and leak_balloon() have a limit for the amount of modified pages. The work re-queues itself when necessary. For this, we make fill_balloon() to return the number of really modified pages. Note that leak_balloon() already did this. virtballoon_restore() queues the work only when really needed. The only complication is that we need to prevent queuing the work when the balloon is being removed. It was easier before because the kthread simply removed itself from the wait queue. We need an extra boolean and spin lock now. My initial idea was to use a dedicated workqueue. Michael S. Tsirkin suggested using a system one. Tejun Heo confirmed that the system workqueue has a pretty high concurrency level (256) by default. Therefore we need not be afraid of too long blocking. Signed-off-by: Petr Mladek <pmladek@suse.cz> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
- 02 Mar, 2016 16 commits
-
-
Cornelia Huck authored
SET_IND takes as a payload the _address_ of the indicators, meaning that we have one of the rare cases where kmalloc(sizeof(&...)) is actually correct. Let's clarify that with a comment. The count for the ccw, however, was only correct because the indicators are 64 bit. Let's use the correct value. Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Geliang Tang authored
Use dev_to_virtio() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@163.com> Message-Id: <912bf59bd3a48f2d4d4994681e898dc084fe29d3.1451484163.git.geliangtang@163.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Greg Kurz authored
Looking at how callers use this, maybe we should just rename init_used to vhost_vq_init_access. The _used suffix was a hint that we access the vq used ring. But maybe what callers care about is that it must be called after access_ok. Also, this function manipulates the vq->is_le field which isn't related to the vq used ring. This patch simply renames vhost_init_used() to vhost_vq_init_access() as suggested by Michael. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Greg Kurz authored
The default use case for vhost is when the host and the vring have the same endianness (default native endianness). But there are cases where they differ and vhost should byteswap when accessing the vring. The first case is when the host is big endian and the vring belongs to a virtio 1.0 device, which is always little endian. This is covered by the vq->is_le field. This field is initialized when userspace calls the VHOST_SET_FEATURES ioctl. It is reset when the device stops. We already have a vhost_init_is_le() helper, but the reset operation is opencoded as follows: vq->is_le = virtio_legacy_is_little_endian(); It isn't clear that we are resetting vq->is_le here. This patch moves the code to a helper with a more explicit name. The other case where we may have to byteswap is when the architecture can switch endianness at runtime (bi-endian). If endianness differs in the host and in the guest, then legacy devices need to be used in cross-endian mode. This mode is available with CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y, which introduces a vq->user_be field. Userspace may enable cross-endian mode by calling the SET_VRING_ENDIAN ioctl before the device is started. The cross-endian mode is disabled when the device is stopped. The current names of the helpers that manipulate vq->user_be are unclear. This patch renames those helpers to clearly show that this is cross-endian stuff and with explicit enable/disable semantics. No behaviour change. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Michael S. Tsirkin authored
Latest virtio spec says the feature bit name is VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_WCE is the legacy name. virtio blk header says exactly the reverse - fix that and update driver code to match. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Andy Lutomirski authored
Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
-
Andy Lutomirski authored
This switches to vring_create_virtqueue, simplifying the driver and adding DMA API support. This fixes virtio-pci on platforms and busses that have IOMMUs. This will break the experimental QEMU Q35 IOMMU support until QEMU is fixed. In exchange, it fixes physical virtio hardware as well as virtio-pci running under Xen. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Andy Lutomirski authored
This switches to vring_create_virtqueue, simplifying the driver and adding DMA API support. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Andy Lutomirski authored
This leaves vring_new_virtqueue alone for compatbility, but it adds two new improved APIs: vring_create_virtqueue: Creates a virtqueue backed by automatically allocated coherent memory. (Some day it this could be extended to support non-coherent memory, too, if there ends up being a platform on which it's worthwhile.) __vring_new_virtqueue: Creates a virtqueue with a manually-specified layout. This should allow mic_virtio to work much more cleanly. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Andy Lutomirski authored
virtio_ring currently sends the device (usually a hypervisor) physical addresses of its I/O buffers. This is okay when DMA addresses and physical addresses are the same thing, but this isn't always the case. For example, this never works on Xen guests, and it is likely to fail if a physical "virtio" device ever ends up behind an IOMMU or swiotlb. The immediate use case for me is to enable virtio on Xen guests. For that to work, we need vring to support DMA address translation as well as a corresponding change to virtio_pci or to another driver. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Andy Lutomirski authored
This is a kludge, but no one has come up with a a better idea yet. We'll introduce DMA API support guarded by vring_use_dma_api(). Eventually we may be able to return true on more and more systems, and hopefully we can get rid of vring_use_dma_api() entirely some day. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Christian Borntraeger authored
As virtio-ccw will have dma ops, we can no longer default to the zPCI ones. Make use of dev_archdata to keep the dma_ops per device. The pci devices now use that to override the default, and the default is changed to use the noop ops for everything that does not specify a device specific one. To compile without PCI support we will enable HAS_DMA all the time, via the default config in lib/Kconfig. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Christian Borntraeger authored
Some of the alpha pci noop dma ops are identical to the common ones. Use them. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Christian Borntraeger authored
We are going to require dma_ops for several common drivers, even for systems that do have an identity mapping. Lets provide some minimal no-op dma_ops that can be used for that purpose. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Greg Kurz authored
We don't want side effects. If something fails, we rollback vq->is_le to its previous value. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
Ladi Prosek authored
Looks like a copy-paste bug. The value is used as an optimization and a wrong value probably isn't causing any serious damage. Found when porting this code to Windows. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-
- 28 Feb, 2016 15 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Thomas Gleixner: "A rather largish series of 12 patches addressing a maze of race conditions in the perf core code from Peter Zijlstra" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Robustify task_function_call() perf: Fix scaling vs. perf_install_in_context() perf: Fix scaling vs. perf_event_enable() perf: Fix scaling vs. perf_event_enable_on_exec() perf: Fix ctx time tracking by introducing EVENT_TIME perf: Cure event->pending_disable race perf: Fix race between event install and jump_labels perf: Fix cloning perf: Only update context time when active perf: Allow perf_release() with !event->ctx perf: Do not double free perf: Close install vs. exit race
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Thomas Gleixner: "This update contains: - Hopefully the last ASM CLAC fixups - A fix for the Quark family related to the IMR lock which makes kexec work again - A off-by-one fix in the MPX code. Ironic, isn't it? - A fix for X86_PAE which addresses once more an unsigned long vs phys_addr_t hickup" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mpx: Fix off-by-one comparison with nr_registers x86/mm: Fix slow_virt_to_phys() for X86_PAE again x86/entry/compat: Add missing CLAC to entry_INT80_32 x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32 x86/platform/intel/quark: Change the kernel's IMR lock bit to false
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixlet from Thomas Gleixner: "A trivial printk typo fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix trivial typo in printk() message
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes from Thomas Gleixner: "Four small fixes for irqchip drivers: - Add missing low level irq handler initialization on mxs, so interrupts can acutally be delivered - Add a missing barrier to the GIC driver - Two fixes for the GIC-V3-ITS driver, addressing a double EOI write and a cache flush beyond the actual region" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3: Add missing barrier to 32bit version of gic_read_iar() irqchip/mxs: Add missing set_handle_irq() irqchip/gicv3-its: Avoid cache flush beyond ITS_BASERn memory size irqchip/gic-v3-its: Fix double ICC_EOIR write for LPI in EOImode==1
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging/android fix from Greg KH: "Here is one patch, for the android binder driver, to resolve a reported problem. Turns out it has been around for a while (since 3.15), so it is good to finally get it resolved. It has been in linux-next for a while with no reported issues" * tag 'staging-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: drivers: android: correct the size of struct binder_uintptr_t for BC_DEAD_BINDER_DONE
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are a few USB fixes for 4.5-rc6 They fix a reported bug for some USB 3 devices by reverting the recent patch, a MAINTAINERS change for some drivers, some new device ids, and of course, the usual bunch of USB gadget driver fixes. All have been in linux-next for a while with no reported issues" * tag 'usb-4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: MAINTAINERS: drop OMAP USB and MUSB maintainership usb: musb: fix DMA for host mode usb: phy: msm: Trigger USB state detection work in DRD mode usb: gadget: net2280: fix endpoint max packet for super speed connections usb: gadget: gadgetfs: unregister gadget only if it got successfully registered usb: gadget: remove driver from pending list on probe error Revert "usb: hub: do not clear BOS field during reset device" usb: chipidea: fix return value check in ci_hdrc_pci_probe() usb: chipidea: error on overflow for port_test_write USB: option: add "4G LTE usb-modem U901" USB: cp210x: add IDs for GE B650V3 and B850V3 boards USB: option: add support for SIM7100E usb: musb: Fix DMA desired mode for Mentor DMA engine usb: gadget: fsl_qe_udc: fix IS_ERR_VALUE usage usb: dwc2: USB_DWC2 should depend on HAS_DMA usb: dwc2: host: fix the data toggle error in full speed descriptor dma usb: dwc2: host: fix logical omissions in dwc2_process_non_isoc_desc usb: dwc3: Fix assignment of EP transfer resources usb: dwc2: Add extra delay when forcing dr_mode
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do_last(): ELOOP failure exit should be done after leaving RCU mode should_follow_link(): validate ->d_seq after having decided to follow namei: ->d_inode of a pinned dentry is stable only for positives do_last(): don't let a bogus return value from ->open() et.al. to confuse us fs: return -EOPNOTSUPP if clone is not supported hpfs: don't truncate the file when delete fails
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "We didn't have a batch last week, so this one is slightly larger. None of them are scary though, a handful of fixes for small DT pieces, replacing properties with newer conventions. Highlights: - N900 fix for setting system revision - onenand init fix to avoid filesystem corruption - Clock fix for audio on Beaglebone-x15 - Fixes on shmobile to deal with CONFIG_DEBUG_RODATA (default y in 4.6) + misc smaller stuff" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: Extend info, add wiki and ml for meson arch MAINTAINERS: alpine: add a new maintainer and update the entry ARM: at91/dt: fix typo in sama5d2 pinmux descriptions ARM: OMAP2+: Fix onenand initialization to avoid filesystem corruption Revert "regulator: tps65217: remove tps65217.dtsi file" ARM: shmobile: Remove shmobile_boot_arg ARM: shmobile: Move shmobile_smp_{mpidr, fn, arg}[] from .text to .bss ARM: shmobile: r8a7779: Remove remainings of removed SCU boot setup code ARM: shmobile: Move shmobile_scu_base from .text to .bss ARM: OMAP2+: Fix omap_device for module reload on PM runtime forbid ARM: OMAP2+: Improve omap_device error for driver writers ARM: DTS: am57xx-beagle-x15: Select SYS_CLK2 for audio clocks ARM: dts: am335x/am57xx: replace gpio-key,wakeup with wakeup-source property ARM: OMAP2+: Set system_rev from ATAGS for n900 ARM: dts: orion5x: fix the missing mtd flash on linkstation lswtgl ARM: dts: kirkwood: use unique machine name for ds112 ARM: dts: imx6: remove bogus interrupt-parent from CAAM node
-
Al Viro authored
... or we risk seeing a bogus value of d_is_symlink() there. Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
... otherwise d_is_symlink() above might have nothing to do with the inode value we've got. Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
both do_last() and walk_component() risk picking a NULL inode out of dentry about to become positive, *then* checking its flags and seeing that it's not negative anymore and using (already stale by then) value they'd fetched earlier. Usually ends up oopsing soon after that... Cc: stable@vger.kernel.org # v3.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
... into returning a positive to path_openat(), which would interpret that as "symlink had been encountered" and proceed to corrupt memory, etc. It can only happen due to a bug in some ->open() instance or in some LSM hook, etc., so we report any such event *and* make sure it doesn't trick us into further unpleasantness. Cc: stable@vger.kernel.org # v3.6+, at least Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Christoph Hellwig authored
-EBADF is a rather confusing error if an operations is not supported, and nfsd gets rather upset about it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Mikulas Patocka authored
The delete opration can allocate additional space on the HPFS filesystem due to btree split. The HPFS driver checks in advance if there is available space, so that it won't corrupt the btree if we run out of space during splitting. If there is not enough available space, the HPFS driver attempted to truncate the file, but this results in a deadlock since the commit 7dd29d8d ("HPFS: Introduce a global mutex and lock it on every callback from VFS"). This patch removes the code that tries to truncate the file and -ENOSPC is returned instead. If the user hits -ENOSPC on delete, he should try to delete other files (that are stored in a leaf btree node), so that the delete operation will make some space for deleting the file stored in non-leaf btree node. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: stable@vger.kernel.org # 2.6.39+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 27 Feb, 2016 8 commits
-
-
Linus Torvalds authored
Merge fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: dax: move writeback calls into the filesystems dax: give DAX clearing code correct bdev ext4: online defrag not supported with DAX ext2, ext4: only set S_DAX for regular inodes block: disable block device DAX by default ocfs2: unlock inode if deleting inode from orphan fails mm: ASLR: use get_random_long() drivers: char: random: add get_random_long() mm: numa: quickly fail allocations for NUMA balancing on full nodes mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext2/4 DAX fix from Ted Ts'o: "This fixes a file system corruption bug with DAX" * tag 'tags/ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext2, ext4: fix issue with missing journal entry in ext4_dax_mkwrite()
-
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds authored
Pull PCI fixes from Bjorn Helgaas: "Enumeration: Revert x86 pcibios_alloc_irq() to fix regression (Bjorn Helgaas) Marvell MVEBU host bridge driver: Restrict build to 32-bit ARM (Thierry Reding)" * tag 'pci-v4.5-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: mvebu: Restrict build to 32-bit ARM Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()" Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed" Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
-
Ross Zwisler authored
As it is currently written ext4_dax_mkwrite() assumes that the call into __dax_mkwrite() will not have to do a block allocation so it doesn't create a journal entry. For a read that creates a zero page to cover a hole followed by a write that actually allocates storage this is incorrect. The ext4_dax_mkwrite() -> __dax_mkwrite() -> __dax_fault() path calls get_blocks() to allocate storage. Fix this by having the ->page_mkwrite fault handler call ext4_dax_fault() as this function already has all the logic needed to allocate a journal entry and call __dax_fault(). Also update the ext2 fault handlers in this same way to remove duplicate code and keep the logic between ext2 and ext4 the same. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fix from Stephen Boyd: "One small fix to keep OMAP platforms working across a suspend/resume cycle" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: ti: omap3+: dpll: use non-locking version of clk_get_rate
-
Ross Zwisler authored
Previously calls to dax_writeback_mapping_range() for all DAX filesystems (ext2, ext4 & xfs) were centralized in filemap_write_and_wait_range(). dax_writeback_mapping_range() needs a struct block_device, and it used to get that from inode->i_sb->s_bdev. This is correct for normal inodes mounted on ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time files. Instead, call dax_writeback_mapping_range() directly from the filesystem ->writepages function so that it can supply us with a valid block device. This also fixes DAX code to properly flush caches in response to sync(2). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ross Zwisler authored
dax_clear_blocks() needs a valid struct block_device and previously it was using inode->i_sb->s_bdev in all cases. This is correct for normal inodes on mounted ext2, ext4 and XFS filesystems, but is incorrect for DAX raw block devices and for XFS real-time devices. Instead, rename dax_clear_blocks() to dax_clear_sectors(), and change its arguments to take a bdev and a sector instead of an inode and a block. This better reflects what the function does, and it allows the filesystem and raw block device code to pass in an appropriate struct block_device. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ross Zwisler authored
Online defrag operations for ext4 are hard coded to use the page cache. See ext4_ioctl() -> ext4_move_extents() -> move_extent_per_page() When combined with DAX I/O, which circumvents the page cache, this can result in data corruption. This was observed with xfstests ext4/307 and ext4/308. Fix this by only allowing online defrag for non-DAX files. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jens Axboe <axboe@fb.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-