- 21 Nov, 2014 28 commits
-
-
Josef Bacik authored
We use the modified list to keep track of which extents have been modified so we know which ones are candidates for logging at fsync() time. Newly modified extents are added to the list at modification time, around the same time the ordered extent is created. We do this so that we don't have to wait for ordered extents to complete before we know what we need to log. The problem is when something like this happens log extent 0-4k on inode 1 copy csum for 0-4k from ordered extent into log sync log commit transaction log some other extent on inode 1 ordered extent for 0-4k completes and adds itself onto modified list again log changed extents see ordered extent for 0-4k has already been logged at this point we assume the csum has been copied sync log crash On replay we will see the extent 0-4k in the log, drop the original 0-4k extent which is the same one that we are replaying which also drops the csum, and then we won't find the csum in the log for that bytenr. This of course causes us to have errors about not having csums for certain ranges of our inode. So remove the modified list manipulation in unpin_extent_cache, any modified extents should have been added well before now, and we don't want them re-logged. This fixes my test that I could reliably reproduce this problem with. Thanks, cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Josef Bacik authored
Liu Bo pointed out that my previous fix would lose the generation update in the scenario I described. It is actually much worse than that, we could lose the entire extent if we lose power right after the transaction commits. Consider the following write extent 0-4k log extent in log tree commit transaction < power fail happens here ordered extent completes We would lose the 0-4k extent because it hasn't updated the actual fs tree, and the transaction commit will reset the log so it isn't replayed. If we lose power before the transaction commit we are save, otherwise we are not. Fix this by keeping track of all extents we logged in this transaction. Then when we go to commit the transaction make sure we wait for all of those ordered extents to complete before proceeding. This will make sure that if we lose power after the transaction commit we still have our data. This also fixes the problem of the improperly updated extent generation. Thanks, cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Josef Bacik authored
If we have two fsync()'s race on different subvols one will do all of its work to get into the log_tree, wait on it's outstanding IO, and then allow the log_tree to finish it's commit. The problem is we were just free'ing that subvols logged extents instead of waiting on them, so whoever lost the race wouldn't really have their data on disk. Fix this by waiting properly instead of freeing the logged extents. Thanks, cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
-
David Sterba authored
The sizes that are obtained from space infos are in raw units and have to be adjusted according to the raid factor. This was missing for f_bavail and df reported doubled size for raid1. Reported-by: Martin Steigerwald <Martin@lichtvoll.de> Fixes: ba7b6e62 ("btrfs: adjust statfs calculations according to raid profiles") CC: stable@vger.kernel.org Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
-
Gui Hecheng authored
This can be reproduced by fstests: btrfs/070 The scenario is like the following: replace worker thread defrag thread --------------------- ------------- copy_nocow_pages_worker btrfs_defrag_file copy_nocow_pages_for_inode ... btrfs_writepages |A| lock_extent_bits extent_write_cache_pages |B| lock_page __extent_writepage ... writepage_delalloc find_lock_delalloc_range |B| lock_extent_bits find_or_create_page pagecache_get_page |A| lock_page This leads to an ABBA pattern deadlock. To fix it, o we just change it to an AABB pattern which means to @unlock_extent_bits() before we @lock_page(), and in this way the @extent_read_full_page_nolock() is no longer in an locked context, so change it back to @extent_read_full_page() to regain protection. o Since we @unlock_extent_bits() earlier, then before @write_page_nocow(), the extent may not really point at the physical block we want, so we have to check it before write. Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com> Tested-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
Replacing a xattr consists of doing a lookup for its existing value, delete the current value from the respective leaf, release the search path and then finally insert the new value. This leaves a time window where readers (getxattr, listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs, so this has security implications. This change also fixes 2 other existing issues which were: *) Deleting the old xattr value without verifying first if the new xattr will fit in the existing leaf item (in case multiple xattrs are packed in the same item due to name hash collision); *) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't exist but we have have an existing item that packs muliple xattrs with the same name hash as the input xattr. In this case we should return ENOSPC. A test case for xfstests follows soon. Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace implementation. Reported-by: Alexandre Oliva <oliva@gnu.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
We try to allocate an extent state structure before acquiring the extent state tree's spinlock as we might need a new one later and therefore avoid doing later an atomic allocation while holding the tree's spinlock. However we returned -ENOMEM if that initial non-atomic allocation failed, which is a bit excessive since we might end up not needing the pre-allocated extent state at all - for the case where the tree doesn't have any extent states that cover the input range and cover too any other range. Therefore don't return -ENOMEM if that pre-allocation fails. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Josef Bacik authored
Our gluster boxes get several thousand statfs() calls per second, which begins to suck hardcore with all of the lock contention on the chunk mutex and dev list mutex. We don't really need to hold these things, if we have transient weirdness with statfs() because of the chunk allocator we don't care, so remove this locking. We still need the dev_list lock if you mount with -o alloc_start however, which is a good argument for nuking that thing from orbit, but that's a patch for another day. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Josef Bacik authored
Our gluster boxes were spending lots of time in statfs because our fs'es are huge. The problem is statfs loops through all of the block groups looking for read only block groups, and when you have several terabytes worth of data that ends up being a lot of block groups. Move the read only block groups onto a read only list and only proces that list in btrfs_account_ro_block_groups_free_space to reduce the amount of churn. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
-
David Sterba authored
Copy&paste errors in some messages and add few more missing macro accessors. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
-
Stefan Behrens authored
The xfstest btrfs/014 which tests the balance operation caused that the check_int module complained that known blocks changed their physical location. Since this is not an error in this case, only print such message if the verbose mode was enabled. Reported-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Tested-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Stefan Behrens authored
The xfstest btrfs/014 which tests the balance operation caused issues with the check_int module. The attempt was made to use btrfs_map_block() to find the physical location for a written block. However, this was not at all needed since the location of the written block was known since a hook to submit_bio() was the reason for entering the check_int module. Additionally, after a block relocation it happened that btrfs_map_block() failed causing misleading error messages afterwards. This patch changes the check_int module to use the known information of the physical location from the bio. Reported-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Tested-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
We try to allocate an extent state before acquiring the tree's spinlock just in case we end up needing to split an existing extent state into two. If that allocation failed, we would return -ENOMEM. However, our only single caller (transaction/log commit code), passes in an extent state that was cached from a call to find_first_extent_bit() and that has a very high chance to match exactly the input range (always true for a transaction commit and very often, but not always, true for a log commit) - in this case we end up not needing at all that initial extent state used for an eventual split. Therefore just don't return -ENOMEM if we can't allocate the temporary extent state, since we might not need it at all, and if we end up needing one, we'll do it later anyway. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
Right now the only caller of find_first_extent_bit() that is interested in caching extent states (transaction or log commit), never gets an extent state cached. This is because find_first_extent_bit() only caches states that have at least one of the flags EXTENT_IOBITS or EXTENT_BOUNDARY, and the transaction/log commit caller always passes a tree that doesn't have ever extent states with any of those flags (they can only have one of the following flags: EXTENT_DIRTY, EXTENT_NEW or EXTENT_NEED_WAIT). This change together with the following one in the patch series (titled "Btrfs: avoid returning -ENOMEM in convert_extent_bit() too early") will help reduce significantly the chances of calls to convert_extent_bit() fail with -ENOMEM when called from the transaction/log commit code. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
When committing a transaction or a log, we look for btree extents that need to be durably persisted by searching for ranges in a io tree that have some bits set (EXTENT_DIRTY or EXTENT_NEW). We then attempt to clear those bits and set the EXTENT_NEED_WAIT bit, with calls to the function convert_extent_bit, and then start writeback for the extents. That function however can return an error (at the moment only -ENOMEM is possible, specially when it does GFP_ATOMIC allocation requests through alloc_extent_state_atomic) - that means the ranges didn't got the EXTENT_NEED_WAIT bit set (or at least not for the whole range), which in turn means a call to btrfs_wait_marked_extents() won't find those ranges for which we started writeback, causing a transaction commit or a log commit to persist a new superblock without waiting for the writeback of extents in that range to finish first. Therefore if a crash happens after persisting the new superblock and before writeback finishes, we have a superblock pointing to roots that weren't fully persisted or roots that point to nodes or leafs that weren't fully persisted, causing all sorts of unexpected/bad behaviour as we endup reading garbage from disk or the content of some node/leaf from a past generation that got cowed or deleted and is no longer valid (for this later case we end up getting error messages like "parent transid verify failed on X wanted Y found Z" when reading btree nodes/leafs from disk). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Eryu Guan authored
device replace could fail due to another running scrub process or any other errors btrfs_scrub_dev() may hit, but this failure doesn't get returned to userspace. The following steps could reproduce this issue mkfs -t btrfs -f /dev/sdb1 /dev/sdb2 mount /dev/sdb1 /mnt/btrfs while true; do btrfs scrub start -B /mnt/btrfs >/dev/null 2>&1; done & btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs # if this replace succeeded, do the following and repeat until # you see this log in dmesg # BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115 #btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs # once you see the error log in dmesg, check return value of # replace echo $? Introduce a new dev replace result BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS to catch -EINPROGRESS explicitly and return other errors directly to userspace. Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Shilong Wang authored
size of @btrfsic_state needs more than 2M, it is very likely to fail allocating memory using kzalloc(). see following mesage: [91428.902148] Call Trace: [<ffffffff816f6e0f>] dump_stack+0x4d/0x66 [<ffffffff811b1c7f>] warn_alloc_failed+0xff/0x170 [<ffffffff811b66e1>] __alloc_pages_nodemask+0x951/0xc30 [<ffffffff811fd9da>] alloc_pages_current+0x11a/0x1f0 [<ffffffff811b1e0b>] ? alloc_kmem_pages+0x3b/0xf0 [<ffffffff811b1e0b>] alloc_kmem_pages+0x3b/0xf0 [<ffffffff811d1018>] kmalloc_order+0x18/0x50 [<ffffffff811d1074>] kmalloc_order_trace+0x24/0x140 [<ffffffffa06c097b>] btrfsic_mount+0x8b/0xae0 [btrfs] [<ffffffff810af555>] ? check_preempt_curr+0x85/0xa0 [<ffffffff810b2de3>] ? try_to_wake_up+0x103/0x430 [<ffffffffa063d200>] open_ctree+0x1bd0/0x2130 [btrfs] [<ffffffffa060fdde>] btrfs_mount+0x62e/0x8b0 [btrfs] [<ffffffff811fd9da>] ? alloc_pages_current+0x11a/0x1f0 [<ffffffff811b0a5e>] ? __get_free_pages+0xe/0x50 [<ffffffff81230429>] mount_fs+0x39/0x1b0 [<ffffffff812509fb>] vfs_kern_mount+0x6b/0x150 [<ffffffff812537fb>] do_mount+0x27b/0xc30 [<ffffffff811b0a5e>] ? __get_free_pages+0xe/0x50 [<ffffffff812544f6>] SyS_mount+0x96/0xf0 [<ffffffff81701970>] system_call_fastpath+0x16/0x1b Since we are allocating memory for hash table array, so it will be good if we could allocate continuous pages here. Fix this problem by firstly trying kzalloc(), if we fail, use vzalloc() instead. Signed-off-by: Wang Shilong <wangshilong1991@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
If cow_file_range_inline() failed, when called from compress_file_range(), we were tagging the locked page for writeback, end its writeback and unlock it, but not marking it with an error nor setting AS_EIO in inode's mapping flags. This made it impossible for a caller of filemap_fdatawrite_range (writepages) or filemap_fdatawait_range() to know that an error happened. And the return value of compress_file_range() is useless because it's returned to a workqueue task and not to the task calling filemap_fdatawrite_range (writepages). This change applies on top of the previous patchset starting at the patch titled: "[1/5] Btrfs: set page and mapping error on compressed write failure" Which changed extent_clear_unlock_delalloc() to use SetPageError and mapping_set_error(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
To avoid duplicating this double filemap_fdatawrite_range() call for inodes with async extents (compressed writes) so often. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
For compressed writes, after doing the first filemap_fdatawrite_range() we don't get the pages tagged for writeback immediately. Instead we create a workqueue task, which is run by other kthread, and keep the pages locked. That other kthread compresses data, creates the respective ordered extent/s, tags the pages for writeback and unlocks them. Therefore we need a second call to filemap_fdatawrite_range() if we have compressed writes, as this second call will wait for the pages to become unlocked, then see they became tagged for writeback and finally wait for the writeback to finish. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
Its return value is useless, its single caller ignores it and can't do anything with it anyway, since it's a workqueue task and not the task calling filemap_fdatawrite_range (writepages) nor filemap_fdatawait_range(). Failure is communicated to such functions via start and end of writeback with the respective pages tagged with an error and AS_EIO flag set in the inode's imapping. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Shilong Wang authored
Steps to reproduce: # mkfs.btrfs -f /dev/sdb # mount -t btrfs /dev/sdb /mnt -o compress=lzo # dd if=/dev/zero of=/mnt/data bs=$((33*4096)) count=1 after previous steps, inode will be detected as bad compression ratio, and NOCOMPRESS flag will be set for that inode. Reason is that compress have a max limit pages every time(128K), if a 132k write in, it will be splitted into two write(128k+4k), this bug is a leftover for commit 68bb462d(Btrfs: don't compress for a small write) Fix this problem by checking every time before compression, if it is a small write(<=blocksize), we bail out and fall into nocompression directly. Signed-off-by: Wang Shilong <wangshilong1991@gmail.com> Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
Our compressed bio write end callback was essentially ignoring the error parameter. When a write error happens, it must pass a value of 0 to the inode's write_page_end_io_hook callback, SetPageError on the respective pages and set AS_EIO in the inode's mapping flags, so that a call to filemap_fdatawait_range() / filemap_fdatawait() can find out that errors happened (we surely don't want silent failures on fsync for example). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
Its return value is completely ignored by its single caller and it's useless anyway, since errors are indicated through SetPageError and the bit AS_EIO set in the flags of the inode's mapping. The caller can't do anything with the value, as it's invoked from a workqueue task and not by the task calling filemap_fdatawrite_range (which calls the writepages address space callback, which in turn calls the inode's fill_delalloc callback). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
If we had an error when processing one of the async extents from our list, we were not processing the remaining async extents, meaning we would leak those async_extent structs, never release the pages with the compressed data and never unlock and clear the dirty flag from the inode's pages (those that correspond to the uncompressed content). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
In inode.c:submit_compressed_extents(), if we fail before calling btrfs_submit_compressed_write(), or when that function fails, we were freeing the async_extent structure without releasing its pages and freeing the pages array. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
In inode.c:submit_compressed_extents(), before calling btrfs_submit_compressed_write() we start writeback for all pages, clear their dirty flag, unlock them, etc, but if btrfs_submit_compressed_write() fails (at the moment it can only fail with -ENOMEM), we never end the writeback on the pages, so any filemap_fdatawait_range() call will hang forever. We were also not calling the writepage end io hook, which means the corresponding ordered extent will never complete and all its waiters will block forever, such as a full fsync (via btrfs_wait_ordered_range()). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
Filipe Manana authored
If we fail in submit_compressed_extents() before calling btrfs_submit_compressed_write(), we start and end the writeback for the pages (clear their dirty flag, unlock them, etc) but we don't tag the pages, nor the inode's mapping, with an error. This makes it impossible for a caller of filemap_fdatawait_range() (fsync, or transaction commit for e.g.) know that there was an error. Note that the return value of submit_compressed_extents() is useless, as that function is executed by a workqueue task and not directly by the fill_delalloc callback. This means the writepage/s callbacks of the inode's address space operations don't get that return value. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
-
- 17 Nov, 2014 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fixes from Olof Johansson: "Another small set of fixes: - some DT compatible typo fixes - irq setup fix dealing with irq storms on orion - i2c quirk generalization for mvebu - a handful of smaller fixes for OMAP - a couple of added file patterns for OMAP entries in MAINTAINERS" * tag 'armsoc-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: at91/dt: Fix sama5d3x typos pinctrl: dra: dt-bindings: Fix output pull up/down MAINTAINERS: Update entry for omap related .dts files to cover new SoCs MAINTAINERS: add more files under OMAP SUPPORT ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage ARM: dts: am335x-evm: Fix 5th NAND partition's name ARM: orion: Fix for certain sequence of request_irq can cause irq storm ARM: mvebu: armada xp: Generalize use of i2c quirk
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fixes from David Miller: 1) Fix NULL oops in Schizo PCI controller error handler. 2) Fix race between xchg and other operations on 32-bit sparc, from Andreas Larsson. 3) swab*() helpers need a dummy memory input operand to show data flow on 64-bit sparc. 4) Fix RCU warnings due to missing irq_{enter,exit}() around generic_smp_call_function*() calls. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix constraints on swab helpers. sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks sparc64: Do irq_{enter,exit}() around generic_smp_call_function*(). sparc64: Fix crashes in schizo_pcierr_intr_other().
-
- 16 Nov, 2014 9 commits
-
-
git://neil.brown.name/mdLinus Torvalds authored
Pull md bugfix from Neil Brown: "One fix for md for 3.18. This fixes a regression introduced in 3.13" * tag 'md/3.18-fix' of git://neil.brown.name/md: md: Always set RECOVERY_NEEDED when clearing RECOVERY_FROZEN
-
Peter Rosin authored
Some DT files had a typo with a missing "5" in sama5d3x first compatible string. Signed-off-by: Peter Rosin <peda@axentia.se> [nicolas.ferre@atmel.com: modify commit log] Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Olof Johansson <olof@lixom.net>
-
Olof Johansson authored
Merge tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Merge "omap fixes against v3.18-rc4" from Tony Lindgren: Few omap fixes for hangs and wrong pinctrl defines, and update MAINTAINERS file to avoid missing PMIC and SoC related patches: - Fix random hangs on am437x because of incorrect default value for the DDR regulator - Fix wrong partition name for NAND on am335x-evm - Fix wrong pinctrl defines for dra7xx - Update maintainers entries for PMICs and SoCs * tag 'omap-fixes-against-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: pinctrl: dra: dt-bindings: Fix output pull up/down MAINTAINERS: Update entry for omap related .dts files to cover new SoCs MAINTAINERS: add more files under OMAP SUPPORT ARM: dts: AM437x-SK-EVM: Fix DCDC3 voltage ARM: dts: AM437x-GP-EVM: Fix DCDC3 voltage ARM: dts: AM43x-EPOS-EVM: Fix DCDC3 voltage ARM: dts: am335x-evm: Fix 5th NAND partition's name Signed-off-by: Olof Johansson <olof@lixom.net>
-
git://git.infradead.org/linux-mvebuOlof Johansson authored
Merge "mvebu fixes for v3.18" from Jason Cooper: - Armada XP - Generalize i2c quirk - orion - Fix irq storm caused by specific sequence of request_irq * tag 'mvebu-fixes-3.18' of git://git.infradead.org/linux-mvebu: ARM: orion: Fix for certain sequence of request_irq can cause irq storm ARM: mvebu: armada xp: Generalize use of i2c quirk
-
NeilBrown authored
md_check_recovery will skip any recovery and also clear MD_RECOVERY_NEEDED if MD_RECOVERY_FROZEN is set. So when we clear _FROZEN, we must set _NEEDED and ensure that md_check_recovery gets run. Otherwise we could miss out on something that is needed. In particular, this can make it impossible to remove a failed device from an array is the 'recovery-needed' processing didn't happen. Suitable for stable kernels since 3.13. Cc: stable@vger.kernel.org (3.13+) Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com> Fixes: 30b8feb7Signed-off-by: NeilBrown <neilb@suse.de>
-
David S. Miller authored
We are reading the memory location, so we have to have a memory constraint in there purely for the sake of showing the data flow to the compiler. Reported-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is a set of six fixes and a MAINTAINER update. The fixes are two multipath (one in Test Unit Ready handling for the path checkers and one in the section of code that sends a start unit after failover; both of these were perturbed by the scsi-mq update), a CD-ROM door locking fix that was likewise introduced by scsi-mq and three driver fixes for a previous code update in cxgb4i, megaraid_sas and bnx2fc" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: bnx2fc: fix tgt spinlock locking megaraid_sas: fix bug in handling return value of pci_enable_msix_range() cxgb4i: send abort_rpl correctly cxgbi: add maintainer for cxgb3i/cxgb4i scsi: TUR path is down after adapter gets reset with multipath scsi: call device handler for failed TUR command scsi: only re-lock door after EH on devices that were reset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "Microcode fixes, a Xen fix and a KASLR boot loading fix with certain memory layouts" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode, AMD: Fix ucode patch stashing on 32-bit x86/core, x86/xen/smp: Use 'die_complete' completion when taking CPU down x86, microcode: Fix accessing dis_ucode_ldr on 32-bit x86, kaslr: Prevent .bss from overlaping initrd x86, microcode, AMD: Fix early ucode loading on 32-bit
-
Linus Torvalds authored
Al Viro pointed out that the x86-64 csum_partial_copy_from_user() is somewhat confused about what it should do on errors, notably it mostly clears the uncopied end result buffer, but misses that for the initial alignment case. All users should check for errors, so it's dubious whether the clearing is even necessary, and Al also points out that we should probably clean up the calling conventions, but regardless of any future changes to this function, the fact that it is inconsistent is just annoying. So make the __get_user() failure path use the same error exit as all the other errors do. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: David Miller <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-