- 25 Mar, 2013 4 commits
-
-
Jens Axboe authored
Less error prone if we just kill it, it's only used once anyway. Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kent Overstreet authored
Took out some nested functions, and fixed some more checkpatch complaints. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Kent Overstreet authored
config: make ARCH=i386 allmodconfig All error/warnings: drivers/md/bcache/bset.c: In function 'bch_ptr_bad': >> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/debug.c: In function 'bch_pbtree': >> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/btree.c: In function 'bch_btree_read_done': >> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/closure.o: In function `closure_debug_init': >> (.init.text+0x0): multiple definition of `init_module' >> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Jens Axboe authored
Merge branch 'bcache-for-upstream' of http://evilpiepirate.org/git/linux-bcache into for-3.10/drivers
-
- 23 Mar, 2013 23 commits
-
-
Kent Overstreet authored
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>
-
Kent Overstreet authored
Hack, but bcache needs a way around lockdep for locking during garbage collection - we need to keep multiple btree nodes locked for coalescing and rw_lock_nested() isn't really sufficient or appropriate here. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Ingo Molnar <mingo@redhat.com>
-
Kent Overstreet authored
Exported so it can be used by bcache's tracepoints Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: Ingo Molnar <mingo@redhat.com>
-
Kent Overstreet authored
Needed for bcache - need a cheap source of random numbers for perturbing IO sizes, for rate limiting IO to the SSD. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: "Theodore Ts'o" <tytso@mit.edu>
-
Kent Overstreet authored
This reverts commit 11b80f45. Bcache needs rw semaphores for cache coherency in writeback mode - writes have to take a read lock on a per cache device rw sem, and release it when the bio completes. But since this is for bios it's naturally not in the context of the process that originally took the lock. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Christoph Hellwig <hch@infradead.org> CC: David Howells <dhowells@redhat.com>
-
Lars Ellenberg authored
Now that the on-disk activity-log ring buffer size is adjustable, the maximum active set can become larger, and is now limited by the use of 16bit "labels". This increases the maximum working set from 6433 to 65534 extents, each of which covers an area of 4MiB. Which means that if you use the maximum, you'd have to resync more than 250 GiB after an unclean Primary shutdown. With capable backend storage and replication links, this is entirely feasible. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
There may have been more incoming requests while we where preparing the current transaction. Try to consolidate more updates into this transaction until we make no more progres. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
The IO accounting of the drbd "queue depth" was misleading. We only started IO accounting once we already wrote the activity log. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Depending on current IO depth, try to consolidate as many updates as possible into one activity log transaction. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
New helper to be able to consolidate more updates into a single transaction. Without this, we can only grab a single refcount on an updated element while preparing a transaction. lc_get_cumulative - like lc_get; also finds to-be-changed elements @lc: the lru cache to operate on @enr: the label to look up Unlike lc_get this also returns the element for @enr, if it is belonging to a pending transaction, so the return values are like for lc_get(), plus: pointer to an element already on the "to_be_changed" list. In this case, the cache was already marked %LC_DIRTY. Caller needs to make sure that the pending transaction is completed, before proceeding to actually use this element. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Fixed up by Jens to export lc_get_cumulative(). Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
To make the code easier to follow, use an explicit find_active_resync_extent(), and add a "nonblock" parameter to _al_get(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
This is in preparation to be able to defer requests that need to wait for an activity log transaction to a submitter workqueue. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
A request hitting an already "hot" extent should proceed right away, even if some other requests need to wait for pending transactions. Without that short-circuit, several simultaneous make_request contexts race for committing the transaction, possibly penalizing the innocent. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
We used to calculate all on-disk meta data offsets, and then compare the stored offsets, basically treating them as magic numbers. Now with the activity log striping, the activity log size is no longer fixed. We need to first read the super block, then base the activity log and bitmap offsets on the stored offsets/al stripe settings. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Make it obvious that this value is in units of 512 Byte sectors. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Now we have the cached meta_dev_idx member, we can get rid of a few rcu_read_lock() sections and rcu_dereference(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k The intended use case is activity log on RAID 0 or similar. Logically consecutive transactions will advance their on-disk position by al_stripe_size_4k 4kB (transaction sized) blocks. Right now, these are still asserted to be the backward compatible values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB). Also introduce a caching member for meta_dev_idx in the in-core structure: even though it is initially passed in in the rcu-protected disk_conf structure, it cannot change without a detach/attach cycle. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
Add a comment about our meta data layout variants, and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT) to make it clear that they are short hand for fixed constants, and not arbitrarily to be redefined as one may see fit. Properly pad struct meta_data_on_disk to 4kB, and initialize to zero not only the first 512 Byte, but all of it in drbd_md_sync(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
Lars Ellenberg authored
This fixes ASSERT( mdev->state.disk == D_FAILED ) in drivers/block/drbd/drbd_main.c When we detach from local disk, we let the local refcount hit zero twice. First, we transition to D_FAILED, so we won't give out new references to incoming requests; we still may give out *internal* references, though. Once the refcount hits zero [1] while in D_FAILED, we queue a transition to D_DISKLESS to our worker. We need to queue it, because we may be in atomic context when putting the reference. Once the transition to D_DISKLESS actually happened [2] from worker context, we don't give out new internal references either. Between hitting zero the first time [1] and actually transition to D_DISKLESS [2], there may be a few very short lived internal get/put, so we may hit zero more than once while being in D_FAILED, or even see a race where a an internal get_ldev() happened while D_FAILED, but the corresponding put_ldev() happens just after the transition to D_DISKLESS. That's why we have the additional test_and_set_bit(GO_DISKLESS,); and that's why the assert was placed wrong. Since there was exactly one code path left to drbd_go_diskless(), and that checks already for D_FAILED, drop that assert, and fold in the drbd_queue_work(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-
- 17 Mar, 2013 4 commits
-
-
Linus Torvalds authored
-
David Rientjes authored
Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after suspend/resume") introduces a link failure since perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL: arch/x86/power/built-in.o: In function `restore_processor_state': (.text+0x45c): undefined reference to `perf_restore_debug_store' Fix it by defining the dummy function appropriately. Signed-off-by: David Rientjes <rientjes@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after suspend/resume") fixed a crash when doing PEBS performance profiling after resuming, but in using init_debug_store_on_cpu() to restore the DS_AREA mtrr it also resulted in a new WARN_ON() triggering. init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU cross-calls to do the MSR update. Which is not really valid at the early resume stage, and the warning is quite reasonable. Now, it all happens to _work_, for the simple reason that smp_call_function_single() ends up just doing the call directly on the CPU when the CPU number matches, but we really should just do the wrmsr() directly instead. This duplicates the wrmsr() logic, but hopefully we can just remove the wrmsr_on_cpu() version eventually. Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "Eric's rcu barrier patch fixes a long standing problem with our unmount code hanging on to devices in workqueue helpers. Liu Bo nailed down a difficult assertion for in-memory extent mappings." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix warning of free_extent_map Btrfs: fix warning when creating snapshots Btrfs: return as soon as possible when edquot happens Btrfs: return EIO if we have extent tree corruption btrfs: use rcu_barrier() to wait for bdev puts at unmount Btrfs: remove btrfs_try_spin_lock Btrfs: get better concurrency for snapshot-aware defrag work
-
- 16 Mar, 2013 8 commits
-
-
Liu Bo authored
Users report that an extent map's list is still linked when it's actually going to be freed from cache. The story is that a) when we're going to drop an extent map and may split this large one into smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means that it's on the list to be logged, then the smaller ems split from it will also be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected. b) we'll keep ems from unlinking the list and freeing when they are flagged with EXTENT_FLAG_LOGGING, because the log code holds one reference. The end result is the warning, but the truth is that we set the flag EXTENT_FLAG_LOGGING only during fsync. So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one. Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuildLinus Torvalds authored
Pull kbuild fix from Michal Marek: "One fix for for make headers_install/headers_check to not require make 3.81. The requirement has been accidentally introduced in 3.7." * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: kbuild: fix make headers_check with make 3.80
-
git://openrisc.net/jonas/linuxLinus Torvalds authored
Pull OpenRISC bug fixes from Jonas Bonn: - The GPIO descriptor work has exposed how broken the non-GPIOLIB bits for OpenRISC were. We now require GPIOLIB as this is the preferred way forward. - The system.h split introduced a bug in llist.h for arches using asm-generic/cmpxchg.h directly, which is currently only OpenRISC. The patch here moves two defines from asm-generic/atomic.h to asm-generic/cmpxchg.h to make things work as they should. - The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does not have the virt_to_bus methods, so there's a patch to remove it again. * tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux: openrisc: remove HAVE_VIRT_TO_BUS asm-generic: move cmpxchg*_local defs to cmpxchg.h openrisc: require gpiolib
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds authored
Pull char/misc fixes from Greg Kroah-Hartman: "Here are some tiny fixes for the w1 drivers and the final removal patch for getting rid of CONFIG_EXPERIMENTAL (all users of it are now gone from your tree, this just drops the Kconfig item itself.) All have been in the linux-next tree for a while" * tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: final removal of CONFIG_EXPERIMENTAL w1: fix oops when w1_search is called from netlink connector w1-gpio: fix unused variable warning w1-gpio: remove erroneous __exit and __exit_p() ARM: w1-gpio: fix erroneous gpio requests
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "A collection of small fixes, as expected for the middle rc: - A couple of fixes for potential NULL dereferences and out-of-range array accesses revealed by static code parsers - A fix for the wrong error handling detected by trinity - A regression fix for missing audio on some MacBooks - CA0132 DSP loader fixes - Fix for EAPD control of IDT codecs on machines w/o speaker - Fix a regression in the HD-audio widget list parser code - Workaround for the NuForce UDH-100 USB audio" * tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs sound: sequencer: cap array index in seq_chn_common_event() ALSA: hda/ca0132 - Remove extra setting of dsp_state. ALSA: hda/ca0132 - Check download state of DSP. ALSA: hda/ca0132 - Check if dspload_image succeeded. ALSA: hda - Disable IDT eapd_switch if there are no internal speakers ALSA: hda - Fix snd_hda_get_num_raw_conns() to return a correct value ALSA: usb-audio: add a workaround for the NuForce UDH-100 ALSA: asihpi - fix potential NULL pointer dereference ALSA: seq: Fix missing error handling in snd_seq_timer_open()
-
git://git.linaro.org/people/mszyprowski/linux-dma-mappingLinus Torvalds authored
Pull DMA-mapping fix from Marek Szyprowski: "An important fix for all ARM architectures which use ZONE_DMA. Without it dma_alloc_* calls with GFP_ATOMIC flag might have allocated buffers outsize DMA zone." * 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixesLinus Torvalds authored
Pull MFD fixes from Samuel Ortiz: "This is the first batch of MFD fixes for 3.9. With this one we have: - An ab8500 build failure fix. - An ab8500 device tree parsing fix. - A fix for twl4030_madc remove routine to work properly (when built-in). - A fix for properly registering palmas interrupt handler. - A fix for omap-usb init routine to actually write into the hostconfig register. - A couple of warning fixes for ab8500-gpadc and tps65912" * tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes: mfd: twl4030-madc: Remove __exit_p annotation mfd: ab8500: Kill "reg" property from binding mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO mfd: wm831x: Don't forward declare enum wm831x_auxadc mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource() mfd: tps65912: Declare and use tps65912_irq_exit() mfd: palmas: Provide irq flags through DT/platform data mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error mfd: omap-usb-host: Actually update hostconfig
-
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-stagingLinus Torvalds authored
Pull hwmon fixes from Guenter Roeck: "Bug fixes for pmbus, ltc2978, and lineage-pem drivers Added specific maintainer for some hwmon drivers" * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (pmbus/ltc2978) Fix temperature reporting hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute() hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes MAINTAINERS: Add maintainer for MAX6697, INA209, and INA2XX drivers
-
- 15 Mar, 2013 1 commit
-
-
Stephane Eranian authored
This patch fixes a kernel crash when using precise sampling (PEBS) after a suspend/resume. Turns out the CPU notifier code is not invoked on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly by the kernel and keeps it power-on/resume value of 0 causing any PEBS measurement to crash when running on CPU0. The workaround is to add a hook in the actual resume code to restore the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0, the DS_AREA will be restored twice but this is harmless. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-