- 11 Sep, 2009 21 commits
-
-
Jens Axboe authored
Test results here look good, and on big OLTP runs it has also shown to significantly increase cycles attributed to the database and cause a performance boost. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Shan Wei authored
The blktrace tools can show process id when cfq dispatched a request, using cfq_log_cfqq() instead of cfq_log(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Marcin Slusarz authored
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Tim Waugh <tim@cyberelk.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Eric Dumazet authored
commit 22bece00 (cciss: fix regression firmware not displayed in procfs) added a small memory leak in cciss_init_one() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Miklos Szeredi authored
Splice should update the modification and access times on regular files just like read and write. Not updating mtime will confuse backup tools, etc... This patch only adds the time updates for regular files. For pipes and other special files that splice touches the need for updating the times is less clear. Let's discuss and fix that separately. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Return 0 if we successfully marked this iopoll structure as ours for scheduling, instead of 1. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
It's not currently used, as pointed out by Gui Jianfeng <guijianfeng@cn.fujitsu.com>. We already check the wait_request flag to allow an idling queue priority allocation access, so we don't need this extra flag. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
We already have interrupts disabled at that point, so use the __raise_softirq_irqoff() variant. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
It's not exported, I doubt we'll have a reason to change this... Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Note sure why they happened in the first place, probably some bad terminal setting. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
This borrows some code from NAPI and implements a polled completion mode for block devices. The idea is the same as NAPI - instead of doing the command completion when the irq occurs, schedule a dedicated softirq in the hopes that we will complete more IO when the iopoll handler is invoked. Devices have a budget of commands assigned, and will stay in polled mode as long as they continue to consume their budget from the iopoll softirq handler. If they do not, the device is set back to interrupt completion mode. This patch holds the core bits for blk-iopoll, device driver support sold separately. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Instead of just checking whether this device uses block layer tagging, we can improve the detection by looking at the maximum queue depth it has reached. If that crosses 4, then deem it a queuing device. This is important on high IOPS devices, since plugging hurts the performance there (it can be as much as 10-15% of the sys time). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Get rid of any functions that test for these bits and make callers use bio_rw_flagged() directly. Then it is at least directly apparent what variable and flag they check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Makes for a saner interface, instead of returning the bit position. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Hannes Reinecke authored
Whenever a block device changes it's read-only attribute notify the userspace about it. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Vivek Goyal authored
o Get rid of busy_rt_queues infrastructure. Looks like it is redundant. o Once an RT queue gets request it will preempt any of the BE or IDLE queues immediately. Otherwise this queue will be put on service tree and scheduler will anyway select this queue before any of the BE or IDLE queue. Hence looks like there is no need to keep track of how many busy RT queues are currently on service tree. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
To lessen the impact of async IO on sync IO, let the device drain of any async IO in progress when switching to a sync cfqq that has idling enabled. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Update scsi_io_completion() such that it only fails requests till the next error boundary and retry the leftover. This enables block layer to merge requests with different failfast settings and still behave correctly on errors. Allow merge of requests of different failfast settings. As SCSI is currently the only subsystem which follows failfast status, there's no need to worry about other block drivers for now. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Niel Lambrechts <niel.lambrechts@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Failfast has characteristics from other attributes. When issuing, executing and successuflly completing requests, failfast doesn't make any difference. It only affects how a request is handled on failure. Allowing requests with different failfast settings to be merged cause normal IOs to fail prematurely while not allowing has performance penalties as failfast is used for read aheads which are likely to be located near in-flight or to-be-issued normal IOs. This patch introduces the concept of 'mixed merge'. A request is a mixed merge if it is merge of segments which require different handling on failure. Currently the only mixable attributes are failfast ones (or lack thereof). When a bio with different failfast settings is added to an existing request or requests of different failfast settings are merged, the merged request is marked mixed. Each bio carries failfast settings and the request always tracks failfast state of the first bio. When the request fails, blk_rq_err_bytes() can be used to determine how many bytes can be safely failed without crossing into an area which requires further retrials. This allows request merging regardless of failfast settings while keeping the failure handling correct. This patch only implements mixed merge but doesn't enable it. The next one will update SCSI to make use of mixed merge. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Niel Lambrechts <niel.lambrechts@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
bio and request use the same set of failfast bits. This patch makes the following changes to simplify things. * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_* bits coincide with __REQ_FAILFAST_* bits. * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV but the matching is useless anyway. init_request_from_bio() is responsible for setting FAILFAST bits on FS requests and non-FS requests never use BIO_RW_AHEAD. Drop the code and comment from blk_rq_bio_prep(). * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and simplify FAILFAST flags handling in init_request_from_bio(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
- 10 Sep, 2009 3 commits
-
-
Geert Uytterhoeven authored
Commit b8313b6d ("dm log: remove incorrect field from userspace table output") added a call to strstr() with a single-character "needle" string parameter. Unfortunately some versions of gcc replace such calls to strstr() by calls to strchr() behind our back. This causes linking errors if strchr() is defined as an inline function in <asm/string.h> (e.g. on m68k): | WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined! Avoid this by explicitly calling strchr() instead. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
* lookup-permissions-cleanup: jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()' ext[234]: move over to 'check_acl' permission model shmfs: use 'check_acl' instead of 'permission' Make 'check_acl()' a first-class filesystem op Simplify exec_permission_lite(), part 3 Simplify exec_permission_lite() further Simplify exec_permission_lite() logic Do not call 'ima_path_check()' for each path component
-
Roland McGrath authored
In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if the PT_LOAD has no PROT_WRITE and no .bss. This generates EFAULT. Here is a small test case. (Yes, there are other, useful PT_INTERP which have only .text and no .data/.bss.) ----- ptinterp.S _start: .globl _start nop int3 ----- $ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S $ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c $ ./hello Segmentation fault # during execve() itself After applying the patch: $ ./hello Trace trap # user-mode execution after execve() finishes If the ELF headers are actually self-inconsistent, then dying is fine. But having no PROT_WRITE segment is perfectly normal and correct if there is no segment with p_memsz > p_filesz (i.e. bss). John Reiser suggested checking for PROT_WRITE in the bss logic. I think it makes most sense to simply apply the bss logic only when there is bss. This patch looks less trivial than it is due to some reindentation. It just moves the "if (last_bss > elf_bss) {" test up to include the partial-page bss logic as well as the more-pages bss logic. Reported-by: John Reiser <jreiser@bitwagon.com> Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 09 Sep, 2009 3 commits
-
-
Linus Torvalds authored
-
Ed Cashin authored
Andy Whitcroft reported an oops in aoe triggered by use of an incorrectly initialised request_queue object: [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add an uninitialized object, something is seriously wrong. [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu [ 2645.959107] Call Trace: [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70 [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0 [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160 [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe] The request queue of an aoe device is not used but can be allocated in code that does not sleep. Bruno bisected this regression down to cd43e26f block: Expose stacked device queues in sysfs "This seems to generate /sys/block/$device/queue and its contents for everyone who is using queues, not just for those queues that have a non-NULL queue->request_fn." Addresses http://bugs.launchpad.net/bugs/410198 Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942 Note that embedding a queue inside another object has always been an illegal construct, since the queues are reference counted and must persist until the last reference is dropped. So aoe was always buggy in this respect (Jens). Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Bruno Premont <bonbons@linux-vserver.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
-
Linus Torvalds authored
Reinette Chatre reports a frozen system (with blinking keyboard LEDs) when switching from graphics mode to the text console, or when suspending (which does the same thing). With netconsole, the oops turned out to be BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915] and it's due to the i915_gem.c code doing drm_irq_uninstall() after having done i915_gem_idle(). And the i915_gem_idle() path will do i915_gem_idle() -> i915_gem_cleanup_ringbuffer() -> i915_gem_cleanup_hws() -> dev_priv->hw_status_page = NULL; but if an i915 interrupt comes in after this stage, it may want to access that hw_status_page, and gets the above NULL pointer dereference. And since the NULL pointer dereference happens from within an interrupt, and with the screen still in graphics mode, the common end result is simply a silently hung machine. Fix it by simply uninstalling the irq handler before idling rather than after. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13819Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 08 Sep, 2009 9 commits
-
-
Linus Torvalds authored
This avoids an indirect call in the VFS for each path component lookup. Well, at least as long as you own the directory in question, and the ACL check is unnecessary. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Don't implement per-filesystem 'extX_permission()' functions that have to be called for every path component operation, and instead just expose the actual ACL checking so that the VFS layer can now do it for us. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
shmfs wants purely standard POSIX ACL semantics, so we can use the new generic VFS layer POSIX ACL checking rather than cooking our own 'permission()' function. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This is stage one in flattening out the callchains for the common permission testing. Rather than have most filesystem implement their own inode->i_op->permission function that just calls back down to the VFS layers 'generic_permission()' with the per-filesystem ACL checking function, the filesystem can just expose its 'check_acl' function directly, and let the VFS layer do everything for it. This is all just preparatory - no filesystem actually enables this yet. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Don't call down to the generic inode_permission() function just to call the inode-specific permission function - just do it directly. The generic inode_permission() code does things like checking MAY_WRITE and devcgroup_inode_permission(), neither of which are relevant for the light pathname walk permission checks (we always do just MAY_EXEC, and the inode is never a special device). Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This function is only called for path components that are already known to be directories (they have a '->lookup' method). So don't bother doing that whole S_ISDIR() testing, the whole point of the 'lite()' version is that we know that we are looking at a directory component, and that we're only checking name lookup permission. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Instead of returning EAGAIN and having the caller do something special for that case, just do the special case directly. Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Not only is that a supremely timing-critical path, but it's hopefully some day going to be lockless for the common case, and ima can't do that. Plus the integrity code doesn't even care about non-regular files, so it was always a total waste of time and effort. Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhenyu Wang authored
eDP is exclusive connector too, and add missing crtc_mask setting for TV. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14139Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reported-and-tested-by: Carlos R. Mafra <crmafra2@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 Sep, 2009 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6Linus Torvalds authored
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.
-
Linus Torvalds authored
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: IMA: update ima_counts_put
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: gianfar: Fix build.
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6: pcmcia: add CNF-CDROM-ID for ide
-