- 19 Oct, 2004 22 commits
-
-
Jens Axboe authored
This patch modularizes the io schedulers completely, allowing them to be modular. Additionally it enables online switching of io schedulers. See also http://lwn.net/Articles/102593/ . There's a scheduler file in the sysfs directory for the block device queue: axboe@router:/sys/block/hda/queue> ls iosched max_sectors_kb read_ahead_kb max_hw_sectors_kb nr_requests scheduler If you list the contents of the file, it will show available schedulers and the active one: axboe@router:/sys/block/hda/queue> cat scheduler [cfq] Lets load a few more. router:/sys/block/hda/queue # modprobe deadline-iosched router:/sys/block/hda/queue # modprobe as-iosched router:/sys/block/hda/queue # cat scheduler [cfq] deadline anticipatory Changing is done with router:/sys/block/hda/queue # echo deadline > scheduler router:/sys/block/hda/queue # cat scheduler cfq [deadline] anticipatory deadline is now the new active io scheduler for hda. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
davej points out that in this code local variable `ret' is already known to be positive non-zero, so this test is meaningless. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Processes can sleep in do_get_write_access(), waiting for buffers to be removed from the BJ_Shadow state. We did this by doing a wake_up_buffer() in the commit path and sleeping on the buffer in do_get_write_access(). With the filtered bit-level wakeup code this doesn't work properly any more - the wake_up_buffer() accidentally wakes up tasks which are sleeping in lock_buffer() as well. Those tasks now implicitly assume that the buffer came unlocked. Net effect: Bogus I/O errors when reading journal blocks, because the buffer isn't up to date yet. Hence the recently spate of journal_bmap() failure reports. The patch creates a new jbd-private BH flag purely for this wakeup function. So a wake_up_bit(..., BH_Unshadow) doesn't wake up someone who is waiting for a wake_up_bit(BH_Lock). JBD was the only user of wake_up_buffer(), so remove it altogether. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Document the requirement to use a memory barrier prior to wake_up_bit(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Some of the parameters to __wait_on_bit() and __wait_on_bit_lock() are redundant, as the wait_bit_queue parameter holds the flags word and the bit number. This patch updates __wait_on_bit() and __wait_on_bit_lock() to fetch that information from the wait_bit_queue passed to them and so reduce the number of parameters so that -mregparm may be more effective. Incremental atop the complete out-of-lining of the contention cases and the fastcall and wait_on_bit_lock()/test_and_set_bit() fixes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Move the slow paths of wait_on_bit() and wait_on_bit_lock() out of line. Also uninline wake_up_bit() to reduce the number of callsites generated, and adjust loop startup in __wait_on_bit_lock() to properly reflect its usage in the contention case. Incremental atop the fastcall and wait_on_bit_lock()/test_and_set_bit() fixes. Successfully tested on x86-64. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Eliminate the inode waitqueue hashtable using bit_waitqueue() via wait_on_bit() and wake_up_bit() to locate the waitqueue head associated with a bit. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Eliminate the bh waitqueue hashtable using bit_waitqueue() via wait_on_bit() and wake_up_bit() to locate the waitqueue head associated with a bit. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Consolidate bit waiting code patterns for page waitqueues using __wait_on_bit() and __wait_on_bit_lock(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
Eliminate specialized page and bh waitqueue hashing structures in favor of a standardized structure, using wake_up_bit() to wake waiters using the standardized wait_bit_key structure. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
William Lee Irwin III authored
The following patch series consolidates the various instances of waitqueue hashing to use a uniform structure and share the per-zone hashtable among all waitqueue hashers. This is expected to increase the number of hashtable buckets available for waiting on bh's and inodes and eliminate statically allocated kernel data structures for greater node locality and reduced kernel image size. Some attempt was made to look similar to Oleg Nesterov's suggested API in order to provide some kind of credit for independent invention of something very similar (the original versions of these patches predated my public postings on the subject of filtered waitqueues). These patches have the further benefit and intention of enabling aio to use filtered wakeups by standardizing the data structure passed to wake functions so that embedded waitqueue elements in aio structures may be succesfully passed to the filtered wakeup wake functions, though this patch series doesn't implement that particular functionality. Successfully stress-tested on x86-64, and ia64 in recent prior versions. This patch: Move waitqueue -related functions not needing static functions in sched.c to kernel/wait.c Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Olaf Dabrunz authored
The ioctl TIOCCONS allows any user to redirect console output to another tty. This allows anyone to suppress messages to the console at will. AFAIK nowadays not many programs write to /dev/console, except for start scripts and the kernel (printk() above console log level). Still, I believe that administrators and operators would not like any user to be able to hijack messages that were written to the console. The only user of TIOCCONS that I am aware of is bootlogd/blogd, which runs as root. Please comment if there are other users. Is there any reason why normal users should be able to use TIOCCONS? Otherwise I would suggest to restrict access to root (CAP_SYS_ADMIN), e.g. with this patch. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paulo Marques authored
This patch is an improvement over my first kallsyms speedup patch posted about 2 weeks ago. It changes scripts/kallsyms as to produce a different format for kallsyms_names and extra data to speedup lookups. The compression algorithm is quite simple: it uses all the char codes not actually used in symbols to build a lookup table that translates these codes into small strings. For instance, in my test runs the code 0xFE was being translated into "acpi_" giving a 4 byte save on every translation. The advantage of this algorithm is that to translate a symbol we only require information that is stored on that symbol position, and never need to go back on the compressed stream to get information from other symbols. To give an idea about the benefits of this algorithm here are some benchmark results on a P4 2.8GHz with a symbol table with 10000 entries: kallsyms_lookup average time: vanilla 1346.0 us speedup 14.4 us with this patch 0.5 us total data produced by scripts/kallsyms: uncompressed 169 Kb vanilla 134 Kb with this patch 91 Kb (speedup was my latest patch, that only changed the way kallsyms_lookup worked and not the data format) I removed a cond_resched() from the proc/kallsyms handling code path, because using stem compression, if the current position went backwards, the hole stream would be uncompressed up to the current position. It seemed that by removing this loop it would be safe to remove the conditional reschedule altogether. There is just one catch with this patch: the time it takes to compile the kernel goes up just a bit (about 0.8s on a P4 2.8GHz with defconfig). If this delay is not acceptable I can change the compression algorithm so that it can use the previous table (calculating a new table is what consumes most of the time, and not doing the actual compression) and check to see if it obtains a similar compression ratio. If it does, then this is a sign that the symbol patterns haven't changed that much and this table is still good to use. This would not only cut the time down to half on any compilation (because of the 2 pass symbol build method), but in frequent cases where a developer is compiling a single file and linking everything over and over again, the table optimization process would never run. I'm CC'ing Brent Casavant on this email, because last june he sent a patch trying a different approach that used a 32 entry symbol cache, because there was a problem with the time "top" took to read "proc/<pid>/wchan". I was hopping he would be willing to test this patch and comment on the results. Signed-off-by: Paulo Marques <pmarques@grupopie.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The feature set the patch includes: - Key attributes: - Key type - Description (by which a key of a particular type can be selected) - Payload - UID, GID and permissions mask - Expiry time - Keyrings (just a type of key that holds links to other keys) - User-defined keys - Key revokation - Access controls - Per user key-count and key-memory consumption quota - Three std keyrings per task: per-thread, per-process, session - Two std keyrings per user: per-user and default-user-session - prctl() functions for key and keyring creation and management - Kernel interfaces for filesystem, blockdev, net stack access - JIT key creation by usermode helper There are also two utility programs available: (*) http://people.redhat.com/~dhowells/keys/keyctl.c A comprehensive key management tool, permitting all the interfaces available to userspace to be exercised. (*) http://people.redhat.com/~dhowells/keys/request-key An example shell script (to be installed in /sbin) for instantiating a key. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch adds the new error codes I added for key-related errors to those archs that don't make use of <asm-generic/errno.h>, including Alpha, MIPS, PA-RISC, Sparc and Sparc64. This is required to compile with CONFIG_KEYS on those platforms. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
Here's a patch to add some new error codes specific to key management. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Rename resierfs's `struct key' to `struct reiserfs_key' to avoid namespace clashes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Matthew Dobson authored
The idea behind this patch is to create a nodemask_t as a node analog of cpumask_t. As NUMA machines become more common, the need for a standard, cross-platform bitmap of both online & possible nodes becomes more apparent. We believe we've worked out most of the kinks of the variable length bitmap types with the recent cpumask_t patches. Nodemasks are also currently far less widespread than cpumasks. Further, inclusion at this point in the kernel would mean consistency in node handling between 2.6 and 2.7. Future goals would be to get rid of the 'numnodes' variable used to count the number of online nodes, and replace with node_online_map. This would allow arbitrary node numbering and facilitate node hotplugging. (Nothing actually uses this yet, but several projects need it, and it does model a well-defined physical grouping). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Peter Osterlund authored
The problem is that some drives fail the "GET CONFIGURATION" command when asked to only return 8 bytes. This happens for example on my drive, which is identified as: hdc: HL-DT-ST DVD+RW GCA-4040N, ATAPI CD/DVD-ROM drive Since the cdrom_mmc3_profile() function already allocates 32 bytes for the reply buffer, this patch is enough to make the command succeed on my drive. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Peter Osterlund authored
This patch implements CDRW packet writing as a kernel block device. Usage instructions are in the packet-writing.txt file. A hint: If you don't want to wait for a complete disc format, you can format just a part of the disc. For example: cdrwtool -d /dev/hdc -m 10240 This will format 10240 blocks, ie 20MB. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Peter Osterlund authored
Nigel pointed out that the earlier patches contained attributions that are not present in this patch. The 2.4 patch contains: Nov 5 2001, Aug 8 2002. Modified by Andy Polyakov <appro@fy.chalmers.se> to support MMC-3 complaint DVD+RW units. and Nigel changed it to this in his 2.6 patch: Modified by Nigel Kukard <nkukard@lbsd.net> - support DVD+RW 2.4.x patch by Andy Polyakov <appro@fy.chalmers.se> The patch I sent you deleted most of the earlier work and moved the rest to cdrom.c, but the comments were not moved over, since the earlier authors didn't modify cdrom.c. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Peter Osterlund authored
This patch adds support for using DVD+RW drives as writable block devices. The patch is based on work from: Andy Polyakov <appro@fy.chalmers.se> - Wrote the 2.4 patch Nigel Kukard <nkukard@lbsd.net> - Initial porting to 2.6.x It works for me using an Iomega Super DVD 8x USB drive. Nov 5 2001, Aug 8 2002. Modified by Andy Polyakov <appro@fy.chalmers.se> to support MMC-3 complaint DVD+RW units. Modified by Nigel Kukard <nkukard@lbsd.net> - support DVD+RW 2.4.x patch by Andy Polyakov <appro@fy.chalmers.se> This patch implements CDRW packet writing as a kernel block device. Usage instructions are in the packet-writing.txt file. A hint: If you don't want to wait for a complete disc format, you can format just a part of the disc. For example: cdrwtool -d /dev/hdc -m 10240 This will format 10240 blocks, ie 20MB. Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 18 Oct, 2004 18 commits
-
-
Linus Torvalds authored
Including the proper header file showed that they didn't match the declared prototypes.
-
Linus Torvalds authored
The proper C99 syntax is much preferred.
-
Linus Torvalds authored
-
Jens Axboe authored
This has been around for a while. Return the full scsi result byte in rq->errors for SG_IO generated requests. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ingo Molnar authored
This patch fixes all the preempt-after-task->state-is-TASK_DEAD problems we had. Right now, the moment procfs does a down() that sleeps in proc_pid_flush() [it could] our TASK_DEAD state is zapped and we might be back to TASK_RUNNING to and we trigger this assert: schedule(); BUG(); /* Avoid "noreturn function does return". */ for (;;) ; I have split out TASK_ZOMBIE and TASK_DEAD into a separate p->exit_state field, to allow the detaching of exit-signal/parent/wait-handling from descheduling a dead task. Dead-task freeing is done via PF_DEAD. Tested the patch on x86 SMP and UP, but all architectures should work fine. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Ingo Molnar authored
This patch fixes an interaction between the numa=fake=<domains> feature, the domain setup code and cpu_siblings_map[]. The bug leads to a bootup crash when using numa=fake=2 on a 2-way/4-way SMP+HT box. When SCHED_SMT is turned on the domains-setup code relies on siblings not spanning multiple domains (which makes perfect sense). But numa=fake=2 creates an assymetric 1101/0010 splitup between CPUs, which results in two siblings being on different nodes. The patch adds a check_siblings_map() function that checks the sibling maps and fixes them up if they violate this rule. (it also prints a warning in that case.) The patch also turns SCHED_DOMAIN_DEBUG back on - had this been enabled we'd have noticed this bug much earlier. From: Badari Pulavarty <pbadari@us.ibm.com> arch/x86_64/mm/numa.c: In function `numa_setup': arch/x86_64/mm/numa.c:332: error: `numa_fake' undeclared (first use in this function) arch/x86_64/mm/numa.c:332: error: (Each undeclared identifier is reported only once arch/x86_64/mm/numa.c:332: error: for each function it appears in.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Matthew Dobson authored
NODE_BALANCE_RATE is defined all over the place, but used nowhere. Let's remove it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Matthew Dobson authored
Here's yet another version of a patch to implement per-arch SD_*_INITs. This follows the same basic idea of my last patch, but 1) defines an arch-specific SD_NODE_INIT for the 4 NUMA arches (i386, x86_64, IA64 & PPC64), 2) defines *default* SD_CPU_INIT & SD_SIBLING_INIT for *all* arches, with the possibility of them being overridden by simply defining an arch-specific version in include/asm/topology.h. The motivation behind the third version of this patch is that Martin feels that there should be no "default" NUMA initializer because NUMA characteristics are *very* arch/platform specific, and hence a "default" NUMA initializer can only lead to confusion. I agree with most of that, but don't quite see as much harm in having a default as he does. Nevertheless, to keep him quiet, I've run up this version of the patch. Martin, please run this through your magic test suite and make sure I didn't break anything trivial. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Peter Williams authored
Problem: In the function try_to_wake_up(), when the runqueue's nr_uninterruptible field is decremented it's possible (on SMP systems) that the pointer no longer points to the runqueue that the task being woken was on when it went to sleep. This would cause the wrong runqueue's field to be decremented and the correct one tp remain unchanged. Fix: Save a pointer to the old runqueue at the beginning of the function and use it when decrementing nr_uninterruptible. Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
Better debugging output when the CPU scheduler detects atomicity errors. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Still having some trouble with ia64 domain setup on the Altixes. Jesse hasn't had much time to look into it, and I'm lacking an Altix, so I'm not sure if this is right or not... Anyway, it again does the right thing on the NUMAQ, and fixes some real bugs, so can you include it please? * Increase SD_NODES_PER_DOMAIN to 6 from 4 to better match Altix's topology. A setting of 4 will include this node, the other one in the brick, and the 2 nodes in the next closest brick, while 6 will catch 2 other bricks. Probably it could be increased even more. * Work correctly with sparse and not completely full node maps. * Nasty typo fixed in find_next_best_node: - val = node_distance(node, i); + val = node_distance(node, n); * Ensure all nodes are themselves a member of their numa balancing domain. This is more a precaution against creative implementations of node_distance.. but it makes the setup easier to verify without having to look at a table of node_distance's, which is possibly generated at runtime. So again, I'm not too sure if this will fix the Altix setup or not. But if you do a release, it will surely be less broken than it was before. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Use CPU_DOWN_FAILED notifier in the sched-domains hotplug code. This goes with 4/8 "integrate cpu hotplug and sched domains" Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Introduce CPU_DOWN_FAILED notifier, so we can cope with a failure after a CPU_DOWN_PREPARE notice. This fixes 3/8 "add CPU_DOWN_PREPARE notifier" to be useful Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Actually turn on SD_LOAD_BALANCE for the regular domains. Introduced by 5/8 "sched add load balance flag". Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Fix an oops in the domain debug code when isolated CPUs are specified. Introduced by 5/8 "sched add load balance flag" Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Implement disjoint NUMA domain setup for IA64 architecture. Most of the code was what was ripped out of kernel/sched.c, which was written by Jesse Barnes <jbarnes@sgi.com>. I fixed up the tricky NUMA groups initialistion. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Allow sched domain setup to be overridden by arch code. This functionality is needed again. From: Paul Jackson <pj@sgi.com> Builds of 2.6.9-rc1-mm5 ia64 NUMA configs fail, with many complaints that SD_NODE_INIT is defined twice, in asm/processor.h and linux/sched.h. I guess that the preprocessor conditionals were wrong when Nick added the per-arch override ability again of SD_NODE_INIT were wrong. At least this change lets me rebuild ia64 again. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Nick Piggin authored
Remove the disjoint NUMA domains setup code. It was broken. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-