- 19 May, 2004 40 commits
-
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> From: "J. Bruce Fields" <bfields@fieldses.org> Currently we are counting the number of threads already asleep and returning an immediate NFS4ERR_DELAY (==JUKEBOX) error if more than half are already asleep. This patch removes that logic, so instead we only return NFS4ERR_DELAY if an upcall times out (if it takes more than a second to return). With the thread counting there is the risk that even when all the relevant subsystems are responsive, the client may still see occasional NFS4ERR_DELAY returns just because, by coincidence, several upcalls were initiated at the same time. I expect clients will delay several seconds before retrying after NFS4ERR_DELAY, so this will be quite noticeable to users. Sporadic long delays like this are likely to lead users to suspect a problem somewhere, when in fact there is none. The current scheme ensures that we can still process requests not depending on upcalls, even when all threads would otherwise be tied up waiting on upcalls. However, this is not something that should happen under normal circumstances; if a server spends a significant portion of its time with all threads waiting for upcalls, this a sign that something is seriously wrong. In such a circumstance (e.g., an ldap server dies), we can, at least, bound the waiting time to a second without the need for counting threads. In short, removing the thread-counting will allow us to behave predictably when things are working, while still allowing some progress when they don't. It would be a worthwhile project to measure the amount of time threads spend waiting for upcalls (or for reads, for that matter); if a significant portion of the time they spend handling requests is spent sleeping, then there's an opportunity to improve nfsd performance: if we can break the one-to-one mapping between requests and threads, then we can lower the number of threads required to keep the nfs server busy. However, both the currently available options for doing this are problematic: returning JUKEBOX/DELAY errors at random times will lead to unpredictable performance, and saving a copy of the request to be processed from scratch again later is wasteful and makes it difficult to provide correct semantics, especially in the NFSv4 case. So for now I believe waits with short timeouts are the best option.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> From: "J. Bruce Fields" <bfields@fieldses.org> 1 second should be plenty of time; if we're going to take longer than that it's probably better just to return NFS4ERR_DELAY and let the client retry anyway.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> From: "J. Bruce Fields" <bfields@fieldses.org> Slightly better behavior on failed mapping (which may happen either because idmapd is not running, or because there it has told us it doesn't know the mapping.): on name->id (setattr), return BADNAME. (I used ESRCH to communicate BADNAME, just because it was the first error in include/asm-generic/errno-base.h that had something to do with nonexistance of something, and that we weren't already using.) id->name (getattr), return a string representation of the numerical id. This is probably useless to the client, especially since we're unlikely to accept such a string on a setattr, but perhaps some client will find it mildly helpful.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> From: "J. Bruce Fields" <bfields@fieldses.org> Also fix leaks on error; split up code a bit to make it easier to verify correctness.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> nfsd_cross_mnt can release the reference to the passed svc_export structure when it returns a different svc_export structure. So we need to make sure we have a counted reference before, and drop the reference afterwards.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> fh_compose currently consumes a reference to the dentry but not the export point. This is both inconsistent and confusing. It is better if a routine like this doesn't consume reference points, so with this patch, it doesn't. This fixes a couple of very subtle and unusual reference counting errors.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> We currently serialize all writes to these caches with queue_io_sem, so we only needed one buffer. There is some need for larger-than-one-page writes, so we can just statically allocate a buffer.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> This is important for update-in-place caches which may change from being negative to posative. Thanks to "J. Bruce Fields" <bfields@fieldses.org> and Olaf Kirch <okir@suse.de>
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> With the _bh, we can deadlock.
-
Andrew Morton authored
From: "Randy.Dunlap" <rddunlap@osdl.org> We need to call mca_system_init() to register MCA bus struct, otherwise find_mca_adapter() oopses with a NULL ptr dereference. Fixes this oops reported last week: http://marc.theaimsgroup.com/?l=linux-kernel&m=108455738606747&w=2 Thanks to James Bottomley for pointing this out.
-
Andrew Morton authored
From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br> drivers/cdrom/azctd.c:379: warning: `pa_ok' defined but not used
-
Andrew Morton authored
We just tested the page's uptodateness, no point in doing it again.
-
Andrew Morton authored
From: Matt Domsch <Matt_Domsch@dell.com> * Adds MODULE_VERSION * Remove check for efi_enabled in efivars_exit() - we aborted module load at init based on this already.
-
Andrew Morton authored
From: Matt Domsch <Matt_Domsch@dell.com> EDD: Remove no longer needed SCSI header file inclusion. Thanks to ArjanV for reminding me.
-
Andrew Morton authored
From: Jonathan Corbet <corbet@lwn.net> I noticed a patch went in to Documentation/SubmittingDrivers which tweaked the URL for KernelTraffic. Here's a self-serving patch which makes that section more complete; to be fair, I added two other sites too. Just in case it's useful.
-
Andrew Morton authored
From: Jan Kara <jack@ucw.cz> This patch fixes possible quota files corruption which could happen when root did not have any inodes&space allocated. Originally this could not happen as structure would not be written to disk in that case but with journalled quota we need to write even all-zero structure. The fix is not very nice but change of the format on disk is probably worse (I made a mistake with not including the usage-bitmaps into format :().
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> This patch against 2.6.6 fixes error handling for two out-of-memory conditions in selinuxfs, avoiding potential deadlock due to returning without releasing a semaphore. The patch was submitted by Karl MacMillan of Tresys.
-
Andrew Morton authored
From: "Randy.Dunlap" <rddunlap@osdl.org> The module parameter name is incorrect (looks like a thinko).
-
Andrew Morton authored
From: Andreas Gruenbacher <agruen@suse.de> Here is a patch that re-adds support for more than one directory in SUBDIRS. We have a number of packages that use this. The FORCE dependency of crmodverdir seems unnecessary; removing. (acked by Sam)
-
Andrew Morton authored
From: Christoph Hellwig <hch@lst.de> This one is missing updates from the v4l1 interfaces in 2.4 to the 2.6ish v4l2 and thus doesn't compile. While we're at it also remove the MOD_{INC,DEC}_USE_COUNT calls in it that were bogus even in 2.4 to avoid false positives in grep.
-
Andrew Morton authored
From: Christoph Hellwig <hch@lst.de> If we want new drivers to not use obsolete interfaces we're better off not mentioning it in the documentation.
-
Andrew Morton authored
From: Christoph Hellwig <hch@lst.de> This driver is unloadable for the pci case, but not if vlb cards are found so we can't use the module_exit removal to lock it into memory. Replace the MOD_INC_USE_COUNT with __module_get in it's module_init routine.
-
Andrew Morton authored
It no longer exists.
-
Andrew Morton authored
drivers/atm/fore200e.c: In function `fore200e_close': drivers/atm/fore200e.c:1659: warning: use of cast expressions as lvalues is deprecated
-
Andrew Morton authored
From: "Randy.Dunlap" <rddunlap@osdl.org> kexec is a fairly major and popular feature. People are shipping it in products, although it is not known if Linux distributors plan to ship it. The patch reserves the kexec syscall slots to pin the ABI down for everyone. - add kexec_load prototype to syscalls.h - add LINUX_REBOOT_CMD_KEXEC to reboot.h - add kexec_load syscall for ia32, ia64, x86_64, ppc32, ppc64
-
Andrew Morton authored
From: Mathieu Chouquet-Stringer <mchouque@online.fr> If you use O=/someotherdir or KBUILD_OUTPUT=/someotherdir on the following architectures: alpha, mips, sh and cris, the build process is probably going to fail at one point or another, depending on the target you used, because make can't find scripts/Makefile.build or scripts/Makefile.clean. The following patch fixes this, I greped the whole tree and these four were the only "offenders" I found.
-
Andrew Morton authored
From: "Sergey S. Kostyliov" <rathamahata@php4.ru>
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> The new domain scheduler got miscompiled on x86-64 with gcc 3.3.3-hammer, which is shipping with some distributions. The kernel deadlocks eventually under light stress on SMP systems with the right options. After some experiments it seems this simple change avoids the miscompilation. It also doesn't pessimize the code unduly for other architectures.
-
Andrew Morton authored
From: Manfred Spraul <manfred@colorfullife.com> The attached patch adds a simple kmem_cache_alloc_node function: allocate memory on a given node. The function is intended for cpu bound structures. It's used for alloc_percpu and for the slab-internal per-cpu structures. Jack Steiner reported a ~3% performance increase for AIM7 on a 64-way Itanium 2. Port maintainers: The patch could cause problems if CPU_UP_PREPARE is called for a cpu on a node before the corresponding memory is attached and/or if alloc_pages_node doesn't fall back to memory from another node if there is no memory in the requested node. I think noone does that, but I'm not sure.
-
Andrew Morton authored
From: Manfred Spraul <manfred@colorfullife.com> The slab allocator keeps track of the free objects in a slab with a linked list of integers (typedef'ed to kmem_bufctl_t). Right now unsigned int is used for kmem_bufctl_t, i.e. 4 bytes per-object overhead. The attached patch implements a per-arch definition of for this type: Theoretically, unsigned short is sufficient for kmem_bufctl_t and this would reduce the per-object overhead to 2 bytes. But some archs cannot operate on 16-bit values efficiently, thus it's not possible to switch everyone to ushort. The chosen types are a result of dicussions with the various arch maintainers.
-
Andrew Morton authored
From: Manfred Spraul <manfred@colorfullife.com> the attached patch switches the SLAB_HWCACHE_ALIGN alignment from the compile time L1 cache line size to the runtime detected value for i386. x86-64 already uses the runtime detection.
-
Andrew Morton authored
From: Nick Piggin <nickpiggin@yahoo.com.au> If the zone has a very small number of inactive pages, local variable `ratio' can be huge and we do way too much scanning. So much so that Ingo hit an NMI watchdog expiry, although that was because the zone would have a had a single refcount-zero page in it, and that logic recently got fixed up via get_page_testone(). Nick's patch simply puts a sane-looking upper bound on the number of pages which we'll scan in this round. It fixes another failure case: if the inactive list becomes very small compared to the size of the active list, active list scanning (and therefore inactive list refilling) also becomes small. This patch causes inactive list scanning to be keyed off the size of the active+inactive lists. It has the plus of hiding active and inactive balancing implementation from the higher level scanning code. It will slightly change other aspects of scanning behaviour, but probably not significantly.
-
Andrew Morton authored
Experimenting with various values of DENTRY_STORAGE dentry size objs/slab dentry size * objs/slab inline string 148 26 3848 32 152 26 3952 36 156 25 3900 40 160 24 4000 44 We're currently at 160. The patch fairly arbitrarily takes it down to 152, so we can fit a 35-char name into the inline part of the dentry. Also, go back to the old way of sizing d_iname so that any arch-specific compiler-forced alignemnts are honoured.
-
Andrew Morton authored
Fix http://bugme.osdl.org/show_bug.cgi?id=2710. When the user passed madvise a length of -1 through -4095, madvise blindly rounds this up to 0 then "succeeds".
-
Andrew Morton authored
From: Brian Gerst <bgerst@didntduck.org> Generate offsets for thread_info, cpuinfo_x86, and a few others instead of hardcoding them.
-
Andrew Morton authored
It conflicts with the readq() I/O function.
-
Andrew Morton authored
From: Chris Wright <chrisw@osdl.org> Add disable param to capabilities module. Similar to the SELinux param for disabling at boot time. This allows vendors to ship single binary image with capabilities compiled statically, and disable it if they provide another security model compiled as module.
-
Andrew Morton authored
From: Ram Pai <linuxram@us.ibm.com> Currently the readahead code tends to read one more page than it should with seeky database-style loads. This was to prevent bogus readahead triggering when we step into the last page of the current window. The patch removes that workaround and fixes up the suboptimal logic instead. wrt the "rounding errors" mentioned in this patch, Ram provided the following description: Say the i/o size is 20 pages. Our algorithm starts by a initial average i/o size of 'ra_pages/2' which is mostly say 16. Now every time we take a average, the 'average' progresses as follows (16+20)/2=18 (18+20)/2=19 (19+20)/2=19 (19+20)/2=19..... and the rounding error makes it never touch 20 Benchmarking sitrep: IOZONE run on a nfs mounted filesystem: client machine 2proc, 733MHz, 2GB memory server machine 8proc, 700Mhz, 8GB memory ./iozone -c -t1 -s 4096m -r 128k
-
Andrew Morton authored
drivers/scsi/dpt_i2o.c: In function `adpt_queue': drivers/scsi/dpt_i2o.c:442: warning: use of cast expressions as lvalues is deprecated drivers/scsi/dpt_i2o.c: In function `adpt_scsi_register': drivers/scsi/dpt_i2o.c:2213: warning: use of cast expressions as lvalues is deprecated
-
Andrew Morton authored
From: Arthur Othieno <a.othieno@bluewin.ch> CONFIG_MAC_SERIAL (drivers/macintosh/macserial.c) is marked obsolete and currently doesn't build. benh says: "I though build got fixed recently ... well, anyway, the driver is indeed obsolete, there's a new one in drivers/serial now."
-