- 24 Mar, 2009 19 commits
-
-
Steven Whitehouse authored
The logic requires that we mark the glock dirty in page_mkwrite otherwise we might not flush correctly in the case that no allocation was required in the process of dirying the page. Also we need to set the shared write flag early for the same reason. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This cleans up a number of bits of code mostly based in glops.c. A couple of simple functions have been merged into the callers to make it more obvious what is going on, the mysterious raising of i_writecount around the truncate_inode_pages() call has been removed. The meta_go_* operations have been renamed rgrp_go_* since that is the only lock type that they are used with. The unused argument of gfs2_read_sb has been removed. Also a bug has been fixed where a check for the rindex inode was in the wrong callback. More comments are added, and the debugging code is improved too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Benjamin Marzinski authored
After calling out to the dlm, GFS2 sets the new state of a glock to gl_target in gdlm_ast(). However, gl_target is not always the lock state that was requested. If a conversion from shared to exclusive fails, finish_xmote() will call do_xmote() with LM_ST_UNLOCKED, instead of gl->gl_target, so that it can reacquire the lock in exlusive the next time around. In this case, setting the lock to gl_target in gdlm_ast() will make GFS2 think that it has the glock in exclusive mode, when really, it doesn't have the glock locked at all. This patch adds a new field to the gfs2_glock structure, gl_req, to track the mode that was requested. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Hisashi Hifumi authored
I introduced "is_partially_uptodate" aops for GFS2. A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. I did a performance test using the sysbench. #sysbench --num-threads=16 --max-requests=200000 --test=fileio --file-num=1 --file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0 --file-rw-ratio=1 run -2.6.29-rc6 Test execution summary: total time: 202.6389s total number of events: 200000 total time taken by event execution: 2580.0480 per-request statistics: min: 0.0000s avg: 0.0129s max: 49.5852s approx. 95 percentile: 0.0462s -2.6.29-rc6-patched Test execution summary: total time: 177.8639s total number of events: 200000 total time taken by event execution: 2419.0199 per-request statistics: min: 0.0000s avg: 0.0121s max: 52.4306s approx. 95 percentile: 0.0444s arch: ia64 pagesize: 16k blocksize: 4k Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Hannes Eder authored
Impact: Make symbol static. Fix this sparse warning: fs/gfs2/rgrp.c:188:5: warning: symbol 'gfs2_bitfit' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Hannes Eder authored
Fix this sparse warnings: fs/gfs2/rgrp.c:156:23: warning: constant 0xffffffffffffffff is so big it is unsigned long long fs/gfs2/rgrp.c:157:23: warning: constant 0xaaaaaaaaaaaaaaaa is so big it is unsigned long long fs/gfs2/rgrp.c:158:23: warning: constant 0x5555555555555555 is so big it is long long fs/gfs2/rgrp.c:194:20: warning: constant 0x5555555555555555 is so big it is long long fs/gfs2/rgrp.c:204:44: warning: constant 0x5555555555555555 is so big it is long long Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This adds support for "quota" and "noquota" mount options in addition to the existing "quota=on/off/account" so that we are compatible with the names by which these options are more generally known. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
An alignment issue with the existing bitfit algorithm was reported on IA64. This patch attempts to fix that, and also to tidy up the code a bit. There is now more documentation about how this works and it has survived a number of different tests. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This adds a sysfs file called demote_rq to GFS2's per filesystem directory. Its possible to use this file to demote arbitrary glocks in exactly the same way as if a request had come in from a remote node. This is intended for testing issues relating to caching of data under glocks. Despite that, the interface is generic enough to send requests to any type of glock, but be careful as its not always safe to send an arbitrary message to an arbitrary glock. For that reason and to prevent DoS, this interface is restricted to root only. The messages look like this: <type>:<glocknumber> <mode> Example: echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq Which means "please demote inode glock (type 2) number 13324 so that I can get an EX (exclusive) lock". The lock modes are those which would normally be sent by a remote node in its callback so if you want to unlock a glock, you use EX, to demote to shared, use SH or PR (depending on whether you like GFS2 or DLM lock modes better!). If the glock doesn't exist, you'll get -ENOENT returned. If the arguments don't make sense, you'll get -EINVAL returned. The plan is that this interface will be used in combination with the blktrace patch which I recently posted for comments although it is, of course, still useful in its own right. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
Since we have a UUID, we ought to expose it to the user via sysfs and uevents. We already have the fs name in both of these places (a combination of the lock proto and lock table name) so if we add the UUID as well, we have a full set. For older filesystems (i.e. those created before mkfs.gfs2 was writing UUIDs by default) the sysfs file will appear zero length, and no UUID env var will be added to the uevents. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch allows GFS2 to generate discard requests for blocks which are no longer useful to the filesystem (i.e. those which have been freed as the result of an unlink operation). The requests are generated at the time which those blocks become available for reuse in the filesystem. In order to use this new feature, you have to specify the "discard" mount option. The code coalesces adjacent blocks into a single extent when generating the discard requests, thus generating the minimum number. If an error occurs when the request has been sent to the block device, then it will print a message and turn off the requests for that filesystem. If the problem is temporary, then you can use remount to turn the option back on again. There is also a nodiscard mount option so that you can use remount to turn discard requests off, if required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch fixes a deadlock when the journal is flushed and there are dirty inodes other than the one which caused the journal flush. Originally the journal flushing code was trying to obtain the transaction glock while running the flush code for an inode glock. We no longer require the transaction glock at this point in time since we know that any attempt to get the transaction glock from another node will result in a journal flush. So if we are flushing the journal, we can be sure that the transaction lock is still cached from when the transaction was started. By inlining a version of gfs2_trans_begin() (minus the bit which gets the transaction glock) we can avoid the deadlock problems caused if there is a demote request queued up on the transaction glock. In addition I've also moved the umount rwsem so that it covers the glock workqueue, since it all demotions are done by this workqueue now. That fixes a bug on umount which I came across while fixing the original problem. Reported-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
We were keeping hold of an extra ref to the root inode in one of the error paths, that resulted in a hang. Reported-by: Nate Straz <nstraz@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Tested-by: Robert Peterson <rpeterso@redhat.com>
-
Steven Whitehouse authored
The time stamp field is unused in the glock now that we are using a shrinker, so that we can remove it and save sizeof(unsigned long) bytes in each glock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
We only really need a single spin lock for the quota data, so lets just use the lru lock for now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Abhijith Das <adas@redhat.com>
-
Abhijith Das authored
Deallocation of gfs2_quota_data objects now happens on-demand through a shrinker instead of routinely deallocating through the quotad daemon. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Abhijith Das authored
The quota code uses lvbs and this is currently not implemented in lock_nolock, thereby causing panics when quota is enabled with lock_nolock. This patch adds the relevant bits. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
The following patch fixes an issue relating to remount and argument parsing. After this fix is applied, remount becomes atomic in that it either succeeds changing the mount to the new state, or it fails and leaves it in the old state. Previously it was possible for the parsing of options to fail part way though and for the fs to be left in a state where some of the new arguments had been applied, but some had not. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
- 23 Mar, 2009 11 commits
-
-
Linus Torvalds authored
-
Kyle McMartin authored
With a sufficiently new compiler and binutils, code which wasn't previously generating .eh_frame sections has begun to. Certain architectures (powerpc, in this case) may generate unexpected relocation formats in response to this, preventing modules from loading. While the new relocation types should probably be handled, revert to the previous behaviour with regards to generation of .eh_frame sections. (This was reported against Fedora, which appears to be the only distro doing any building against gcc-4.4 at present: RH bz#486545.) Signed-off-by: Kyle McMartin <kyle@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Alexandre Oliva <aoliva@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jody McIntyre authored
Revert the change to the orphan dates of Windows 95, DOS, compression. Add a new orphan date for OS/2. Signed-off-by: Jody McIntyre <scjody@sun.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) ucc_geth: Fix oops when using fixed-link support dm9000: locking bugfix net: update dnet.c for bus_id removal dnet: DNET should depend on HAS_IOMEM dca: add missing copyright/license headers nl80211: Check that function pointer != NULL before using it sungem: missing net_device_ops be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle be2net: replenish when posting to rx-queue is starved in out of mem conditions bas_gigaset: correctly allocate USB interrupt transfer buffer smsc911x: reset last known duplex and carrier on open sh_eth: Fix mistake of the address of SH7763 sh_eth: Change handling of IRQ netns: oops in ip[6]_frag_reasm incrementing stats net: kfree(napi->skb) => kfree_skb net: fix sctp breakage ipv6: fix display of local and remote sit endpoints net: Document /proc/sys/net/core/netdev_budget tulip: fix crash on iface up with shirq debug virtio_net: Make virtio_net support carrier detection ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix crash with /proc/iomem sparc64: Reschedule KGDB capture to a software interrupt. sbus: Auto-load openprom module when device opened.
-
Miklos Szeredi authored
This patch fixes bug #12208: Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208 Subject : uml is very slow on 2.6.28 host This turned out to be not a scheduler regression, but an already existing problem in ptrace being triggered by subtle scheduler changes. The problem is this: - task A is ptracing task B - task B stops on a trace event - task A is woken up and preempts task B - task A calls ptrace on task B, which does ptrace_check_attach() - this calls wait_task_inactive(), which sees that task B is still on the runq - task A goes to sleep for a jiffy - ... Since UML does lots of the above sequences, those jiffies quickly add up to make it slow as hell. This patch solves this by not rescheduling in read_unlock() after ptrace_stop() has woken up the tracer. Thanks to Oleg Nesterov and Ingo Molnar for the feedback. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpcLinus Torvalds authored
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc: powerpc/mm: Fix Respect _PAGE_COHERENT on classic ppc32 SW TLB load machines
-
Kumar Gala authored
Grant picked up the wrong version of "Respect _PAGE_COHERENT on classic ppc32 SW" (commit a4bd6a93) It was missing the code to actually deal with the fixup of _PAGE_COHERENT based on the CPU feature. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
-
Anton Vorontsov authored
commit b1c4a9dd ("ucc_geth: Change uec phy id to the same format as gianfar's") introduced a regression in the ucc_geth driver that causes this oops when fixed-link is used: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0151270 Oops: Kernel access of bad area, sig: 11 [#1] TMCUTU NIP: c0151270 LR: c0151270 CTR: c0017760 REGS: cf81fa60 TRAP: 0300 Not tainted (2.6.29-rc8) MSR: 00009032 <EE,ME,IR,DR> CR: 24024042 XER: 20000000 DAR: 00000000, DSISR: 20000000 TASK = cf81cba0[1] 'swapper' THREAD: cf81e000 GPR00: c0151270 cf81fb10 cf81cba0 00000000 c0272e20 c025f354 00001e80 cf86b08c GPR08: d1068200 cffffb74 06000000 d106c200 42024042 10085148 0fffd000 0ffc81a0 GPR16: 00000001 00000001 00000000 007ffeb0 00000000 0000c000 cf83f36c cf83f000 GPR24: 00000030 cf83f360 cf81fb20 00000000 d106c200 20000000 00001e80 cf83f360 NIP [c0151270] ucc_geth_open+0x330/0x1efc LR [c0151270] ucc_geth_open+0x330/0x1efc Call Trace: [cf81fb10] [c0151270] ucc_geth_open+0x330/0x1efc (unreliable) [cf81fba0] [c0187638] dev_open+0xbc/0x12c [cf81fbc0] [c0187e38] dev_change_flags+0x8c/0x1b0 This patch fixes the issue by removing offending (and somewhat duplicate) code from init_phy() routine, and changes _probe() function to use uec_mdio_bus_name(). Also, since we fully construct phy_bus_id in the _probe() routine, we no longer need ->phy_address and ->mdio_bus fields in ucc_geth_info structure. I wish the patch would be a bit shorter, but it seems like the only way to fix the issue in a sane way. Luckily, the patch has been tested with real PHYs and fixed-link, so no further regressions expected. Reported-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Brownell authored
This fixes a locking bug in the dm9000 driver. It calls request_irq() without setting IRQF_DISABLED ... which is correct for handlers that support IRQ sharing, since that behavior is not guaranteed for shared IRQs. However, its IRQ handler then wrongly assumes that IRQs are blocked. So the fix just uses the right spinlock primitives in the IRQ handler. NOTE: this is a classic example of the type of bug which lockdep currently masks by forcibly setting IRQF_DISABLED on IRQ handlers that did not request that flag. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Rothwell authored
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 Mar, 2009 6 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixesLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: kconfig: improve seed in randconfig kconfig: fix randconfig for choice blocks
-
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommuLinus Torvalds authored
* 'fix-includes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: merge the non-MMU and MMU versions of siginfo.h m68k: use the MMU version of unistd.h for all m68k platforms m68k: merge the non-MMU and MMU versions of signal.h m68k: merge the non-MMU and MMU versions of ptrace.h m68k: use MMU version of setup.h for both MMU and non-MMU m68k: merge the non-MMU and MMU versions of sigcontext.h m68k: merge the non-MMU and MMU versions of swab.h m68k: merge the non-MMU and MMU versions of param.h
-
Gertjan van Wingerde authored
Update all previous incarnations of my email address to the correct one. Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tyler Hicks authored
If ecryptfs_encrypted_view or ecryptfs_xattr_metadata were being specified as mount options, a NULL pointer dereference of crypt_stat was possible during lookup. This patch moves the crypt_stat assignment into ecryptfs_lookup_and_interpose_lower(), ensuring that crypt_stat will not be NULL before we attempt to dereference it. Thanks to Dan Carpenter and his static analysis tool, smatch, for finding this bug. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Acked-by: Dustin Kirkland <kirkland@canonical.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tyler Hicks authored
When allocating the memory used to store the eCryptfs header contents, a single, zeroed page was being allocated with get_zeroed_page(). However, the size of an eCryptfs header is either PAGE_CACHE_SIZE or ECRYPTFS_MINIMUM_HEADER_EXTENT_SIZE (8192), whichever is larger, and is stored in the file's private_data->crypt_stat->num_header_bytes_at_front field. ecryptfs_write_metadata_to_contents() was using num_header_bytes_at_front to decide how many bytes should be written to the lower filesystem for the file header. Unfortunately, at least 8K was being written from the page, despite the chance of the single, zeroed page being smaller than 8K. This resulted in random areas of kernel memory being written between the 0x1000 and 0x1FFF bytes offsets in the eCryptfs file headers if PAGE_SIZE was 4K. This patch allocates a variable number of pages, calculated with num_header_bytes_at_front, and passes the number of allocated pages along to ecryptfs_write_metadata_to_contents(). Thanks to Florian Streibelt for reporting the data leak and working with me to find the problem. 2.6.28 is the only kernel release with this vulnerability. Corresponds to CVE-2009-0787 Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Acked-by: Dustin Kirkland <kirkland@canonical.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Eugene Teo <eugeneteo@kernel.sg> Cc: Greg KH <greg@kroah.com> Cc: dann frazier <dannf@dannf.org> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Florian Streibelt <florian@f-streibelt.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Benjamin Herrenschmidt authored
This fixes a regression introduced when we switched to using the core pci_set_power_state(). The chip seems to need the state to be written over and over again until it sticks, so we do that. Note that the code is a bit blunt, without timeout, etc... but that's pretty much because I put back in there the code exactly as it used to be before the regression. I still add a call to pci_set_power_state() at the end so that ACPI gets called appropriately on x86. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Raymond Wooninck <tittiatcoke@gmail.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 21 Mar, 2009 2 commits
-
-
Ilya Yanok authored
Signed-off-by: Ilya Yanok <yanok@emcraft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maciej Sosnowski authored
In two dca files copyright and license headers are missing. This patch adds them there. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 20 Mar, 2009 2 commits
-
-
Jouni Malinen authored
NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS handlers did not verify whether a function pointer is NULL (not supported by the driver) before trying to call the function. The former nl80211 command is available for unprivileged users, too, so this can potentially allow normal users to kill networking (or worse..) if mac80211 is built without CONFIG_MAC80211_MESH=y. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>