- 13 May, 2013 25 commits
-
-
Bjørn Mork authored
commit a2a2d6c7 upstream. Adding support for a Mediatek based device labelled as D-Link Model: DWM-156, H/W Ver: A7 Also adding two other device IDs found in the Debian(!) packages included on the embedded device driver CD. This is a composite MBIM + serial ports + card reader device: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 14 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2001 ProdID=7d01 Rev= 3.00 S: Manufacturer=D-Link,Inc S: Product=D-Link DWM-156 C:* #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00 I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=125us I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=02 Prot=01 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=500us E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Namhyung Kim authored
commit 39e30cd1 upstream. The first page was allocated separately, so no need to start from 0. Link: http://lkml.kernel.org/r/1364820385-32027-2-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
J. Bruce Fields authored
commit 0c7c3e67 upstream. Don't actually close any opens until we don't need them at all. This means being left with write access when it's not really necessary, but that's better than putting a file that might still have posix locks held on it, as we have been. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
David Engraf authored
commit 51fd36f3 upstream. One can trigger an overflow when using ktime_add_ns() on a 32bit architecture not supporting CONFIG_KTIME_SCALAR. When passing a very high value for u64 nsec, e.g. 7881299347898368000 the do_div() function converts this value to seconds (7881299347) which is still to high to pass to the ktime_set() function as long. The result in is a negative value. The problem on my system occurs in the tick-sched.c, tick_nohz_stop_sched_tick() when time_delta is set to timekeeping_max_deferment(). The check for time_delta < KTIME_MAX is valid, thus ktime_add_ns() is called with a too large value resulting in a negative expire value. This leads to an endless loop in the ticker code: time_delta: 7881299347898368000 expires = ktime_add_ns(last_update, time_delta) expires: negative value This fix caps the value to KTIME_MAX. This error doesn't occurs on 64bit or architectures supporting CONFIG_KTIME_SCALAR (e.g. ARM, x86-32). Signed-off-by: David Engraf <david.engraf@sysgo.com> [jstultz: Minor tweaks to commit message & header] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Prarit Bhargava authored
commit 8f294b5a upstream. The settimeofday01 test in the LTP testsuite effectively does gettimeofday(current time); settimeofday(Jan 1, 1970 + 100 seconds); settimeofday(current time); This test causes a stack trace to be displayed on the console during the setting of timeofday to Jan 1, 1970 + 100 seconds: [ 131.066751] ------------[ cut here ]------------ [ 131.096448] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x135/0x140() [ 131.104935] Hardware name: Dinar [ 131.108150] Modules linked in: sg nfsv3 nfs_acl nfsv4 auth_rpcgss nfs dns_resolver fscache lockd sunrpc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables kvm_amd kvm sp5100_tco bnx2 i2c_piix4 crc32c_intel k10temp fam15h_power ghash_clmulni_intel amd64_edac_mod pcspkr serio_raw edac_mce_amd edac_core microcode xfs libcrc32c sr_mod sd_mod cdrom ata_generic crc_t10dif pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm ahci pata_atiixp libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 131.176784] Pid: 0, comm: swapper/28 Not tainted 3.8.0+ #6 [ 131.182248] Call Trace: [ 131.184684] <IRQ> [<ffffffff810612af>] warn_slowpath_common+0x7f/0xc0 [ 131.191312] [<ffffffff8106130a>] warn_slowpath_null+0x1a/0x20 [ 131.197131] [<ffffffff810b9fd5>] clockevents_program_event+0x135/0x140 [ 131.203721] [<ffffffff810bb584>] tick_program_event+0x24/0x30 [ 131.209534] [<ffffffff81089ab1>] hrtimer_interrupt+0x131/0x230 [ 131.215437] [<ffffffff814b9600>] ? cpufreq_p4_target+0x130/0x130 [ 131.221509] [<ffffffff81619119>] smp_apic_timer_interrupt+0x69/0x99 [ 131.227839] [<ffffffff8161805d>] apic_timer_interrupt+0x6d/0x80 [ 131.233816] <EOI> [<ffffffff81099745>] ? sched_clock_cpu+0xc5/0x120 [ 131.240267] [<ffffffff814b9ff0>] ? cpuidle_wrap_enter+0x50/0xa0 [ 131.246252] [<ffffffff814b9fe9>] ? cpuidle_wrap_enter+0x49/0xa0 [ 131.252238] [<ffffffff814ba050>] cpuidle_enter_tk+0x10/0x20 [ 131.257877] [<ffffffff814b9c89>] cpuidle_idle_call+0xa9/0x260 [ 131.263692] [<ffffffff8101c42f>] cpu_idle+0xaf/0x120 [ 131.268727] [<ffffffff815f8971>] start_secondary+0x255/0x257 [ 131.274449] ---[ end trace 1151a50552231615 ]--- When we change the system time to a low value like this, the value of timekeeper->offs_real will be a negative value. It seems that the WARN occurs because an hrtimer has been started in the time between the releasing of the timekeeper lock and the IPI call (via a call to on_each_cpu) in clock_was_set() in the do_settimeofday() code. The end result is that a REALTIME_CLOCK timer has been added with softexpires = expires = KTIME_MAX. The hrtimer_interrupt() fires/is called and the loop at kernel/hrtimer.c:1289 is executed. In this loop the code subtracts the clock base's offset (which was set to timekeeper->offs_real in do_settimeofday()) from the current hrtimer_cpu_base->expiry value (which was KTIME_MAX): KTIME_MAX - (a negative value) = overflow A simple check for an overflow can resolve this problem. Using KTIME_MAX instead of the overflow value will result in the hrtimer function being run, and the reprogramming of the timer after that. Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> [jstultz: Tweaked commit subject] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Stefani Seibold authored
commit 58f8b6c4 upstream. This patch add a missing usb device id for the GDMBoost V1.x device The patch is against 3.9-rc5 Signed-off-by: Stefani Seibold <stefani@seibold.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Christian Lamparter authored
commit 9e9dd0e8 upstream. The "Mobile Sandy Bridge CPUs" in the Fujitsu Esprimo Q900 mini desktop PCs are probably misleading the LVDS detection code in intel_lvds_supported. Nothing is connected to the LVDS ports in these systems. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Dmitry Monakhov authored
commit 794446c6 upstream. The following race is possible: [kjournald2] other_task jbd2_journal_commit_transaction() j_state = T_FINISHED; spin_unlock(&journal->j_list_lock); ->jbd2_journal_remove_checkpoint() ->jbd2_journal_free_transaction(); ->kmem_cache_free(transaction) ->j_commit_callback(journal, transaction); -> USE_AFTER_FREE WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 In order to demonstrace this issue one should mount ext4 with mount -o discard option on SSD disk. This makes callback longer and race window becomes wider. In order to fix this we should mark transaction as finished only after callbacks have completed Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> [bwh: Backported to 3.2: s/jbd2_journal_free_transaction/kfree/] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Theodore Ts'o authored
commit d76a3a77 upstream. In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: George Barnett <gbarnett@atlassian.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
fanchaoting authored
commit b022032e upstream. we should return error status directly when nfs4_preprocess_stateid_op return error. Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Ben Jencks authored
commit e7d3b6e2 upstream. Add the Apple 24" LED Cinema display to the supported devices. Signed-off-by: Ben Jencks <ben@bjencks.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Ming Lei authored
commit f7db5e76 upstream. The inode->i_mutex isn't hold when updating filp->f_pos in read()/write(), so the filp->f_pos might be read as 0 or 1 in readdir() when there is concurrent read()/write() on this same file, then may cause use after free in readdir(). The bug can be reproduced with Li Zefan's test code on the link: https://patchwork.kernel.org/patch/2160771/ This patch fixes the use after free under this situation. Reported-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: file position is child inode number, not hash] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Tony Luck authored
commit d303e9e9 upstream. Back 2010 during a revamp of the irq code some initializations were moved from ia64_mca_init() to ia64_mca_late_init() in commit c75f2aa1 Cannot use register_percpu_irq() from ia64_mca_init() But this was hideously wrong. First of all these initializations are now down far too late. Specifically after all the other cpus have been brought up and initialized their own CMC vectors from smp_callin(). Also ia64_mca_late_init() may be called from any cpu so the line: ia64_mca_cmc_vector_setup(); /* Setup vector on BSP */ is generally not executed on the BSP, and so the CMC vector isn't setup at all on that processor. Make use of the arch_early_irq_init() hook to get this code executed at just the right moment: not too early, not too late. Reported-by: Fred Hartnett <fred.hartnett@hp.com> Tested-by: Fred Hartnett <fred.hartnett@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Alex Deucher authored
commit 41167828 upstream. Monitors seem to prefer it. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=37696Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [bwh: Backported to 3.2: - Adjust context - Add to pll->flags, not radeon_crtc->pll_flags] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Alex Deucher authored
commit bf05d998 upstream. It doesn't work reliably. Just report back the currently selected engine clock. Partially fixes: https://bugs.freedesktop.org/show_bug.cgi?id=62493Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Yinghai Lu authored
commit 545d6e18 upstream. Found problem on system that firmware that could handle pci aer. Firmware get error reporting after pci injecting error, before os boots. But after os boots, firmware can not get report anymore, even pci=noaer is passed. Root cause: BIOS _OSC has problem with query bit checking. It turns out that BIOS vendor is copying example code from ACPI Spec. In ACPI Spec 5.0, page 290: If (Not(And(CDW1,1))) // Query flag clear? { // Disable GPEs for features granted native control. If (And(CTRL,0x01)) // Hot plug control granted? { Store(0,HPCE) // clear the hot plug SCI enable bit Store(1,HPCS) // clear the hot plug SCI status bit } ... } When Query flag is set, And(CDW1,1) will be 1, Not(1) will return 0xfffffffe. So it will get into code path that should be for control set only. BIOS acpi code should be changed to "If (LEqual(And(CDW1,1), 0)))" Current kernel code is using _OSC query to notify firmware about support from OS and then use _OSC to set control bits. During query support, current code is using all possible controls. So will execute code that should be only for control set stage. That will have problem when pci=noaer or aer firmware_first is used. As firmware have that control set for os aer already in query support stage, but later will not os aer handling. We should avoid passing all possible controls, just use osc_control_set instead. That should workaround BIOS bugs with affected systems on the field as more bios vendors are copying sample code from ACPI spec. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Li Zefan authored
commit 3ac1707a upstream. The 3rd parameter of flex_array_prealloc() is the number of elements, not the index of the last element. The effect of the bug is, when opening cgroup.procs, a flex array will be allocated and all elements of the array is allocated with GFP_KERNEL flag, but the last one is GFP_ATOMIC, and if we fail to allocate memory for it, it'll trigger a BUG_ON(). Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Stephan Schreiber authored
commit de53e9ca upstream. The Linux Kernel contains some inline assembly source code which has wrong asm register constraints in arch/ia64/kvm/vtlb.c. I observed this on Kernel 3.2.35 but it is also true on the most recent Kernel 3.9-rc1. File arch/ia64/kvm/vtlb.c: u64 guest_vhpt_lookup(u64 iha, u64 *pte) { u64 ret; struct thash_data *data; data = __vtr_lookup(current_vcpu, iha, D_TLB); if (data != NULL) thash_vhpt_insert(current_vcpu, data->page_flags, data->itir, iha, D_TLB); asm volatile ( "rsm psr.ic|psr.i;;" "srlz.d;;" "ld8.s r9=[%1];;" "tnat.nz p6,p7=r9;;" "(p6) mov %0=1;" "(p6) mov r9=r0;" "(p7) extr.u r9=r9,0,53;;" "(p7) mov %0=r0;" "(p7) st8 [%2]=r9;;" "ssm psr.ic;;" "srlz.d;;" "ssm psr.i;;" "srlz.d;;" : "=r"(ret) : "r"(iha), "r"(pte):"memory"); return ret; } The list of output registers is : "=r"(ret) : "r"(iha), "r"(pte):"memory"); The constraint "=r" means that the GCC has to maintain that these vars are in registers and contain valid info when the program flow leaves the assembly block (output registers). But "=r" also means that GCC can put them in registers that are used as input registers. Input registers are iha, pte on the example. If the predicate p7 is true, the 8th assembly instruction "(p7) mov %0=r0;" is the first one which writes to a register which is maintained by the register constraints; it sets %0. %0 means the first register operand; it is ret here. This instruction might overwrite the %2 register (pte) which is needed by the next instruction: "(p7) st8 [%2]=r9;;" Whether it really happens depends on how GCC decides what registers it uses and how it optimizes the code. The attached patch fixes the register operand constraints in arch/ia64/kvm/vtlb.c. The register constraints should be : "=&r"(ret) : "r"(iha), "r"(pte):"memory"); The & means that GCC must not use any of the input registers to place this output register in. This is Debian bug#702639 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702639). The patch is applicable on Kernel 3.9-rc1, 3.2.35 and many other versions. Signed-off-by: Stephan Schreiber <info@fs-driver.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Stephan Schreiber authored
commit 136f39dd upstream. The Linux Kernel contains some inline assembly source code which has wrong asm register constraints in arch/ia64/include/asm/futex.h. I observed this on Kernel 3.2.23 but it is also true on the most recent Kernel 3.9-rc1. File arch/ia64/include/asm/futex.h: static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval) { if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) return -EFAULT; { register unsigned long r8 __asm ("r8"); unsigned long prev; __asm__ __volatile__( " mf;; \n" " mov %0=r0 \n" " mov ar.ccv=%4;; \n" "[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n" " .xdata4 \"__ex_table\", 1b-., 2f-. \n" "[2:]" : "=r" (r8), "=r" (prev) : "r" (uaddr), "r" (newval), "rO" ((long) (unsigned) oldval) : "memory"); *uval = prev; return r8; } } The list of output registers is : "=r" (r8), "=r" (prev) The constraint "=r" means that the GCC has to maintain that these vars are in registers and contain valid info when the program flow leaves the assembly block (output registers). But "=r" also means that GCC can put them in registers that are used as input registers. Input registers are uaddr, newval, oldval on the example. The second assembly instruction " mov %0=r0 \n" is the first one which writes to a register; it sets %0 to 0. %0 means the first register operand; it is r8 here. (The r0 is read-only and always 0 on the Itanium; it can be used if an immediate zero value is needed.) This instruction might overwrite one of the other registers which are still needed. Whether it really happens depends on how GCC decides what registers it uses and how it optimizes the code. The objdump utility can give us disassembly. The futex_atomic_cmpxchg_inatomic() function is inline, so we have to look for a module that uses the funtion. This is the cmpxchg_futex_value_locked() function in kernel/futex.c: static int cmpxchg_futex_value_locked(u32 *curval, u32 __user *uaddr, u32 uval, u32 newval) { int ret; pagefault_disable(); ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval); pagefault_enable(); return ret; } Now the disassembly. At first from the Kernel package 3.2.23 which has been compiled with GCC 4.4, remeber this Kernel seemed to work: objdump -d linux-3.2.23/debian/build/build_ia64_none_mckinley/kernel/futex.o 0000000000000230 <cmpxchg_futex_value_locked>: 230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;; 236: 80 40 0d 00 42 00 adds r8=40,r3 23c: 00 00 04 00 nop.i 0x0;; 240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];; 246: 90 08 28 00 42 00 adds r9=1,r10 24c: 00 00 04 00 nop.i 0x0;; 250: 09 00 00 00 01 00 [MMI] nop.m 0x0 256: 00 48 20 20 23 00 st4 [r8]=r9 25c: 00 00 04 00 nop.i 0x0;; 260: 08 10 80 06 00 21 [MMI] adds r2=32,r3 266: 00 00 00 02 00 00 nop.m 0x0 26c: 02 08 f1 52 extr.u r16=r33,0,61 270: 05 40 88 00 08 e0 [MLX] addp4 r8=r34,r0 276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;; 27c: f1 f7 ff 65 280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2] 286: 00 00 00 02 00 c0 nop.m 0x0 28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;; 290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14 296: 00 00 00 02 00 40 nop.m 0x0 29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33 2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0 <cmpxchg_futex_value_locked+0x80> 2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2b0: 0a 00 00 00 22 00 [MMI] mf;; 2b6: 80 00 00 00 42 00 mov r8=r0 2bc: 00 00 04 00 nop.i 0x0 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; 2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv 2cc: 00 00 04 00 nop.i 0x0;; 2d0: 10 00 84 40 90 11 [MIB] st4 [r32]=r33 2d6: 00 00 00 02 00 00 nop.i 0x0 2dc: 20 00 00 40 br.few 2f0 <cmpxchg_futex_value_locked+0xc0> 2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14 2e6: 00 00 00 02 00 00 nop.m 0x0 2ec: 00 00 04 00 nop.i 0x0;; 2f0: 0b 58 20 1a 19 21 [MMI] adds r11=3208,r13;; 2f6: 20 01 2c 20 20 00 ld4 r18=[r11] 2fc: 00 00 04 00 nop.i 0x0;; 300: 0b 88 fc 25 3f 23 [MMI] adds r17=-1,r18;; 306: 00 88 2c 20 23 00 st4 [r11]=r17 30c: 00 00 04 00 nop.i 0x0;; 310: 11 00 00 00 01 00 [MIB] nop.m 0x0 316: 00 00 00 02 00 80 nop.i 0x0 31c: 08 00 84 00 br.ret.sptk.many b0;; The lines 2b0: 0a 00 00 00 22 00 [MMI] mf;; 2b6: 80 00 00 00 42 00 mov r8=r0 2bc: 00 00 04 00 nop.i 0x0 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; 2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv 2cc: 00 00 04 00 nop.i 0x0;; are the instructions of the assembly block. The line 2b6: 80 00 00 00 42 00 mov r8=r0 sets the r8 register to 0 and after that 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; prepares the 'oldvalue' for the cmpxchg but it takes it from r8. This is wrong. What happened here is what I explained above: An input register is overwritten which is still needed. The register operand constraints in futex.h are wrong. (The problem doesn't occur when the Kernel is compiled with GCC 4.6.) The attached patch fixes the register operand constraints in futex.h. The code after patching of it: static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval) { if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) return -EFAULT; { register unsigned long r8 __asm ("r8") = 0; unsigned long prev; __asm__ __volatile__( " mf;; \n" " mov ar.ccv=%4;; \n" "[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n" " .xdata4 \"__ex_table\", 1b-., 2f-. \n" "[2:]" : "+r" (r8), "=&r" (prev) : "r" (uaddr), "r" (newval), "rO" ((long) (unsigned) oldval) : "memory"); *uval = prev; return r8; } } I also initialized the 'r8' var with the C programming language. The _asm qualifier on the definition of the 'r8' var forces GCC to use the r8 processor register for it. I don't believe that we should use inline assembly for zeroing out a local variable. The constraint is "+r" (r8) what means that it is both an input register and an output register. Note that the page fault handler will modify the r8 register which will be the return value of the function. The real fix is "=&r" (prev) The & means that GCC must not use any of the input registers to place this output register in. Patched the Kernel 3.2.23 and compiled it with GCC4.4: 0000000000000230 <cmpxchg_futex_value_locked>: 230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;; 236: 80 40 0d 00 42 00 adds r8=40,r3 23c: 00 00 04 00 nop.i 0x0;; 240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];; 246: 90 08 28 00 42 00 adds r9=1,r10 24c: 00 00 04 00 nop.i 0x0;; 250: 09 00 00 00 01 00 [MMI] nop.m 0x0 256: 00 48 20 20 23 00 st4 [r8]=r9 25c: 00 00 04 00 nop.i 0x0;; 260: 08 10 80 06 00 21 [MMI] adds r2=32,r3 266: 20 12 01 10 40 00 addp4 r34=r34,r0 26c: 02 08 f1 52 extr.u r16=r33,0,61 270: 05 40 00 00 00 e1 [MLX] mov r8=r0 276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;; 27c: f1 f7 ff 65 280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2] 286: 00 00 00 02 00 c0 nop.m 0x0 28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;; 290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14 296: 00 00 00 02 00 40 nop.m 0x0 29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33 2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0 <cmpxchg_futex_value_locked+0x80> 2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2b0: 0b 00 00 00 22 00 [MMI] mf;; 2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34 2bc: 00 00 04 00 nop.i 0x0;; 2c0: 09 58 8c 42 11 10 [MMI] cmpxchg4.acq r11=[r33],r35,ar.ccv 2c6: 00 00 00 02 00 00 nop.m 0x0 2cc: 00 00 04 00 nop.i 0x0;; 2d0: 10 00 2c 40 90 11 [MIB] st4 [r32]=r11 2d6: 00 00 00 02 00 00 nop.i 0x0 2dc: 20 00 00 40 br.few 2f0 <cmpxchg_futex_value_locked+0xc0> 2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14 2e6: 00 00 00 02 00 00 nop.m 0x0 2ec: 00 00 04 00 nop.i 0x0;; 2f0: 0b 88 20 1a 19 21 [MMI] adds r17=3208,r13;; 2f6: 30 01 44 20 20 00 ld4 r19=[r17] 2fc: 00 00 04 00 nop.i 0x0;; 300: 0b 90 fc 27 3f 23 [MMI] adds r18=-1,r19;; 306: 00 90 44 20 23 00 st4 [r17]=r18 30c: 00 00 04 00 nop.i 0x0;; 310: 11 00 00 00 01 00 [MIB] nop.m 0x0 316: 00 00 00 02 00 80 nop.i 0x0 31c: 08 00 84 00 br.ret.sptk.many b0;; Much better. There is a 270: 05 40 00 00 00 e1 [MLX] mov r8=r0 which was generated by C code r8 = 0. Below 2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34 what means that oldval is no longer overwritten. This is Debian bug#702641 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702641). The patch is applicable on Kernel 3.9-rc1, 3.2.23 and many other versions. Signed-off-by: Stephan Schreiber <info@fs-driver.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Steven Rostedt (Red Hat) authored
commit 7fe70b57 upstream. ftrace_dump() had a lot of issues. What ftrace_dump() does, is when ftrace_dump_on_oops is set (via a kernel parameter or sysctl), it will dump out the ftrace buffers to the console when either a oops, panic, or a sysrq-z occurs. This was written a long time ago when ftrace was fragile to recursion. But it wasn't written well even for that. There's a possible deadlock that can occur if a ftrace_dump() is happening and an NMI triggers another dump. This is because it grabs a lock before checking if the dump ran. It also totally disables ftrace, and tracing for no good reasons. As the ring_buffer now checks if it is read via a oops or NMI, where there's a chance that the buffer gets corrupted, it will disable itself. No need to have ftrace_dump() do the same. ftrace_dump() is now cleaned up where it uses an atomic counter to make sure only one dump happens at a time. A simple atomic_inc_return() is enough that is needed for both other CPUs and NMIs. No need for a spinlock, as if one CPU is running the dump, no other CPU needs to do it too. The tracing_on variable is turned off and not turned on. The original code did this, but it wasn't pretty. By just disabling this variable we get the result of not seeing traces that happen between crashes. For sysrq-z, it doesn't get turned on, but the user can always write a '1' to the tracing_on file. If they are using sysrq-z, then they should know about tracing_on. The new code is much easier to read and less error prone. No more deadlock possibility when an NMI triggers here. Reported-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt (Red Hat) authored
commit 4df29712 upstream. Currently, the depth reported in the stack tracer stack_trace file does not match the stack_max_size file. This is because the stack_max_size includes the overhead of stack tracer itself while the depth does not. The first time a max is triggered, a calculation is not performed that figures out the overhead of the stack tracer and subtracts it from the stack_max_size variable. The overhead is stored and is subtracted from the reported stack size for comparing for a new max. Now the stack_max_size corresponds to the reported depth: # cat stack_max_size 4640 # cat stack_trace Depth Size Location (48 entries) ----- ---- -------- 0) 4640 32 _raw_spin_lock+0x18/0x24 1) 4608 112 ____cache_alloc+0xb7/0x22d 2) 4496 80 kmem_cache_alloc+0x63/0x12f 3) 4416 16 mempool_alloc_slab+0x15/0x17 [...] While testing against and older gcc on x86 that uses mcount instead of fentry, I found that pasing in ip + MCOUNT_INSN_SIZE let the stack trace show one more function deep which was missing before. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Steven Rostedt (Red Hat) authored
commit d4ecbfc4 upstream. When gcc 4.6 on x86 is used, the function tracer will use the new option -mfentry which does a call to "fentry" at every function instead of "mcount". The significance of this is that fentry is called as the first operation of the function instead of the mcount usage of being called after the stack. This causes the stack tracer to show some bogus results for the size of the last function traced, as well as showing "ftrace_call" instead of the function. This is due to the stack frame not being set up by the function that is about to be traced. # cat stack_trace Depth Size Location (48 entries) ----- ---- -------- 0) 4824 216 ftrace_call+0x5/0x2f 1) 4608 112 ____cache_alloc+0xb7/0x22d 2) 4496 80 kmem_cache_alloc+0x63/0x12f The 216 size for ftrace_call includes both the ftrace_call stack (which includes the saving of registers it does), as well as the stack size of the parent. To fix this, if CC_USING_FENTRY is defined, then the stack_tracer will reserve the first item in stack_dump_trace[] array when calling save_stack_trace(), and it will fill it in with the parent ip. Then the code will look for the parent pointer on the stack and give the real size of the parent's stack pointer: # cat stack_trace Depth Size Location (14 entries) ----- ---- -------- 0) 2640 48 update_group_power+0x26/0x187 1) 2592 224 update_sd_lb_stats+0x2a5/0x4ac 2) 2368 160 find_busiest_group+0x31/0x1f1 3) 2208 256 load_balance+0xd9/0x662 I'm Cc'ing stable, although it's not urgent, as it only shows bogus size for item #0, the rest of the trace is legit. It should still be corrected in previous stable releases. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Steven Rostedt (Red Hat) authored
commit 87889501 upstream. Use the stack of stack_trace_call() instead of check_stack() as the test pointer for max stack size. It makes it a bit cleaner and a little more accurate. Adding stable, as a later fix depends on this patch. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Zhao Hongjiang authored
commit 91d80a84 upstream. dprintk() shouldn't access @ring after it's unmapped. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 3.2: keep the second argument to kunmap_atomic()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Mathias Krause authored
commit 72a763d8 upstream. The current code does not set the msg_namelen member to 0 and therefore makes net/socket.c leak the local sockaddr_storage variable to userland -- 128 bytes of kernel stack memory. Fix that. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
- 25 Apr, 2013 15 commits
-
-
Ben Hutchings authored
-
Tejun Heo authored
commit 383efcd0 upstream. try_to_wake_up_local() should only be invoked to wake up another task in the same runqueue and BUG_ON()s are used to enforce the rule. Missing try_to_wake_up_local() can stall workqueue execution but such stalls are likely to be finite either by another work item being queued or the one blocked getting unblocked. There's no reason to trigger BUG while holding rq lock crashing the whole system. Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130318192234.GD3042@htj.dyndns.orgSigned-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Andrew Honig authored
commit 8f964525 upstream. This patch adds support for kvm_gfn_to_hva_cache_init functions for reads and writes that will cross a page. If the range falls within the same memslot, then this will be a fast operation. If the range is split between two memslots, then the slower kvm_read_guest and kvm_write_guest are used. Tested: Test against kvm_clock unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> [bwh: Backported to 3.2: - Drop change in lapic.c - Keep using __gfn_to_memslot() in kvm_gfn_to_hva_cache_init()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Andy Honig authored
commit a2c118bf upstream. If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate that request. ioapic_read_indirect contains an ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in non-debug builds. In recent kernels this allows a guest to cause a kernel oops by reading invalid memory. In older kernels (pre-3.3) this allows a guest to read from large ranges of host memory. Tested: tested against apic unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Andy Honig authored
commit 0b79459b upstream. There is a potential use after free issue with the handling of MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable memory such as frame buffers then KVM might continue to write to that address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins the page in memory so it's unlikely to cause an issue, but if the user space component re-purposes the memory previously used for the guest, then the guest will be able to corrupt that memory. Tested: Tested against kvmclock unit test Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> [bwh: Backported to 3.2: - Adjust context - We do not implement the PVCLOCK_GUEST_STOPPED flag] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Andy Honig authored
commit c300aa64 upstream. If the guest sets the GPA of the time_page so that the request to update the time straddles a page then KVM will write onto an incorrect page. The write is done byusing kmap atomic to get a pointer to the page for the time structure and then performing a memcpy to that page starting at an offset that the guest controls. Well behaved guests always provide a 32-byte aligned address, however a malicious guest could use this to corrupt host kernel memory. Tested: Tested against kvmclock unit test. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Vyacheslav Dubeyko authored
commit 12f267a2 upstream. Change a u32 to loff_t hfsplus_file_truncate(). Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hin-Tak Leung <htl10@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Dave Airlie authored
commit 054430e7 upstream. Okay so Alan's patch handled the case where there was no registered fbcon, however the other path entered in set_con2fb_map pit. In there we called fbcon_takeover, but we also took the console lock in a couple of places. So push the console lock out to the callers of set_con2fb_map, this means fbmem and switcheroo needed to take the lock around the fb notifier entry points that lead to this. This should fix the efifb regression seen by Maarten. Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Lu Hua <huax.lu@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
David Woodhouse authored
commit f5cf8f07 upstream. This code was broken because it assumed that all MTD devices were map-based. Disable it for now, until it can be fixed properly for the next merge window. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
hayeswang authored
commit e2409d83 upstream. It would cause no link after suspending or shutdowning when the nic changes the speed to 10M and connects to a link partner which forces the speed to 100M. Check the link partner ability to determine which speed to set. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Arnd Bergmann authored
commit c678ef52 upstream. As found by gcc-4.8, the QUEUE_SYSFS_BIT_FNS macro creates functions that use a value generated by queue_var_store independent of whether that value was set or not. block/blk-sysfs.c: In function 'queue_store_nonrot': block/blk-sysfs.c:244:385: warning: 'val' may be used uninitialized in this function [-Wmaybe-uninitialized] Unlike most other such warnings, this one is not a false positive, writing any non-number string into the sysfs files indeed has an undefined result, rather than returning an error. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
David Henningsson authored
commit aeb3a972 upstream. Rename "Digitial In" to "Digital In". This function is only used for proc output, so should not cause any problems to change. Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Rainer Koenig authored
commit 1d87caa6 upstream. * Added the device ID to the modalias list and assinged ALC662 patches for it * Added 4 port support for the device ID 0671 in alc662_parse_auto_config Signed-off-by: Rainer Koenig <Rainer.Koenig@ts.fujitsu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Maxim Mikityanskiy authored
commit 51c94491 upstream. Fix memory leak - don't forget to kfree ACPI object when returning from msi_wmi_notify() after suppressing key event. Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Acked-by: Anisse Astier <anisse@astier.eu> Signed-off-by: Lee, Chun-Yi <jlee@suse.com> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-
Jani Nikula authored
commit 5559ecad upstream. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44156Reported-by: Alan Zimmerman <alan.zimm@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
-