• Chuck Lever's avatar
    NFS: Fix "kernel BUG at fs/aio.c:554!" · 839f7ad6
    Chuck Lever authored
    Nick Piggin reports:
    
    > I'm getting use after frees in aio code in NFS
    >
    > [ 2703.396766] Call Trace:
    > [ 2703.396858]  [<ffffffff8100b057>] ? native_sched_clock+0x27/0x80
    > [ 2703.396959]  [<ffffffff8108509e>] ? put_lock_stats+0xe/0x40
    > [ 2703.397058]  [<ffffffff81088348>] ? lock_release_holdtime+0xa8/0x140
    > [ 2703.397159]  [<ffffffff8108a2a5>] lock_acquire+0x95/0x1b0
    > [ 2703.397260]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
    > [ 2703.397361]  [<ffffffff81039701>] ? get_parent_ip+0x11/0x50
    > [ 2703.397464]  [<ffffffff81612a31>] _raw_spin_lock_irq+0x41/0x80
    > [ 2703.397564]  [<ffffffff811627db>] ? aio_put_req+0x2b/0x60
    > [ 2703.397662]  [<ffffffff811627db>] aio_put_req+0x2b/0x60
    > [ 2703.397761]  [<ffffffff811647fe>] do_io_submit+0x2be/0x7c0
    > [ 2703.397895]  [<ffffffff81164d0b>] sys_io_submit+0xb/0x10
    > [ 2703.397995]  [<ffffffff8100307b>] system_call_fastpath+0x16/0x1b
    >
    > Adding some tracing, it is due to nfs completing the request then
    > returning something other than -EIOCBQUEUED, so aio.c
    > also completes the request.
    
    To address this, prevent the NFS direct I/O engine from completing
    async iocbs when the forward path returns an error without starting
    any I/O.
    
    This fix appears to survive ^C during both "xfstest no. 208" and "fsx
    -Z."
    
    It's likely this bug has existed for a very long while, as we are seeing
    very similar symptoms in OEL 5.  Copying stable.
    
    Cc: Stable <stable@kernel.org>
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
    839f7ad6
direct.c 27.5 KB