1. 10 Feb, 2016 1 commit
  2. 01 Feb, 2016 1 commit
    • Tejun Heo's avatar
      libata: fix sff host state machine locking while polling · 8eee1d3e
      Tejun Heo authored
      The bulk of ATA host state machine is implemented by
      ata_sff_hsm_move().  The function is called from either the interrupt
      handler or, if polling, a work item.  Unlike from the interrupt path,
      the polling path calls the function without holding the host lock and
      ata_sff_hsm_move() selectively grabs the lock.
      
      This is completely broken.  If an IRQ triggers while polling is in
      progress, the two can easily race and end up accessing the hardware
      and updating state machine state at the same time.  This can put the
      state machine in an illegal state and lead to a crash like the
      following.
      
        kernel BUG at drivers/ata/libata-sff.c:1302!
        invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
        Modules linked in:
        CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000
        RIP: 0010:[<ffffffff83a83409>]  [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60
        ...
        Call Trace:
         <IRQ>
         [<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584
         [<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877
         [<     inline     >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629
         [<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902
         [<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157
         [<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205
         [<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623
         [<     inline     >] generic_handle_irq_desc include/linux/irqdesc.h:146
         [<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78
         [<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240
         [<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520
         <EOI>
         [<     inline     >] rcu_lock_acquire include/linux/rcupdate.h:490
         [<     inline     >] rcu_read_lock include/linux/rcupdate.h:874
         [<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145
         [<     inline     >] do_fault_around mm/memory.c:2943
         [<     inline     >] do_read_fault mm/memory.c:2962
         [<     inline     >] do_fault mm/memory.c:3133
         [<     inline     >] handle_pte_fault mm/memory.c:3308
         [<     inline     >] __handle_mm_fault mm/memory.c:3418
         [<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447
         [<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
         [<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
         [<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
         [<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
      
      Fix it by ensuring that the polling path is holding the host lock
      before entering ata_sff_hsm_move() so that all hardware accesses and
      state updates are performed under the host lock.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.com
      Cc: stable@vger.kernel.org
      8eee1d3e
  3. 29 Jan, 2016 1 commit
    • Tejun Heo's avatar
      libata-sff: use WARN instead of BUG on illegal host state machine state · a588afc9
      Tejun Heo authored
      ata_sff_hsm_move() triggers BUG if it sees a host state machine state
      that it dind't expect.  The risk for data corruption when the
      condition occurs is low as it's highly unlikely that it would lead to
      spurious completion of commands.  The BUG occasionally triggered for
      subtle race conditions in the driver.  Let's downgrade it to WARN so
      that it doesn't kill the machine unnecessarily.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      a588afc9
  4. 25 Jan, 2016 3 commits
  5. 24 Jan, 2016 34 commits