• Sagi Grimberg's avatar
    nvme-tcp: fix possible crash in recv error flow · 39d06079
    Sagi Grimberg authored
    If the target misbehaves and sends us unexpected payload we
    need to make sure to fail the controller and stop processing
    the input stream. We clear the rd_enabled flag and stop
    the io_work, but we may still requeue it if we still have pending
    sends and then in the next invocation we will process the input
    stream as the check is only in the .data_ready upcall.
    
    To fix this we need to make sure not to self-requeue io_work
    upon a recv flow error.
    
    This fixes the crash:
     nvme nvme2: receive failed:  -22
     BUG: unable to handle page fault for address: ffffbeb5816c3b48
     nvme_ns_head_make_request: 29 callbacks suppressed
     block nvme0n5: no usable path - requeuing I/O
     block nvme0n5: no usable path - requeuing I/O
     block nvme0n7: no usable path - requeuing I/O
     block nvme0n7: no usable path - requeuing I/O
     block nvme0n3: no usable path - requeuing I/O
     block nvme0n3: no usable path - requeuing I/O
     block nvme0n3: no usable path - requeuing I/O
     block nvme0n7: no usable path - requeuing I/O
     block nvme0n3: no usable path - requeuing I/O
     block nvme0n3: no usable path - requeuing I/O
     #PF: supervisor read access inkernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 1039157067 P4D 1039157067 PUD 103915a067 PMD 102719f067 PTE 0
     Oops: 0000 [#1] SMP PTI
     CPU: 8 PID: 411 Comm: kworker/8:1H Not tainted 5.3.0-40-generic #32~18.04.1-Ubuntu
     Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015
     Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
     RIP: 0010:nvme_tcp_recv_skb+0x2ae/0xb50 [nvme_tcp]
     RSP: 0018:ffffbeb5806cfd10 EFLAGS: 00010246
     RAX: ffffbeb5816c3b48 RBX: 00000000000003d0 RCX: 0000000000000008
     RDX: 00000000000003d0 RSI: 0000000000000001 RDI: ffff9a3040684b40
     RBP: ffffbeb5806cfd90 R08: 0000000000000000 R09: ffffffff946e6900
     R10: ffffbeb5806cfce0 R11: 0000000000000001 R12: 0000000000000000
     R13: ffff9a2ff86501c0 R14: 00000000000003d0 R15: ffff9a30b85f2798
     FS:  0000000000000000(0000) GS:ffff9a30bf800000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: ffffbeb5816c3b48 CR3: 000000088400a006 CR4: 00000000003626e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     Call Trace:
      tcp_read_sock+0x8c/0x290
      ? __release_sock+0x9d/0xe0
      ? nvme_tcp_write_space+0xb0/0xb0 [nvme_tcp]
      nvme_tcp_io_work+0x4b4/0x830 [nvme_tcp]
      ? finish_task_switch+0x163/0x270
      process_one_work+0x1fd/0x3f0
      worker_thread+0x34/0x410
      kthread+0x121/0x140
      ? process_one_work+0x3f0/0x3f0
      ? kthread_park+0xb0/0xb0
      ret_from_fork+0x35/0x40
    Reported-by: default avatarRoy Shterman <roys@lightbitslabs.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    39d06079
tcp.c 61.6 KB