• Jack Morgenstein's avatar
    IB/umad: Fix use of unprotected device pointer · f23a5350
    Jack Morgenstein authored
    The ib_write_umad() is protected by taking the umad file mutex.
    However, it accesses file->port->ib_dev -- which is protected only by the
    port's mutex (field file_mutex).
    
    The ib_umad_remove_one() calls ib_umad_kill_port() which sets
    port->ib_dev to NULL under the port mutex (NOT the file mutex).
    It then sets the mad agent to "dead" under the umad file mutex.
    
    This is a race condition -- because there is a window where
    port->ib_dev is NULL, while the agent is not "dead".
    
    As a result, we saw stack traces like:
    
    [16490.678059] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
    [16490.678246] IP: ib_umad_write+0x29c/0xa3a [ib_umad]
    [16490.678333] PGD 0 P4D 0
    [16490.678404] Oops: 0000 [#1] SMP PTI
    [16490.678466] Modules linked in: rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) ptp pps_core mlx4_ib(OE-) ib_core(OE) mlx4_core(OE) mlx_compat
    (OE) memtrack(OE) devlink mst_pciconf(OE) mst_pci(OE) netconsole nfsv3 nfs_acl nfs lockd grace fscache cfg80211 rfkill esp6_offload esp6 esp4_offload esp4 sunrpc kvm_intel kvm ppdev parport_pc irqbypass
    parport joydev i2c_piix4 virtio_balloon cirrus drm_kms_helper ttm drm e1000 serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg [last unloaded: mlxfw]
    [16490.679202] CPU: 4 PID: 3115 Comm: sminfo Tainted: G           OE   4.14.13-300.fc27.x86_64 #1
    [16490.679339] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
    [16490.679477] task: ffff9cf753890000 task.stack: ffffaf70c26b0000
    [16490.679571] RIP: 0010:ib_umad_write+0x29c/0xa3a [ib_umad]
    [16490.679664] RSP: 0018:ffffaf70c26b3d90 EFLAGS: 00010202
    [16490.679747] RAX: 0000000000000010 RBX: ffff9cf75610fd80 RCX: 0000000000000000
    [16490.679856] RDX: 0000000000000001 RSI: 00007ffdf2bfd714 RDI: ffff9cf6bb2a9c00
    
    In the above trace, ib_umad_write is trying to dereference the NULL
    file->port->ib_dev pointer.
    
    Fix this by using the agent's device pointer (the device field
    in struct ib_mad_agent) -- which IS protected by the umad file mutex.
    
    Cc: <stable@vger.kernel.org> # v4.11
    Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
    Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
    Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
    Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
    f23a5350
user_mad.c 34 KB