• Patrick Kelsey's avatar
    IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests · 00cbce5c
    Patrick Kelsey authored
    hfi1 user SDMA request processing has two bugs that can cause data
    corruption for user SDMA requests that have multiple payload iovecs
    where an iovec other than the tail iovec does not run up to the page
    boundary for the buffer pointed to by that iovec.a
    
    Here are the specific bugs:
    1. user_sdma_txadd() does not use struct user_sdma_iovec->iov.iov_len.
       Rather, user_sdma_txadd() will add up to PAGE_SIZE bytes from iovec
       to the packet, even if some of those bytes are past
       iovec->iov.iov_len and are thus not intended to be in the packet.
    2. user_sdma_txadd() and user_sdma_send_pkts() fail to advance to the
       next iovec in user_sdma_request->iovs when the current iovec
       is not PAGE_SIZE and does not contain enough data to complete the
       packet. The transmitted packet will contain the wrong data from the
       iovec pages.
    
    This has not been an issue with SDMA packets from hfi1 Verbs or PSM2
    because they only produce iovecs that end short of PAGE_SIZE as the tail
    iovec of an SDMA request.
    
    Fixing these bugs exposes other bugs with the SDMA pin cache
    (struct mmu_rb_handler) that get in way of supporting user SDMA requests
    with multiple payload iovecs whose buffers do not end at PAGE_SIZE. So
    this commit fixes those issues as well.
    
    Here are the mmu_rb_handler bugs that non-PAGE_SIZE-end multi-iovec
    payload user SDMA requests can hit:
    1. Overlapping memory ranges in mmu_rb_handler will result in duplicate
       pinnings.
    2. When extending an existing mmu_rb_handler entry (struct mmu_rb_node),
       the mmu_rb code (1) removes the existing entry under a lock, (2)
       releases that lock, pins the new pages, (3) then reacquires the lock
       to insert the extended mmu_rb_node.
    
       If someone else comes in and inserts an overlapping entry between (2)
       and (3), insert in (3) will fail.
    
       The failure path code in this case unpins _all_ pages in either the
       original mmu_rb_node or the new mmu_rb_node that was inserted between
       (2) and (3).
    3. In hfi1_mmu_rb_remove_unless_exact(), mmu_rb_node->refcount is
       incremented outside of mmu_rb_handler->lock. As a result, mmu_rb_node
       could be evicted by another thread that gets mmu_rb_handler->lock and
       checks mmu_rb_node->refcount before mmu_rb_node->refcount is
       incremented.
    4. Related to #2 above, SDMA request submission failure path does not
       check mmu_rb_node->refcount before freeing mmu_rb_node object.
    
       If there are other SDMA requests in progress whose iovecs have
       pointers to the now-freed mmu_rb_node(s), those pointers to the
       now-freed mmu_rb nodes will be dereferenced when those SDMA requests
       complete.
    
    Fixes: 7be85676 ("IB/hfi1: Don't remove RB entry when not needed.")
    Fixes: 77241056 ("IB/hfi1: add driver files")
    Signed-off-by: default avatarBrendan Cunningham <bcunningham@cornelisnetworks.com>
    Signed-off-by: default avatarPatrick Kelsey <pat.kelsey@cornelisnetworks.com>
    Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
    Link: https://lore.kernel.org/r/168088636445.3027109.10054635277810177889.stgit@252.162.96.66.static.eigbox.netSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
    00cbce5c
trace_mmu.h 1.31 KB