1. 12 Jan, 2023 2 commits
  2. 11 Jan, 2023 5 commits
  3. 10 Jan, 2023 23 commits
  4. 09 Jan, 2023 9 commits
  5. 07 Jan, 2023 1 commit
    • David S. Miller's avatar
      Merge tag 'rxrpc-fixes-20230107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 571f3dd0
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Fix race between call connection, data transmit and call disconnect
      
      Here are patches to fix an oops[1] caused by a race between call
      connection, initial packet transmission and call disconnection which
      results in something like:
      
              kernel BUG at net/rxrpc/peer_object.c:413!
      
      when the syzbot test is run.  The problem is that the connection procedure
      is effectively split across two threads and can get expanded by taking an
      interrupt, thereby adding the call to the peer error distribution list
      *after* it has been disconnected (say by the rxrpc socket shutting down).
      
      The easiest solution is to look at the fourth set of I/O thread
      conversion/SACK table expansion patches that didn't get applied[2] and take
      from it those patches that move call connection and disconnection into the
      I/O thread.  Moving these things into the I/O thread means that the
      sequencing is managed by all being done in the same thread - and the race
      can no longer happen.
      
      This is preferable to introducing an extra lock as adding an extra lock
      would make the I/O thread have to wait for the app thread in yet another
      place.
      
      The changes can be considered as a number of logical parts:
      
       (1) Move all of the call state changes into the I/O thread.
      
       (2) Make client connection ID space per-local endpoint so that the I/O
           thread doesn't need locks to access it.
      
       (3) Move actual abort generation into the I/O thread and clean it up.  If
           sendmsg or recvmsg want to cause an abort, they have to delegate it.
      
       (4) Offload the setting up of the security context on a connection to the
           thread of one of the apps that's starting a call.  We don't want to be
           doing any sort of crypto in the I/O thread.
      
       (5) Connect calls (ie. assign them to channel slots on connections) in the
           I/O thread.  Calls are set up by sendmsg/kafs and passed to the I/O
           thread to connect.  Connections are allocated in the I/O thread after
           this.
      
       (6) Disconnect calls in the I/O thread.
      
      I've also added a patch for an unrelated bug that cropped up during
      testing, whereby a race can occur between an incoming call and socket
      shutdown.
      
      Note that whilst this fixes the original syzbot bug, another bug may get
      triggered if this one is fixed:
      
              INFO: rcu detected stall in corrupted
              rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P5792 } 2657 jiffies s: 2825 root: 0x0/T
              rcu: blocking rcu_node structures (internal RCU debug):
      
      It doesn't look this should be anything to do with rxrpc, though, as I've
      tested an additional patch[3] that removes practically all the RCU usage
      from rxrpc and it still occurs.  It seems likely that it is being caused by
      something in the tunnelling setup that the syzbot test does, but there's
      not enough info to go on.  It also seems unlikely to be anything to do with
      the afs driver as the test doesn't use that.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      571f3dd0