1. 10 Jan, 2023 14 commits
  2. 09 Jan, 2023 7 commits
  3. 07 Jan, 2023 3 commits
    • David S. Miller's avatar
      Merge tag 'rxrpc-fixes-20230107' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 571f3dd0
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Fix race between call connection, data transmit and call disconnect
      
      Here are patches to fix an oops[1] caused by a race between call
      connection, initial packet transmission and call disconnection which
      results in something like:
      
              kernel BUG at net/rxrpc/peer_object.c:413!
      
      when the syzbot test is run.  The problem is that the connection procedure
      is effectively split across two threads and can get expanded by taking an
      interrupt, thereby adding the call to the peer error distribution list
      *after* it has been disconnected (say by the rxrpc socket shutting down).
      
      The easiest solution is to look at the fourth set of I/O thread
      conversion/SACK table expansion patches that didn't get applied[2] and take
      from it those patches that move call connection and disconnection into the
      I/O thread.  Moving these things into the I/O thread means that the
      sequencing is managed by all being done in the same thread - and the race
      can no longer happen.
      
      This is preferable to introducing an extra lock as adding an extra lock
      would make the I/O thread have to wait for the app thread in yet another
      place.
      
      The changes can be considered as a number of logical parts:
      
       (1) Move all of the call state changes into the I/O thread.
      
       (2) Make client connection ID space per-local endpoint so that the I/O
           thread doesn't need locks to access it.
      
       (3) Move actual abort generation into the I/O thread and clean it up.  If
           sendmsg or recvmsg want to cause an abort, they have to delegate it.
      
       (4) Offload the setting up of the security context on a connection to the
           thread of one of the apps that's starting a call.  We don't want to be
           doing any sort of crypto in the I/O thread.
      
       (5) Connect calls (ie. assign them to channel slots on connections) in the
           I/O thread.  Calls are set up by sendmsg/kafs and passed to the I/O
           thread to connect.  Connections are allocated in the I/O thread after
           this.
      
       (6) Disconnect calls in the I/O thread.
      
      I've also added a patch for an unrelated bug that cropped up during
      testing, whereby a race can occur between an incoming call and socket
      shutdown.
      
      Note that whilst this fixes the original syzbot bug, another bug may get
      triggered if this one is fixed:
      
              INFO: rcu detected stall in corrupted
              rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { P5792 } 2657 jiffies s: 2825 root: 0x0/T
              rcu: blocking rcu_node structures (internal RCU debug):
      
      It doesn't look this should be anything to do with rxrpc, though, as I've
      tested an additional patch[3] that removes practically all the RCU usage
      from rxrpc and it still occurs.  It seems likely that it is being caused by
      something in the tunnelling setup that the syzbot test does, but there's
      not enough info to go on.  It also seems unlikely to be anything to do with
      the afs driver as the test doesn't use that.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      571f3dd0
    • David Howells's avatar
      rxrpc: Fix incoming call setup race · 42f229c3
      David Howells authored
      An incoming call can race with rxrpc socket destruction, leading to a
      leaked call.  This may result in an oops when the call timer eventually
      expires:
      
         BUG: kernel NULL pointer dereference, address: 0000000000000874
         RIP: 0010:_raw_spin_lock_irqsave+0x2a/0x50
         Call Trace:
          <IRQ>
          try_to_wake_up+0x59/0x550
          ? __local_bh_enable_ip+0x37/0x80
          ? rxrpc_poke_call+0x52/0x110 [rxrpc]
          ? rxrpc_poke_call+0x110/0x110 [rxrpc]
          ? rxrpc_poke_call+0x110/0x110 [rxrpc]
          call_timer_fn+0x24/0x120
      
      with a warning in the kernel log looking something like:
      
         rxrpc: Call 00000000ba5e571a still in use (1,SvAwtACK,1061d,0)!
      
      incurred during rmmod of rxrpc.  The 1061d is the call flags:
      
         RECVMSG_READ_ALL, RX_HEARD, BEGAN_RX_TIMER, RX_LAST, EXPOSED,
         IS_SERVICE, RELEASED
      
      but no DISCONNECTED flag (0x800), so it's an incoming (service) call and
      it's still connected.
      
      The race appears to be that:
      
       (1) rxrpc_new_incoming_call() consults the service struct, checks sk_state
           and allocates a call - then pauses, possibly for an interrupt.
      
       (2) rxrpc_release_sock() sets RXRPC_CLOSE, nulls the service pointer,
           discards the prealloc and releases all calls attached to the socket.
      
       (3) rxrpc_new_incoming_call() resumes, launching the new call, including
           its timer and attaching it to the socket.
      
      Fix this by read-locking local->services_lock to access the AF_RXRPC socket
      providing the service rather than RCU in rxrpc_new_incoming_call().
      There's no real need to use RCU here as local->services_lock is only
      write-locked by the socket side in two places: when binding and when
      shutting down.
      
      Fixes: 5e6ef4f1 ("rxrpc: Make the I/O thread take over the call and local processor work")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      42f229c3
    • Angela Czubak's avatar
      octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable · b4e9b876
      Angela Czubak authored
      PF netdev can request AF to enable or disable reception and transmission
      on assigned CGX::LMAC. The current code instead of disabling or enabling
      'reception and transmission' also disables/enable the LMAC. This patch
      fixes this issue.
      
      Fixes: 1435f66a ("octeontx2-af: CGX Rx/Tx enable/disable mbox handlers")
      Signed-off-by: default avatarAngela Czubak <aczubak@marvell.com>
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20230105160107.17638-1-hkelam@marvell.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b4e9b876
  4. 06 Jan, 2023 16 commits
    • Tung Nguyen's avatar
      tipc: fix unexpected link reset due to discovery messages · c244c092
      Tung Nguyen authored
      This unexpected behavior is observed:
      
      node 1                    | node 2
      ------                    | ------
      link is established       | link is established
      reboot                    | link is reset
      up                        | send discovery message
      receive discovery message |
      link is established       | link is established
      send discovery message    |
                                | receive discovery message
                                | link is reset (unexpected)
                                | send reset message
      link is reset             |
      
      It is due to delayed re-discovery as described in function
      tipc_node_check_dest(): "this link endpoint has already reset
      and re-established contact with the peer, before receiving a
      discovery message from that node."
      
      However, commit 598411d7 has changed the condition for calling
      tipc_node_link_down() which was the acceptance of new media address.
      
      This commit fixes this by restoring the old and correct behavior.
      
      Fixes: 598411d7 ("tipc: make resetting of links non-atomic")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c244c092
    • David Howells's avatar
      rxrpc: Move client call connection to the I/O thread · 9d35d880
      David Howells authored
      Move the connection setup of client calls to the I/O thread so that a whole
      load of locking and barrierage can be eliminated.  This necessitates the
      app thread waiting for connection to complete before it can begin
      encrypting data.
      
      This also completes the fix for a race that exists between call connection
      and call disconnection whereby the data transmission code adds the call to
      the peer error distribution list after the call has been disconnected (say
      by the rxrpc socket getting closed).
      
      The fix is to complete the process of moving call connection, data
      transmission and call disconnection into the I/O thread and thus forcibly
      serialising them.
      
      Note that the issue may predate the overhaul to an I/O thread model that
      were included in the merge window for v6.2, but the timing is very much
      changed by the change given below.
      
      Fixes: cf37b598 ("rxrpc: Move DATA transmission into call processor work item")
      Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      9d35d880
    • David Howells's avatar
      rxrpc: Move the client conn cache management to the I/O thread · 0d6bf319
      David Howells authored
      Move the management of the client connection cache to the I/O thread rather
      than managing it from the namespace as an aggregate across all the local
      endpoints within the namespace.
      
      This will allow a load of locking to be got rid of in a future patch as
      only the I/O thread will be looking at the this.
      
      The downside is that the total number of cached connections on the system
      can get higher because the limit is now per-local rather than per-netns.
      We can, however, keep the number of client conns in use across the entire
      netfs and use that to reduce the expiration time of idle connection.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      0d6bf319
    • David Howells's avatar
      rxrpc: Remove call->state_lock · 96b4059f
      David Howells authored
      All the setters of call->state are now in the I/O thread and thus the state
      lock is now unnecessary.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      96b4059f
    • David Howells's avatar
      rxrpc: Move call state changes from recvmsg to I/O thread · 93368b6b
      David Howells authored
      Move the call state changes that are made in rxrpc_recvmsg() to the I/O
      thread.  This means that, thenceforth, only the I/O thread does this and
      the call state lock can be removed.
      
      This requires the Rx phase to be ended when the last packet is received,
      not when it is processed.
      
      Since this now changes the rxrpc call state to SUCCEEDED before we've
      consumed all the data from it, rxrpc_kernel_check_life() mustn't say the
      call is dead until the recvmsg queue is empty (unless the call has failed).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      93368b6b
    • David Howells's avatar
      rxrpc: Move call state changes from sendmsg to I/O thread · 2d689424
      David Howells authored
      Move all the call state changes that are made in rxrpc_sendmsg() to the I/O
      thread.  This is a step towards removing the call state lock.
      
      This requires the switch to the RXRPC_CALL_CLIENT_AWAIT_REPLY and
      RXRPC_CALL_SERVER_SEND_REPLY states to be done when the last packet is
      decanted from ->tx_sendmsg to ->tx_buffer in the I/O thread, not when it is
      added to ->tx_sendmsg by sendmsg().
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      2d689424
    • David Howells's avatar
      rxrpc: Wrap accesses to get call state to put the barrier in one place · d41b3f5b
      David Howells authored
      Wrap accesses to get the state of a call from outside of the I/O thread in
      a single place so that the barrier needed to order wrt the error code and
      abort code is in just that place.
      
      Also use a barrier when setting the call state and again when reading the
      call state such that the auxiliary completion info (error code, abort code)
      can be read without taking a read lock on the call state lock.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d41b3f5b
    • David Howells's avatar
      rxrpc: Split out the call state changing functions into their own file · 0b9bb322
      David Howells authored
      Split out the functions that change the state of an rxrpc call into their
      own file.  The idea being to remove anything to do with changing the state
      of a call directly from the rxrpc sendmsg() and recvmsg() paths and have
      all that done in the I/O thread only, with the ultimate aim of removing the
      state lock entirely.  Moving the code out of sendmsg.c and recvmsg.c makes
      that easier to manage.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      0b9bb322
    • David Howells's avatar
      rxrpc: Set up a connection bundle from a call, not rxrpc_conn_parameters · 1bab27af
      David Howells authored
      Use the information now stored in struct rxrpc_call to configure the
      connection bundle and thence the connection, rather than using the
      rxrpc_conn_parameters struct.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      1bab27af
    • David Howells's avatar
      rxrpc: Offload the completion of service conn security to the I/O thread · 2953d3b8
      David Howells authored
      Offload the completion of the challenge/response cycle on a service
      connection to the I/O thread.  After the RESPONSE packet has been
      successfully decrypted and verified by the work queue, offloading the
      changing of the call states to the I/O thread makes iteration over the
      conn's channel list simpler.
      
      Do this by marking the RESPONSE skbuff and putting it onto the receive
      queue for the I/O thread to collect.  We put it on the front of the queue
      as we've already received the packet for it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      2953d3b8
    • David Howells's avatar
      rxrpc: Make the set of connection IDs per local endpoint · f06cb291
      David Howells authored
      Make the set of connection IDs per local endpoint so that endpoints don't
      cause each other's connections to get dismissed.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      f06cb291
    • David Howells's avatar
      rxrpc: Tidy up abort generation infrastructure · 57af281e
      David Howells authored
      Tidy up the abort generation infrastructure in the following ways:
      
       (1) Create an enum and string mapping table to list the reasons an abort
           might be generated in tracing.
      
       (2) Replace the 3-char string with the values from (1) in the places that
           use that to log the abort source.  This gets rid of a memcpy() in the
           tracepoint.
      
       (3) Subsume the rxrpc_rx_eproto tracepoint with the rxrpc_abort tracepoint
           and use values from (1) to indicate the trace reason.
      
       (4) Always make a call to an abort function at the point of the abort
           rather than stashing the values into variables and using goto to get
           to a place where it reported.  The C optimiser will collapse the calls
           together as appropriate.  The abort functions return a value that can
           be returned directly if appropriate.
      
      Note that this extends into afs also at the points where that generates an
      abort.  To aid with this, the afs sources need to #define
      RXRPC_TRACE_ONLY_DEFINE_ENUMS before including the rxrpc tracing header
      because they don't have access to the rxrpc internal structures that some
      of the tracepoints make use of.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      57af281e
    • David Howells's avatar
      rxrpc: Clean up connection abort · a00ce28b
      David Howells authored
      Clean up connection abort, using the connection state_lock to gate access
      to change that state, and use an rxrpc_call_completion value to indicate
      the difference between local and remote aborts as these can be pasted
      directly into the call state.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      a00ce28b
    • David Howells's avatar
      rxrpc: Implement a mechanism to send an event notification to a connection · f2cce89a
      David Howells authored
      Provide a means by which an event notification can be sent to a connection
      through such that the I/O thread can pick it up and handle it rather than
      doing it in a separate workqueue.
      
      This is then used to move the deferred final ACK of a call into the I/O
      thread rather than a separate work queue as part of the drive to do all
      transmission from the I/O thread.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      f2cce89a
    • David Howells's avatar
      rxrpc: Only disconnect calls in the I/O thread · 03fc55ad
      David Howells authored
      Only perform call disconnection in the I/O thread to reduce the locking
      requirement.
      
      This is the first part of a fix for a race that exists between call
      connection and call disconnection whereby the data transmission code adds
      the call to the peer error distribution list after the call has been
      disconnected (say by the rxrpc socket getting closed).
      
      The fix is to complete the process of moving call connection, data
      transmission and call disconnection into the I/O thread and thus forcibly
      serialising them.
      
      Note that the issue may predate the overhaul to an I/O thread model that
      were included in the merge window for v6.2, but the timing is very much
      changed by the change given below.
      
      Fixes: cf37b598 ("rxrpc: Move DATA transmission into call processor work item")
      Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      03fc55ad
    • David Howells's avatar
      rxrpc: Only set/transmit aborts in the I/O thread · a343b174
      David Howells authored
      Only set the abort call completion state in the I/O thread and only
      transmit ABORT packets from there.  rxrpc_abort_call() can then be made to
      actually send the packet.
      
      Further, ABORT packets should only be sent if the call has been exposed to
      the network (ie. at least one attempted DATA transmission has occurred for
      it).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      a343b174