An error occurred fetching the project authors.
  1. 01 Dec, 2022 6 commits
    • David Howells's avatar
      rxrpc: Transmit ACKs at the point of generation · b0346843
      David Howells authored
      For ACKs generated inside the I/O thread, transmit the ACK at the point of
      generation.  Where the ACK is generated outside of the I/O thread, it's
      offloaded to the I/O thread to transmit it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      b0346843
    • David Howells's avatar
      rxrpc: Remove the _bh annotation from all the spinlocks · 3dd9c8b5
      David Howells authored
      None of the spinlocks in rxrpc need a _bh annotation now as the RCU
      callback routines no longer take spinlocks and the bulk of the packet
      wrangling code is now run in the I/O thread, not softirq context.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      3dd9c8b5
    • David Howells's avatar
      rxrpc: Make the I/O thread take over the call and local processor work · 5e6ef4f1
      David Howells authored
      Move the functions from the call->processor and local->processor work items
      into the domain of the I/O thread.
      
      The call event processor, now called from the I/O thread, then takes over
      the job of cranking the call state machine, processing incoming packets and
      transmitting DATA, ACK and ABORT packets.  In a future patch,
      rxrpc_send_ACK() will transmit the ACK on the spot rather than queuing it
      for later transmission.
      
      The call event processor becomes purely received-skb driven.  It only
      transmits things in response to events.  We use "pokes" to queue a dummy
      skb to make it do things like start/resume transmitting data.  Timer expiry
      also results in pokes.
      
      The connection event processor, becomes similar, though crypto events, such
      as dealing with CHALLENGE and RESPONSE packets is offloaded to a work item
      to avoid doing crypto in the I/O thread.
      
      The local event processor is removed and VERSION response packets are
      generated directly from the packet parser.  Similarly, ABORTs generated in
      response to protocol errors will be transmitted immediately rather than
      being pushed onto a queue for later transmission.
      
      Changes:
      ========
      ver #2)
       - Fix a couple of introduced lock context imbalances.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      5e6ef4f1
    • David Howells's avatar
      rxrpc: Move DATA transmission into call processor work item · cf37b598
      David Howells authored
      Move DATA transmission into the call processor work item.  In a future
      patch, this will be called from the I/O thread rather than being itsown
      work item.
      
      This will allow DATA transmission to be driven directly by incoming ACKs,
      pokes and timers as those are processed.
      
      The Tx queue is also split: The queue of packets prepared by sendmsg is now
      places in call->tx_sendmsg and the packet dispatcher decants the packets
      into call->tx_buffer as space becomes available in the transmission
      window.  This allows sendmsg to run ahead of the available space to try and
      prevent an underflow in transmission.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cf37b598
    • David Howells's avatar
      rxrpc: trace: Don't use __builtin_return_address for rxrpc_call tracing · cb0fc0c9
      David Howells authored
      In rxrpc tracing, use enums to generate lists of points of interest rather
      than __builtin_return_address() for the rxrpc_call tracepoint
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cb0fc0c9
    • David Howells's avatar
      rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracing · 47c810a7
      David Howells authored
      In rxrpc tracing, use enums to generate lists of points of interest rather
      than __builtin_return_address() for the rxrpc_peer tracepoint
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      47c810a7
  2. 08 Nov, 2022 7 commits
    • David Howells's avatar
      rxrpc: Fix congestion management · 1fc4fa2a
      David Howells authored
      rxrpc has a problem in its congestion management in that it saves the
      congestion window size (cwnd) from one call to another, but if this is 0 at
      the time is saved, then the next call may not actually manage to ever
      transmit anything.
      
      To this end:
      
       (1) Don't save cwnd between calls, but rather reset back down to the
           initial cwnd and re-enter slow-start if data transmission is idle for
           more than an RTT.
      
       (2) Preserve ssthresh instead, as that is a handy estimate of pipe
           capacity.  Knowing roughly when to stop slow start and enter
           congestion avoidance can reduce the tendency to overshoot and drop
           larger amounts of packets when probing.
      
      In future, cwind growth also needs to be constrained when the window isn't
      being filled due to being application limited.
      Reported-by: default avatarSimon Wilkinson <sxw@auristor.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      1fc4fa2a
    • David Howells's avatar
      rxrpc: Save last ACK's SACK table rather than marking txbufs · d57a3a15
      David Howells authored
      Improve the tracking of which packets need to be transmitted by saving the
      last ACK packet that we receive that has a populated soft-ACK table rather
      than marking packets.  Then we can step through the soft-ACK table and look
      at the packets we've transmitted beyond that to determine which packets we
      might want to retransmit.
      
      We also look at the highest serial number that has been acked to try and
      guess which packets we've transmitted the peer is likely to have seen.  If
      necessary, we send a ping to retrieve that number.
      
      One downside that might be a problem is that we can't then compare the
      previous acked/unacked state so easily in rxrpc_input_soft_acks() - which
      is a potential problem for the slow-start algorithm.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d57a3a15
    • David Howells's avatar
      rxrpc: Don't use a ring buffer for call Tx queue · a4ea4c47
      David Howells authored
      Change the way the Tx queueing works to make the following ends easier to
      achieve:
      
       (1) The filling of packets, the encryption of packets and the transmission
           of packets can be handled in parallel by separate threads, rather than
           rxrpc_sendmsg() allocating, filling, encrypting and transmitting each
           packet before moving onto the next one.
      
       (2) Get rid of the fixed-size ring which sets a hard limit on the number
           of packets that can be retained in the ring.  This allows the number
           of packets to increase without having to allocate a very large ring or
           having variable-sized rings.
      
           [Note: the downside of this is that it's then less efficient to locate
           a packet for retransmission as we then have to step through a list and
           examine each buffer in the list.]
      
       (3) Allow the filler/encrypter to run ahead of the transmission window.
      
       (4) Make it easier to do zero copy UDP from the packet buffers.
      
       (5) Make it easier to do zero copy from userspace to the packet buffers -
           and thence to UDP (only if for unauthenticated connections).
      
      To that end, the following changes are made:
      
       (1) Use the new rxrpc_txbuf struct instead of sk_buff for keeping packets
           to be transmitted in.  This allows them to be placed on multiple
           queues simultaneously.  An sk_buff isn't really necessary as it's
           never passed on to lower-level networking code.
      
       (2) Keep the transmissable packets in a linked list on the call struct
           rather than in a ring.  As a consequence, the annotation buffer isn't
           used either; rather a flag is set on the packet to indicate ackedness.
      
       (3) Use the RXRPC_CALL_TX_LAST flag to indicate that the last packet to be
           transmitted has been queued.  Add RXRPC_CALL_TX_ALL_ACKED to indicate
           that all packets up to and including the last got hard acked.
      
       (4) Wire headers are now stored in the txbuf rather than being concocted
           on the stack and they're stored immediately before the data, thereby
           allowing zerocopy of a single span.
      
       (5) Don't bother with instant-resend on transmission failure; rather,
           leave it for a timer or an ACK packet to trigger.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      a4ea4c47
    • David Howells's avatar
      rxrpc: Clean up ACK handling · 530403d9
      David Howells authored
      Clean up the rxrpc_propose_ACK() function.  If deferred PING ACK proposal
      is split out, it's only really needed for deferred DELAY ACKs.  All other
      ACKs, bar terminal IDLE ACK are sent immediately.  The deferred IDLE ACK
      submission can be handled by conversion of a DELAY ACK into an IDLE ACK if
      there's nothing to be SACK'd.
      
      Also, because there's a delay between an ACK being generated and being
      transmitted, it's possible that other ACKs of the same type will be
      generated during that interval.  Apart from the ACK time and the serial
      number responded to, most of the ACK body, including window and SACK
      parameters, are not filled out till the point of transmission - so we can
      avoid generating a new ACK if there's one pending that will cover the SACK
      data we need to convey.
      
      Therefore, don't propose a new DELAY or IDLE ACK for a call if there's one
      already pending.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      530403d9
    • David Howells's avatar
      rxrpc: Allocate ACK records at proposal and queue for transmission · 72f0c6fb
      David Howells authored
      Allocate rxrpc_txbuf records for ACKs and put onto a queue for the
      transmitter thread to dispatch.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      72f0c6fb
    • David Howells's avatar
      rxrpc: Remove the flags from the rxrpc_skb tracepoint · 27f699cc
      David Howells authored
      Remove the flags from the rxrpc_skb tracepoint as we're no longer going to
      be using this for the transmission buffers and so marking which are
      transmission buffers isn't going to be necessary.
      
      Note that this also remove the rxrpc skb flag that indicates if this is a
      transmission buffer and so the count is not updated for the moment.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      27f699cc
    • David Howells's avatar
      rxrpc: Add stats procfile and DATA packet stats · b0154246
      David Howells authored
      Add a procfile, /proc/net/rxrpc/stats, to display some statistics about
      what rxrpc has been doing.  Writing a blank line to the stats file will
      clear the increment-only counters.  Allocated resource counters don't get
      cleared.
      
      Add some counters to count various things about DATA packets, including the
      number created, transmitted and retransmitted and the number received, the
      number of ACK-requests markings and the number of jumbo packets received.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      b0154246
  3. 25 Aug, 2022 1 commit
    • David Howells's avatar
      rxrpc: Fix locking in rxrpc's sendmsg · b0f571ec
      David Howells authored
      Fix three bugs in the rxrpc's sendmsg implementation:
      
       (1) rxrpc_new_client_call() should release the socket lock when returning
           an error from rxrpc_get_call_slot().
      
       (2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
           held in the event that we're interrupted by a signal whilst waiting
           for tx space on the socket or relocking the call mutex afterwards.
      
           Fix this by: (a) moving the unlock/lock of the call mutex up to
           rxrpc_send_data() such that the lock is not held around all of
           rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
           whether we're return with the lock dropped.  Note that this means
           recvmsg() will not block on this call whilst we're waiting.
      
       (3) After dropping and regaining the call mutex, rxrpc_send_data() needs
           to go and recheck the state of the tx_pending buffer and the
           tx_total_len check in case we raced with another sendmsg() on the same
           call.
      
      Thinking on this some more, it might make sense to have different locks for
      sendmsg() and recvmsg().  There's probably no need to make recvmsg() wait
      for sendmsg().  It does mean that recvmsg() can return MSG_EOR indicating
      that a call is dead before a sendmsg() to that call returns - but that can
      currently happen anyway.
      
      Without fix (2), something like the following can be induced:
      
      	WARNING: bad unlock balance detected!
      	5.16.0-rc6-syzkaller #0 Not tainted
      	-------------------------------------
      	syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
      	[<ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	but there are no more locks to release!
      
      	other info that might help us debug this:
      	no locks held by syz-executor011/3597.
      	...
      	Call Trace:
      	 <TASK>
      	 __dump_stack lib/dump_stack.c:88 [inline]
      	 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      	 print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
      	 __lock_release kernel/locking/lockdep.c:5306 [inline]
      	 lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
      	 __mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
      	 rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
      	 sock_sendmsg_nosec net/socket.c:704 [inline]
      	 sock_sendmsg+0xcf/0x120 net/socket.c:724
      	 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
      	 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
      	 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
      	 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      	 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      [Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      cc: Hawkins Jiawei <yin31149@gmail.com>
      cc: Khalid Masum <khalid.masum.92@gmail.com>
      cc: Dan Carpenter <dan.carpenter@oracle.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b0f571ec
  4. 22 May, 2022 1 commit
    • David Howells's avatar
      rxrpc: Return an error to sendmsg if call failed · 4ba68c51
      David Howells authored
      If at the end of rxrpc sendmsg() or rxrpc_kernel_send_data() the call that
      was being given data was aborted remotely or otherwise failed, return an
      error rather than returning the amount of data buffered for transmission.
      
      The call (presumably) did not complete, so there's not much point
      continuing with it.  AF_RXRPC considers it "complete" and so will be
      unwilling to do anything else with it - and won't send a notification for
      it, deeming the return from sendmsg sufficient.
      
      Not returning an error causes afs to incorrectly handle a StoreData
      operation that gets interrupted by a change of address due to NAT
      reconfiguration.
      
      This doesn't normally affect most operations since their request parameters
      tend to fit into a single UDP packet and afs_make_call() returns before the
      server responds; StoreData is different as it involves transmission of a
      lot of data.
      
      This can be triggered on a client by doing something like:
      
      	dd if=/dev/zero of=/afs/example.com/foo bs=1M count=512
      
      at one prompt, and then changing the network address at another prompt,
      e.g.:
      
      	ifconfig enp6s0 inet 192.168.6.2 && route add 192.168.6.1 dev enp6s0
      
      Tracing packets on an Auristor fileserver looks something like:
      
      192.168.6.1 -> 192.168.6.3  RX 107 ACK Idle  Seq: 0  Call: 4  Source Port: 7000  Destination Port: 7001
      192.168.6.3 -> 192.168.6.1  AFS (RX) 1482 FS Request: Unknown(64538) (64538)
      192.168.6.3 -> 192.168.6.1  AFS (RX) 1482 FS Request: Unknown(64538) (64538)
      192.168.6.1 -> 192.168.6.3  RX 107 ACK Idle  Seq: 0  Call: 4  Source Port: 7000  Destination Port: 7001
      <ARP exchange for 192.168.6.2>
      192.168.6.2 -> 192.168.6.1  AFS (RX) 1482 FS Request: Unknown(0) (0)
      192.168.6.2 -> 192.168.6.1  AFS (RX) 1482 FS Request: Unknown(0) (0)
      192.168.6.1 -> 192.168.6.2  RX 107 ACK Exceeds Window  Seq: 0  Call: 4  Source Port: 7000  Destination Port: 7001
      192.168.6.1 -> 192.168.6.2  RX 74 ABORT  Seq: 0  Call: 4  Source Port: 7000  Destination Port: 7001
      192.168.6.1 -> 192.168.6.2  RX 74 ABORT  Seq: 29321  Call: 4  Source Port: 7000  Destination Port: 7001
      
      The Auristor fileserver logs code -453 (RXGEN_SS_UNMARSHAL), but the abort
      code received by kafs is -5 (RX_PROTOCOL_ERROR) as the rx layer sees the
      condition and generates an abort first and the unmarshal error is a
      consequence of that at the application layer.
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004810.html # v1
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ba68c51
  5. 23 Nov, 2020 2 commits
  6. 05 Oct, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix accept on a connection that need securing · 2d914c1b
      David Howells authored
      When a new incoming call arrives at an userspace rxrpc socket on a new
      connection that has a security class set, the code currently pushes it onto
      the accept queue to hold a ref on it for the socket.  This doesn't work,
      however, as recvmsg() pops it off, notices that it's in the SERVER_SECURING
      state and discards the ref.  This means that the call runs out of refs too
      early and the kernel oopses.
      
      By contrast, a kernel rxrpc socket manually pre-charges the incoming call
      pool with calls that already have user call IDs assigned, so they are ref'd
      by the call tree on the socket.
      
      Change the mode of operation for userspace rxrpc server sockets to work
      like this too.  Although this is a UAPI change, server sockets aren't
      currently functional.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      2d914c1b
  7. 23 Aug, 2020 1 commit
  8. 30 Jul, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix race between recvmsg and sendmsg on immediate call failure · 65550098
      David Howells authored
      There's a race between rxrpc_sendmsg setting up a call, but then failing to
      send anything on it due to an error, and recvmsg() seeing the call
      completion occur and trying to return the state to the user.
      
      An assertion fails in rxrpc_recvmsg() because the call has already been
      released from the socket and is about to be released again as recvmsg deals
      with it.  (The recvmsg_q queue on the socket holds a ref, so there's no
      problem with use-after-free.)
      
      We also have to be careful not to end up reporting an error twice, in such
      a way that both returns indicate to userspace that the user ID supplied
      with the call is no longer in use - which could cause the client to
      malfunction if it recycles the user ID fast enough.
      
      Fix this by the following means:
      
       (1) When sendmsg() creates a call after the point that the call has been
           successfully added to the socket, don't return any errors through
           sendmsg(), but rather complete the call and let recvmsg() retrieve
           them.  Make sendmsg() return 0 at this point.  Further calls to
           sendmsg() for that call will fail with ESHUTDOWN.
      
           Note that at this point, we haven't send any packets yet, so the
           server doesn't yet know about the call.
      
       (2) If sendmsg() returns an error when it was expected to create a new
           call, it means that the user ID wasn't used.
      
       (3) Mark the call disconnected before marking it completed to prevent an
           oops in rxrpc_release_call().
      
       (4) recvmsg() will then retrieve the error and set MSG_EOR to indicate
           that the user ID is no longer known by the kernel.
      
      An oops like the following is produced:
      
      	kernel BUG at net/rxrpc/recvmsg.c:605!
      	...
      	RIP: 0010:rxrpc_recvmsg+0x256/0x5ae
      	...
      	Call Trace:
      	 ? __init_waitqueue_head+0x2f/0x2f
      	 ____sys_recvmsg+0x8a/0x148
      	 ? import_iovec+0x69/0x9c
      	 ? copy_msghdr_from_user+0x5c/0x86
      	 ___sys_recvmsg+0x72/0xaa
      	 ? __fget_files+0x22/0x57
      	 ? __fget_light+0x46/0x51
      	 ? fdget+0x9/0x1b
      	 do_recvmmsg+0x15e/0x232
      	 ? _raw_spin_unlock+0xa/0xb
      	 ? vtime_delta+0xf/0x25
      	 __x64_sys_recvmmsg+0x2c/0x2f
      	 do_syscall_64+0x4c/0x78
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 357f5ef6 ("rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()")
      Reported-by: syzbot+b54969381df354936d96@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65550098
  9. 21 Jul, 2020 1 commit
  10. 05 Jun, 2020 2 commits
    • David Howells's avatar
      rxrpc: Fix missing notification · 5ac0d622
      David Howells authored
      Under some circumstances, rxrpc will fail a transmit a packet through the
      underlying UDP socket (ie. UDP sendmsg returns an error).  This may result
      in a call getting stuck.
      
      In the instance being seen, where AFS tries to send a probe to the Volume
      Location server, tracepoints show the UDP Tx failure (in this case returing
      error 99 EADDRNOTAVAIL) and then nothing more:
      
       afs_make_vl_call: c=0000015d VL.GetCapabilities
       rxrpc_call: c=0000015d NWc u=1 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000dd89ee8a
       rxrpc_call: c=0000015d Gus u=2 sp=rxrpc_new_client_call+0x14f/0x580 [rxrpc] a=00000000e20e4b08
       rxrpc_call: c=0000015d SEE u=2 sp=rxrpc_activate_one_channel+0x7b/0x1c0 [rxrpc] a=00000000e20e4b08
       rxrpc_call: c=0000015d CON u=2 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000e20e4b08
       rxrpc_tx_fail: c=0000015d r=1 ret=-99 CallDataNofrag
      
      The problem is that if the initial packet fails and the retransmission
      timer hasn't been started, the call is set to completed and an error is
      returned from rxrpc_send_data_packet() to rxrpc_queue_packet().  Though
      rxrpc_instant_resend() is called, this does nothing because the call is
      marked completed.
      
      So rxrpc_notify_socket() isn't called and the error is passed back up to
      rxrpc_send_data(), rxrpc_kernel_send_data() and thence to afs_make_call()
      and afs_vl_get_capabilities() where it is simply ignored because it is
      assumed that the result of a probe will be collected asynchronously.
      
      Fileserver probing is similarly affected via afs_fs_get_capabilities().
      
      Fix this by always issuing a notification in __rxrpc_set_call_completion()
      if it shifts a call to the completed state, even if an error is also
      returned to the caller through the function return value.
      
      Also put in a little bit of optimisation to avoid taking the call
      state_lock and disabling softirqs if the call is already in the completed
      state and remove some now redundant rxrpc_notify_socket() calls.
      
      Fixes: f5c17aae ("rxrpc: Calls should only have one terminal state")
      Reported-by: default avatarGerry Seidman <gerry@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      5ac0d622
    • David Howells's avatar
      rxrpc: Move the call completion handling out of line · 3067bf8c
      David Howells authored
      Move the handling of call completion out of line so that the next patch can
      add more code in that area.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      3067bf8c
  11. 11 May, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix the excessive initial retransmission timeout · c410bf01
      David Howells authored
      rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
      sufficiently sampled.  This can cause problems with some fileservers with
      calls to the cache manager in the afs filesystem being dropped from the
      fileserver because a packet goes missing and the retransmission timeout is
      greater than the call expiry timeout.
      
      Fix this by:
      
       (1) Copying the RTT/RTO calculation code from Linux's TCP implementation
           and altering it to fit rxrpc.
      
       (2) Altering the various users of the RTT to make use of the new SRTT
           value.
      
       (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
           value instead (which is needed in jiffies), along with a backoff.
      
      Notes:
      
       (1) rxrpc provides RTT samples by matching the serial numbers on outgoing
           DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
           against the reference serial number in incoming REQUESTED ACK and
           PING-RESPONSE ACK packets.
      
       (2) Each packet that is transmitted on an rxrpc connection gets a new
           per-connection serial number, even for retransmissions, so an ACK can
           be cross-referenced to a specific trigger packet.  This allows RTT
           information to be drawn from retransmitted DATA packets also.
      
       (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
           on an rxrpc_call because many RPC calls won't live long enough to
           generate more than one sample.
      
       (4) The calculated SRTT value is in units of 8ths of a microsecond rather
           than nanoseconds.
      
      The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.
      
      Fixes: 17926a79 ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c410bf01
  12. 13 Mar, 2020 3 commits
    • David Howells's avatar
      rxrpc: Fix sendmsg(MSG_WAITALL) handling · 498b5776
      David Howells authored
      Fix the handling of sendmsg() with MSG_WAITALL for userspace to round the
      timeout for when a signal occurs up to at least two jiffies as a 1 jiffy
      timeout may end up being effectively 0 if jiffies wraps at the wrong time.
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      498b5776
    • David Howells's avatar
      rxrpc: Fix call interruptibility handling · e138aa7d
      David Howells authored
      Fix the interruptibility of kernel-initiated client calls so that they're
      either only interruptible when they're waiting for a call slot to come
      available or they're not interruptible at all.  Either way, they're not
      interruptible during transmission.
      
      This should help prevent StoreData calls from being interrupted when
      writeback is in progress.  It doesn't, however, handle interruption during
      the receive phase.
      
      Userspace-initiated calls are still interruptable.  After the signal has
      been handled, sendmsg() will return the amount of data copied out of the
      buffer and userspace can perform another sendmsg() call to continue
      transmission.
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e138aa7d
    • David Howells's avatar
      rxrpc: Abstract out the calculation of whether there's Tx space · 158fe666
      David Howells authored
      Abstract out the calculation of there being sufficient Tx buffer space.
      This is reproduced several times in the rxrpc sendmsg code.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      158fe666
  13. 07 Oct, 2019 2 commits
    • David Howells's avatar
      rxrpc: Fix call crypto state cleanup · 91fcfbe8
      David Howells authored
      Fix the cleanup of the crypto state on a call after the call has been
      disconnected.  As the call has been disconnected, its connection ref has
      been discarded and so we can't go through that to get to the security ops
      table.
      
      Fix this by caching the security ops pointer in the rxrpc_call struct and
      using that when freeing the call security state.  Also use this in other
      places we're dealing with call-specific security.
      
      The symptoms look like:
      
          BUG: KASAN: use-after-free in rxrpc_release_call+0xb2d/0xb60
          net/rxrpc/call_object.c:481
          Read of size 8 at addr ffff888062ffeb50 by task syz-executor.5/4764
      
      Fixes: 1db88c53 ("rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto")
      Reported-by: syzbot+eed305768ece6682bb7f@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      91fcfbe8
    • David Howells's avatar
      rxrpc: Fix call ref leak · c48fc11b
      David Howells authored
      When sendmsg() finds a call to continue on with, if the call is in an
      inappropriate state, it doesn't release the ref it just got on that call
      before returning an error.
      
      This causes the following symptom to show up with kasan:
      
      	BUG: KASAN: use-after-free in rxrpc_send_keepalive+0x8a2/0x940
      	net/rxrpc/output.c:635
      	Read of size 8 at addr ffff888064219698 by task kworker/0:3/11077
      
      where line 635 is:
      
      	whdr.epoch	= htonl(peer->local->rxnet->epoch);
      
      The local endpoint (which cannot be pinned by the call) has been released,
      but not the peer (which is pinned by the call).
      
      Fix this by releasing the call in the error path.
      
      Fixes: 37411cad ("rxrpc: Fix potential NULL-pointer exception")
      Reported-by: syzbot+d850c266e3df14da1d31@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c48fc11b
  14. 27 Aug, 2019 2 commits
  15. 30 Jul, 2019 1 commit
    • David Howells's avatar
      rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet · c69565ee
      David Howells authored
      Fix the fact that a notification isn't sent to the recvmsg side to indicate
      a call failed when sendmsg() fails to transmit a DATA packet with the error
      ENETUNREACH, EHOSTUNREACH or ECONNREFUSED.
      
      Without this notification, the afs client just sits there waiting for the
      call to complete in some manner (which it's not now going to do), which
      also pins the rxrpc call in place.
      
      This can be seen if the client has a scope-level IPv6 address, but not a
      global-level IPv6 address, and we try and transmit an operation to a
      server's IPv6 address.
      
      Looking in /proc/net/rxrpc/calls shows completed calls just sat there with
      an abort code of RX_USER_ABORT and an error code of -ENETUNREACH.
      
      Fixes: c54e43d7 ("rxrpc: Fix missing start of call timeout")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      c69565ee
  16. 24 May, 2019 1 commit
  17. 16 May, 2019 1 commit
    • David Howells's avatar
      rxrpc: Allow the kernel to mark a call as being non-interruptible · b960a34b
      David Howells authored
      Allow kernel services using AF_RXRPC to indicate that a call should be
      non-interruptible.  This allows kafs to make things like lock-extension and
      writeback data storage calls non-interruptible.
      
      If this is set, signals will be ignored for operations on that call where
      possible - such as waiting to get a call channel on an rxrpc connection.
      
      It doesn't prevent UDP sendmsg from being interrupted, but that will be
      handled by packet retransmission.
      
      rxrpc_kernel_recv_data() isn't affected by this since that never waits,
      preferring instead to return -EAGAIN and leave the waiting to the caller.
      
      Userspace initiated calls can't be set to be uninterruptible at this time.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      b960a34b
  18. 12 Apr, 2019 1 commit
  19. 16 Jan, 2019 1 commit
    • David Howells's avatar
      Revert "rxrpc: Allow failed client calls to be retried" · e122d845
      David Howells authored
      The changes introduced to allow rxrpc calls to be retried creates an issue
      when it comes to refcounting afs_call structs.  The problem is that when
      rxrpc_send_data() queues the last packet for an asynchronous call, the
      following sequence can occur:
      
       (1) The notify_end_tx callback is invoked which causes the state in the
           afs_call to be changed from AFS_CALL_CL_REQUESTING or
           AFS_CALL_SV_REPLYING.
      
       (2) afs_deliver_to_call() can then process event notifications from rxrpc
           on the async_work queue.
      
       (3) Delivery of events, such as an abort from the server, can cause the
           afs_call state to be changed to AFS_CALL_COMPLETE on async_work.
      
       (4) For an asynchronous call, afs_process_async_call() notes that the call
           is complete and tried to clean up all the refs on async_work.
      
       (5) rxrpc_send_data() might return the amount of data transferred
           (success) or an error - which could in turn reflect a local error or a
           received error.
      
      Synchronising the clean up after rxrpc_kernel_send_data() returns an error
      with the asynchronous cleanup is then tricky to get right.
      
      Mostly revert commit c038a58c.  The two API
      functions the original commit added aren't currently used.  This makes
      rxrpc_kernel_send_data() always return successfully if it queued the data
      it was given.
      
      Note that this doesn't affect synchronous calls since their Rx notification
      function merely pokes a wait queue and does not refcounting.  The
      asynchronous call notification function *has* to do refcounting and pass a
      ref over the work item to avoid the need to sync the workqueue in call
      cleanup.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e122d845
  20. 10 May, 2018 1 commit
    • David Howells's avatar
      rxrpc: Fix missing start of call timeout · c54e43d7
      David Howells authored
      The expect_rx_by call timeout is supposed to be set when a call is started
      to indicate that we need to receive a packet by that point.  This is
      currently put back every time we receive a packet, but it isn't started
      when we first send a packet.  Without this, the call may wait forever if
      the server doesn't deign to reply.
      
      Fix this by setting the timeout upon a successful UDP sendmsg call for the
      first DATA packet.  The timeout is initiated only for initial transmission
      and not for subsequent retries as we don't want the retry mechanism to
      extend the timeout indefinitely.
      
      Fixes: a158bdd3 ("rxrpc: Fix call timeouts")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c54e43d7
  21. 30 Mar, 2018 3 commits
    • David Howells's avatar
      rxrpc: Fix leak of rxrpc_peer objects · 17226f12
      David Howells authored
      When a new client call is requested, an rxrpc_conn_parameters struct object
      is passed in with a bunch of parameters set, such as the local endpoint to
      use.  A pointer to the target peer record is also placed in there by
      rxrpc_get_client_conn() - and this is removed if and only if a new
      connection object is allocated.  Thus it leaks if a new connection object
      isn't allocated.
      
      Fix this by putting any peer object attached to the rxrpc_conn_parameters
      object in the function that allocated it.
      
      Fixes: 19ffa01c ("rxrpc: Use structs to hold connection params and protocol info")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      17226f12
    • David Howells's avatar
      rxrpc: Fix checker warnings and errors · 88f2a825
      David Howells authored
      Fix various issues detected by checker.
      
      Errors:
      
       (*) rxrpc_discard_prealloc() should be using rcu_assign_pointer to set
           call->socket.
      
      Warnings:
      
       (*) rxrpc_service_connection_reaper() should be passing NULL rather than 0 to
           trace_rxrpc_conn() as the where argument.
      
       (*) rxrpc_disconnect_client_call() should get its net pointer via the
           call->conn rather than call->sock to avoid a warning about accessing
           an RCU pointer without protection.
      
       (*) Proc seq start/stop functions need annotation as they pass locks
           between the functions.
      
      False positives:
      
       (*) Checker doesn't correctly handle of seq-retry lock context balance in
           rxrpc_find_service_conn_rcu().
      
       (*) Checker thinks execution may proceed past the BUG() in
           rxrpc_publish_service_conn().
      
       (*) Variable length array warnings from SKCIPHER_REQUEST_ON_STACK() in
           rxkad.c.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      88f2a825
    • David Howells's avatar
      rxrpc: Fix Tx ring annotation after initial Tx failure · 03877bf6
      David Howells authored
      rxrpc calls have a ring of packets that are awaiting ACK or retransmission
      and a parallel ring of annotations that tracks the state of those packets.
      If the initial transmission of a packet on the underlying UDP socket fails
      then the packet annotation is marked for resend - but the setting of this
      mark accidentally erases the last-packet mark also stored in the same
      annotation slot.  If this happens, a call won't switch out of the Tx phase
      when all the packets have been transmitted.
      
      Fix this by retaining the last-packet mark and only altering the packet
      state.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      03877bf6