An error occurred fetching the project authors.
  1. 01 Dec, 2022 7 commits
    • David Howells's avatar
      rxrpc: Move packet reception processing into I/O thread · 446b3e14
      David Howells authored
      Split the packet input handler to make the softirq side just dump the
      received packet into the local endpoint receive queue and then call the
      remainder of the input function from the I/O thread.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      446b3e14
    • David Howells's avatar
      rxrpc: Don't hold a ref for call timer or workqueue · 3feda9d6
      David Howells authored
      Currently, rxrpc gives the call timer a ref on the call when it starts it
      and this is passed along to the workqueue by the timer expiration function.
      The problem comes when queue_work() fails (ie. the work item is already
      queued): the timer routine must put the ref - but this may cause the
      cleanup code to run.
      
      This has the unfortunate effect that the cleanup code may then be run in
      softirq context - which means that any spinlocks it might need to touch
      have to be guarded to disable softirqs (ie. they need a "_bh" suffix).
      
      Fix this by:
      
       (1) Don't give a ref to the timer.
      
       (2) Making the expiration function not do anything if the refcount is 0.
           Note that this is more of an optimisation.
      
       (3) Make sure that the cleanup routine waits for timer to complete.
      
      However, this has a consequence that timer cannot give a ref to the work
      item.  Therefore the following fixes are also necessary:
      
       (4) Don't give a ref to the work item.
      
       (5) Make the work item return asap if it sees the ref count is 0.
      
       (6) Make sure that the cleanup routine waits for the work item to
           complete.
      
      Unfortunately, neither the timer nor the work item can simply get around
      the problem by just using refcount_inc_not_zero() as the waits would still
      have to be done, and there would still be the possibility of having to put
      the ref in the expiration function.
      
      Note the call work item is going to go away with the work being transferred
      to the I/O thread, so the wait in (6) will become obsolete.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      3feda9d6
    • David Howells's avatar
      rxrpc: trace: Don't use __builtin_return_address for sk_buff tracing · 9a36a6bc
      David Howells authored
      In rxrpc tracing, use enums to generate lists of points of interest rather
      than __builtin_return_address() for the sk_buff tracepoint.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      9a36a6bc
    • David Howells's avatar
      rxrpc: trace: Don't use __builtin_return_address for rxrpc_call tracing · cb0fc0c9
      David Howells authored
      In rxrpc tracing, use enums to generate lists of points of interest rather
      than __builtin_return_address() for the rxrpc_call tracepoint
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cb0fc0c9
    • David Howells's avatar
      rxrpc: trace: Don't use __builtin_return_address for rxrpc_local tracing · 0fde882f
      David Howells authored
      In rxrpc tracing, use enums to generate lists of points of interest rather
      than __builtin_return_address() for the rxrpc_local tracepoint
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      0fde882f
    • David Howells's avatar
      rxrpc: Drop rxrpc_conn_parameters from rxrpc_connection and rxrpc_bundle · 2cc80086
      David Howells authored
      Remove the rxrpc_conn_parameters struct from the rxrpc_connection and
      rxrpc_bundle structs and emplace the members directly.  These are going to
      get filled in from the rxrpc_call struct in future.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      2cc80086
    • David Howells's avatar
      rxrpc: Fix call leak · 49df54a6
      David Howells authored
      When retransmitting a packet, rxrpc_resend() shouldn't be attaching a ref
      to the call to the txbuf as that pins the call and prevents the call from
      clearing the packet buffer.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Fixes: d57a3a15 ("rxrpc: Save last ACK's SACK table rather than marking txbufs")
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      49df54a6
  2. 08 Nov, 2022 7 commits
    • David Howells's avatar
      rxrpc: Save last ACK's SACK table rather than marking txbufs · d57a3a15
      David Howells authored
      Improve the tracking of which packets need to be transmitted by saving the
      last ACK packet that we receive that has a populated soft-ACK table rather
      than marking packets.  Then we can step through the soft-ACK table and look
      at the packets we've transmitted beyond that to determine which packets we
      might want to retransmit.
      
      We also look at the highest serial number that has been acked to try and
      guess which packets we've transmitted the peer is likely to have seen.  If
      necessary, we send a ping to retrieve that number.
      
      One downside that might be a problem is that we can't then compare the
      previous acked/unacked state so easily in rxrpc_input_soft_acks() - which
      is a potential problem for the slow-start algorithm.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      d57a3a15
    • David Howells's avatar
      rxrpc: Remove call->lock · 4e76bd40
      David Howells authored
      call->lock is no longer necessary, so remove it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      4e76bd40
    • David Howells's avatar
      rxrpc: Don't use a ring buffer for call Tx queue · a4ea4c47
      David Howells authored
      Change the way the Tx queueing works to make the following ends easier to
      achieve:
      
       (1) The filling of packets, the encryption of packets and the transmission
           of packets can be handled in parallel by separate threads, rather than
           rxrpc_sendmsg() allocating, filling, encrypting and transmitting each
           packet before moving onto the next one.
      
       (2) Get rid of the fixed-size ring which sets a hard limit on the number
           of packets that can be retained in the ring.  This allows the number
           of packets to increase without having to allocate a very large ring or
           having variable-sized rings.
      
           [Note: the downside of this is that it's then less efficient to locate
           a packet for retransmission as we then have to step through a list and
           examine each buffer in the list.]
      
       (3) Allow the filler/encrypter to run ahead of the transmission window.
      
       (4) Make it easier to do zero copy UDP from the packet buffers.
      
       (5) Make it easier to do zero copy from userspace to the packet buffers -
           and thence to UDP (only if for unauthenticated connections).
      
      To that end, the following changes are made:
      
       (1) Use the new rxrpc_txbuf struct instead of sk_buff for keeping packets
           to be transmitted in.  This allows them to be placed on multiple
           queues simultaneously.  An sk_buff isn't really necessary as it's
           never passed on to lower-level networking code.
      
       (2) Keep the transmissable packets in a linked list on the call struct
           rather than in a ring.  As a consequence, the annotation buffer isn't
           used either; rather a flag is set on the packet to indicate ackedness.
      
       (3) Use the RXRPC_CALL_TX_LAST flag to indicate that the last packet to be
           transmitted has been queued.  Add RXRPC_CALL_TX_ALL_ACKED to indicate
           that all packets up to and including the last got hard acked.
      
       (4) Wire headers are now stored in the txbuf rather than being concocted
           on the stack and they're stored immediately before the data, thereby
           allowing zerocopy of a single span.
      
       (5) Don't bother with instant-resend on transmission failure; rather,
           leave it for a timer or an ACK packet to trigger.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      a4ea4c47
    • David Howells's avatar
      rxrpc: Clean up ACK handling · 530403d9
      David Howells authored
      Clean up the rxrpc_propose_ACK() function.  If deferred PING ACK proposal
      is split out, it's only really needed for deferred DELAY ACKs.  All other
      ACKs, bar terminal IDLE ACK are sent immediately.  The deferred IDLE ACK
      submission can be handled by conversion of a DELAY ACK into an IDLE ACK if
      there's nothing to be SACK'd.
      
      Also, because there's a delay between an ACK being generated and being
      transmitted, it's possible that other ACKs of the same type will be
      generated during that interval.  Apart from the ACK time and the serial
      number responded to, most of the ACK body, including window and SACK
      parameters, are not filled out till the point of transmission - so we can
      avoid generating a new ACK if there's one pending that will cover the SACK
      data we need to convey.
      
      Therefore, don't propose a new DELAY or IDLE ACK for a call if there's one
      already pending.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      530403d9
    • David Howells's avatar
      rxrpc: Allocate ACK records at proposal and queue for transmission · 72f0c6fb
      David Howells authored
      Allocate rxrpc_txbuf records for ACKs and put onto a queue for the
      transmitter thread to dispatch.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      72f0c6fb
    • David Howells's avatar
      rxrpc: Record statistics about ACK types · f2a676d1
      David Howells authored
      Record statistics about the different types of ACKs that have been
      transmitted and received and the number of ACKs that have been filled out
      and transmitted or that have been skipped.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      f2a676d1
    • David Howells's avatar
      rxrpc: Add stats procfile and DATA packet stats · b0154246
      David Howells authored
      Add a procfile, /proc/net/rxrpc/stats, to display some statistics about
      what rxrpc has been doing.  Writing a blank line to the stats file will
      clear the increment-only counters.  Allocated resource counters don't get
      cleared.
      
      Add some counters to count various things about DATA packets, including the
      number created, transmitted and retransmitted and the number received, the
      number of ACK-requests markings and the number of jumbo packets received.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      b0154246
  3. 01 Sep, 2022 1 commit
  4. 22 May, 2022 2 commits
    • David Howells's avatar
      rxrpc: Don't try to resend the request if we're receiving the reply · 114af61f
      David Howells authored
      rxrpc has a timer to trigger resending of unacked data packets in a call.
      This is not cancelled when a client call switches to the receive phase on
      the basis that most calls don't last long enough for it to ever expire.
      However, if it *does* expire after we've started to receive the reply, we
      shouldn't then go into trying to retransmit or pinging the server to find
      out if an ack got lost.
      
      Fix this by skipping the resend code if we're into receiving the reply to a
      client call.
      
      Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      114af61f
    • David Howells's avatar
      rxrpc, afs: Fix selection of abort codes · de696c47
      David Howells authored
      The RX_USER_ABORT code should really only be used to indicate that the user
      of the rxrpc service (ie. userspace) implicitly caused a call to be aborted
      - for instance if the AF_RXRPC socket is closed whilst the call was in
      progress.  (The user may also explicitly abort a call and specify the abort
      code to use).
      
      Change some of the points of generation to use other abort codes instead:
      
       (1) Abort the call with RXGEN_SS_UNMARSHAL or RXGEN_CC_UNMARSHAL if we see
           ENOMEM and EFAULT during received data delivery and abort with
           RX_CALL_DEAD in the default case.
      
       (2) Abort with RXGEN_SS_MARSHAL if we get ENOMEM whilst trying to send a
           reply.
      
       (3) Abort with RX_CALL_DEAD if we stop hearing from the peer if we had
           heard from the peer and abort with RX_CALL_TIMEOUT if we hadn't.
      
       (4) Abort with RX_CALL_DEAD if we try to disconnect a call that's not
           completed successfully or been aborted.
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de696c47
  5. 31 Mar, 2022 1 commit
  6. 22 Jan, 2022 1 commit
  7. 17 Jun, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix afs large storage transmission performance drop · 02c28dff
      David Howells authored
      Commit 2ad6691d, which moved the modification of the status annotation
      for a packet in the Tx buffer prior to the retransmission moved the state
      clearance, but managed to lose the bit that set it to UNACK.
      
      Consequently, if a retransmission occurs, the packet is accidentally
      changed to the ACK state (ie. 0) by masking it off, which means that the
      packet isn't counted towards the tally of newly-ACK'd packets if it gets
      hard-ACK'd.  This then prevents the congestion control algorithm from
      recovering properly.
      
      Fix by reinstating the change of state to UNACK.
      
      Spotted by the generic/460 xfstest.
      
      Fixes: 2ad6691d ("rxrpc: Fix race between incoming ACK parser and retransmitter")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      02c28dff
  8. 12 Jun, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix race between incoming ACK parser and retransmitter · 2ad6691d
      David Howells authored
      There's a race between the retransmission code and the received ACK parser.
      The problem is that the retransmission loop has to drop the lock under
      which it is iterating through the transmission buffer in order to transmit
      a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
      window round and discard the packets from the buffer.
      
      The retransmission code then updated the annotations for the wrong packet
      and a later retransmission thought it had to retransmit a packet that
      wasn't there, leading to a NULL pointer dereference.
      
      Fix this by:
      
       (1) Moving the annotation change to before we drop the lock prior to
           transmission.  This means we can't vary the annotation depending on
           the outcome of the transmission, but that's fine - we'll retransmit
           again later if it failed now.
      
       (2) Skipping the packet if the skb pointer is NULL.
      
      The following oops was seen:
      
      	BUG: kernel NULL pointer dereference, address: 000000000000002d
      	Workqueue: krxrpcd rxrpc_process_call
      	RIP: 0010:rxrpc_get_skb+0x14/0x8a
      	...
      	Call Trace:
      	 rxrpc_resend+0x331/0x41e
      	 ? get_vtime_delta+0x13/0x20
      	 rxrpc_process_call+0x3c0/0x4ac
      	 process_one_work+0x18f/0x27f
      	 worker_thread+0x1a3/0x247
      	 ? create_worker+0x17d/0x17d
      	 kthread+0xe6/0xeb
      	 ? kthread_delayed_work_timer_fn+0x83/0x83
      	 ret_from_fork+0x1f/0x30
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ad6691d
  9. 05 Jun, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix missing notification · 5ac0d622
      David Howells authored
      Under some circumstances, rxrpc will fail a transmit a packet through the
      underlying UDP socket (ie. UDP sendmsg returns an error).  This may result
      in a call getting stuck.
      
      In the instance being seen, where AFS tries to send a probe to the Volume
      Location server, tracepoints show the UDP Tx failure (in this case returing
      error 99 EADDRNOTAVAIL) and then nothing more:
      
       afs_make_vl_call: c=0000015d VL.GetCapabilities
       rxrpc_call: c=0000015d NWc u=1 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000dd89ee8a
       rxrpc_call: c=0000015d Gus u=2 sp=rxrpc_new_client_call+0x14f/0x580 [rxrpc] a=00000000e20e4b08
       rxrpc_call: c=0000015d SEE u=2 sp=rxrpc_activate_one_channel+0x7b/0x1c0 [rxrpc] a=00000000e20e4b08
       rxrpc_call: c=0000015d CON u=2 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000e20e4b08
       rxrpc_tx_fail: c=0000015d r=1 ret=-99 CallDataNofrag
      
      The problem is that if the initial packet fails and the retransmission
      timer hasn't been started, the call is set to completed and an error is
      returned from rxrpc_send_data_packet() to rxrpc_queue_packet().  Though
      rxrpc_instant_resend() is called, this does nothing because the call is
      marked completed.
      
      So rxrpc_notify_socket() isn't called and the error is passed back up to
      rxrpc_send_data(), rxrpc_kernel_send_data() and thence to afs_make_call()
      and afs_vl_get_capabilities() where it is simply ignored because it is
      assumed that the result of a probe will be collected asynchronously.
      
      Fileserver probing is similarly affected via afs_fs_get_capabilities().
      
      Fix this by always issuing a notification in __rxrpc_set_call_completion()
      if it shifts a call to the completed state, even if an error is also
      returned to the caller through the function return value.
      
      Also put in a little bit of optimisation to avoid taking the call
      state_lock and disabling softirqs if the call is already in the completed
      state and remove some now redundant rxrpc_notify_socket() calls.
      
      Fixes: f5c17aae ("rxrpc: Calls should only have one terminal state")
      Reported-by: default avatarGerry Seidman <gerry@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      5ac0d622
  10. 11 May, 2020 1 commit
    • David Howells's avatar
      rxrpc: Fix the excessive initial retransmission timeout · c410bf01
      David Howells authored
      rxrpc currently uses a fixed 4s retransmission timeout until the RTT is
      sufficiently sampled.  This can cause problems with some fileservers with
      calls to the cache manager in the afs filesystem being dropped from the
      fileserver because a packet goes missing and the retransmission timeout is
      greater than the call expiry timeout.
      
      Fix this by:
      
       (1) Copying the RTT/RTO calculation code from Linux's TCP implementation
           and altering it to fit rxrpc.
      
       (2) Altering the various users of the RTT to make use of the new SRTT
           value.
      
       (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO
           value instead (which is needed in jiffies), along with a backoff.
      
      Notes:
      
       (1) rxrpc provides RTT samples by matching the serial numbers on outgoing
           DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets
           against the reference serial number in incoming REQUESTED ACK and
           PING-RESPONSE ACK packets.
      
       (2) Each packet that is transmitted on an rxrpc connection gets a new
           per-connection serial number, even for retransmissions, so an ACK can
           be cross-referenced to a specific trigger packet.  This allows RTT
           information to be drawn from retransmitted DATA packets also.
      
       (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than
           on an rxrpc_call because many RPC calls won't live long enough to
           generate more than one sample.
      
       (4) The calculated SRTT value is in units of 8ths of a microsecond rather
           than nanoseconds.
      
      The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers.
      
      Fixes: 17926a79 ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c410bf01
  11. 27 Aug, 2019 1 commit
    • David Howells's avatar
      rxrpc: Use the tx-phase skb flag to simplify tracing · 987db9f7
      David Howells authored
      Use the previously-added transmit-phase skbuff private flag to simplify the
      socket buffer tracing a bit.  Which phase the skbuff comes from can now be
      divined from the skb rather than having to be guessed from the call state.
      
      We can also reduce the number of rxrpc_skb_trace values by eliminating the
      difference between Tx and Rx in the symbols.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      987db9f7
  12. 09 Aug, 2019 1 commit
  13. 30 May, 2019 1 commit
  14. 03 Nov, 2018 1 commit
    • David Howells's avatar
      rxrpc: Fix lockup due to no error backoff after ack transmit error · c7e86acf
      David Howells authored
      If the network becomes (partially) unavailable, say by disabling IPv6, the
      background ACK transmission routine can get itself into a tizzy by
      proposing immediate ACK retransmission.  Since we're in the call event
      processor, that happens immediately without returning to the workqueue
      manager.
      
      The condition should clear after a while when either the network comes back
      or the call times out.
      
      Fix this by:
      
       (1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
           This will allow a certain amount of time to elapse before we try
           again.
      
       (2) Enforce a return to the workqueue manager after a certain number of
           iterations of the call processing loop.
      
       (3) Add a backoff delay that increases the delay on deferred ACKs by a
           jiffy per failed transmission to a limit of HZ.  The backoff delay is
           cleared on a successful return from kernel_sendmsg().
      
       (4) Cancel calls immediately if the opening sendmsg fails.  The layer
           above can arrange retransmission or rotate to another server.
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7e86acf
  15. 01 Aug, 2018 1 commit
  16. 04 Jun, 2018 1 commit
    • David Howells's avatar
      rxrpc: Fix handling of call quietly cancelled out on server · 1a025028
      David Howells authored
      Sometimes an in-progress call will stop responding on the fileserver when
      the fileserver quietly cancels the call with an internally marked abort
      (RX_CALL_DEAD), without sending an ABORT to the client.
      
      This causes the client's call to eventually expire from lack of incoming
      packets directed its way, which currently leads to it being cancelled
      locally with ETIME.  Note that it's not currently clear as to why this
      happens as it's really hard to reproduce.
      
      The rotation policy implement by kAFS, however, doesn't differentiate
      between ETIME meaning we didn't get any response from the server and ETIME
      meaning the call got cancelled mid-flow.  The latter leads to an oops when
      fetching data as the rotation partially resets the afs_read descriptor,
      which can result in a cleared page pointer being dereferenced because that
      page has already been filled.
      
      Handle this by the following means:
      
       (1) Set a flag on a call when we receive a packet for it.
      
       (2) Store the highest packet serial number so far received for a call
           (bearing in mind this may wrap).
      
       (3) If, when the "not received anything recently" timeout expires on a
           call, we've received at least one packet for a call and the connection
           as a whole has received packets more recently than that call, then
           cancel the call locally with ECONNRESET rather than ETIME.
      
           This indicates that the call was definitely in progress on the server.
      
       (4) In kAFS, if the rotation algorithm sees ECONNRESET rather than ETIME,
           don't try the next server, but rather abort the call.
      
           This avoids the oops as we don't try to reuse the afs_read struct.
           Rather, as-yet ungotten pages will be reread at a later data.
      
      Also:
      
       (5) Add an rxrpc tracepoint to log detection of the call being reset.
      
      Without this, I occasionally see an oops like the following:
      
          general protection fault: 0000 [#1] SMP PTI
          ...
          RIP: 0010:_copy_to_iter+0x204/0x310
          RSP: 0018:ffff8800cae0f828 EFLAGS: 00010206
          RAX: 0000000000000560 RBX: 0000000000000560 RCX: 0000000000000560
          RDX: ffff8800cae0f968 RSI: ffff8800d58b3312 RDI: 0005080000000000
          RBP: ffff8800cae0f968 R08: 0000000000000560 R09: ffff8800ca00f400
          R10: ffff8800c36f28d4 R11: 00000000000008c4 R12: ffff8800cae0f958
          R13: 0000000000000560 R14: ffff8800d58b3312 R15: 0000000000000560
          FS:  00007fdaef108080(0000) GS:ffff8800ca680000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fb28a8fa000 CR3: 00000000d2a76002 CR4: 00000000001606e0
          Call Trace:
           skb_copy_datagram_iter+0x14e/0x289
           rxrpc_recvmsg_data.isra.0+0x6f3/0xf68
           ? trace_buffer_unlock_commit_regs+0x4f/0x89
           rxrpc_kernel_recv_data+0x149/0x421
           afs_extract_data+0x1e0/0x798
           ? afs_wait_for_call_to_complete+0xc9/0x52e
           afs_deliver_fs_fetch_data+0x33a/0x5ab
           afs_deliver_to_call+0x1ee/0x5e0
           ? afs_wait_for_call_to_complete+0xc9/0x52e
           afs_wait_for_call_to_complete+0x12b/0x52e
           ? wake_up_q+0x54/0x54
           afs_make_call+0x287/0x462
           ? afs_fs_fetch_data+0x3e6/0x3ed
           ? rcu_read_lock_sched_held+0x5d/0x63
           afs_fs_fetch_data+0x3e6/0x3ed
           afs_fetch_data+0xbb/0x14a
           afs_readpages+0x317/0x40d
           __do_page_cache_readahead+0x203/0x2ba
           ? ondemand_readahead+0x3a7/0x3c1
           ondemand_readahead+0x3a7/0x3c1
           generic_file_buffered_read+0x18b/0x62f
           __vfs_read+0xdb/0xfe
           vfs_read+0xb2/0x137
           ksys_read+0x50/0x8c
           do_syscall_64+0x7d/0x1a0
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Note the weird value in RDI which is a result of trying to kmap() a NULL
      page pointer.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a025028
  17. 30 Mar, 2018 2 commits
    • Marc Dionne's avatar
      rxrpc: Fix resend event time calculation · 59299aa1
      Marc Dionne authored
      Commit a158bdd3 ("rxrpc: Fix call timeouts") reworked the time calculation
      for the next resend event.  For this calculation, "oldest" will be before
      "now", so ktime_sub(oldest, now) will yield a negative value.  When passed
      to nsecs_to_jiffies which expects an unsigned value, the end result will be
      a very large value, and a resend event scheduled far into the future.  This
      could cause calls to stall if some packets were lost.
      
      Fix by ordering the arguments to ktime_sub correctly.
      
      Fixes: a158bdd3 ("rxrpc: Fix call timeouts")
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      59299aa1
    • David Howells's avatar
      rxrpc: Fix a bit of time confusion · f82eb88b
      David Howells authored
      The rxrpc_reduce_call_timer() function should be passed the 'current time'
      in jiffies, not the current ktime time.  It's confusing in rxrpc_resend
      because that has to deal with both.  Pass the correct current time in.
      
      Note that this only affects the trace produced and not the functioning of
      the code.
      
      Fixes: a158bdd3 ("rxrpc: Fix call timeouts")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f82eb88b
  18. 27 Mar, 2018 1 commit
  19. 29 Nov, 2017 2 commits
  20. 24 Nov, 2017 4 commits
    • David Howells's avatar
      rxrpc: Add keepalive for a call · 415f44e4
      David Howells authored
      We need to transmit a packet every so often to act as a keepalive for the
      peer (which has a timeout from the last time it received a packet) and also
      to prevent any intervening firewalls from closing the route.
      
      Do this by resetting a timer every time we transmit a packet.  If the timer
      ever expires, we transmit a PING ACK packet and thereby also elicit a PING
      RESPONSE ACK from the other side - which prevents our last-rx timeout from
      expiring.
      
      The timer is set to 1/6 of the last-rx timeout so that we can detect the
      other side going away if it misses 6 replies in a row.
      
      This is particularly necessary for servers where the processing of the
      service function may take a significant amount of time.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      415f44e4
    • David Howells's avatar
      rxrpc: Add a timeout for detecting lost ACKs/lost DATA · bd1fdf8c
      David Howells authored
      Add an extra timeout that is set/updated when we send a DATA packet that
      has the request-ack flag set.  This allows us to detect if we don't get an
      ACK in response to the latest flagged packet.
      
      The ACK packet is adjudged to have been lost if it doesn't turn up within
      2*RTT of the transmission.
      
      If the timeout occurs, we schedule the sending of a PING ACK to find out
      the state of the other side.  If a new DATA packet is ready to go sooner,
      we cancel the sending of the ping and set the request-ack flag on that
      instead.
      
      If we get back a PING-RESPONSE ACK that indicates a lower tx_top than what
      we had at the time of the ping transmission, we adjudge all the DATA
      packets sent between the response tx_top and the ping-time tx_top to have
      been lost and retransmit immediately.
      
      Rather than sending a PING ACK, we could just pick a DATA packet and
      speculatively retransmit that with request-ack set.  It should result in
      either a REQUESTED ACK or a DUPLICATE ACK which we can then use in lieu the
      a PING-RESPONSE ACK mentioned above.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      bd1fdf8c
    • David Howells's avatar
      rxrpc: Express protocol timeouts in terms of RTT · beb8e5e4
      David Howells authored
      Express protocol timeouts for data retransmission and deferred ack
      generation in terms on RTT rather than specified timeouts once we have
      sufficient RTT samples.
      
      For the moment, this requires just one RTT sample to be able to use this
      for ack deferral and two for data retransmission.
      
      The data retransmission timeout is set at RTT*1.5 and the ACK deferral
      timeout is set at RTT.
      
      Note that the calculated timeout is limited to a minimum of 4ns to make
      sure it doesn't happen too quickly.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      beb8e5e4
    • David Howells's avatar
      rxrpc: Fix call timeouts · a158bdd3
      David Howells authored
      Fix the rxrpc call expiration timeouts and make them settable from
      userspace.  By analogy with other rx implementations, there should be three
      timeouts:
      
       (1) "Normal timeout"
      
           This is set for all calls and is triggered if we haven't received any
           packets from the peer in a while.  It is measured from the last time
           we received any packet on that call.  This is not reset by any
           connection packets (such as CHALLENGE/RESPONSE packets).
      
           If a service operation takes a long time, the server should generate
           PING ACKs at a duration that's substantially less than the normal
           timeout so is to keep both sides alive.  This is set at 1/6 of normal
           timeout.
      
       (2) "Idle timeout"
      
           This is set only for a service call and is triggered if we stop
           receiving the DATA packets that comprise the request data.  It is
           measured from the last time we received a DATA packet.
      
       (3) "Hard timeout"
      
           This can be set for a call and specified the maximum lifetime of that
           call.  It should not be specified by default.  Some operations (such
           as volume transfer) take a long time.
      
      Allow userspace to set/change the timeouts on a call with sendmsg, using a
      control message:
      
      	RXRPC_SET_CALL_TIMEOUTS
      
      The data to the message is a number of 32-bit words, not all of which need
      be given:
      
      	u32 hard_timeout;	/* sec from first packet */
      	u32 idle_timeout;	/* msec from packet Rx */
      	u32 normal_timeout;	/* msec from data Rx */
      
      This can be set in combination with any other sendmsg() that affects a
      call.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a158bdd3
  21. 02 Nov, 2017 1 commit
    • David Howells's avatar
      rxrpc: Fix call expiry handling · dcbefc30
      David Howells authored
      Fix call expiry handling in the following ways
      
       (1) If all the request data from a client call is acked, don't send a
           follow up IDLE ACK with firstPacket == 1 and previousPacket == 0 as
           this appears to fool some servers into thinking everything has been
           accepted.
      
       (2) Never send an abort back to the server once it has ACK'd all the
           request packets; rather just try to reuse the channel for the next
           call.  The first request DATA packet of the next call on the same
           channel will implicitly ACK the entire reply of the dead call - even
           if we haven't transmitted it yet.
      
       (3) Don't send RX_CALL_TIMEOUT in an ABORT packet, librx uses abort codes
           to pass local errors to the caller in addition to remote errors, and
           this is meant to be local only.
      
      The following also need to be addressed in future patches:
      
       (4) Service calls should send PING ACKs as 'keep alives' if the server is
           still processing the call.
      
       (5) VERSION REPLY packets should be sent to the peers of service
           connections to act as keep-alives.  This is used to keep firewall
           routes in place.  The AFS CM should enable this.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      dcbefc30
  22. 06 Apr, 2017 1 commit