Commit d40fd600 authored by Jacob Keller's avatar Jacob Keller Committed by Tony Nguyen

ice: handle flushing stale Tx timestamps in ice_ptp_tx_tstamp

In the event of a PTP clock time change due to .adjtime or .settime, the
ice driver needs to update the cached copy of the PHC time and also discard
any outstanding Tx timestamps.

This is required because otherwise the wrong copy of the PHC time will be
used when extending the Tx timestamp. This could result in reporting
incorrect timestamps to the stack.

The current approach taken to handle this is to call
ice_ptp_flush_tx_tracker, which will discard any timestamps which are not
yet complete.

This is problematic for two reasons:

1) it could lead to a potential race condition where the wrong timestamp is
   associated with a future packet.

   This can occur with the following flow:

   1. Thread A gets request to transmit a timestamped packet, and picks an
      index and transmits the packet

   2. Thread B calls ice_ptp_flush_tx_tracker and sees the index in use,
      marking is as disarded. No timestamp read occurs because the status
      bit is not set, but the index is released for re-use

   3. Thread A gets a new request to transmit another timestamped packet,
      picks the same (now unused) index and transmits that packet.

   4. The PHY transmits the first packet and updates the timestamp slot and
      generates an interrupt.

   5. The ice_ptp_tx_tstamp thread executes and sees the interrupt and a
      valid timestamp but associates it with the new Tx SKB and not the one
      that actual timestamp for the packet as expected.

   This could result in the previous timestamp being assigned to a new
   packet producing incorrect timestamps and leading to incorrect behavior
   in PTP applications.

   This is most likely to occur when the packet rate for Tx timestamp
   requests is very high.

2) on E822 hardware, we must avoid reading a timestamp index more than once
   each time its status bit is set and an interrupt is generated by
   hardware.

   We do have some extensive checks for the unread flag to ensure that only
   one of either the ice_ptp_flush_tx_tracker or ice_ptp_tx_tstamp threads
   read the timestamp. However, even with this we can still have cases
   where we "flush" a timestamp that was actually completed in hardware.
   This can lead to cases where we don't read the timestamp index as
   appropriate.

To fix both of these issues, we must avoid calling ice_ptp_flush_tx_tracker
outside of the teardown path.

Rather than using ice_ptp_flush_tx_tracker, introduce a new state bitmap,
the stale bitmap. Start this as cleared when we begin a new timestamp
request. When we're about to extend a timestamp and send it up to the
stack, first check to see if that stale bit was set. If so, drop the
timestamp without sending it to the stack.

When we need to update the cached PHC timestamp out of band, just mark all
currently outstanding timestamps as stale. This will ensure that once
hardware completes the timestamp we'll ignore it correctly and avoid
reporting bogus timestamps to userspace.

With this change, we fix potential issues caused  by calling
ice_ptp_flush_tx_tracker during normal operation.
Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
parent c1f3414d
...@@ -625,11 +625,13 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx) ...@@ -625,11 +625,13 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx)
* *
* If a given index has a valid timestamp, perform the following steps: * If a given index has a valid timestamp, perform the following steps:
* *
* 1) copy the timestamp out of the PHY register * 1) check that the timestamp request is not stale
* 4) clear the timestamp valid bit in the PHY register * 2) check that a timestamp is ready and available in the PHY memory bank
* 5) unlock the index by clearing the associated in_use bit. * 3) read and copy the timestamp out of the PHY register
* 2) extend the 40b timestamp value to get a 64bit timestamp * 4) unlock the index by clearing the associated in_use bit
* 3) send that timestamp to the stack * 5) check if the timestamp is stale, and discard if so
* 6) extend the 40 bit timestamp value to get a 64 bit timestamp value
* 7) send this 64 bit timestamp to the stack
* *
* Returns true if all timestamps were handled, and false if any slots remain * Returns true if all timestamps were handled, and false if any slots remain
* without a timestamp. * without a timestamp.
...@@ -640,16 +642,16 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx) ...@@ -640,16 +642,16 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx)
* interrupt. In some cases hardware might not interrupt us again when the * interrupt. In some cases hardware might not interrupt us again when the
* timestamp is captured. * timestamp is captured.
* *
* Note that we only take the tracking lock when clearing the bit and when * Note that we do not hold the tracking lock while reading the Tx timestamp.
* checking if we need to re-queue this task. The only place where bits can be * This is because reading the timestamp requires taking a mutex that might
* set is the hard xmit routine where an SKB has a request flag set. The only * sleep.
* places where we clear bits are this work function, or when flushing the Tx
* timestamp tracker.
* *
* If the Tx tracker gets flushed while we're processing a packet, we catch * The only place where we set in_use is when a new timestamp is initiated
* this because we grab the SKB pointer under lock. If the SKB is NULL we know * with a slot index. This is only called in the hard xmit routine where an
* that another thread already discarded the SKB and we can avoid passing it * SKB has a request flag set. The only places where we clear this bit is this
* up to the stack. * function, or during teardown when the Tx timestamp tracker is being
* removed. A timestamp index will never be re-used until the in_use bit for
* that index is cleared.
* *
* If a Tx thread starts a new timestamp, we might not begin processing it * If a Tx thread starts a new timestamp, we might not begin processing it
* right away but we will notice it at the end when we re-queue the task. * right away but we will notice it at the end when we re-queue the task.
...@@ -658,10 +660,11 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx) ...@@ -658,10 +660,11 @@ ice_ptp_is_tx_tracker_up(struct ice_ptp_tx *tx)
* interrupt for that timestamp should re-trigger this function once * interrupt for that timestamp should re-trigger this function once
* a timestamp is ready. * a timestamp is ready.
* *
* Note that minimizing the time we hold the lock is important. If we held the * In cases where the PTP hardware clock was directly adjusted, some
* lock for the entire function we would unnecessarily block the Tx hot path * timestamps may not be able to safely use the timestamp extension math. In
* which needs to set the timestamp index. Limiting how long we hold the lock * this case, software will set the stale bit for any outstanding Tx
* ensures we do not block Tx threads. * timestamps when the clock is adjusted. Then this function will discard
* those captured timestamps instead of sending them to the stack.
* *
* If a Tx packet has been waiting for more than 2 seconds, it is not possible * If a Tx packet has been waiting for more than 2 seconds, it is not possible
* to correctly extend the timestamp using the cached PHC time. It is * to correctly extend the timestamp using the cached PHC time. It is
...@@ -750,6 +753,8 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx) ...@@ -750,6 +753,8 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx)
clear_bit(idx, tx->in_use); clear_bit(idx, tx->in_use);
skb = tx->tstamps[idx].skb; skb = tx->tstamps[idx].skb;
tx->tstamps[idx].skb = NULL; tx->tstamps[idx].skb = NULL;
if (test_and_clear_bit(idx, tx->stale))
drop_ts = true;
spin_unlock(&tx->lock); spin_unlock(&tx->lock);
/* It is unlikely but possible that the SKB will have been /* It is unlikely but possible that the SKB will have been
...@@ -794,21 +799,24 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx) ...@@ -794,21 +799,24 @@ static bool ice_ptp_tx_tstamp(struct ice_ptp_tx *tx)
static int static int
ice_ptp_alloc_tx_tracker(struct ice_ptp_tx *tx) ice_ptp_alloc_tx_tracker(struct ice_ptp_tx *tx)
{ {
unsigned long *in_use, *stale;
struct ice_tx_tstamp *tstamps; struct ice_tx_tstamp *tstamps;
unsigned long *in_use;
tstamps = kcalloc(tx->len, sizeof(*tstamps), GFP_KERNEL); tstamps = kcalloc(tx->len, sizeof(*tstamps), GFP_KERNEL);
in_use = bitmap_zalloc(tx->len, GFP_KERNEL); in_use = bitmap_zalloc(tx->len, GFP_KERNEL);
stale = bitmap_zalloc(tx->len, GFP_KERNEL);
if (!tstamps || !in_use) { if (!tstamps || !in_use || !stale) {
kfree(tstamps); kfree(tstamps);
bitmap_free(in_use); bitmap_free(in_use);
bitmap_free(stale);
return -ENOMEM; return -ENOMEM;
} }
tx->tstamps = tstamps; tx->tstamps = tstamps;
tx->in_use = in_use; tx->in_use = in_use;
tx->stale = stale;
tx->init = 1; tx->init = 1;
spin_lock_init(&tx->lock); spin_lock_init(&tx->lock);
...@@ -836,6 +844,7 @@ ice_ptp_flush_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx) ...@@ -836,6 +844,7 @@ ice_ptp_flush_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx)
pf->ptp.tx_hwtstamp_flushed++; pf->ptp.tx_hwtstamp_flushed++;
} }
clear_bit(idx, tx->in_use); clear_bit(idx, tx->in_use);
clear_bit(idx, tx->stale);
spin_unlock(&tx->lock); spin_unlock(&tx->lock);
/* Clear any potential residual timestamp in the PHY block */ /* Clear any potential residual timestamp in the PHY block */
...@@ -844,6 +853,25 @@ ice_ptp_flush_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx) ...@@ -844,6 +853,25 @@ ice_ptp_flush_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx)
} }
} }
/**
* ice_ptp_mark_tx_tracker_stale - Mark unfinished timestamps as stale
* @tx: the tracker to mark
*
* Mark currently outstanding Tx timestamps as stale. This prevents sending
* their timestamp value to the stack. This is required to prevent extending
* the 40bit hardware timestamp incorrectly.
*
* This should be called when the PTP clock is modified such as after a set
* time request.
*/
static void
ice_ptp_mark_tx_tracker_stale(struct ice_ptp_tx *tx)
{
spin_lock(&tx->lock);
bitmap_or(tx->stale, tx->stale, tx->in_use, tx->len);
spin_unlock(&tx->lock);
}
/** /**
* ice_ptp_release_tx_tracker - Release allocated memory for Tx tracker * ice_ptp_release_tx_tracker - Release allocated memory for Tx tracker
* @pf: Board private structure * @pf: Board private structure
...@@ -869,6 +897,9 @@ ice_ptp_release_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx) ...@@ -869,6 +897,9 @@ ice_ptp_release_tx_tracker(struct ice_pf *pf, struct ice_ptp_tx *tx)
bitmap_free(tx->in_use); bitmap_free(tx->in_use);
tx->in_use = NULL; tx->in_use = NULL;
bitmap_free(tx->stale);
tx->stale = NULL;
tx->len = 0; tx->len = 0;
} }
...@@ -987,20 +1018,13 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf) ...@@ -987,20 +1018,13 @@ static int ice_ptp_update_cached_phctime(struct ice_pf *pf)
* @pf: Board specific private structure * @pf: Board specific private structure
* *
* This function must be called when the cached PHC time is no longer valid, * This function must be called when the cached PHC time is no longer valid,
* such as after a time adjustment. It discards any outstanding Tx timestamps, * such as after a time adjustment. It marks any currently outstanding Tx
* and updates the cached PHC time for both the PF and Rx rings. If updating * timestamps as stale and updates the cached PHC time for both the PF and Rx
* the PHC time cannot be done immediately, a warning message is logged and * rings.
* the work item is scheduled. *
* * If updating the PHC time cannot be done immediately, a warning message is
* These steps are required in order to ensure that we do not accidentally * logged and the work item is scheduled immediately to minimize the window
* report a timestamp extended by the wrong PHC cached copy. Note that we * with a wrong cached timestamp.
* do not directly update the cached timestamp here because it is possible
* this might produce an error when ICE_CFG_BUSY is set. If this occurred, we
* would have to try again. During that time window, timestamps might be
* requested and returned with an invalid extension. Thus, on failure to
* immediately update the cached PHC time we would need to zero the value
* anyways. For this reason, we just zero the value immediately and queue the
* update work item.
*/ */
static void ice_ptp_reset_cached_phctime(struct ice_pf *pf) static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
{ {
...@@ -1024,8 +1048,12 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf) ...@@ -1024,8 +1048,12 @@ static void ice_ptp_reset_cached_phctime(struct ice_pf *pf)
msecs_to_jiffies(10)); msecs_to_jiffies(10));
} }
/* Flush any outstanding Tx timestamps */ /* Mark any outstanding timestamps as stale, since they might have
ice_ptp_flush_tx_tracker(pf, &pf->ptp.port.tx); * been captured in hardware before the time update. This could lead
* to us extending them with the wrong cached value resulting in
* incorrect timestamp values.
*/
ice_ptp_mark_tx_tracker_stale(&pf->ptp.port.tx);
} }
/** /**
...@@ -2373,6 +2401,7 @@ s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb) ...@@ -2373,6 +2401,7 @@ s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb)
* requests. * requests.
*/ */
set_bit(idx, tx->in_use); set_bit(idx, tx->in_use);
clear_bit(idx, tx->stale);
tx->tstamps[idx].start = jiffies; tx->tstamps[idx].start = jiffies;
tx->tstamps[idx].skb = skb_get(skb); tx->tstamps[idx].skb = skb_get(skb);
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
......
...@@ -113,6 +113,7 @@ struct ice_tx_tstamp { ...@@ -113,6 +113,7 @@ struct ice_tx_tstamp {
* @lock: lock to prevent concurrent access to fields of this struct * @lock: lock to prevent concurrent access to fields of this struct
* @tstamps: array of len to store outstanding requests * @tstamps: array of len to store outstanding requests
* @in_use: bitmap of len to indicate which slots are in use * @in_use: bitmap of len to indicate which slots are in use
* @stale: bitmap of len to indicate slots which have stale timestamps
* @block: which memory block (quad or port) the timestamps are captured in * @block: which memory block (quad or port) the timestamps are captured in
* @offset: offset into timestamp block to get the real index * @offset: offset into timestamp block to get the real index
* @len: length of the tstamps and in_use fields. * @len: length of the tstamps and in_use fields.
...@@ -125,6 +126,7 @@ struct ice_ptp_tx { ...@@ -125,6 +126,7 @@ struct ice_ptp_tx {
spinlock_t lock; /* lock protecting in_use bitmap */ spinlock_t lock; /* lock protecting in_use bitmap */
struct ice_tx_tstamp *tstamps; struct ice_tx_tstamp *tstamps;
unsigned long *in_use; unsigned long *in_use;
unsigned long *stale;
u8 block; u8 block;
u8 offset; u8 offset;
u8 len; u8 len;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment