• Bruce Allan's avatar
    e1000e: locking bug introduced by commit 67fd4fcb · a90b412c
    Bruce Allan authored
    Commit 67fd4fcb (e1000e: convert to stats64) added the ability to update
    statistics more accurately and on-demand through the net_device_ops
    .ndo_get_stats64 hook, but introduced a locking bug on 82577/8/9 when
    linked at half-duplex (seen on kernels with CONFIG_DEBUG_ATOMIC_SLEEP=y and
    CONFIG_PROVE_LOCKING=y).  The commit introduced code paths that caused a
    mutex to be locked in atomic contexts, e.g. an rcu_read_lock is held when
    irqbalance reads the stats from /sys/class/net/ethX/statistics causing the
    mutex to be locked to read the Phy half-duplex statistics registers.
    
    The mutex was originally introduced to prevent concurrent accesses of
    resources (the NVM and Phy) shared by the driver, firmware and hardware
    a few years back when there was an issue with the NVM getting corrupted.
    It was later split into two mutexes - one for the NVM and one for the Phy
    when it was determined the NVM, unlike the Phy, should not be protected by
    the software/firmware/hardware semaphore (arbitration of which is done in
    part with the SWFLAG bit in the EXTCNF_CTRL register).  This latter
    semaphore should be sufficient to prevent resource contention of the Phy in
    the driver (i.e. the mutex for Phy accesses is not needed), but to be sure
    the mutex is replaced with an atomic bit flag which will warn if any
    contention is possible.
    
    Also add additional debug output to help determine when the sw/fw/hw
    semaphore is owned by the firmware or hardware.
    Signed-off-by: default avatarBruce Allan <bruce.w.allan@intel.com>
    Reported-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
    Tested-by: default avatarJeff Pieper <jeffrey.e.pieper@intel.com>
    a90b412c
e1000.h 26.4 KB