Commits · ed36edf870d447b4ab7cbd217ba5c6600f20e32d · Kirill Smelkov / linux

02 Oct, 2017 12 commits

Merge branch 'bcm63xx_enet-small-fixes-and-cleanups' · ed36edf8

David S. Miller authored Oct 01, 2017

Jonas Gorski says:

====================
bcm63xx_enet: small fixes and cleanups

This patch set fixes a few theoretical issues and cleans up the code a
bit. It also adds a bit more managed function usage to simplify clock
and iomem usage.

Based on net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ed36edf8

bcm63xx_enet: remove unneeded include · 840f9223

Jonas Gorski authored Oct 01, 2017

We don't use anyhing from that file, so drop it.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

840f9223

bcm63xx_enet: drop unneeded NULL phy_clk check · 4e78e5c5

Jonas Gorski authored Oct 01, 2017

clk_disable and clk_unprepare are NULL-safe, so need to duplicate the
NULL check of the functions.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e78e5c5

bcm63xx_enet: use managed functions for clock/ioremap · 7e697ce9

Jonas Gorski authored Oct 01, 2017

Use managed functions where possible to reduce the amount of resource
handling on error and remove paths.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e697ce9

bcm63xx_enet: do not rely on probe order · 527a4871

Jonas Gorski authored Oct 01, 2017

Do not rely on the shared device being probed before the enet(sw)
devices. This makes it easier to eventually move out the shared
device as a dma controller driver (what it should be).
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

527a4871

bcm63xx_enet: do not write to random DMA channel on BCM6345 · d6213c1f

Jonas Gorski authored Oct 01, 2017

The DMA controller regs actually point to DMA channel 0, so the write to
ENETDMA_CFG_REG will actually modify a random DMA channel.

Since DMA controller registers do not exist on BCM6345, guard the write
with the usual check for dma_has_sram.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d6213c1f

bcm63xx_enet: correct clock usage · 9c86b846

Jonas Gorski authored Oct 01, 2017

Check the return code of prepare_enable and change one last instance of
enable only to prepare_enable. Also properly disable and release the
clock in error paths and on remove for enetsw.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c86b846

net-ipv6: remove unused IP6_ECN_clear() function · b80ccfe9

Maciej Żenczykowski authored Sep 26, 2017

This function is unused, and furthermore it is buggy since it suffers
from the same issue that requires IP6_ECN_set_ce() to take a pointer
to the skb so that it may (in case of CHECKSUM_COMPLETE) update skb->csum

Instead of fixing it, let's just outright remove it.

Tested: builds, and 'git grep IP6_ECN_clear' comes up empty
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b80ccfe9

ipv4: Namespaceify tcp_fastopen_blackhole_timeout knob · 3733be14

Haishuang Yan authored Sep 27, 2017

Different namespace application might require different time period in
second to disable Fastopen on active TCP sockets.

Tested:
Simulate following similar situation that the server's data gets dropped
after 3WHS.
C ---- syn-data ---> S
C <--- syn/ack ----- S
C ---- ack --------> S
S (accept & write)
C?  X <- data ------ S
	[retry and timeout]

And then print netstat of TCPFastOpenBlackhole, the counter increased as
expected when the firewall blackhole issue is detected and active TFO is
disabled.
# cat /proc/net/netstat | awk '{print $91}'
TCPFastOpenBlackhole
1
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3733be14

ipv4: Namespaceify tcp_fastopen_key knob · 43713848

Haishuang Yan authored Sep 27, 2017

Different namespace application might require different tcp_fastopen_key
independently of the host.

David Miller pointed out there is a leak without releasing the context
of tcp_fastopen_key during netns teardown. So add the release action in
exit_batch path.

Tested:
1. Container namespace:
# cat /proc/sys/net/ipv4/tcp_fastopen_key:
2817fff2-f803cf97-eadfd1f3-78c0992b

cookie key in tcp syn packets:
Fast Open Cookie
    Kind: TCP Fast Open Cookie (34)
    Length: 10
    Fast Open Cookie: 1e5dd82a8c492ca9

2. Host:
# cat /proc/sys/net/ipv4/tcp_fastopen_key:
107d7c5f-68eb2ac7-02fb06e6-ed341702

cookie key in tcp syn packets:
Fast Open Cookie
    Kind: TCP Fast Open Cookie (34)
    Length: 10
    Fast Open Cookie: e213c02bf0afbc8a
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

43713848

ipv4: Remove the 'publish' logic in tcp_fastopen_init_key_once · dd000598

Haishuang Yan authored Sep 27, 2017

The 'publish' logic is not necessary after commit dfea2aa6 ("tcp:
Do not call tcp_fastopen_reset_cipher from interrupt context"), because
in tcp_fastopen_cookie_gen，it wouldn't call tcp_fastopen_init_key_once.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dd000598

ipv4: Namespaceify tcp_fastopen knob · e1cfcbe8

Haishuang Yan authored Sep 27, 2017

Different namespace application might require enable TCP Fast Open
feature independently of the host.

This patch series continues making more of the TCP Fast Open related
sysctl knobs be per net-namespace.
Reported-by: Luca BRUNO <lucab@debian.org>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1cfcbe8

01 Oct, 2017 16 commits

Merge branch 'dsa_ptr-port' · 506d0a3e

David S. Miller authored Oct 01, 2017

Vivien Didelot says:

====================
net: dsa: change dsa_ptr for a dsa_port

With DSA, a master net_device is physically wired to a dedicated CPU
switch port. For interaction with the DSA layer, the struct net_device
contains a dsa_ptr, which currently points to a dsa_switch_tree object.

This is only valid for a switch fabric with a single CPU port. In order
to support switch fabrics with multiple CPU ports, we first need to
change the type of dsa_ptr to what it really is: a dsa_port object.

This is what this patchset does. The first patches adds a
dsa_master_get_slave helper and cleans up portions of DSA core to make
the next patches more readable. These next patches prepare the xmit and
receive hot paths and finally change dsa_ptr.

Changes in v2:
  - introduce dsa_master_get_slave helper to simplify patch 6
  - keep hot path data at beginning of dsa_port for cacheline 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

506d0a3e

net: dsa: remove tag ops from the switch tree · aa193d9b

Vivien Didelot authored Sep 29, 2017

Now that the dsa_ptr is a dsa_port instance, there is no need to keep
the tag operations in the dsa_switch_tree structure. Remove it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa193d9b

net: dsa: change dsa_ptr for a dsa_port · 2f657a60

Vivien Didelot authored Sep 29, 2017

With DSA, a master net device (CPU facing interface) has a dsa_ptr
pointer to which hangs a dsa_switch_tree. This is not correct because a
master interface is wired to a dedicated switch port, and because we can
theoretically have several master interfaces pointing to several CPU
ports of the same switch fabric.

Change the master interface's dsa_ptr for the CPU dsa_port pointer.
This is a step towards supporting multiple CPU ports.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f657a60

net: dsa: prepare master receive hot path · 3e41f93b

Vivien Didelot authored Sep 29, 2017

In preparation to make DSA master devices point to their corresponding
CPU port instead of the whole tree, add copies of dst and rcv in the
dsa_port structure so that we keep fast access in the receive hot path.

Also keep the copies at the beginning of the dsa_port structure in order
to ensure they are available in cacheline 1.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3e41f93b

net: dsa: add tagging ops to port · 15240248

Vivien Didelot authored Sep 29, 2017

The DSA tagging protocol operations are specific to each CPU port,
thus the dsa_device_ops pointer belongs to the dsa_port structure.

>From now on assign a slave's xmit copy from its CPU port tagging
operations. This will ease the future support for multiple CPU ports.

Also keep the tag_ops at the beginning of the dsa_port structure so that
we ensure copies for hot path are in cacheline 1.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

15240248

net: dsa: use temporary dsa_device_ops variable · 62fc9587

Vivien Didelot authored Sep 29, 2017

When resolving the DSA tagging protocol used by a CPU switch, use a
temporary "tag_ops" variable to store the dsa_device_ops instead of
using directly dst->tag_ops. This will make the future patches moving
this pointer around easier to read.

There is no functional changes.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

62fc9587

net: dsa: use cpu_dp in master code · 7ec764ee

Vivien Didelot authored Sep 29, 2017

Make it clear that the master device is linked to a CPU port by using
"cpu_dp" for the dsa_port variable in master.c instead of "port", then
use a "port" variable to describe the port index, as usually seen in
other places of DSA core.

This will make the future patch touching dsa_ptr more readable. There is
no functional changes.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7ec764ee

net: dsa: add master helper to look up slaves · 3775b1b7

Vivien Didelot authored Sep 29, 2017

The DSA tagging code does not need to know about the DSA architecture,
it only needs to return the slave device corresponding to the source
port index (and eventually the source device index for cascade-capable
switches) parsed from the frame received on the master device.

For this purpose, provide an inline dsa_master_get_slave helper which
validates the device and port indexes and look up the slave device.

This makes the tagging rcv functions more concise and robust, and also
makes dsa_get_cpu_port obsolete.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3775b1b7

net: hns3: fix null pointer dereference before null check · 075cfdd6

Colin Ian King authored Sep 29, 2017

pointer ndev is being dereferenced with the call to netif_running
before it is being null checked. Re-order the code to only dereference
ndev after it has been null checked.

Detected by CoverityScan, CID#1457206 ("Dereference before null check")

Fixes: 9df8f79a ("net: hns3: Add DCB support when interacting with network stack")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

075cfdd6

hv_netvsc: report stop_queue and wake_queue · 09af87d1

Simon Xiao authored Sep 29, 2017

Report the numbers of events for stop_queue and wake_queue in
ethtool stats.

Example:
ethtool -S eth0
NIC statistics:
	...
	stop_queue: 7
	wake_queue: 7
	...
Signed-off-by: Simon Xiao <sixiao@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

09af87d1

bpf: Fix compiler warning on info.map_ids for 32bit platform · 721e08da

Martin KaFai Lau authored Sep 29, 2017

This patch uses u64_to_user_ptr() to cast info.map_ids to a userspace ptr.
It also tags the user_map_ids with '__user' for sparse check.

Fixes: cb4d2b3f ("bpf: Add name, load_time, uid and map_ids to bpf_prog_info")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

721e08da

net_sched: remove redundant assignment to ret · b1c49d14

Colin Ian King authored Sep 29, 2017

The assignment of -EINVAL to variable ret is redundant as it
is being overwritten on the following error exit paths or
to the return value from the following call to basic_set_parms.
Fix this up by removing it. Cleans up clang warning message:

net/sched/cls_basic.c:185:2: warning: Value stored to 'err' is never read

Fixes: 1d8134fe ("net_sched: use idr to allocate basic filter handles")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1c49d14

net: ipmr: make function ipmr_notifier_init static · ef739d8a

Colin Ian King authored Sep 29, 2017

The function ipmr_notifier_init is local to the source and does
not need to be in global scope, so make it static.

Cleans up sparse warning:
warning: symbol 'ipmr_notifier_init' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef739d8a

ibmvnic: Set state UP · e876a8a7

Mick Tarsel authored Sep 28, 2017

State is initially reported as UNKNOWN. Before register call
netif_carrier_off(). Once the device is opened, call netif_carrier_on() in
order to set the state to UP.
Signed-off-by: Mick Tarsel <mjtarsel@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e876a8a7

Revert "net: dsa: bcm_sf2: Defer port enabling to calling port_enable" · 21a2774e

Florian Fainelli authored Sep 28, 2017

This reverts commit e85ec74a ("net: dsa: bcm_sf2: Defer port
enabling to calling port_enable") because this now makes an unbind
followed by a bind to fail connecting to the ingrated PHY.

What this patch missed is that we need the PHY to be enabled with
bcm_sf2_gphy_enable_set() before probing it on the MDIO bus. This is
correctly done in the ops->setup() function, but by the time
ops->port_enable() runs, this is too late. Upon unbind we would power
down the PHY, and so when we would bind again, the PHY would be left
powered off.

Fixes: e85ec74a ("net: dsa: bcm_sf2: Defer port enabling to calling port_enable")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21a2774e

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · a6992ebe

David S. Miller authored Oct 01, 2017

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-09-29

This series contains updates to i40e and i40evf only.

Jake provides several of the changes starting with the renaming of a
variable to clarify what the value is actually calculating. Found we
were misusing the __I40E_RECOVERY_PENDING bit to determine when we
should actually request a new IRQ in i40e_setup_misc_vector(), which
lead to a design mistake, so to resolve the issue, use a separate
state bit for miscellaneous IRQ setup and fix up the design while we
are at it.  Cleaned up the old legacy PM support in the driver since
we support the newer generic PM callbacks.  Fixed a failure to
hibernate issue, where on some platforms with a large number of CPUs,
we would allocate many IRQ vectors which we would try to migrate to
CPU0 when hibernating.

Sudheer cleans up a check for unqualified module inside i40e_up_complete()
because the link state information is in flux at time, so log messages
are getting logged with incorrect link state information.  Also provided
additional log message cleanups and simplify member variable access in
the printing of the link messages.

Mariusz relaxes the firmware check since Fortville and Fort Park NICs
can and do have different firmware versions, so only warn for older
Fortville firmware.  Fixed an errata with a flow director statistic that
was not wrapping as expected, simply reset after reading.

Mitch prevents consternation by lowering the log level to debug on a
message seen regularly on VF reset or unload, which is meaningless under
normal circumstances.  Refactor the firmware version checking since
Fortville and Fort Park devices can have different firmware versions.

Alan fixes a ring to vector mapping, where the past implementation
attempted to map each Tx and Rx ring to its own vector, however we use
combined queues so we should be mapping the Tx/Rx rings together on one
vector.  Adds the ability for the VF to request a different number of
queues allocated to it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a6992ebe

30 Sep, 2017 3 commits

mkiss: remove redundant check on len being zero · 45c1fd61

Colin Ian King authored Sep 27, 2017

The check on len is redundant as it is always greater than 1,
so just remove it and make the printk less complex.

Detected by CoverityScan, CID#1226729 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

45c1fd61

net-ipv6: add support for sockopt(SOL_IPV6, IPV6_FREEBIND) · 84e14fe3

Maciej Żenczykowski authored Sep 26, 2017

So far we've been relying on sockopt(SOL_IP, IP_FREEBIND) being usable
even on IPv6 sockets.

However, it turns out it is perfectly reasonable to want to set freebind
on an AF_INET6 SOCK_RAW socket - but there is no way to set any SOL_IP
socket option on such a socket (they're all blindly errored out).

One use case for this is to allow spoofing src ip on a raw socket
via sendmsg cmsg.

Tested:
  built, and booted
  # python
  >>> import socket
  >>> SOL_IP = socket.SOL_IP
  >>> SOL_IPV6 = socket.IPPROTO_IPV6
  >>> IP_FREEBIND = 15
  >>> IPV6_FREEBIND = 78
  >>> s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, 0)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  0
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  0
  >>> s.setsockopt(SOL_IPV6, IPV6_FREEBIND, 1)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  1
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  1
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

84e14fe3

net: ipv6: send NS for DAD when link operationally up · 1f372c7b

Mike Manning authored Sep 25, 2017

The NS for DAD are sent on admin up as long as a valid qdisc is found.
A race condition exists by which these packets will not egress the
interface if the operational state of the lower device is not yet up.
The solution is to delay DAD until the link is operationally up
according to RFC2863. Rather than only doing this, follow the existing
code checks by deferring IPv6 device initialization altogether. The fix
allows DAD on devices like tunnels that are controlled by userspace
control plane. The fix has no impact on regular deployments, but means
that there is no IPv6 connectivity until the port has been opened in
the case of port-based network access control, which should be
desirable.
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f372c7b

29 Sep, 2017 9 commits

i40e: refactor FW version checking · 22b96551

Mitch Williams authored Jul 14, 2017

The i40e driver now supports two different devices with two different
firmware versions. So be smart about how we handle these. Move the FW
version macros to the appropriate header file, and add a convenience
macro that checks the version based on the device. Then use this macro
to check whether or not the driver can use the new link info API.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

22b96551

i40e: Enable VF to negotiate number of allocated queues · a3f5aa90

Alan Brady authored Jul 14, 2017

Currently the PF allocates a default number of queues for each VF and
cannot be changed.  This patch enables the VF to request a different
number of queues allocated to it.  This patch also adds a new virtchnl
op and capability flag to facilitate this negotiation.

After the PF receives a request message, it will set a requested number
of queues for that VF.  Then when the VF resets, its VSI will get a new
number of queues allocated to it.

This is a best effort request and since we only allocate a guaranteed
default number, if the VF tries to ask for more than the guaranteed
number, there may not be enough in HW to accommodate it unless other
queues for other VFs are freed. It should also be noted decreasing the
number queues allocated to a VF to below the default will NOT enable the
allocation of more than 32 VFs per PF and will not free queues guaranteed
to each VF by default.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a3f5aa90

i40evf: fix ring to vector mapping · c97fc9b6

Alan Brady authored Jul 14, 2017

The current implementation for mapping queues to vectors is broken
because it attempts to map each Tx and Rx ring to its own vector,
however we use combined queues so we should actually be mapping the
Tx/Rx rings together on one vector.

Also in the current implementation, in the case where we have more
queues than vectors, we attempt to group the queues together into
'chunks' and map each 'chunk' of queues to a vector.  Chunking them
together would be more ideal if, and only if, we only had RSS because of
the way the hashing algorithm works but in the case of a future patch
that enables VF ADq, round robin assignment is better and still works
with RSS.

This patch resolves both those issues and simplifies the code needed to
accomplish this.  Instead of treating the case where we have more queues
than vectors as special, if we notice our vector index is greater than
vectors, reset the vector index to zero and continue mapping.  This
should ensure that in both cases, whether we have enough vectors for
each queue or not, the queues get appropriately mapped.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

c97fc9b6

i40e: shutdown all IRQs and disable MSI-X when suspended · b980c063

Jacob Keller authored Jul 14, 2017

On some platforms with a large number of CPUs, we will allocate many IRQ
vectors. When hibernating, the system will attempt to migrate all of the
vectors back to CPU0 when shutting down all the other CPUs. It is
possible that we have so many vectors that it cannot re-assign them to
CPU0. This is even more likely if we have many devices installed in one
platform.

The end result is failure to hibernate, as it is not possible to
shutdown the CPUs. We can avoid this by disabling MSI-X and clearing our
interrupt scheme when the device is suspended. A more ideal solution
would be some method for the stack to properly handle this for all
drivers, rather than on a case-by-case basis for each driver to fix
itself.

However, until this more ideal solution exists, we can do our part and
shutdown our IRQs during suspend, which should allow systems with
a large number of CPUs to safely suspend or hibernate.

It may be worth investigating if we should shut down even further when
we suspend as it may make the path cleaner, but this was the minimum fix
for the hibernation issue mentioned here.

Testing-hints:
  This affects systems with a large number of CPUs, and with multiple
  devices enabled. Without this change, those platforms are unable to
  hibernate at all.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b980c063

i40e: prevent service task from running while we're suspended · 5c499228

Jacob Keller authored Jul 14, 2017

Although the service task does check the suspended status before
running, it might already be part way through running when we go to
suspend. Lets ensure that the service task is stopped and will not be
restarted again until we finish resuming. This ensures that service task
code does not cause strange interactions with the suspend/resume
handlers.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

5c499228

i40e: don't clear suspended state until we finish resuming · 401586c2

Jacob Keller authored Jul 14, 2017

When handling suspend and resume callbacks we want to make sure that (a)
we don't suspend again if we're already suspended and (b) we don't
resume again if we're already resuming. Lets make sure we test_and_set
the __I40E_SUSPENDED bit in i40e_suspend which ensures that a suspend
call when already suspended will exit early. Additionally, if
__I40E_SUSPENDED is not set when we begin resuming, exit early as well.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

401586c2

i40e: use newer generic PM support instead of legacy PM callbacks · 0e5d3da4

Jacob Keller authored Jul 14, 2017

Stop using the old legacy PM support, since we now have stable support
for the newer generic PM callbacks.

This has several advantages. First, we no longer have to manage our
own pci_save_state() and power changes, as it's preferred to have the
PCI stack do this. Second, these routines get called for both hibernate
and suspend to ram, so we can have the driver properly handle all the
suspend/resume flows that it needs to.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

0e5d3da4

i40e: use separate state bit for miscellaneous IRQ setup · c17401a1

Jacob Keller authored Jul 14, 2017

We currently (mis)use the __I40E_RECOVERY_PENDING bit to determine when
we should actually request a new IRQ in i40e_setup_misc_vector().

This led to a design mistake where we open-coded the re-setup of the
miscellaneous vector in i40e_resume() instead of using the function
provided. If we did not open-code this and instead tried to use the
i40e_setup_misc_vector() function, it would lead to never reallocating
the IRQ.

This would lead to a second i40e_suspend() call failing to free the
vector due to a NULL pointer dereference.

A future patch is going to re-work how the i40e_suspend() and
i40e_resume() flows work to clear all IRQ vectors, which would require
us to use i40e_setup_misc_vector() directly. Since during this time the
__I40E_RECOVERY_PENDING bit is set, we'll never re-allocate the vector.

Rather than leaving the open-coded setup in i40e_resume() lets just fix
the problem properly in i40e_setup_misc_vector().

Introduce a new state bit which indicates when the IRQ has been
assigned, which will be set when i40e_setup_misc_vector is first called.
This ultimately resolves the issue of re-requesting the vector, without
overloading the __I40E_RECOVERY_PENDING state. This ensures that the
suspend/resume cycle can use the setup function instead of open-coding
the re-request during resume.

Additionally, since the only callers of i40e_stop_misc_vector also want
to free it, move this code directly into the function to avoid
duplication. Due to the new functionality, rename it to
i40e_free_misc_vector().

This lets us drop the extra calls to free and re-enable the vector
during i40e_suspend() and i40e_resume(). We don't need to call
i40e_setup_misc_Vector() in i40e_resume() because it gets called by the
i40e_rebuild() call.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

c17401a1

i40evf: lower message level · 905770fa

Mitch Williams authored Jul 14, 2017

We see this message regularly on VF reset or unload (which invokes a
reset). It's essentially meaningless unless it's happening constantly.
To prevent consternation, lower the log level to debug so it's not seen
under normal circumstance.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

905770fa