Commits · 0cead3a6a1abd1f1656e302d876e985aadf8d8d2 · Kirill Smelkov / linux

08 Jan, 2017 12 commits

net: netcp: ethss: get phy-handle only if link interface is MAC-to-PHY · 0cead3a6

Karicheri, Muralidharan authored Jan 06, 2017

Currently to parse phy-handle, driver doesn't check if the interface is
MAC to PHY. This patch add this check for all MAC to PHY interface types
supported by the driver.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0cead3a6

net: netcp: store network statistics in 64 bits · 6a8162e9

Michael Scherban authored Jan 06, 2017

Previously the network statistics were stored in 32 bit variable
which can cause some stats to roll over after several minutes of
high traffic. This implements 64 bit storage so larger numbers
can be stored.
Signed-off-by: Michael Scherban <m-scherban@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6a8162e9

net: netcp: remove the redundant memmov() · aa255101

Karicheri, Muralidharan authored Jan 06, 2017

The psdata is populated with command data by netcp modules
to the tail of the buffer and set_words() copy the same
to the front of the psdata. So remove the redundant memmov
function call.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa255101

net: netcp: extract eflag from desc for rx_hook handling · 69d707d0

Karicheri, Muralidharan authored Jan 06, 2017

Extract the eflag bits from the received desc and pass it down
the rx_hook chain to be available for netcp modules. Also the
psdata and epib data has to be inspected by the netcp modules.
So the desc can be freed only after returning from the rx_hook.
So move knav_pool_desc_put() after the rx_hook processing.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

69d707d0

Merge branch 'cpsw-cpdma-DDR' · b14ad90c

David S. Miller authored Jan 07, 2017

Grygorii Strashko says:

====================
net: ethernet: ti: cpsw: support placing CPDMA descriptors into DDR

This series intended to add support for placing CPDMA descriptors into
DDR by introducing new module parameter "descs_pool_size" to specify
size of descriptor's pool. The "descs_pool_size" defines total number
of CPDMA CPPI descriptors to be used for both ingress/egress packets
processing. If not specified - the default value 256 will be used
which will allow to place descriptor's pool into the internal CPPI
RAM.

In addition, added ability to re-split CPDMA pool of descriptors
between RX and TX path via ethtool '-G' command wich will allow to
configure and fix number of descriptors used by RX and TX path, which,
then, will be split between RX/TX channels proportionally depending on
number of RX/TX channels and its weight.

This allows significantly to reduce UDP packets drop rate for
bandwidth >301 Mbits/sec (am57x).

Before enabling this feature, the am437x SoC has to be fixed as it's
proved that it's not working when CPDMA descriptors placed in DDR.
So, the patch 1 fixes this issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b14ad90c

Documentation: DT: net: cpsw: remove no_bd_ram property · c40d8883

Grygorii Strashko authored Jan 06, 2017

Even if no_bd_ram property is described in TI CPSW bindings the support for
it has never been introduced in CPSW driver, so there are no real users of
it. Hence, remove no_bd_ram property from documentation and DT files.

Cc: 'Rob Herring <robh+dt@kernel.org>'
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c40d8883

net: ethernet: ti: cpsw: add support for ringparam configuration · be034fc1

Grygorii Strashko authored Jan 06, 2017

The CPDMA uses one pool of descriptors for both RX and TX which by default
split between all channels proportionally depending on total number of
CPDMA channels and number of TX and RX channels. As result, more
descriptors will be consumed by TX path if there are more TX channels and
there is no way now to dedicate more descriptors for RX path.

So, add the ability to re-split CPDMA pool of descriptors between RX and TX
path via ethtool '-G' command wich will allow to configure and fix number
of descriptors used by RX and TX path, which, then, will be split between
RX/TX channels proportionally depending on RX/TX channels number and
weight. ethtool '-G' command will accept only number of RX entries and rest
of descriptors will be arranged for TX automatically.

Command:
  ethtool -G <devname> rx <number of descriptors>

defaults and limitations:
- minimum number of rx descriptors is 10% of total number of descriptors in
  CPDMA pool
- maximum number of rx descriptors is 90% of total number of descriptors in
  CPDMA pool
- by default, descriptors will be split equally between RX/TX path
- any values passed in "tx" parameter will be ignored

Usage:

 # ethtool -g eth0
	Pre-set maximums:
	RX:             7372
	RX Mini:        0
	RX Jumbo:       0
	TX:             0
	Current hardware settings:
	RX:             4096
	RX Mini:        0
	RX Jumbo:       0
	TX:             4096

 # ethtool -G eth0 rx 7372
 # ethtool -g eth0
	Ring parameters for eth0:
	Pre-set maximums:
	RX:             7372
	RX Mini:        0
	RX Jumbo:       0
	TX:             0
	Current hardware settings:
	RX:             7372
	RX Mini:        0
	RX Jumbo:       0
	TX:             820
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be034fc1

net: ethernet: ti: cpsw: add support for descs pool size configuration · 90225bf0

Grygorii Strashko authored Jan 06, 2017

The CPSW CPDMA can process buffer descriptors placed as in internal
CPPI RAM as in DDR. This patch adds support in CPSW and CPDMA for
descs_pool_size mudule parameter, which defines total number of CPDMA CPPI
descriptors to be used for both ingress/egress packets processing:
 - memory size, required for CPDMA descriptor pool, is calculated basing
on number of descriptors specified by user in descs_pool_size and
CPDMA descriptor size and allocated from coherent memory (CMA area);
 - CPDMA descriptor pool will be allocated in DDR if pool memory size >
internal CPPI RAM or use internal CPPI RAM otherwise;
 - if descs_pool_size not specified in DT - the default value 256 will
be used which will allow to place CPDMA descriptors pool into the
internal CPPI RAM (current default behaviour);
 - CPDMA will ignore descs_pool_size if descs_pool_size = 0 for
backward comaptiobility with davinci_emac.

descs_pool_size is boot time setting and can't be changed once
CPSW/CPDMA is initialized.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

90225bf0

net: ethernet: ti: cpdma: use devm_ioremap · 7f3b490a

Grygorii Strashko authored Jan 06, 2017

Use devm_ioremap() and simplify the code.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7f3b490a

net: ethernet: ti: cpdma: minimize number of parameters in cpdma_desc_pool_create/destroy() · 5fcc40a9

Grygorii Strashko authored Jan 06, 2017

Update cpdma_desc_pool_create/destroy() to accept only one parameter
struct cpdma_ctlr*, as this structure contains all required
information for pool creation/destruction.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5fcc40a9

net: ethernet: ti: cpdma: fix desc re-queuing · 12a303e3

Grygorii Strashko authored Jan 06, 2017

The currently processing cpdma descriptor with EOQ flag set may
contain two values in Next Descriptor Pointer field:
- valid pointer: means CPDMA missed addition of new desc in queue;
- null: no more descriptors in queue.
In the later case, it's not required to write to HDP register, but now
CPDMA does it.

Hence, add additional check for Next Descriptor Pointer != null in
cpdma_chan_process() function before writing in HDP register.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

12a303e3

net: ethernet: ti: cpdma: am437x: allow descs to be plased in ddr · a6c83ccf

Grygorii Strashko authored Jan 06, 2017

It's observed that cpsw/cpdma is not working properly when CPPI
descriptors are placed in DDR instead of internal CPPI RAM on am437x
SoC:
- rx/tx silently stops processing packets;
- or - after boot it's working for sometime, but stuck once Network
load is increased (ping is working, but iperf is not).
(The same issue has not been reproduced on am335x and am57xx).

It seems that write to HDP register processed faster by interconnect
than writing of descriptor memory buffer in DDR, which is probably
caused by store buffer / write buffer differences as these functions
are implemented differently across devices. So, to fix this i come up
with two minimal, required changes:

1) all accesses to the channel register HDP/CP/RXFREE registers should
be done using sync IO accessors readl()/writel(), because all previous
memory writes writes have to be completed before starting channel
(write to HDP) or completing desc processing.

2) the change 1 only doesn't work on am437x and additional reading of
desc's field is required right after the new descriptor was filled
with data and before pointer on it will be stored in
prev_desc->hw_next field or HDP register.

In addition, to above changes this patch eliminates all relaxed ordering
I/O accessors in this driver as suggested by David Miller to avoid such
kind of issues in the future, but with one exception - relaxed IO accessors
will still be used to fill desc in cpdma_chan_submit(), which is safe as
there is read barrier at the end of write sequence, and because sync IO
accessors usage here will affect on net performance.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6c83ccf

07 Jan, 2017 9 commits

Merge branch 'l2tp-cleanup-socket-lookup-code' · 350a4718

David S. Miller authored Jan 06, 2017

Guillaume Nault says:

====================
l2tp: cleanup socket lookup code in l2tp_ip and l2tp_ip6

First three patches remove redundant tests and add missing "const"
qualifiers.

Fourth patch splits the conditionals found in __l2tp_ip*_bind_lookup(),
to make these functions easier to review. In the process, I found that
some corner cases were still not handled properly. So I've added the
missing tests in this patch too, because they're pretty simple and the
whole "if" statements are modified anyway.

I expect it to be easier to review this way. If not, I can split up
patch #4, post the missing tests separately to -net, and later repost
this series as pure cleanup. Just let me know.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

350a4718

l2tp: rework socket comparison in __l2tp_ip*_bind_lookup() · c5fdae04

Guillaume Nault authored Jan 06, 2017

Split conditions, so that each test becomes clearer.

Also, for l2tp_ip, check if "laddr" is 0. This prevents a socket from
binding to the unspecified address when other sockets are already bound
using the same device (if any), connection ID and namespace.

Same thing for l2tp_ip6: add ipv6_addr_any(laddr) and
ipv6_addr_any(raddr) tests to ensure that an IPv6 unspecified address
passed as parameter is properly treated a wildcard.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5fdae04

l2tp: remove useless NULL check in __l2tp_ip*_bind_lookup() · 986f7cbc

Guillaume Nault authored Jan 06, 2017

If "l2tp" was NULL, that'd mean "sk" is NULL too. This can't happen
since "sk" is returned by sk_for_each_bound().
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

986f7cbc

l2tp: make __l2tp_ip*_bind_lookup() parameters 'const' · bb39b0bd

Guillaume Nault authored Jan 06, 2017

Add const qualifier wherever possible for __l2tp_ip_bind_lookup() and
__l2tp_ip6_bind_lookup().
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb39b0bd

l2tp: remove redundant addr_len check in l2tp_ip_bind() · 8cf2f704

Guillaume Nault authored Jan 06, 2017

addr_len's value has already been verified at this point.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

8cf2f704

RDS: validate the requested traces user input against max supported · 780e9829

santosh.shilimkar@oracle.com authored Jan 06, 2017

Larger than supported value can lead to array read/write overflow.
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

780e9829

Merge tag 'rxrpc-rewrite-20170106' of... · 1c966c5d

David S. Miller authored Jan 06, 2017

Merge tag 'rxrpc-rewrite-20170106' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
afs: Implement bulk read

This pair of patches implements bulk data reading from an AFS server.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1c966c5d

sctp: prepare asoc stream for stream reconf · a8386317

Xin Long authored Jan 06, 2017

sctp stream reconf, described in RFC 6525, needs a structure to
save per stream information in assoc, like stream state.

In the future, sctp stream scheduler also needs it to save some
stream scheduler params and queues.

This patchset is to prepare the stream array in assoc for stream
reconf. It defines sctp_stream that includes stream arrays inside
to replace ssnmap.

Note that we use different structures for IN and OUT streams, as
the members in per OUT stream will get more and more different
from per IN stream.

v1->v2:
  - put these patches into a smaller group.
v2->v3:
  - define sctp_stream to contain stream arrays, and create stream.c
    to put stream-related functions.
  - merge 3 patches into 1, as new sctp_stream has the same name
    with before.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8386317

udp: inuse checks can quit early for reuseport · df560056

Eric Garver authored Jan 05, 2017

UDP lib inuse checks will walk the entire hash bucket to check if the
portaddr is in use. In the case of reuseport we can stop searching when
we find a matching reuseport.

On a 16-core VM a test program that spawns 16 threads that each bind to
1024 sockets (one per 10ms) takes 1m45s. With this change it takes 11s.

Also add a cond_resched() when the port is not specified.
Signed-off-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>

df560056

06 Jan, 2017 19 commits

cxgb4: Add port description for new cards. · 89eb9835

Ganesh Goudar authored Jan 06, 2017

Add port description for 25G and 100G cards, and also
change few port descriptions in compliance with the new
naming convention.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

89eb9835

cxgb4/cxgb4vf: Display 25G and 100G link speed · 5e78f7fd

Ganesh Goudar authored Jan 06, 2017

Add support to report 25G and 100G links, which was missed
as part of commit "eb97ad99".
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e78f7fd

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · c5e70991

David S. Miller authored Jan 06, 2017

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2017-01-06

This series contains updates/fixes to igb and e1000e.

Joe fixes indentation and improper line wrapping in igb.

David Singleton fixes an issue in e1000e where in systemd, where things
are done in parallel and can create a condition where e1000_shutdown is
called after e1000_close, hitting BUG_ON assert in free_msi_irqs.

Cao Jin fixes a code comment on the wakeup status register.  Also fixes
a possible NULL pointer dereference by using igb_adapter->io_addr
instead of e1000_hw->hw_addr in igb_configure_tx_ring().

Chris Arges works around a firmware issue, which can cause probe of i210
NIC to fail, so zero the page select register during igb_get_phy_id() to
workaround the issue.  Aaron Sierra adds also a check for this issue
during the initialization of PHY parameters to ensure that this same
issue happens after probe.

Todd fixes a possible race condition in close/suspend by extending
the rtnl_lock() to protect the call to netif_device_detach() and
igb_clear_interrupt_scheme().  Also adds i211 to a known i210/i211
workaround.

Hannu Lounento fixes inverted logic on a debug statement.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c5e70991

net: ipv4: make fib_select_default static · c7b371e3

David Ahern authored Jan 05, 2017

fib_select_default has a single caller within the same file.
Make it static.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7b371e3

net: ipv4: Simplify rt_fill_info · 0c8d803f

David Ahern authored Jan 05, 2017

rt_fill_info has only 1 caller and both of the last 2 args -- nowait
and flags -- are hardcoded to 0. Given that remove them as input arguments
and simplify rt_fill_info accordingly.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c8d803f

cxgb4: Synchronize access to mailbox · 4055ae5e

Hariprasad Shenai authored Jan 06, 2017

The issue comes when there are multiple threads attempting to use
the mailbox facility at the same time.
When DCB operations and interface up/down is run in a loop for every
0.1 sec, we observed mailbox collisions. And out of the two commands
one would fail with the present code, since we don't queue the second
command.

To overcome the above issue, added a queue to access the mailbox.
Whenever a mailbox command is issued add it to the queue. If its at
the head issue the mailbox command, else wait for the existing command
to complete. Usually command takes less than a milli-second to
complete.

Also timeout from the loop, if the command under execution takes
long time to run.

In reality, the number of mailbox access collisions is going to be
very rare since no one runs such abusive script.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4055ae5e

net: dsa: b53: Utilize common helpers for u64/MAC · 4b92ea81

Florian Fainelli authored Jan 05, 2017

Utilize the two functions recently introduced: u64_to_ether() and
ether_to_u64() instead of our own versions.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b92ea81

net: dsa: remove version string · 7558828a

Vivien Didelot authored Jan 05, 2017

The dsa_driver_version string is irrelevant and has not been bumped
since its introduction about 9 years ago. Kill it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7558828a

liquidio: fix wrong information about channels reported to ethtool · 026b471b

Weilin Chang authored Jan 04, 2017

Information reported to ethtool about channels is sometimes wrong for PF,
and always wrong for VF.  Fix them by getting the information from the
right fields from the right structs.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

026b471b

ipv6: do not send RTM_DELADDR for tentative addresses · f784ad3d

Mahesh Bandewar authored Jan 04, 2017

RTM_NEWADDR notification is sent when IFA_F_TENTATIVE is cleared from
the address. So if the address is added and deleted before DAD probes
completes, the RTM_DELADDR will be sent for which there was no
RTM_NEWADDR causing asymmetry in notification. However if the same
logic is used while sending RTM_DELADDR notification, this asymmetry
can be avoided.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
CC: Patrick McHardy <kaber@trash.net>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

f784ad3d

liquidio VF: fix incorrect struct being used · 90028b07

Prasad Kanneganti authored Jan 04, 2017

The VF driver is using the wrong struct when sending commands to the NIC
firmware, sometimes causing adverse effects in the firmware. The right
struct is the one that the PF is using, so make the VF use that as well.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

90028b07

afs: Make afs_readpages() fetch data in bulk · 91b467e0

David Howells authored Jan 05, 2017

Make afs_readpages() use afs_vnode_fetch_data()'s new ability to take a
list of pages and do a bulk fetch.
Signed-off-by: David Howells <dhowells@redhat.com>

91b467e0

afs: Make afs_fs_fetch_data() take a list of pages · 196ee9cd

David Howells authored Jan 05, 2017

Make afs_fs_fetch_data() take a list of pages for bulk data transfer.  This
will allow afs_readpages() to be made more efficient.
Signed-off-by: David Howells <dhowells@redhat.com>

196ee9cd

igb: Fix hw_dbg logging in igb_update_flash_i210 · 76ed5a8f

Hannu Lounento authored Jan 02, 2017

Fix an if statement with hw_dbg lines where the logic was inverted with
regards to the corresponding return value used in the if statement.
Signed-off-by: Hannu Lounento <hannu.lounento@ge.com>
Signed-off-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

76ed5a8f

igb: add i211 to i210 PHY workaround · 5bc8c230

Todd Fujinaka authored Nov 28, 2016

i210 and i211 share the same PHY but have different PCI IDs. Don't
forget i211 for any i210 workarounds.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

5bc8c230

igb: close/suspend race in netif_device_detach · 9474933c

Todd Fujinaka authored Nov 15, 2016

Similar to ixgbe, when an interface is part of a namespace it is
possible that igb_close() may be called while __igb_shutdown() is
running which ends up in a double free WARN and/or a BUG in
free_msi_irqs().

Extend the rtnl_lock() to protect the call to netif_device_detach() and
igb_clear_interrupt_scheme() in __igb_shutdown() and check for
netif_device_present() to avoid calling igb_clear_interrupt_scheme() a
second time in igb_close().

Also extend the rtnl lock in igb_resume() to netif_device_attach().
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9474933c

igb: re-assign hw address pointer on reset after PCI error · 69b97cf6

Guilherme G Piccoli authored Nov 10, 2016

Whenever the igb driver detects the result of a read operation returns
a value composed only by F's (like 0xFFFFFFFF), it will detach the
net_device, clear the hw_addr pointer and warn to the user that adapter's
link is lost - those steps happen on igb_rd32().

In case a PCI error happens on Power architecture, there's a recovery
mechanism called EEH, that will reset the PCI slot and call driver's
handlers to reset the adapter and network functionality as well.

We observed that once hw_addr is NULL after the error is detected on
igb_rd32(), it's never assigned back, so in the process of resetting
the network functionality we got a NULL pointer dereference in both
igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid
such bug, this patch re-assigns the hw_addr value in the slot_reset
handler.
Reported-by: Anthony H Thai <ahthai@us.ibm.com>
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

69b97cf6

igb: reset the PHY before reading the PHY ID · 18278533

Aaron Sierra authored Nov 29, 2016

Several people have reported firmware leaving the I210/I211 PHY's page
select register set to something other than the default of zero. This
causes the first accesses, PHY_IDx register reads, to access something
else, resulting in device probe failure:

    igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
    igb: Copyright (c) 2007-2014 Intel Corporation.
    igb: probe of 0000:01:00.0 failed with error -2

This problem began for them after a previous patch I submitted was
applied:

    commit 2a3cdead
    Author: Aaron Sierra <asierra@xes-inc.com>
    Date:   Tue Nov 3 12:37:09 2015 -0600

        igb: Remove GS40G specific defines/functions

I personally experienced this problem after attempting to PXE boot from
I210 devices using this firmware:

    Intel(R) Boot Agent GE v1.5.78
    Copyright (C) 1997-2014, Intel Corporation

Resetting the PHY before reading from it, ensures the page select
register is in its default state and doesn't make assumptions about
the PHY's register set before the PHY has been probed.

Cc: Matwey V. Kornilov <matwey@sai.msu.ru>
Cc: Chris Arges <carges@vectranetworks.com>
Cc: Jochen Henneberg <jh@henneberg-systemdesign.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Tested-by: Matwey V. Kornilov <matwey@sai.msu.ru>
Tested-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

18278533

igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr · 629823b8

Cao jin authored Nov 08, 2016

When running as guest, under certain condition, it will oops as following.
writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
is NULL. While other register access won't oops kernel because they use
wr32/rd32 which have a defense against NULL pointer.

    [  141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
    error received: id=0101
    [  141.225523] igb 0000:01:00.1: PCIe Bus Error:
    severity=Uncorrected (Fatal), type=Unaccessible,
    id=0101(Unregistered Agent ID)
    [  141.299442] igb 0000:01:00.1: broadcast error_detected message
    [  141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
    detached
    [  141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
    detached
    [  143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
    [  143.465994] igb 0000:01:00.1: broadcast slot_reset message
    [  143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
    [  144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
    [  145.312078] igb 0000:01:00.1: broadcast resume message
    [  145.322211] BUG: unable to handle kernel paging request at
    0000000000003818
    [  145.361275] IP: [<ffffffffa02fd38d>]
    igb_configure_tx_ring+0x14d/0x280 [igb]
    [  145.400048] PGD 0
    [  145.438007] Oops: 0002 [#1] SMP

A similar issue & solution could be found at:
    http://patchwork.ozlabs.org/patch/689592/Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

629823b8