Commits · a1ef48e1e8843e2f6be631b8cf1c21b24579b9d6 · nexedi / linux

21 Sep, 2015 10 commits

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · a1ef48e1

David S. Miller authored Sep 20, 2015

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-09-17

This series contains updates to i40e and i40evf.

Shannon provides updates to i40e and i40evf to resolve an issue with the
nvmupdate utility.  First renames a variable name to reduce confusion and
to differentiate it from the actual user variable.  Then added the ability
to save the admin queue write back descriptor if a caller supplies a
buffer for it to be saved into.  Added a new GetStatus command so that
the NVM update tool can query the current status instead of doing fake
write requests to probe for readiness.  Added wait states to the NVM
update state machine to signify when waiting for an update operation to
finish, whether we are in the middle of a set of write operations, or we
are now idle but waiting.  Then added a facility to run admin queue
commands through the NVM update utility in order to allow the update
tools to interact with the firmware and do special commands needed for
updates and configuration changes.  Also added a facility to recover the
result of a previously run admin queue command.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a1ef48e1

Merge tag 'linux-can-next-for-4.4-20150917' of... · eaf9a992

David S. Miller authored Sep 20, 2015

Merge tag 'linux-can-next-for-4.4-20150917' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2015-09-17

this is a pull request of two patches for net-next/master.

Gerhard Bertelsmann adds support for the CAN controller found on the
Allwinner A10/A20 SoC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

eaf9a992

rxrpc: Replace get_seconds with ktime_get_seconds · 22a3f9a2

Ksenija Stanojevic authored Sep 17, 2015

Replace time_t type and get_seconds function which are not y2038 safe
on 32-bit systems. Function ktime_get_seconds use monotonic instead of
real time and therefore will not cause overflow.
Signed-off-by: Ksenija Stanojevic <ksenija.stanojevic@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

22a3f9a2

Merge branch 'hsilicon-net-subsys' · 62e3b1d0

David S. Miller authored Sep 20, 2015

huangdaode says:

====================
net: Hisilicon Network Subsystem support

This is V2 of Hisilicon Network Subsystem(HNS) patchesets taking care
about LKML comments.

Please find out the changes from the change logs.
This patchset is rebased on mainline kernel Linux 4.3-rc1 branch.

[PATCH v2 1/5] Device Tree Binding Documentation
[PATCH v2 2/5] Merge MDIO Module
[PATCH v2 3/5] Hisilicon Network Acceleration Engine Framework
[PATCH v2 4/5] Distributed System Area Fabric Module
[PATCH v2 5/5] Basic Ethernet Driver Module

Changes from V1:
1. Remove "inline" in C file (according to LKML comment, same in below).
2. Fix a bug about class_find_device.
3. Change the DTS pattern on hnae, restruct it to compatible with Hi1610 soc.
4. Unified hip04_mdio and hip05_mdio into hns_mdio, which is more usaul for
   later SOCs.

V1 Patches Reference: https://lkml.org/lkml/2015/8/14/165
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

62e3b1d0

net: add Hisilicon Network Subsystem basic ethernet support · b5996f11

huangdaode authored Sep 17, 2015

This is to add basic ethernet support for HNS. It is one of the way to
use the HNS acceleration engine. But most of the decoding/encoding
capability of the AE cannot be used in this way.

This submit contains the basic feature as a ethernet driver. More will
be added later.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b5996f11

net: add Hisilicon Network Subsystem DSAF support · 511e6bc0

huangdaode authored Sep 17, 2015

DSAF, namely Distributed System Area Fabric, is one of the HNS
acceleration engine implementation. This patch add DSAF driver to the
system.

hns_ae_adapt: the adaptor for registering the driver to HNAE framework
hns_dsaf_mac: MAC cover interface for GE and XGE
hns_dsaf_gmac: GE (10/100/1000G Ethernet) MAC function
hns_dsaf_xgmac: XGE (10000+G Ethernet) MAC function
hns_dsaf_main: the platform device driver for the whole hardware
hns_dsaf_misc: some misc helper function, such as LED support
hns_dsaf_ppe: packet process engine function
hns_dsaf_rcb: ring buffer function
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

511e6bc0

net: add Hisilicon Network Subsystem hnae framework support · 6fe6611f

huangdaode authored Sep 17, 2015

HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a
unified ring buffer interface for Hisilicon Network Acceleration
Engines.

With the interface, upper layer can work as ethernet driver, ODP driver
or other service driver on purpose.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fe6611f

net: add Hisilicon Network Subsystem MDIO support · 5b904d39

huangdaode authored Sep 17, 2015

The MDIO support for Hisilicon Network Subsystem. It is used in Hislicon
hip04, hip05 and Hi1610 SoC to control the external PHY
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5b904d39

net: add Hisilicon Network Subsystem support (config and documents) · fc7e37c6

huangdaode authored Sep 17, 2015

The Hisilicon Network Subsystem is a long term evolution IP which is
supposed to be used in Hisilicon ICT SoC. The IP, which is called hns
for short, is a TCP/IP acceleration engine, which can directly decode
TCP/IP stream and distribute them to different ring buffers.

HNS can be configured to work on different mode for different scenario.
This patch make use only some of the mode to make it as standard
ethernet NIC. The other mode will be added soon.

The whole function has 4 kernel sub-modules:

hnae: the HNS acceleration engine framework. It provides a abstract
interface between the engine and the upper layers which make use of the
engine by ring buffer.

hns_enet_drv: a standard ethernet driver that base on the ring buffer.

hns_dsaf: one of the implementation of HNS acceleration engine, which is
applied on Hililicon hip05, Hi1610 and other later-on SoCs

hns_mdio: the mdio control to the PHY, used by acceleration engine

This submit add basic config and documents
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: Kenneth Lee <liguozhu@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc7e37c6

xen-netfront: always set num queues if possible · 812494d9

chas williams authored Sep 16, 2015

If netfront connects with two (or more) queues and then reconnects with
only one queue it fails to delete or rewrite the multi-queue-num-queues
key and netback will try to use the wrong number of queues.

Always write the num-queues field if the backend has multi-queue support.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

812494d9

18 Sep, 2015 30 commits

sch_dsmark: improve memory locality · 47bbbb30

Eric Dumazet authored Sep 17, 2015

Memory placement in sch_dsmark is silly : Better place mask/value
in the same cache line.

Also, we can embed small arrays in the first cache line and
remove a potential cache miss.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

47bbbb30

Merge branch 'bcmgenet-irq-coalesce' · 25354001

David S. Miller authored Sep 17, 2015

Florian Fainelli says:

====================
net: bcmgenet: Interrupt coalescing

This patch series adds support for interrupt coalescing for GENET
adapters.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

25354001

net: bcmgenet: Implement RX coalescing control knobs · 4a29645b

Florian Fainelli authored Sep 16, 2015

Add support for the ethtool rx-frames coalescing parameter which allows
defining the number of RX interrupts per frames received. The RDMA
engine supports a configurable timeout with a resolution of
approximately 8.192 us.

We can no longer enable the BDONE/PDONE interrupts as those would
fire for each packet/buffer received, which would defeat the MBDONE
interrupt purpose. The MBDONE interrupt is guaranteed to correspond to a
PDONE/BDONE interrupt when the threshold is set to 1.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a29645b

net: bcmgenet: Implement TX coalescing control knobs · 2f913070

Florian Fainelli authored Sep 16, 2015

Configuring the ethtool tx-frames property, which translates into N
packets before a TX interrupt is the simplest configuration scheme
because it requires no locking neither at the softare nor hardware
level, and is completely indepedent from the link speed. Since ethtool
does not allow per-tx queue coalescing parameters, we apply the same
setting to any transmit queue.

We can no longer enable the BDONE/PDONE interrupts as those would fire
for each packet/buffer received, which would defeat the MBDONE interrupt
purpose. The MBDONE interrupt is guaranteed to correspond to a
PDONE/BDONE interrupt when the threshold is set to 1, but offers
interrupt coalescing when the value is > 1.

Since the HW is configured to generate an interrupt when the ring
becomes emtpy, we have to deny any timeout/timer settings coming from
user-space to indicate we can only generate an interrupt very <N>
packets.

While we are at it, fix the DMA_INTR_THRESHOLD_MASK value which was off
by one bit (0xff vs. 0x1ff).
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f913070

lan78xx: Remove not defined MAC_CR_GMII_EN_ bit from MAC_CR. · 9110fe4a

Woojung.Huh@microchip.com authored Sep 16, 2015

Remove not defined MAC_CR_GMII_EN_ bit from MAC_CR.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9110fe4a

lan78xx: Create lan78xx_get_mdix_status() and lan78xx_set_mdix_status() for MDIX control. · 758c5c11

Woojung.Huh@microchip.com authored Sep 16, 2015

Create lan78xx_get_mdix_status() and lan78xx_set_mdix_status() for MDIX control.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

758c5c11

lan78xx: Remove phy defines in lan78xx.h and use defines in include/linux/microchipphy.h · bdfba55e

Woojung.Huh@microchip.com authored Sep 16, 2015

Remove phy defines in lan78xx.h and use defines in include/linux/microchipphy.h.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bdfba55e

lan78xx: Update to use phylib instead of mii_if_info. · ce85e13a

Woojung.Huh@microchip.com authored Sep 16, 2015

Update to use phylib instead of mii_if_info.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce85e13a

lan78xx: Add PHYLIB and MICROCHIP_PHY as default config. · 05fe68c0

Woojung.Huh@microchip.com authored Sep 16, 2015

Add PHYLIB and MICROCHIP_PHY as default configuration for lan78xx.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

05fe68c0

lan78xx: Check device ready bit (PMT_CTL_READY_) after reset the PHY · 6c595b03

Woojung.Huh@microchip.com authored Sep 16, 2015

Check device ready bit (PMT_CTL_READY_) after reset the PHY.
Device may not be ready even if PHY_RST_ is cleared depends on configuration.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c595b03

net: Initialize table in fib result · bde6f9de

David Ahern authored Sep 16, 2015

Sergey, Richard and Fabio reported an oops in ip_route_input_noref. e.g., from Richard:

[    0.877040] BUG: unable to handle kernel NULL pointer dereference at 0000000000000056
[    0.877597] IP: [<ffffffff8155b5e2>] ip_route_input_noref+0x1a2/0xb00
[    0.877597] PGD 3fa14067 PUD 3fa6e067 PMD 0
[    0.877597] Oops: 0000 [#1] SMP
[    0.877597] Modules linked in: virtio_net virtio_pci virtio_ring virtio
[    0.877597] CPU: 1 PID: 119 Comm: ifconfig Not tainted 4.2.0+ #1
[    0.877597] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    0.877597] task: ffff88003fab0bc0 ti: ffff88003faa8000 task.ti: ffff88003faa8000
[    0.877597] RIP: 0010:[<ffffffff8155b5e2>]  [<ffffffff8155b5e2>] ip_route_input_noref+0x1a2/0xb00
[    0.877597] RSP: 0018:ffff88003ed03ba0  EFLAGS: 00010202
[    0.877597] RAX: 0000000000000046 RBX: 00000000ffffff8f RCX: 0000000000000020
[    0.877597] RDX: ffff88003fab50b8 RSI: 0000000000000200 RDI: ffffffff8152b4b8
[    0.877597] RBP: ffff88003ed03c50 R08: 0000000000000000 R09: 0000000000000000
[    0.877597] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003fab6f00
[    0.877597] R13: ffff88003fab5000 R14: 0000000000000000 R15: ffffffff81cb5600
[    0.877597] FS:  00007f6de5751700(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
[    0.877597] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.877597] CR2: 0000000000000056 CR3: 000000003fa6d000 CR4: 00000000000006e0
[    0.877597] Stack:
[    0.877597]  0000000000000000 0000000000000046 ffff88003fffa600 ffff88003ed03be0
[    0.877597]  ffff88003f9e2c00 697da8c0017da8c0 ffff880000000000 000000000007fd00
[    0.877597]  0000000000000000 0000000000000046 0000000000000000 0000000400000000
[    0.877597] Call Trace:
[    0.877597]  <IRQ>
[    0.877597]  [<ffffffff812bfa1f>] ? cpumask_next_and+0x2f/0x40
[    0.877597]  [<ffffffff8158e13c>] arp_process+0x39c/0x690
[    0.877597]  [<ffffffff8158e57e>] arp_rcv+0x13e/0x170
[    0.877597]  [<ffffffff8151feec>] __netif_receive_skb_core+0x60c/0xa00
[    0.877597]  [<ffffffff81515795>] ? __build_skb+0x25/0x100
[    0.877597]  [<ffffffff81515795>] ? __build_skb+0x25/0x100
[    0.877597]  [<ffffffff81521ff6>] __netif_receive_skb+0x16/0x70
[    0.877597]  [<ffffffff81522078>] netif_receive_skb_internal+0x28/0x90
[    0.877597]  [<ffffffff8152288f>] napi_gro_receive+0x7f/0xd0
[    0.877597]  [<ffffffffa0017906>] virtnet_receive+0x256/0x910 [virtio_net]
[    0.877597]  [<ffffffffa0017fd8>] virtnet_poll+0x18/0x80 [virtio_net]
[    0.877597]  [<ffffffff815234cd>] net_rx_action+0x1dd/0x2f0
[    0.877597]  [<ffffffff81053228>] __do_softirq+0x98/0x260
[    0.877597]  [<ffffffff8164969c>] do_softirq_own_stack+0x1c/0x30

The root cause is use of res.table uninitialized.

Thanks to Nikolay for noticing the uninitialized use amongst the maze of
gotos.

As Nikolay pointed out the second initialization is not required to fix
the oops, but rather to fix a related problem where a valid lookup should
be invalidated before creating the rth entry.

Fixes: b7503e0c ("net: Add FIB table id to rtable")
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Reported-by: Richard Alpe <richard.alpe@ericsson.com>
Reported-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bde6f9de

Merge branch 'bpf_avoid_clone' · 41a9802f

David S. Miller authored Sep 17, 2015

Alexei Starovoitov says:

====================
bpf: performance improvements

v1->v2: dropped redundant iff_up check in patch 2

At plumbers we discussed different options on how to get rid of skb_clone
from bpf_clone_redirect(), the patch 2 implements the best option.
Patch 1 adds 'integrated exts' to cls_bpf to improve performance by
combining simple actions into bpf classifier.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

41a9802f

bpf: add bpf_redirect() helper · 27b29f63

Alexei Starovoitov authored Sep 15, 2015

Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
   action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
  bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

27b29f63

cls_bpf: introduce integrated actions · 045efa82

Daniel Borkmann authored Sep 15, 2015

Often cls_bpf classifier is used with single action drop attached.
Optimize this use case and let cls_bpf return both classid and action.
For backwards compatibility reasons enable this feature under
TCA_BPF_FLAG_ACT_DIRECT flag.

Then more interesting programs like the following are easier to write:
int cls_bpf_prog(struct __sk_buff *skb)
{
  /* classify arp, ip, ipv6 into different traffic classes
   * and drop all other packets
   */
  switch (skb->protocol) {
  case htons(ETH_P_ARP):
    skb->tc_classid = 1;
    break;
  case htons(ETH_P_IP):
    skb->tc_classid = 2;
    break;
  case htons(ETH_P_IPV6):
    skb->tc_classid = 3;
    break;
  default:
    return TC_ACT_SHOT;
  }

  return TC_ACT_OK;
}

Joint work with Daniel Borkmann.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

045efa82

net: only check perm protocol when register proto · f6c53334

Junwei Zhang authored Sep 18, 2015

The permanent protocol nodes are at the head of the list,
So only need check all these nodes.

No matter the new node is permanent or not,
insert the new node after the last permanent protocol node,

If the new node conflicts with existing permanent node,
return error.
Signed-off-by: Martin Zhang <martinbj2008@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f6c53334

bonding: use l4 hash if available · 4b1b865e

Eric Dumazet authored Sep 15, 2015

If skb carries a l4 hash, no need to perform a flow dissection.

Performance is slightly better :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39012e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39393e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.39988e+06

After patch :

lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.43579e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44304e+06
lpaa5:~# ./super_netperf 200 -H lpaa6 -t TCP_RR -l 100
2.44312e+06
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b1b865e

tcp: provide skb->hash to synack packets · 58d607d3

Eric Dumazet authored Sep 15, 2015

In commit b73c3d0e ("net: Save TX flow hash in sock and set in skbuf
on xmit"), Tom provided a l4 hash to most outgoing TCP packets.

We'd like to provide one as well for SYNACK packets, so that all packets
of a given flow share same txhash, to later enable bonding driver to
also use skb->hash to perform slave selection.

Note that a SYNACK retransmit shuffles the tx hash, as Tom did
in commit 265f94ff ("net: Recompute sk_txhash on negative routing
advice") for established sockets.

This has nice effect making TCP flows resilient to some kind of black
holes, even at connection establish phase.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

58d607d3

i40e/i40evf: Bump i40e to 1.3.21 and i40evf to 1.3.13 · f91638af

Catherine Sullivan authored Aug 28, 2015

Bump.

Change-ID: If7ce84218361defa209142d1d8c6f69d48c2d7ad
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

f91638af

i40e/i40evf: add get AQ result command to nvmupdate utility · b72dc7b1

Shannon Nelson authored Aug 28, 2015

Add a facility to recover the result of a previously run AQ command.

Change-ID: I21afec2c20c1a5e6ba60c7fbfcbedfff78c10e45
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b72dc7b1

i40e/i40evf: add exec_aq command to nvmupdate utility · e4c83c20

Shannon Nelson authored Aug 28, 2015

Add a facility to run AQ commands through the nvmupdate utility in order
to allow the update tools to interact with the FW and do special
commands needed for updates and configuration changes.

Change-ID: I5c41523e4055b37f8e4ee479f7a0574368f4a588
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

e4c83c20

i40e/i40evf: add wait states to NVM state machine · 2f1b5bc8

Shannon Nelson authored Aug 28, 2015

This adds wait states to the NVM update state machine to signify when
waiting for an update operation to finish, whether we're in the middle
of a set of Write operations, or we're now idle but waiting.

Change-ID: Iabe91d6579ef6a2ea560647e374035656211ab43
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

2f1b5bc8

i40e/i40evf: add GetStatus command for nvmupdate · 0af8e9db

Shannon Nelson authored Aug 28, 2015

This adds a new GetStatus command so that the NVM update tool can query
the current status instead of doing fake write requests to probe for
readiness.

Change-ID: I671ec6ccd4dfc9dbac3a03b964589d693fda5cd8
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

0af8e9db

i40e/i40evf: add handling of writeback descriptor · 6b5c1b89

Shannon Nelson authored Aug 28, 2015

If the writeback descriptor buffer was previously created, this gives it
to the AQ command request to be used to save the results.

Change-ID: I8c8a1af81e6ebed6d0a15ed31697fe1a6c4e3708
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

6b5c1b89

i40e/i40evf: save aq writeback for future inspection · 87db27a9

Shannon Nelson authored Aug 27, 2015

Add the ability to save the AdminQ write back descriptor if a
caller supplies a buffer for it to be saved into.

Change-ID: I3d1301d26360b39a2d66dc8569e851f54133a3af
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

87db27a9

i40e: rename variable to prevent clash of understanding · 79afe839

Shannon Nelson authored Jul 23, 2015

This code returns something that becomes the errno value from ethtool and
passes around a pointer to an errno variable. This patch changes the name
slightly to differentiate it from the actual user errno variable.

Change-ID: Idaa37845c069e66f4cea072e90f471bb2142454d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

79afe839

Merge branch 'nf_hook_netns' · bbe83731

David S. Miller authored Sep 17, 2015

Eric W. Biederman says:

====================
Passing net through the netfilter hooks

My primary goal with this patchset and it's follow ups is to cleanup the
network routing paths so that we do not look at the output device to
derive the network namespace.  My plan is to pass the network namespace
of the transmitting socket through the output path, to replace code that
looks at the output network device today.  Once that is done we can have
routes with output devices outside of the current network namespace.
Which should allow reception and transmission of packets in network
namespaces to be as fast as normal packet reception and transmission
with early demux disabled, because it will same code path.

Once skb_dst(skb)->dev is a little better under control I think it will
also be possible to use rcu to cleanup the ancient hack that sets
dst->dev to loopback_dev when a network device is removed.

The work to get there is a series of code cleanups.  I am starting with
passing net into the netfilter hooks and into the functions that are
called after the netfilter hooks.  This removes from netfilter the
need to guess which network namespace it is working on.

To get there I perform a series of minor prep patches so the big changes
at the end are possible to audit without getting lost in the noise.  In
particular I have a lot of patches computing net into a local variable
and then using it through out the function.

So this patchset encompases removing dead code, sorting out the _sk
functions that were added last time someone pushed a prototype change
through the post netfilter functions.  Cleaning up individual functions
use of the network namespace.  Passing net into the netfilter hooks.
Passing net into the post netfilter functions.  Using state->net in
the netfilter code where it is available and trivially usable.

Pablo, Dave I don't know whose tree this makes more sense to go
through.  I am assuming at least initially Pablos as netfilter is
involved.  From what I have seen there will be a lot of back and forth
between the netfilter code paths and the routing code paths.

The patches are also available (against 4.3-rc1) at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/net-next.git master
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

bbe83731

netfilter: Add blank lines in callers of netfilter hooks · be10de0a

Eric W. Biederman authored Sep 17, 2015

In code review it was noticed that I had failed to add some blank lines
in places where they are customarily used.  Taking a second look at the
code I have to agree blank lines would be nice so I have added them
here.
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be10de0a

netfilter: Pass net into okfn · 0c4b51f0

Eric W. Biederman authored Sep 15, 2015

This is immediately motivated by the bridge code that chains functions that
call into netfilter.  Without passing net into the okfns the bridge code would
need to guess about the best expression for the network namespace to process
packets in.

As net is frequently one of the first things computed in continuation functions
after netfilter has done it's job passing in the desired network namespace is in
many cases a code simplification.

To support this change the function dst_output_okfn is introduced to
simplify passing dst_output as an okfn.  For the moment dst_output_okfn
just silently drops the struct net.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c4b51f0

netfilter: Use nf_hook_state.net · 9dff2c96

Eric W. Biederman authored Sep 15, 2015

Instead of saying "net = dev_net(state->in?state->in:state->out)"
just say "state->net".  As that information is now availabe,
much less confusing and much less error prone.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9dff2c96

netfilter: Pass struct net into the netfilter hooks · 29a26a56

Eric W. Biederman authored Sep 15, 2015

Pass a network namespace parameter into the netfilter hooks.  At the
call site of the netfilter hooks the path a packet is taking through
the network stack is well known which allows the network namespace to
be easily and reliabily.

This allows the replacement of magic code like
"dev_net(state->in?:state->out)" that appears at the start of most
netfilter hooks with "state->net".

In almost all cases the network namespace passed in is derived
from the first network device passed in, guaranteeing those
paths will not see any changes in practice.

The exceptions are:
xfrm/xfrm_output.c:xfrm_output_resume()         xs_net(skb_dst(skb)->xfrm)
ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont()      ip_vs_conn_net(cp)
ipvs/ip_vs_xmit.c:ip_vs_send_or_cont()          ip_vs_conn_net(cp)
ipv4/raw.c:raw_send_hdrinc()                    sock_net(sk)
ipv6/ip6_output.c:ip6_xmit()			sock_net(sk)
ipv6/ndisc.c:ndisc_send_skb()                   dev_net(skb->dev) not dev_net(dst->dev)
ipv6/raw.c:raw6_send_hdrinc()                   sock_net(sk)
br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

In all cases these exceptions seem to be a better expression for the
network namespace the packet is being processed in then the historic
"dev_net(in?in:out)".  I am documenting them in case something odd
pops up and someone starts trying to track down what happened.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

29a26a56